In probability theory and statistics, a random variable X has an expected value μ = E(X) and a variance σ2 = E((X − μ)2). Probability theory is the branch of Mathematics concerned with analysis of random phenomena Statistics is a mathematical science pertaining to the collection analysis interpretation or explanation and presentation of Data. A random variable is a rigorously defined mathematical entity used mainly to describe Chance and Probability in a mathematical way In Probability theory and Statistics, the variance of a Random variable, Probability distribution, or sample is one measure of These are the first two cumulants: μ = κ1 and σ2 = κ2.
The cumulants κn are defined by the cumulant-generating function which is the g(t) below:

The cumulants are then given by derivatives (at zero) of g(t):
A distribution with given cumulants κn can be approximated through the Edgeworth series. The Gram-Charlier A series and the Edgeworth series, named in honor of Francis Ysidro Edgeworth, are series that approximate a Probability distribution
The cumulants of a distribution are closely related to distribution's moments as described later. Working with cumulants can have an advantage over using moments because for independent variables X and Y,

so that each cumulant of a sum is the sum of the corresponding cumulants of the addends. Addition is the mathematical process of putting things together
Some writers[1][2] prefer to define the cumulant generating function via the characteristic function

This characterization of cumulants is valid even for distributions whose higher moments do not exist. In Probability theory, the characteristic function of any Random variable completely defines its Probability distribution.

Introducing the variance-to-mean ratio,
, the above probability distributions get a unified formula for the derivative of the cumulant generating function:

The second derivative is

confirming that the first cumulant is κ1 = g '(0) = μ and the second cumulant is κ2 = g ' '(0) = μ·ε. In Probability theory and Statistics, the variance-to-mean ratio (VMR, like the Coefficient of variation, is a measure of the dispersion of The constant random variables X = μ have є = 0. The binomial distributions have є = 1 − p so that 0<є<1. The Poisson distributions have є = 1. The negative binomial distributions have є = p−1 so that є > 1. Note the analogy to the eccentricity theory of the conic sections: circles є = 0, ellipses 0 < є < 1, parabolas є = 1, hyperbolas є > 1. In Mathematics, the eccentricity, denoted e or \varepsilon is a parameter associated with every conic section. In Mathematics, a conic section (or just conic) is a Curve obtained by intersecting a cone (more precisely a circular Conical surface
The first cumulant is shift-equivariant; all of the others are shift-invariant. In Mathematics, an invariant is something that does not change under a set of transformations The property of being an invariant is invariance. To state this less tersely, denote by κn(X) the nth cumulant of the probability distribution of the random variable X. The statement is that if c is constant then κ1(X + c) = κ1(X) + c and κn(X + c) = κn(X) for n ≥ 2, i. A mathematical constant is a number usually a Real number, that arises naturally in Mathematics. e. , c is added to the first cumulant, but all higher cumulants are unchanged.
The nth cumulant is homogeneous of degree n, i. e. if c is any constant, then
If X and Y are independent random variables then κn(X + Y) = κn(X) + κn(Y). In Probability theory, to say that two events are independent intuitively means that the occurrence of one event makes it neither more nor less probable that the other
Given the results for the cumulants of the normal distribution, it might be hoped to find families of distributions for which κm = κm+1 = . The normal distribution, also called the Gaussian distribution, is an important family of Continuous probability distributions applicable in many fields . . = 0 for some m>3, with the lower order cumulants (orders 3 to m -1) being non-zero. There are no such distributions [3]. The underlying result here is that the cumulant generating function cannot be a finite order polynomial of order greater than 2.
The moment generating function is:

so the cumulant generating function is the logarithm of the moment generating function. In Probability theory and Statistics, the moment-generating function of a Random variable X is M_X(t=\operatorname{E}\left(e^{tX}\right The first cumulant is the expected value; the second and third cumulants are respectively the second and third central moments (the second central moment is the variance); but the higher cumulants are neither moments nor central moments, but rather more complicated polynomial functions of the moments. In Probability theory and Statistics, the k th moment about the Mean (or k th central moment In Probability theory and Statistics, the variance of a Random variable, Probability distribution, or sample is one measure of
The cumulants are related to the moments by the following recursion formula:

The nth moment μ′n is an nth-degree polynomial in the first n cumulants, thus:






The coefficients are precisely those that occur in Faà di Bruno's formula. Recursion, in Mathematics and Computer science, is a method of defining functions in which the function being defined is applied within its own definition Faà di Bruno's formula is an identity in Mathematics generalizing the Chain rule to higher derivatives named in honor of Francesco Faà di Bruno (1825&ndash1888
The "prime" distinguishes the moments μ′n from the central moments μn. In Probability theory and Statistics, the k th moment about the Mean (or k th central moment To express the central moments as functions of the cumulants, just drop from these polynomials all terms in which κ1 appears as a factor:






These polynomials have a remarkable combinatorial interpretation: the coefficients count certain partitions of sets. Combinatorics is a branch of Pure mathematics concerning the study of discrete (and usually finite) objects In Mathematics, a partition of a set X is a division of X into non-overlapping " parts " or " blocks " A general form of these polynomials is

where
Thus each monomial is a constant times a product of cumulants in which the sum of the indices is n (e. In Mathematics, the word monomial means two different things in the context of Polynomials The first meaning is a product of powers of Variables g. , in the term κ3 κ22 κ1, the sum of the indices is 3 + 2 + 2 + 1 = 8; this appears in the polynomial that expresses the 8th moment as a function of the first eight cumulants). A partition of the integer n corresponds to each term. The integers (from the Latin integer, literally "untouched" hence "whole" the word entire comes from the same origin but via French The coefficient in each term is the number of partitions of a set of n members that collapse to that partition of the integer n when the members of the set become indistinguishable.
The joint cumulant of several random variables X1, . . . , Xn is

where π runs through the list of all partitions of { 1, . . . , n }, and B runs through the list of all blocks of the partition π. For example,

The joint cumulant of just one random variable is its expected value, and that of two random variables is their covariance. In Probability theory and Statistics, covariance is a measure of how much two variables change together (the Variance is a special case of the covariance If some of the random variables are independent of all of the others, then the joint cumulant is zero. If all n random variables are the same, then the joint cumulant is the nth ordinary cumulant.
The combinatorial meaning of the expression of moments in terms of cumulants is easier to understand than that of cumulants in terms of moments:

For example:

Another important property of joint cumulants is multilinearity:

Just as the second cumulant is the variance, the joint cumulant of just two random variables is the covariance. In Probability theory and Statistics, covariance is a measure of how much two variables change together (the Variance is a special case of the covariance The familiar identity

generalizes to cumulants:

The law of total expectation and the law of total variance generalize naturally to conditional cumulants. See also Cumulant In Probability theory and mathematical Statistics, the law of total cumulance is a generalization to Cumulants The proposition in Probability theory known as the law of total expectation, the law of iterated expectations, the tower rule, the smoothing theorem In Probability theory, the law of total variance or variance decomposition formula states that if X and Y are Random variables on the The case n = 3, expressed in the language of (central) moments rather than that of cumulants, says

The general result stated below first appeared in 1969 in The Calculation of Cumulants via Conditioning by David R. Brillinger in volume 21 of Annals of the Institute of Statistical Mathematics, pages 215-218.
In general, we have

where
Cumulants were first introduced by the Danish astronomer, actuary, mathematician, and statistician Thorvald N. Thiele (1838 - 1910) in 1889. An actuary is a business professional who deals with the financial impact of risk and uncertainty A mathematician is a person whose primary area of study and research is the field of Mathematics. Thorvald Nicolai Thiele (December 24 1838 &ndash September 26 1910 was a Danish Astronomer, Actuary and Mathematician, most notable for his Thiele called them half-invariants. They were first called cumulants in a 1931 paper, The derivation of the pattern formulae of two-way partitions from those of simpler patterns, Proceedings of the London Mathematical Society, Series 2, v. The London Mathematical Society ( LMS) is the leading mathematical society in England. 33, pp. 195-208, by the great statistical geneticist Sir Ronald Fisher and the statistician John Wishart, eponym of the Wishart distribution. Sir Ronald Aylmer Fisher, FRS ( 17 February 1890 – 29 July 1962) was an English Statistician, Evolutionary John Wishart ( 28 November 1898 &ndash 14 July 1956) was a Scottish agricultural statistician. In Statistics, the Wishart distribution, named in honor of John Wishart, is a generalization to multiple dimensions of the Chi-square distribution, The historian Stephen Stigler has said that the name cumulant was suggested to Fisher in a letter from Harold Hotelling. Stephen Mack Stigler is Ernest DeWitt Burton Distinguished Service Professor at the Department of Statistics of the University of Chicago[http //chronicle Harold Hotelling ( Fulda Minnesota, September 29, 1895 &ndash December 26, 1973) was a mathematical statistician and very influential In another paper published in 1929, Fisher had called them cumulative moment functions.
More generally, the cumulants of a sequence { mn : n = 1, 2, 3, . . . }, not necessarily the moments of any probability distribution, are given by

where the values of κn for n = 1, 2, 3, . . . are found formally, i. e. , by algebra alone, in disregard of questions of whether any series converges. All of the difficulties of the "problem of cumulants" are absent when one works formally. The simplest example is that the second cumulant of a probability distribution must always be nonnegative, and is zero only if all of the higher cumulants are zero. Formal cumulants are subject to no such constraints.
In combinatorics, the nth Bell number is the number of partitions of a set of size n. Combinatorics is a branch of Pure mathematics concerning the study of discrete (and usually finite) objects In combinatorial Mathematics, the n th Bell number, named in honor of Eric Temple Bell, is the number of partitions of a set All of the cumulants of the sequence of Bell numbers are equal to 1. The Bell numbers are the moments of the Poisson distribution with expected value 1.
For any sequence { κn : n = 1, 2, 3, . . . } of scalars in a field of characteristic zero, being considered formal cumulants, there is a corresponding sequence { μ ′ : n = 1, 2, 3, . In Linear algebra, Real numbers are called Scalars and relate to vectors in a Vector space through the operation of Scalar multiplication In Abstract algebra, a field is an Algebraic structure in which the operations of Addition, Subtraction, Multiplication and division . . } of formal moments, given by the polynomials above. For those polynomials, construct a polynomial sequence in the following way. In Mathematics, a polynomial sequence is a Sequence of Polynomials indexed by the nonnegative integers 0 1 2 3. Out the polynomial

make a new polynomial in these plus one additional variable x:

. . . and generalize the pattern. The pattern is that the numbers of blocks in the aforementioned partitions are the exponents on x. Each coefficient is a polynomial in the cumulants; these are the Bell polynomials, named after Eric Temple Bell. In combinatorial Mathematics, the Bell polynomials Eric Temple Bell ( February 7
This sequence of polynomials is of binomial type. In Mathematics, a Polynomial sequence, ie a sequence of Polynomials indexed by { 0 1 2 3. In fact, no other sequences of binomial type exist; every polynomial sequence of binomial type is completely determined by its sequence of formal cumulants.
In the identity

one sums over all partitions of the set { 1, . . . , n }. If instead, one sums only over the noncrossing partitions, then one gets "free cumulants" rather than conventional cumulants treated above. In Combinatorial mathematics, the topic of noncrossing partitions has assumed some importance because of (among other things its application to the theory of Free probability These play a central role in free probability theory. Free probability is a mathematical theory which studies Non-commutative Random variables The "freeness" property is the analogue of the classical In that theory, rather than considering independence of random variables, defined in terms of Cartesian products of algebras of random variables, one considers instead "freeness" of random variables, defined in terms of free products of algebras rather than Cartesian products of algebras. In Probability theory, to say that two events are independent intuitively means that the occurrence of one event makes it neither more nor less probable that the other A random variable is a rigorously defined mathematical entity used mainly to describe Chance and Probability in a mathematical way Cartesian square redirects here For Cartesian squares in Category theory, see Cartesian square (category theory. In Mathematics, an algebra over a field K, or a K -algebra, is a Vector space A over K equipped with In Abstract algebra, the free product of groups constructs a group from two or more given ones
The ordinary cumulants of degree higher than 2 of the normal distribution are zero. The normal distribution, also called the Gaussian distribution, is an important family of Continuous probability distributions applicable in many fields The free cumulants of degree higher than 2 of the Wigner semicircle distribution are zero. The Wigner semicircle distribution, named after the physicist Eugene Wigner, is the Probability distribution supported on the interval ''R'' the graph of whose This is one respect in which the role of the Wigner distribution in free probability theory is analogous to that of the normal distribution in conventional probability theory.