Citizendia
Your Ad Here

Binomial
Probability mass function
Probability mass function for the binomial distribution
Cumulative distribution function
Cumulative distribution function for the binomial distribution
Colors match the image above
Parameters n \geq 0 number of trials (integer)
0\leq p \leq 1 success probability (real)
Support k \in \{0,\dots,n\}\!
Probability mass function (pmf) {n\choose k} p^k (1-p)^{n-k} \!
Cumulative distribution function (cdf) I_{1-p}(n-\lfloor k\rfloor, 1+\lfloor k\rfloor) \!
Mean np\!
Median one of \{\lfloor np\rfloor, \lceil np \rceil\}[1]
Mode \lfloor (n+1)\,p\rfloor\!
Variance np(1-p)\!
Skewness \frac{1-2p}{\sqrt{np(1-p)}}\!
Excess kurtosis \frac{1-6p(1-p)}{np(1-p)}\!
Entropy  \frac{1}{2} \ln \left( 2 \pi n e p (1-p) \right) + O \left( \frac{1}{n} \right)
Moment-generating function (mgf) (1-p + pe^t)^n \!
Characteristic function (1-p + pe^{it})^n \!

In probability theory and statistics, the binomial distribution is the discrete probability distribution of the number of successes in a sequence of n independent yes/no experiments, each of which yields success with probability p. The integers (from the Latin integer, literally "untouched" hence "whole" the word entire comes from the same origin but via French In Mathematics, the real numbers may be described informally in several different ways In Mathematics, the support of a function is the set of points where the function is not zero or the closure of that set In Probability theory, a probability mass function (abbreviated pmf) is a function that gives the probability that a discrete Random variable In Probability theory and Statistics, the cumulative distribution function (CDF, also probability distribution function or just distribution function In Probability theory and Statistics, a median is described as the number separating the higher half of a sample a population or a Probability distribution In Statistics, the mode is the value that occurs the most frequently in a Data set or a Probability distribution. In Probability theory and Statistics, the variance of a Random variable, Probability distribution, or sample is one measure of In Probability theory and Statistics, skewness is a measure of the asymmetry of the Probability distribution of a real -valued In Probability theory and Statistics, kurtosis (from the Greek word κυρτός kyrtos or kurtos, meaning bulging is a measure of the "peakedness" In Probability theory and Statistics, the moment-generating function of a Random variable X is M_X(t=\operatorname{E}\left(e^{tX}\right In Probability theory, the characteristic function of any Random variable completely defines its Probability distribution. Probability theory is the branch of Mathematics concerned with analysis of random phenomena Statistics is a mathematical science pertaining to the collection analysis interpretation or explanation and presentation of Data. In Probability theory, a Probability distribution is called discrete if it is characterized by a Probability mass function. In Probability theory, to say that two events are independent intuitively means that the occurrence of one event makes it neither more nor less probable that the other Probability is the likelihood or chance that something is the case or will happen Such a success/failure experiment is also called a Bernoulli experiment or Bernoulli trial. In the theory of Probability and Statistics, a Bernoulli trial is an experiment whose outcome is random and can be either of two possible outcomes "success" In fact, when n = 1, the binomial distribution is a Bernoulli distribution. In Probability theory and Statistics, the Bernoulli distribution, named after Swiss scientist Jakob Bernoulli, is a discrete Probability The binomial distribution is the basis for the popular binomial test of statistical significance. In Statistics, the binomial test is an Exact test of the Statistical significance of deviations from a theoretically expected distribution of observations In Statistics, a result is called statistically significant if it is unlikely to have occurred by Chance. A binomial distribution should not be confused with a bimodal distribution.

Contents

Examples

An elementary example is this: Roll a standard die ten times and count the number of sixes. The distribution of this random number is a binomial distribution with n = 10 and p = 1/6.

As another example, assume 5% of a very large population to be green-eyed. You pick 100 people randomly. The number of green-eyed people you pick is a random variable X which follows a binomial distribution with n = 100 and p = 0. A random variable is a rigorously defined mathematical entity used mainly to describe Chance and Probability in a mathematical way 05.

Specification

Probability mass function

In general, if the random variable K follows the binomial distribution with parameters n and p, we write K ~ B(n, p). The probability of getting exactly k successes in n trials is given by the probability mass function:

 \Pr(K = k) = f(k;n,p)={n\choose k}p^k(1-p)^{n-k}

for k = 0, 1, 2, . In Probability theory, a probability mass function (abbreviated pmf) is a function that gives the probability that a discrete Random variable . . , n and where

{n\choose k}=\frac{n!}{k!(n-k)!}

is the binomial coefficient (hence the name of the distribution) "n choose k" (also denoted C(n, k) or nCk). In Mathematics, the binomial coefficient \tbinom nk is the Coefficient of the x   k term in the Polynomial The formula can be understood as follows: we want k successes (pk) and nk failures (1 − p)nk. However, the k successes can occur anywhere among the n trials, and there are C(n, k) different ways of distributing k successes in a sequence of n trials.

In creating reference tables for binomial distribution probability, usually the table is filled in up to n/2 values. This is because for k > n/2, the probability can be calculated by its complement as

f(k;n,p)=f(n-k;n,1-p).\,\!

So, one must look to a different k and a different p (the binomial is not symmetrical in general).

Cumulative distribution function

The cumulative distribution function can be expressed in terms of the regularized incomplete beta function, as follows:

 F(k;n,p) = \Pr(X \le k) = I_{1-p}(n-k, k+1) \!

provided k is an integer and 0 ≤ k ≤ n. In Probability theory and Statistics, the cumulative distribution function (CDF, also probability distribution function or just distribution function In Mathematics, the beta function, also called the Euler integral of the first kind is a Special function defined by If x is not necessarily an integer or not necessarily positive, one can express it thus:

F(x;n,p) = \Pr(X \le x) = \sum_{j=0}^{\operatorname{Floor}(x)} {n\choose j}p^j(1-p)^{n-j}.

For knp, upper bounds for the lower tail of the distribution function can be derived. The Chernoff bound of Probability theory is named after Herman Chernoff. In particular, Hoeffding's inequality yields the bound

 F(k;n,p) \leq \exp\left(-2 \frac{(np-k)^2}{n}\right), \!

and Chernoff's inequality can be used to derive the bound

 F(k;n,p) \leq \exp\left(-\frac{1}{2\,p} \frac{(np-k)^2}{n}\right). \!

Mean, variance, and mode

If X ~ B(n, p) (that is, X is a binomially distributed random variable), then the expected value of X is

\operatorname{E}(X)=np\,\!

and the variance is

\operatorname{Var}(X)=np(1-p).\,\!

This fact is easily proven as follows. Hoeffding's Inequality, named after Wassily Hoeffding, is a result in Probability theory that gives an Upper bound on the Probability In Probability theory and Statistics, the variance of a Random variable, Probability distribution, or sample is one measure of Suppose first that we have exactly one Bernoulli trial. We have two possible outcomes, 1 and 0, with the first having probability p and the second having probability 1 − p; the mean for this trial is given by μ = p. Using the definition of variance, we have

\sigma^2= \left(1 - p\right)^2p + (0-p)^2(1 - p) = p(1-p).

Now suppose that we want the variance for n such trials (i. In Probability theory and Statistics, the variance of a Random variable, Probability distribution, or sample is one measure of e. for the general binomial distribution). Since the trials are independent, we may add the variances for each trial, giving

\sigma^2_n = \sum_{k=1}^n \sigma^2 = np(1 - p). \quad

The mode of X is the greatest integer less than or equal to (n + 1)p; if m = (n + 1)p is an integer, then m − 1 and m are both modes. In Statistics, the mode is the value that occurs the most frequently in a Data set or a Probability distribution.

Explicit derivations of mean and variance

We derive these quantities from first principles. Certain particular sums occur in these two derivations. We rearrange the sums and terms so that sums solely over complete binomial probability mass functions (pmf) arise, which are always unity

 \sum_{k=0}^n \operatorname{Pr}(X=k) = \sum_{k=0}^n {n\choose k}p^k(1-p)^{n-k} = 1

Mean

We apply the definition of the expected value of a discrete random variable to the binomial distribution

\operatorname{E}(X) = \sum_k x_k \cdot \operatorname{Pr}(x_k) = \sum_{k=0}^n k \cdot \operatorname{Pr}(X=k)

= \sum_{k=0}^n k \cdot {n\choose k}p^k(1-p)^{n-k}

The first term of the series (with index k = 0) has value 0 since the first factor, k, is zero. WikipediaWikiProject Probability#Standards for a discussion of standards used for probability distribution articles such as this one In Probability theory, a Probability distribution is called discrete if it is characterized by a Probability mass function. It may thus be discarded, i. e. we can change the lower limit to: k = 1

\operatorname{E}(X) = \sum_{k=1}^n k \cdot \frac{n!}{k!(n-k)!} p^k(1-p)^{n-k}

=  \sum_{k=1}^n k \cdot \frac{n\cdot(n-1)!}{k\cdot(k-1)!(n-k)!} \cdot p \cdot p^{k-1}(1-p)^{n-k}

We've pulled factors of n and k out of the factorials, and one power of p has been split off. We are preparing to redefine the indices.

\operatorname{E}(X) = np \cdot \sum_{k=1}^n \frac{(n-1)!}{(k-1)!(n-k)!} p^{k-1}(1-p)^{n-k}

We rename m = n - 1 and s = k - 1. The value of the sum is not changed by this, but it now becomes readily recognizable

\operatorname{E}(X) = np \cdot \sum_{s=0}^m \frac{(m)!}{(s)!(m-s)!} p^s(1-p)^{m-s}

= np \cdot \sum_{s=0}^m {m\choose s} p^s(1-p)^{m-s}

The ensuing sum is a sum over a complete binomial pmf (of one order lower than the initial sum, as it happens). WikipediaWikiProject Probability#Standards for a discussion of standards used for probability distribution articles such as this one Thus

\operatorname{E}(X) = np \cdot 1 = np

Variance

It can be shown that the variance is equal to (see: variance, 10. Computational formula for variance):

\operatorname{Var}(X) = \operatorname{E}(X^2) - (\operatorname{E}(X))^2.

In using this formula we see that we now also need the expected value of X2, which is

\operatorname{E}(X^2) = \sum_{k=0}^n k^2 \cdot \operatorname{Pr}(X=k)

= \sum_{k=0}^n k^2 \cdot {n\choose k}p^k(1-p)^{n-k}.

We can use our experience gained above in deriving the mean. In Probability theory and Statistics, the variance of a Random variable, Probability distribution, or sample is one measure of We know how to process one factor of k. This gets us as far as

\operatorname{E}(X^2) = np \cdot \sum_{s=0}^m k \cdot {m\choose s} p^s(1-p)^{m-s}
= np \cdot \sum_{s=0}^m (s+1) \cdot {m\choose s} p^s(1-p)^{m-s}

(again, with m = n - 1 and s = k - 1). We split the sum into two separate sums and we recognize each one

\operatorname{E}(X^2) = np \cdot \bigg( \sum_{s=0}^m s \cdot {m\choose s} p^s(1-p)^{m-s} + \sum_{s=0}^m 1 \cdot {m\choose s} p^s(1-p)^{m-s} \bigg).

The first sum is identical in form to the one we calculated in the Mean (above). It sums to mp. The second sum is unity.

\operatorname{E}(X^2) = np \cdot ( mp + 1) = np((n-1)p + 1) = np(np - p + 1).

Using this result in the expression for the variance, along with the Mean (E(X) = np), we get

\operatorname{Var}(X) = \operatorname{E}(X^2) - (\operatorname{E}(X))^2 = np(np - p + 1) - (np)^2 = np(1-p).

Relationship to other distributions

Sums of binomials

If X ~ B(n, p) and Y ~ B(m, p) are independent binomial variables, then X + Y is again a binomial variable; its distribution is

X+Y \sim B(n+m, p).\,

Normal approximation

Binomial PDF and normal approximation for n = 6 and p = 0.5.
Binomial PDF and normal approximation for n = 6 and p = 0. 5.

If n is large enough, the skew of the distribution is not too great, and a suitable continuity correction is used, then an excellent approximation to B(n, p) is given by the normal distribution

 \operatorname{N}(np, np(1-p)).\,\!

Various rules of thumb may be used to decide whether n is large enough. In Probability theory, if a Random variable X has a Binomial distribution with parameters n and p, i The normal distribution, also called the Gaussian distribution, is an important family of Continuous probability distributions applicable in many fields A rule of thumb is a principle with broad application that is not intended to be strictly accurate or reliable for every situation One rule is that both np and n(1 − p) must be greater than 5. However, the specific number varies from source to source, and depends on how good an approximation one wants; some sources give 10. Another commonly used rule holds that the above normal approximation is appropriate only if

\mu \pm 3 \sigma = np \pm 3 \sqrt{np(1-p)} \in [0,n].

The following is an example of applying a continuity correction: Suppose one wishes to calculate Pr(X ≤ 8) for a binomial random variable X. In Probability theory, if a Random variable X has a Binomial distribution with parameters n and p, i If Y has a distribution given by the normal approximation, then Pr(X ≤ 8) is approximated by Pr(Y ≤ 8. 5). The addition of 0. 5 is the continuity correction; the uncorrected normal approximation gives considerably less accurate results.

This approximation is a huge time-saver (exact calculations with large n are very onerous); historically, it was the first use of the normal distribution, introduced in Abraham de Moivre's book The Doctrine of Chances in 1733. "Moivre" redirects here for the French commune see Moivre Marne. The Doctrine of Chances was the first textbook on Probability theory, written by 18th-century French Mathematician Abraham de Moivre and Nowadays, it can be seen as a consequence of the central limit theorem since B(n, p) is a sum of n independent, identically distributed Bernoulli variables with parameter p. The central limit theorem (CLT states that the sum of a sufficiently large number of identically distributed independent Random variables each with finite In Probability theory and Statistics, the Bernoulli distribution, named after Swiss scientist Jakob Bernoulli, is a discrete Probability

For example, suppose you randomly sample n people out of a large population and ask them whether they agree with a certain statement. The proportion of people who agree will of course depend on the sample. If you sampled groups of n people repeatedly and truly randomly, the proportions would follow an approximate normal distribution with mean equal to the true proportion p of agreement in the population and with standard deviation σ = (p(1 − p)n)1/2. Large sample sizes n are good because the standard deviation, as a proportion of the expected value, gets smaller, which allows a more precise estimate of the unknown parameter p. The sample size of a Statistical sample is the number of observations that constitute it

Poisson approximation

The binomial distribution converges towards the Poisson distribution as the number of trials goes to infinity while the product np remains fixed. In Probability theory and Statistics, the Poisson distribution is a Discrete probability distribution that expresses the probability of a number of events Therefore the Poisson distribution with parameter λ = np can be used as an approximation to B(n, p) of the binomial distribution if n is sufficiently large and p is sufficiently small. According to two rules of thumb, this approximation is good if n ≥ 20 and p ≤ 0. 05, or if n ≥ 100 and np ≤ 10. [2]

Limits of binomial distributions

{X-np \over \sqrt{np(1-p)\ }}
approaches the normal distribution with expected value 0 and variance 1 (this is just a specific case of the Central Limit Theorem). The normal distribution, also called the Gaussian distribution, is an important family of Continuous probability distributions applicable in many fields In Probability theory and Statistics, the variance of a Random variable, Probability distribution, or sample is one measure of The central limit theorem (CLT states that the sum of a sufficiently large number of identically distributed independent Random variables each with finite

See also

References

  1. ^ Hamza, K. A digital object identifier ( DOI) is a permanent identifier given to an Electronic document. The bean machine, also known as the Quincunx or Galton box, is a device invented by Sir Francis Galton to demonstrate the Law of error In Probability theory and Statistics, the beta distribution is a family of continuous Probability distributions defined on the interval 1 parameterized WikipediaWikiProject Probability#Standards for a discussionof standards used for probability distribution articles such as this one In Probability theory, the multinomial distribution is a generalization of the Binomial distribution. In Probability and Statistics the negative binomial distribution is a Discrete probability distribution. In Probability theory and Statistics, the Poisson distribution is a Discrete probability distribution that expresses the probability of a number of events The Statistics Online Computational Resource (SOCR is a suite of online tools and interactive aids for hands-on learning and teaching concepts in statistical analyses and The normal distribution, also called the Gaussian distribution, is an important family of Continuous probability distributions applicable in many fields In Statistics, a binomial proportion confidence interval is a Confidence interval for a proportion in a Statistical population. (1995). The smallest uniform upper bound on the distance between the mean and the median of the binomial and Poisson distributions. Statist. Probab. Lett. 23 21–25.
  2. ^ NIST/SEMATECH, '6. 3. 3. 1. Counts Control Charts', e-Handbook of Statistical Methods, <http://www.itl.nist.gov/div898/handbook/pmc/section3/pmc331.htm> [accessed 25 October 2006]

External links

Dictionary

binomial distribution

-noun

  1. (statistics) The discrete probability distribution of the number of successes in a sequence of n independent yes/no experiments, each of which yields success with probability p.
© 2009 citizendia.org; parts available under the terms of GNU Free Documentation License, from http://en.wikipedia.org
Dapyx Software network: MP3 Explorer | Ebook Manager | Zenithic