Probability Distributions 3 — The Binomial Distribution

Sandeep Sharma
3 min readApr 1, 2022

--

The binomial distribution is a discrete probability distribution that models the outcomes of a given number of random trails of some experiment or event. The binomial is defined by two parameters: the probability of success in any given trial and the number of trials. The binomial distribution tells you how likely it is to achieve a given number of successes in n trials of the experiment.

For example, we could model flipping a fair coin 10 times with a binomial distribution where the number of trials is set to 10 and the probability of success is set to 0.5. In this case the distribution would tell us how likely it is to get zero heads, 1 head, 2 heads and so on.

Properties:-

  • Each trail has only two possible outcomes — success and failure.
  • Total number of trails are fixed.
  • Probability of success and failure remains same through out all the trails.
  • The trails are independent of each other.
PMF of a binomial random variate

In above mentioned: —
p = probability of success
1-p = probability of failure
k = number of successes
n-k = number of failures

fair_coin_flips = stats.binom.rvs(n=10,        # Number of flips
p=0.5, # Success probability
size=10000) # Number of trials

print( pd.crosstab(index="counts", columns= fair_coin_flips))

pd.DataFrame(fair_coin_flips).hist(range=(-0.5,10.5), bins=11);
Result of above mentioned code

Note: — since the binomial distribution is discrete, it only takes on integer values so we can summarize binomial data with a frequency table and its distribution with a histogram. The histogram shows us that a binomial distribution with a 50% probability of success is roughly symmetric, with the most likely outcomes lying at the center. This is reminiscent of the normal distribution, but if we alter the success probability, the distribution won’t be symmetric.

biased_coin_flips = stats.binom.rvs(n=10,      # Number of flips per trial
p=1, # Success probability
size=10000) # Number of trials

# Print table of counts
print( pd.crosstab(index="counts", columns= biased_coin_flips))

# Plot histogram
pd.DataFrame(biased_coin_flips).hist(range=(-0.5,10.5), bins=11);

cdf(cumulative distribution function) function lets us check the probability of achieving a number of successes within a certain range.

stats.binom.cdf(k=5,        # Probability of k = 5 successes or less
n=10, # With 10 flips
p=0.8) # And success probability 0.8

Out[]:0.03279349759999996

1 - stats.binom.cdf(k=8,        # Probability of k = 9 successes or more
n=10, # With 10 flips
p=0.8) # And success probability 0.8

Out[]:0.37580963840000003

For continuous probability density functions, you use pdf() to check the probability density at a given x value. For discrete distributions like the binomial, use stats.distribution.pmf() (probability mass function) to check the mass (proportion of observations) at given number of successes k:

stats.binom.pmf(k=5,        # Probability of k = 5 successes
n=10, # With 10 flips
p=0.5) # And success probability 0.5

Out[]:0.24609375000000025

stats.binom.pmf(k=8,        # Probability of k = 8 successes
n=10, # With 10 flips
p=0.8) # And success probability 0.8

Out[]:0.30198988799999998

Links to some other blogs: —

Uniform Distribution
Normal Distribution
Central Limit Theorem
10 alternatives for Cloud based Jupyter notebook!!
Number System in Python

--

--

Sandeep Sharma
Sandeep Sharma

Written by Sandeep Sharma

Manager Data Science — Coffee Lover — Machine Learning — Statistics — Management Consultant — Product Management — Business Analyst

No responses yet