Probability Distributions 5— The Poisson Distribution

Sandeep Sharma
2 min readApr 8, 2022

The Poisson distribution models the probability of seeing a certain number of successes within a time interval, where the time it takes for the next success is modeled by an exponential distribution. The Poisson distribution can be used to model traffic, such as the number of arrivals a hospital can expect in a hour’s time or the number of emails you’d expect to receive in a week.

The scipy name for the Poisson distribution is “poisson”. Let’s generate and plot some data from a Poisson distribution with an arrival rate of 1 per time unit:

random.seed(42)

arrival_rate_1 = stats.poisson.rvs(size=10000, # Generate Poisson data
mu=5 ) # Average arrival time 1

# Print table of counts
print( pd.crosstab(index="counts", columns= arrival_rate_1))

# Plot histogram
pd.DataFrame(arrival_rate_1).hist(range=(-0.5,max(arrival_rate_1)+0.5)
, bins=max(arrival_rate_1)+1);

The histogram shows that when arrivals are relatively infrequent, it is rare to see more than a couple of arrivals in each time period. When the arrival rate is high, it becomes increasingly rare to see a low number of arrivals and the distribution starts to look more symmetric.

random.seed(42)

arrival_rate_10 = stats.poisson.rvs(size=10000, # Generate Poisson data
mu=10 ) # Average arrival time 10

# Print table of counts
print( pd.crosstab(index="counts", columns= arrival_rate_10))

# Plot histogram
pd.DataFrame(arrival_rate_10).hist(range=(-0.5,max(arrival_rate_10)+0.5)
, bins=max(arrival_rate_10)+1);

As with other discrete probability distributions, we can use cdf() to check the probability of achieving more or less than a certain number of successes and pmf() to check the probability of obtaining a specific number of successes.

stats.poisson.cdf(k=5,     # Check the probability of 5 arrivals or less
mu=10) # With arrival rate 10

Out[]: 0.06708596287903189

stats.poisson.pmf(k=10,     # Check the prob f exactly 10 arrivals
mu=10) # With arrival rate 10

Out[]: 0.12511003572113372

Thank you for reading this article. Link to this code and other distribution code in my Github profile.

Links to some other blogs: —
The Geometric and Exponential Distributions
Uniform Distribution
Normal Distribution
Binomial Distribution
Central Limit Theorem
10 alternatives for Cloud based Jupyter notebook!!

--

--

Sandeep Sharma

Manager Data Science — Coffee Lover — Machine Learning — Statistics — Management Consultant — Product Management — Business Analyst