Bayesian Generalized Linear Model (Bayesian GLM) — 1

Sandeep Sharma
4 min readJun 23, 2022

--

Bayesian linear regression is an approach to linear regression in which the statistical analysis is undertaken within the context of Bayesian inference

It acts as a more verbose version of standard linear regression

Linear regression gives you single values, for the model parameters as well as the predictions while Bayesian linear regression, in turn, gives you distributions

Bayesian inference methods are particularly useful for system identification tasks where a large number of parameters need to be estimated

The Bayesian framework allows one to control for the model complexity even if the model parameters are under constrained by the data, as imposing a prior distribution over the parameters allows regularizing the fitting procedure

​Why Bayes?

Coherent, consistent inductive inference

Allows for a wider class of applicable inference than frequency interpretation of probability

Prior information can allow for reasonable inference with moderate samples

Asymptotically equivalent to MLE estimates.

Intuitive interpretation

Natural for complex or hierarchical models

Bayesian models allow use of external information into estimates

The main advantage of the Bayesian approach is the use of external information to improve the estimates of the linear model coefficients.

Lasso and other regularized estimators can be viewed as Bayesian estimators with a particular prior

Uncertainty introduced by adding additional model complexity leads to a natural regularization.

​Basics of Bayesian Methods

•The incorporation of prior information (e.g., expert opinion, a thorough literature review of the same or similar variables, and/or prior data)

•The prior is combined with a likelihood function. The likelihood function represents the data (i.e. what is the distribution of the estimate produced by the data)

•The combination of the prior with the likelihood function results in the creation of a posterior distribution of coefficient values

•simulates are drawn from the posterior distribution to create an empirical distribution of likely values for the population parameter

•Basic statistics are used to summarize the empirical distribution of simulates from the posterior The mode (or median or mean) of this empirical distribution represents the maximum likelihood estimate of the true coefficient’s population value (i.e. population parameter) and credible interval can capture the true population value with probability attached

Bayesian statistics mostly involves conditional probability, which is the probability of an event A given event B, and it can be calculated using the Bayes rule.

conditional probability of the event A conditional on the event B

An example to explain above — Calculate the probability of an adult American using the online dating site if one falls in age group of 30–49.

​Bayes Rule — A Use Case

Early HIV Screening in the US Military

-First screen with ELISA
-If positive, then two more rounds of ELISA
-If either positive, two Western blot assays
-Only if both positive, then only determine that the recruit is HIV positive

ELISA

•Sensitivity (True Positive) : 93%
denoted by P(+1 | HIV)=0.93

•Specificity (True Negative) : 99%
denoted by P(-1 | no HIV)=0.99

•Prevalence : 1.48/1000
denoted by P(HIV)=0.00148

Question — Calculate the probability that the recruit testing positive in the first ELISA is actually having HIV?

Thank you for reading. Links to other blogs: —

Central Limit Theorem — Statistics
General Linear Model — 2
General and Generalized Linear Models
The Poisson Distribution
Uniform Distribution
Normal Distribution
Binomial Distribution
10 alternatives for Cloud based Jupyter notebook!!

--

--

Sandeep Sharma
Sandeep Sharma

Written by Sandeep Sharma

Manager Data Science — Coffee Lover — Machine Learning — Statistics — Management Consultant — Product Management — Business Analyst

No responses yet