Bayesian Generalized Linear Model (Bayesian GLM) — 1
Bayesian linear regression is an approach to linear regression in which the statistical analysis is undertaken within the context of Bayesian inference
It acts as a more verbose version of standard linear regression
Linear regression gives you single values, for the model parameters as well as the predictions while Bayesian linear regression, in turn, gives you distributions
Bayesian inference methods are particularly useful for system identification tasks where a large number of parameters need to be estimated
The Bayesian framework allows one to control for the model complexity even if the model parameters are under constrained by the data, as imposing a prior distribution over the parameters allows regularizing the fitting procedure
Why Bayes?
Coherent, consistent inductive inference
Allows for a wider class of applicable inference than frequency interpretation of probability
Prior information can allow for reasonable inference with moderate samples
Asymptotically equivalent to MLE estimates.
Intuitive interpretation
Natural for complex or hierarchical models
Bayesian models allow use of external information into estimates
The main advantage of the Bayesian approach is the use of external information to improve the estimates of the linear model coefficients.
Lasso and other regularized estimators can be viewed as Bayesian estimators with a particular prior
Uncertainty introduced by adding additional model complexity leads to a natural regularization.
Basics of Bayesian Methods
•The incorporation of prior information (e.g., expert opinion, a thorough literature review of the same or similar variables, and/or prior data)
•The prior is combined with a likelihood function. The likelihood function represents the data (i.e. what is the distribution of the estimate produced by the data)
•The combination of the prior with the likelihood function results in the creation of a posterior distribution of coefficient values
•simulates are drawn from the posterior distribution to create an empirical distribution of likely values for the population parameter
•Basic statistics are used to summarize the empirical distribution of simulates from the posterior The mode (or median or mean) of this empirical distribution represents the maximum likelihood estimate of the true coefficient’s population value (i.e. population parameter) and credible interval can capture the true population value with probability attached
Bayesian statistics mostly involves conditional probability, which is the probability of an event A given event B, and it can be calculated using the Bayes rule.
An example to explain above — Calculate the probability of an adult American using the online dating site if one falls in age group of 30–49.
Bayes Rule — A Use Case
Early HIV Screening in the US Military
-First screen with ELISA
-If positive, then two more rounds of ELISA
-If either positive, two Western blot assays
-Only if both positive, then only determine that the recruit is HIV positive
ELISA
•Sensitivity (True Positive) : 93%
denoted by P(+1 | HIV)=0.93•Specificity (True Negative) : 99%
denoted by P(-1 | no HIV)=0.99•Prevalence : 1.48/1000
denoted by P(HIV)=0.00148
Question — Calculate the probability that the recruit testing positive in the first ELISA is actually having HIV?
Thank you for reading. Links to other blogs: —
Central Limit Theorem — Statistics
General Linear Model — 2
General and Generalized Linear Models
The Poisson Distribution
Uniform Distribution
Normal Distribution
Binomial Distribution
10 alternatives for Cloud based Jupyter notebook!!