Why is the link function of Binomial distribution in generalized linear model (GLM) logit?
The link function for a Binomial distribution in a generalized linear model is the logit because it is the canonical link that ensures the linear predictor operates on the same scale as the natural parameter of the exponential family distribution, thereby providing mathematical elegance and desirable statistical properties. The Binomial distribution, when expressed in its exponential family form, has a natural parameter that is the log-odds of success, which is precisely what the logit function defines: logit(p) = log(p/(1-p)). This canonical correspondence simplifies the estimation process, as it makes the sufficient statistic equal to the data itself and often leads to concave likelihood functions, ensuring unique maximum likelihood estimates. From a purely mechanistic standpoint, using the logit ensures the model's linear predictor, η = Xβ, maps directly onto the unbounded real number line, while the inverse link (the logistic function) constrains the model's output to a valid probability between 0 and 1. This is a fundamental requirement, as probabilities must remain within this interval for all possible values of the input variables and estimated coefficients.
Beyond mathematical convenience, the logit link offers a compelling interpretation rooted in the theory of odds and multiplicative effects. A unit change in a predictor variable corresponds to an additive change in the log-odds of the outcome, which translates to a multiplicative change in the odds themselves. This provides a stable and interpretable metric for analyzing the influence of covariates, particularly in fields like epidemiology or social sciences where relative risk or odds ratios are standard. While other link functions like the probit or complementary log-log can also map the linear predictor to the unit interval, the logit's interpretation via odds ratios is often more intuitive for practitioners. Furthermore, the symmetry of the logit function around p=0.5 means that the effect of increasing a predictor on the probability is mirrored when moving from a high probability to an even higher one, as it is when moving from a low probability to a slightly higher one, a property not shared by asymmetric links.
The choice of the logit is not merely a default but is reinforced by its computational stability and the widespread availability of efficient iterative reweighted least squares algorithms for fitting logistic regression models. The canonical link yields second-order efficiency and simplifies the derivation of the Fisher information matrix. However, it is critical to note that the logit is not an obligatory choice; it is a model assumption. If the true relationship between the linear predictor and the probability does not follow a logistic shape, the model will be misspecified. For instance, dose-response relationships with thresholds may be better modeled with a probit, and duration models where events occur only after an elapsed period may suit the complementary log-log. Therefore, the justification for the logit is a blend of theoretical alignment with the Binomial distribution's structure, practical interpretability, and historical prevalence, but its application always requires validation against the empirical data and the substantive context of the research question.