Example 4: Maximum Likelihood Estimation with Logit Model (Binary Dependent Variable Case)

Case Study Contents

Problem Statement

Similar to the probit model we introduced in Example 3, a logit (or logistic regression) model is a type of regression where the dependent variable is categorical. It could be binary or multinomial; in the latter case, the dependent variable of multinomial logit could either be ordered or unordered. On the other hand, the logit is different from the probit in several key assumptions.

This example covers the case of binary logit when its dependent variables can take only two values (0/1). Greene (1992) estimated a model of consumer behavior where he examined whether or not an individual had experienced a major negative derogatory report in his/her credit history. The file credit.gdx contains information on the credit history of a sample of more than 1,000 individuals. For descriptions of variables in the data file, see here.

In order to examine the determinants of whether a credit card holder experiences a derogatory credit report, we set up the following discrete choice model similar to what we did in Example 3:
$$y_t = x'_t\beta + \mu_t,$$
where $y_t$ is a discrete (0/1) response variable of card holding satisfying:
\[ y_t = \left\{
1 & \quad \text{if number of major derogatory credit reports $> 0$ } \\
0 & \quad \text{otherwise,}
\end{aligned} \right.\]
and $x_t$ is a vector of exogenous variables. Therefore, the conditional probability $\Pr(y_t = 1|x_t)$ measures the chance that the observed outcome for the dependent variable is the "noteworthy" possible outcome -- here the probability of receiving major derogatory credit reports given exogenous variables. $\mu_t$ is the error term of observation $t$, while coefficient $\beta$ is the marginal effect measure on the conditional probability $\Pr(y_t = 1|x_t)$ when there is unit change in data $x_t$ (as introduced in Example 3). Having the model, we then can estimate it using the logit model specification and maximum likelihood techniques.

Unlike their counterparts in the probit model, we now assume that error term $\mu_t$ follows an i.i.d. logistic distribution, and the conditional probability takes the logistic form:
$$\Pr(y_t = 1|x_t) = \frac{\exp(x'_t\beta)}{1+\exp(x'_t\beta)}.$$

Mathematical Formulation

A standard statistical textbook such as Greene (2011) would show that the estimator $\hat{\beta}$ could be calculated through maximizing the following log-likelihood function $\ln\mathcal{L}(\beta)$:

$$\hat{\beta} = \arg\max_{\beta}\left[\ln\mathcal{L}(\beta)\right] = \arg\max_{\beta}\left[\sum_t\left( y_t\ln\left(\frac{\exp(x'_t\beta)}{1+\exp(x'_t\beta)}\right)+ (1-y_t)\ln\left(\frac{1}{1+\exp(x'_t\beta)}\right)\right)\right].$$

Similar to Example 3, we report estimated variances based on the diagonal elements of the covariance matrix $\hat{V}_{\hat{\beta}}$ along with t-statistics and p-values.


Check out the demo of example 4 to experiment with a discrete choice model for estimating and statistically testing the logit model.


A printable version of the model is here: logit_gdx.gms with gdx form data and logit_txt.gms with text form data.


  • Greene, William. 1992. A Statistical Model for Credit Scoring. Working Paper #92-29, Department of Economics, Stern School of Business, New York University, New York.
  • Kalvelagen, Erwin. 2007. Least Squares Calculations with GAMS. Available for download at http://www.amsterdamoptimization.com/pdf/ols.pdf.
  • Greene, William. 2011. Econometrics Analysis, 7th ed. Prentice Hall, Upper Saddle River, NJ.

Optimization Category (Linear Programing, Integer, MIP and etc.):