E1: Nonlinear Least Squares with Cobb-Douglas Production Function

Problem Statement

In economics, a production function relates the output of a production process to the inputs. The Cobb–Douglas (C-D) production function, a particular functional form of the production function, is used widely to represent the technological relationship between the amounts of two or more inputs, physical capital and labor, and the amount of output that can be produced by those inputs. The C-D production function is a special case of the Constant Elasticity of Substitution (CES) production function. Production functions with concave functional form such as CES functions are popular in economic modeling because they exhibit diminishing returns; diminishing returns is the decrease in the marginal (incremental) output of a production process as the amount of a single factor of production is incrementally increased, while the amounts of all other factors of production stay constant.

Here, we estimate the Cobb-Douglas function in its most standard form of a single good with two factors, labor and capital, using Mizon’s 1977 data set. In his 1977 paper, Mizon estimated a variety of specifications for production functions including the C-D and CES where they were allowed to have additive as well as multiplicative error terms. (Example 2 will cover the CES models). In his production function estimations, Mizon used U.K. data on capital, labor use, and a common output measure for 24 industries covering the years 1954, 1957 and 1960.

To estimate a vector of unknowns of the Cobb-Douglas function, we are going to minimize the Sum of Squared Errors (SSE) in a nonlinear least squares model. We also will report some standard regression statistics such as standard errors, t-statistic (also called T value), and p-value at the estimation point.

Mathematical Formulation

In a typical nonlinear least squares problem, we estimate a vector of unknowns $\theta$ by solving a constrained optimization problem. To be more specific, we search for estimators of $\theta$, i.e., $\hat{\theta}$, that minimize the SSE subject to some constraints:

$$\hat{\theta} = \arg\min_{\theta}\left[(\sum_{t=1}^m\mu_t^2)\right],$$

$$\mbox{s.t.} \quad q_t = f_t(x_t,\theta) + \mu_t.$$

Based on the data set from Mizon (1977), we observe exogenous data $x_t = (k_t, l_t)$, where $k_t$ is capital used at time t and $l_t$ is labor employed at time t, together with dependent variable $q_t$ (quantity of output). We are interested in estimating an unknown vector of $\theta = (\theta_{i0}, \theta_{i1}, \theta_{i2})’$ in which $\theta_{i0}$ is a scale factor and $\theta_{i1}$ and $\theta_{i2}$ are output elasticity of input factors. The constraints are:

$$ {q_t} = {\theta_{i0}} {k_t^{\theta_{i1}}} {l_t^{\theta_{i2}}} + {\mu_t},$$

In more general cases, with more than two inputs, we can denote the inputs as $V_{1}, V_{2}, \dots, V_{m}$, $m > 2$. Then the fitting constraints are:

$$ {q_t} = {\theta_{i0}} {(V_{1t})^{\theta_{i1}}} {(V_{2t})^{\theta_{i2}}} \dots {(V_{mt})^{\theta_{im}}} + {\mu_t},$$

and we need to estimate a unknown vector of $\theta = (\theta_{i0}, \theta_{i1}, \theta_{i2},\dots, \theta_{im})’$ with $m + 1$ elements. Both the standard and general form of the Cobb-Douglas production function are included in the demo of example 1.

To report standard errors at the estimated point, we need to estimate the co-variance matrix of the coefficients $\hat{V}_{\hat{\theta}}$. In standard econometrics references such as Greene (2011),
\begin{equation}
\hat{V}_{\hat{\theta}} = \frac{\sum_t(y_t – f_t(x_t, \hat{\theta}))^2}{n-m}{(J^TJ)^{-1}}.
\end{equation}
Here $\frac{\sum_t(y_t – f_t(x_t, \hat{\theta}))^2}{n-m}$ is the estimated variance of residuals, where $n-m$ is the degrees of freedom, defined by $n$ = number of observations and $m$ = number of unknowns. On the other hand, a Jacobian matrix $J$ is a $n \times m$ matrix with its $(t,i)^{th}$ element defined as $J_{t,i} = \frac{\partial{\mu_t(\theta)}}{\partial{\theta_i}}$, where $t=1,2,\dots, n$ and $i=1,2, \dots, m$. GAMS provides a mechanism to generate the Jacobian matrix $J$ at solution point $\theta$. As we can see from this nonlinear least squares example in GAMS, cd_gdx.gms (Cobb-Douglas model with gdx input), we rely on the convertd solver with options DictMap and Jacobian for generating a dictionary map from the solver to GAMS and the Jacobian matrix at the solution point. We save them individually in data file dictmap.gdx and jacobian.gdx. Combining the information from these two files will provide us with the Jacobian matrix $J$ at the solution point $\theta$.

Once we have the estimators and the corresponding standard errors of these estimators, we can address the common question of how good are the estimators. To test whether the estimators are significantly different from zero, we can generate a t-statistic based on a hypothesis. Following Greene (2011), let $\hat{\beta}$ be an estimator of true value $\beta$, then a t-statistic for $\beta$ is defined as

$$t_{\hat{\beta}} = \frac{\hat{\beta}-\beta_0}{s.e.(\hat{\beta})}.$$

Note that we usually set $\beta_0 = 0$ when testing the significance of a certain estimator. In general, when a t-statistic is large enough (larger than the critical value with respect to some significance level $\alpha$), we tend to reject the null hypothesis claiming that this estimator is not significantly different from zero. The p-value is defined as the probability of obtaining a result equal to or “more extreme” than what was actually observed, assuming that the hypothesis under consideration is true. In this case, we could simply take the p-value as the probability of observing current data assuming our estimator is not significant. If the p-value is less than the significance level $\alpha$, then it is “unlikely” that the null hypothesis was true. It is easy to see that a large t-statistic would indicate a small p-value for the same statistical test.

Demo

This demo provides two data input options for variable estimation and reports regression statistics based on a Cobb-Douglas production function. The reported statistics include estimators, standard errors, T values, and p-values (against non-significant coefficients assumption) at the estimated point.

Option 1: Data in a text file

Users who have access to the data needed in the estimation should create a text file with the data, for example, the capital, labor, and production data collected in Mizon (1977). See mizon_data.txt. User-provided data files must satisfy the following restrictions:

  • The first column of the data file must be a column of output indexed by Q, denoting the quantity of output.
  • The second and subsequent columns of the data file that contain input data may not contain any negative or zero input values.

The estimated variables in the Cobb-Douglas model are indexed by i0 for scale factor, i1 for output elasticity of input 1, i2 for output elasticity of input 2, i3 for output elasticity of input 3, …, etc.

Users then can download a sample GAMS model file, cd_text.gms (Cobb-Douglas model with text input), and modify it to solve their own estimation problems. Users should specify their own set definitions (sets “t” and “m” in the sample), include their own table of data (as described above), and run the modified model to obtain the estimation results.

Option 2: Data in a GAMS data exchange (gdx) file

Users who have access to the data in a GAMS data exchange (gdx) file can use one of the following two methods.

  • Method 1: Solve using the NEOS Server
    Users can click on the “Solve with NEOS” button to find estimation results based on the default gdx file, i.e., the file with the capital, labor, and production data collected in Mizon (1977). See mizon.gdx. Alternatively, users can upload their own data by clicking on the button next to “Upload GDX File” and then “Solve with NEOS”. User-provided gdx files must satisfy the same restrictions as listed above in Option 1.

    The estimated variables in the Cobb-Douglas model are indexed by i0 for scale factor, i1 for output elasticity of input 1, i2 for output elasticity of input 2, i3 for output elasticity of input 3, …, etc. Clicking on the “Reset” button will clear the solution.

  • Method 2: Calculate the regression statistics locally
    Users who have access to GAMS can download the GAMS model file, cd_gdx.gms (Cobb-Douglas model with gdx input), and solve the model locally with the following command:

    • “gams cd_gdx –in=mydata”

    where mydata.gdx is a data file provided by the user. The gdx file must satisfy the restrictions as described above in Option 1.






Model

A printable version of the nonlinear least squares model is here: cd_gdx.gms (Cobb-Douglas model with gdx input) or cd_text.gms (Cobb-Douglas model with text input).

References

  • Mizon, Grayham E. 1977. Inferential Procedures in Nonlinear Models: An Application in a UK Industrial Cross Section Study of Factor Substitution and Returns to Scale. Econometrica 45(5), 1221-1242.
  • Kalvelagen, Erwin. 2007. Least Squares Calculations with GAMS. Available for download at http://www.amsterdamoptimization.com/pdf/ols.pdf.
  • Greene, William. 2011. Econometrics Analysis, 7th ed. Prentice Hall, Upper Saddle River, NJ.