Log-Linear Regressions: Three Things To Keep In Mind

Early on in introductory econometrics courses students learn about linear regression and two related siblings: log-linear (also called log-level) and log-log regressions. This post is about three things I came to learn about log-linear regressions that should be known to anyone even thinking of running such a specification. I am summarizing two excellent blog posts from William Gould (CEO of STATA Crop.), and Dave Giles (Economics Professor at University of Victoria), that I was told about by Anthony D’Agostino and Kyle Meng, respectively.

In a log-linear regression we have a dependent variable, $y_i$ , of which we take a log transformation, $\log(y_i)$ , and a set of explanatory variables $\mathbf{X}$ . For now let us focus on one continuous explanatory variable, $x_i$ , and one binary (dummy) variable, $D_i$ . Our model is of the following form then:

$\log(y_i) = \beta_0 + \beta_1 x_i + \beta_2 D_i + \varepsilon_i$

Why even bother with a log-linear model? Because the way we interpret the coefficients is different. In a level-level model ( standard linear regression) when the level of the independent variable increases by one unit then the level of the dependent variable increases by the coefficient amount. However, in a log-linear model, when the level of the continuous independent variable (emphasis on continuous explained shortly) increases by one unit, we interpret the coefficient as the relative change in the dependent variable. Meaning that if $\beta_1=0.1$ then we would interpret this as a 10% increase in the size of $\mathbb{E}(y_i)$ .

This matters because if different units of observations have different baselines of $y_i$ , then they might respond differently to the same change in $x_i$ . If one person showers for 30 minutes and another showers for 5 minutes and suddenly there is a water price increase then we would not expect the same absolute reduction in shower times. It is much easier for the former shower enthusiast to shave 5 minutes off their shower time; whereas it will be undesirable, for those around the latter shower frugal, to do the same. It is more likely that they will reduce their showering times in a relative manner. If they cut down their shower times to 24 and 4 minutes, respectively, then a coefficient from a log-linear model will capture that 20% reduction.

Then what are the three big “secrets” about log-linear models?

There is an extra step when it comes to getting the predicted values that you might not be aware of.
- You are probably aware that once you run a log-linear model you need to take the exponent of the predicted values. Something in the form of:
  - $\hat{y_i} = e^{\hat{\beta_0} + \hat{\beta_1} x_i + \hat{\beta_2} D_i}$
- I was ignorant enough to think that this is enough. But if you read through William’s post you will understand why you need to take the next step as well:
  - $\tilde{y_i} = \hat{y_i}\times e^{\frac{\hat{\sigma}^2}{2}}$
  - Where $\hat{\sigma}^2$ is the square of the Root Mean Square Error (RMSE) of the regression. You need this part because when you write the log-linear model back in level form you have that the expectation on the error term is not $\mathbb{E}(\varepsilon_i)$ (which is equal to zero) but instead is equal to $\mathbb{E}(e^{\varepsilon_i})$ ; making all the difference in the world (i.e. this is not equal to zero). Again, more detail in the original post by William Gould.
You might actually want to run a Poisson regression instead.
- Often there are values for which the dependent variable is equal to zero and then we cannot take a log on those values. We are forced to drop them from the sample yet they might hold important information. Also, dropping observations makes it harder to compare to a benchmark level-level model that you are probably running as well. When running a Poisson regression model you don’t have to drop those observations.
- The sample might have values that are different than zero, yet are very small. That will get picked up as very big differences in a log-linear model. Something that a Poisson regression will know how to handle better.
- Pay attention that you are using robust standard errors as you need to relax the assumption about the equality of the mean and the variance in a Poisson distribution. All software packages will do this easily but you have to remember to tell them to do so!
- Pay attention that Poisson models work great with count data when it comes to interpretation, but there are ample caveats when it comes to continuous data.
That coefficient interpretation for continuous variables is valid, but requires more attention when it comes to dummy variables.
- I saved the best for last. In his blog post Giles covers several very interesting topics about dummy variables. The one that is relevant to log-linear models is that the interpretation of the coefficient as the relative change in the dependent variable is wrong. You need to apply a simple correction to get the correct magnitude, which is no longer symmetric. Meaning that there will be a difference in the size of the effect whether the dummy variable goes from zero to one, or vice-versa.
- You can easily convince yourself that the relative change in our model from the beginning of this post, when goes from zero to one, is:
  - $\frac{e^{\beta_0 + \beta_1 x_i + \beta_2 + \varepsilon_i} - e^{\beta_0 + \beta_1 x_i + \varepsilon_i}}{e^{\beta_0 + \beta_1 x_i + \varepsilon_i}}\times 100\%$
  - Which if you continue with the algebra you will get that the correct size of the relative change is given by:
    - When $D_i=0\rightarrow D_i=1$ is $(e^{\beta_2} - 1)\times 100\%$
    - When $D_i=1\rightarrow D_i=0$ is $(e^{-\beta_2} - 1)\times 100\%$
- This correction should only be applied when you refer to the estimates in the text. Your regression table should still have the estimated coefficient as is. But whenever you are referencing that coefficient in terms of its magnitude, apply that correction.
- This being said, the differences are almost negligible when considering coefficient values below 0.1. Around 0.2 you will face 0.02 percentage points difference, and this keeps climbing up. The point of this is that if you are in the realm of point estimates that are below 0.1, you can quite possibly ignore this. But still worth remembering this point.