Hand Coding Instumental Variables
In a previous post we discussed the linear model and how to write a function that performs a linear regression. In this post we will use that linear model function to perform a [Two-Stage Least Squares estimation]. This estimation allows us to […]
Recall that we built the follow linear model function.
ols <- function (y, X, intercept=TRUE) {
if (intercept) X <- cbind(1, X)
solve(t(X)%*%X) %*% t(X)%*%y # solve for beta
}We used the following data.
data(iris)
x1 <- iris$Petal.Width
x2 <- iris$Sepal.Length
y <- iris$Petal.LengthThis allowed us to estimate a linear model.
X <- cbind(x1, x2)
ols(y = y, X = X)## [,1]
## -1.5071384
## x1 1.7481029
## x2 0.5422556Which includes the intercept, since the default value it TRUE (see function definition above),
we could estimate it without an intercept using.
ols(y = y, X = X, intercept = FALSE)## [,1]
## x1 1.9891655
## x2 0.2373159Having revisited the above, we can continue with instumental variables.
We will replicate an example from the AER (Applied Econometric Regressions) package.
library(AER)
data("CigarettesSW")
rprice <- with(CigarettesSW, price/cpi)
tdiff <- with(CigarettesSW, (taxs - tax)/cpi)
packs <- CigarettesSW$packsWe first need to obtain our first stage estimate (putting the whole function between parentheses allows us to both write it to the object s1 and print it).
( s1 <- ols(y = rprice, X = tdiff) )## [,1]
## 91.103739
## X 4.163002We can now obtain the predicted (fitted) values
Xhat <- s1[1]*tdiff + s1[2]Using these fitted values, we can finally estimate our second stage.
ols(y = packs, X = Xhat)## [,1]
## 126.89145456
## X -0.04658554Now compare this to the results using ivreg() fucntion from the AER package.
ivreg(packs ~ rprice | tdiff)##
## Call:
## ivreg(formula = packs ~ rprice | tdiff)
##
## Coefficients:
## (Intercept) rprice
## 219.576 -1.019When we compare this estimate to the estimate from the linear model, we find that the effect of the price is significantly overestimated there.
lm(packs ~ rprice)##
## Call:
## lm(formula = packs ~ rprice)
##
## Coefficients:
## (Intercept) rprice
## 222.209 -1.044We can also do this using R’s built in lm function.
# first stage
s1 <- lm(rprice ~ tdiff)
# estimate second stage using fitted values (Xhat)
lm(packs ~ s1$fitted.values)##
## Call:
## lm(formula = packs ~ s1$fitted.values)
##
## Coefficients:
## (Intercept) s1$fitted.values
## 219.576 -1.019As an intermediate form, we can manually calculate Xhat (fitted.values) and estimate using that.
# manually obtain fitted values
Xhat <- s1$coefficients[2]*tdiff + s1$coefficients[1]
# estimate second stage using Xhat
lm(packs ~ Xhat)##
## Call:
## lm(formula = packs ~ Xhat)
##
## Coefficients:
## (Intercept) Xhat
## 219.576 -1.019Note that if you estimate a TSLS using the lm function,
that you can only use the coefficients,
the error terms will be wrong.