Hand Coding a Linear Model function
In yesterday’s post we developed a method for constructing a multivariate linear model with an intercept.
Today we will turn the collection of loose commands into an integrated and easy to use function
.
A small recap from yesterday. We start by loading data and assigning our variables to objects
data(iris)
x1 <- iris$Petal.Width
x2 <- iris$Sepal.Length
y <- iris$Petal.Length
We now construction our linear model, the fastest way of doing this is using the QR decomposition.
XI <- cbind(x1, x2, 1) # tie all the independent variables together
solve(t(XI)%*%XI) %*% t(XI)%*%y # solve for beta
## [,1]
## x1 1.7481029
## x2 0.5422556
## -1.5071384
In R
functions are treated as objects, which means that we ascribe our function to a name, the same way we ascribed our variables to a name (e.g. x1 <- Petal.Length
).
In order to create a function, we use the construction function (a function to construct an object of the type function).
Logically, but perhaps also somewhat confusingly, this constructor function is called function()
.
Lets start with creating a function, called squaring
, that takes one numeric argument (parameter) and returns it squared.
squaring <- function(a) a^2
squaring(a = 2)
## [1] 4
We can also use negatives of course.
squaring(-3)
## [1] 9
Note that in the above call we did not even specify that the input (-3
) had to be assigned to the argument a
,
this was automatically deduced by R
.
We can also construction a function that takes two arguments and multiplies them.
multiplying <- function(a, b) {
a * b
}
multiplying(5, -6)
## [1] -30
Note that because this function was a bit longer, I put the code on a new line.
When doing this, the code has to be enclosed in curly brackets ({
and }
).
It is now time to apply these insights to our linear model. The first step is to define our function which takes the arguments y
and X
.
linear_model <- function (y, X) {
}
We can now simply insert our sollution for beta.
linear_model <- function (y, X) {
solve(t(X)%*%X) %*% t(X)%*%y # solve for beta
}
We can now estimate our model.
linear_model(y = y, X = XI)
## [,1]
## x1 1.7481029
## x2 0.5422556
## -1.5071384
Note that the argument (e.g. X
) and the assigned object (XI
) do not have to have the same name.
At this point we still need to manually include our incept using cbind(x1, x2, 1)
.
We can add a small piece to our code, which will allow the function to take care of this.
linear_model <- function (y, X, intercept=TRUE) {
if (intercept) X <- cbind(X, 1)
solve(t(X)%*%X) %*% t(X)%*%y # solve for beta
}
The first piece intercept=TRUE
creates the argument and sets the default to TRUE
.
That means that, unless me manually specify default=FALSE
, an intercept will be included.
The second piece if (intercept) X <- cbind(X, 1)
does the following.
- check if
intercept == TRUE
- if so, overwrite
X
withcbind(X, 1)
We can now use this function as follows.
X <- cbind(x1, x2)
linear_model(y = y, X = X)
## [,1]
## x1 1.7481029
## x2 0.5422556
## -1.5071384
That gives us a pretty complete linear model.