Notes on Regression - Approximation of the Conditional Expectation Function

The final installment in my ‘Notes on Regression’ series! For a review on ways to derive the Ordinary Least Square formula as well as various algebraic and geometric interpretations, check out the previous 5 posts: Part 1 - OLS by way of minimising the sum of square errors Part 2 - Projection and Orthogonality Part 3 - Method of Moments Part 4 - Maximum Likelihood Part 5 - Singular Vector Decomposition [Read More]

Notes on Regression - Singular Vector Decomposition

Here’s a fun take on the OLS that I picked up from The Elements of Statistical Learning. It applies the Singular Value Decomposition, also known as the method used in principal component analysis, to the regression framework. Singular Vector Decomposition (SVD) First, a little background on the SVD. The SVD could be thought of as a generalisation of the eigendecomposition. An eigenvector v of matrix \(\mathbf{A}\) is a vector that is mapped to a scaled version of itself: \[ \mathbf{A}v = \lambda v \] where \(\lambda\) is known as the eigenvalue. [Read More]

Comparing the Population and Group Level Regression

I was planning to write a post that uses region level data to infer the underlying relationship at the population level. However, after thinking through the issue over the past few days and working out the math (below), I realise that the question I wanted to answer could not be solved using the aggregate data at hand. Nonetheless, here is a formal description of the problem outlining the assumptions needed to infer population level trends from more aggregated data. [Read More]

Notes on Regression - Maximum Likelihood

Part 4 in the series of notes on regression analysis derives the OLS formula through the maximum likelihood approach. Maximum likelihood involves finding the value of the parameters that maximise the probability of the observed data by assuming a particular functional form distribution. Bernoulli example Take for example a dataset consisting of results from a series of coin flips. The coin may be biased and we want to find an estimator for the probability of the coin landing heads. [Read More]

Notes on Regression - Method of Moments

Another way of establishing the OLS formula is through the method of moments approach. This method supposedly goes way back to Pearson in 1894. It could be thought of as replacing a population moment with a sample analogue and using it to solve for the parameter of interest. Example 1 To find an estimator for the sample mean, \(\mu=E[X]\), one replaces the expected value with a sample analogue, \(\hat{\mu}=\frac{1}{n}\sum_{i=1}^{n} X_{i} = \bar{X}\) [Read More]

Notes on Regression - Projection

This is one of my favourite ways of establishing the traditional OLS formula. I remember being totally amazed when I first found out how to derive the OLS formula in a class on linear algebra. Understanding regression through the perspective of projections also shows the connection between the least squares method and linear algebra. It also gives a nice way of visualising the geometry of the OLS technique. This set of notes is largely inspired by a section in Gilbert Strang’s course on linear algebra. [Read More]

Notes on Regression - OLS

This post is the first in a series of my study notes on regression techniques. I first learnt about regression as a way of fitting a line through a series of points. Invoke some assumptions and one obtains the relationship between two variables. Simple…or so I thought. Through the course of my study, I developed a deeper appreciation of its nuances which I hope to elucidate in these set of notes. [Read More]