Using Leaflet in R - Tutorial

Here’s a tutorial on using Leaflet in R. While the leaflet package supports many options, the documentation is not the clearest and I had to do a bit of googling to customise the plot to my liking. This walkthrough documents the key features of the package which I find useful in generating choropleth overlays. Compared to the simple tmap approach documented in the previous post, creating a visualisation using leaflet gives more control over the final outcome. [Read More]

Examining the Changes in Religious Beliefs - Part 2

In a previous post, I took a look at the distribution of religious beliefs in Singapore. Having compiled additional characteristics across 3 time periods (2000, 2010, 2015), I decided to write a follow-up post to examine the changes across time. The dataset that I will be using is aggregated from the 2000 and 2010 Census as well as the 2015 General Household Survey. [Read More]

Notes on Regression - Method of Moments

Another way of establishing the OLS formula is through the method of moments approach. This method supposedly goes way back to Pearson in 1894. It could be thought of as replacing a population moment with a sample analogue and using it to solve for the parameter of interest. Example 1 To find an estimator for the sample mean, \(\mu=E[X]\), one replaces the expected value with a sample analogue, \(\hat{\mu}=\frac{1}{n}\sum_{i=1}^{n} X_{i} = \bar{X}\) [Read More]

Mapping the Distribution of Religious Beliefs in Singapore

Inspired by my thesis, I have been playing around with mapping tools over the past few days. While the maps showing the distribution of migrant groups across the United States did not make it to the final copy of my paper I had fun toying around with the various mapping packages. In this post, I decided to apply what I have learnt and take a look at the spatial distribution of Singapore’s population. [Read More]

Notes on Regression - Projection

This is one of my favourite ways of establishing the traditional OLS formula. I remember being totally amazed when I first found out how to derive the OLS formula in a class on linear algebra. Understanding regression through the perspective of projections also shows the connection between the least squares method and linear algebra. It also gives a nice way of visualising the geometry of the OLS technique. This set of notes is largely inspired by a section in Gilbert Strang’s course on linear algebra. [Read More]

Thesis Thursday 7 - Conclusion

Finally, the last installment of the Thesis Thursday series! Rather than going through what I have done since the previous post (basically more refinements and robustness checks), I decide share some miscellaneous thoughts and lessons learnt over the past few months. The completed research paper and accompanying slides can be downloaded from my website. ###On R and Stata I decided to code the entire project in R this time round and I have to say that I am quite won over by the capabilities of the various packages. [Read More]

Notes on Regression - OLS

This post is the first in a series of my study notes on regression techniques. I first learnt about regression as a way of fitting a line through a series of points. Invoke some assumptions and one obtains the relationship between two variables. Simple…or so I thought. Through the course of my study, I developed a deeper appreciation of its nuances which I hope to elucidate in these set of notes. [Read More]

Update on the SG Economic Dashboard

I have updated the SG-Dashboard with 2Q 2017 numbers. I also took the opportunity to add in a few new tables and charts. There is a new table that keeps track of value-added (VA) revisions of last quarter’s result. VA for certain industries such as construction are approximated based on early indicators and the actual numbers take a quarter or more to stream in. It is also interesting to see the actual economic performance and whether it matches up to the narrative of last quarter’s release. [Read More]

Thesis Thursday 6 - The Final Stretch

Since my presentation about two weeks ago, I have been working on incorporating some of the suggestions and performing additional robustness tests. The updated version of the slides can be found here. A recap of the key results - I find a positive correlation between the foreign-born and consumption shares within U.S. counties but this result does not hold across Asian countries. In fact, an increase in foreign-born share led to a decline in consumption of Asian-related consumer packaged goods. [Read More]

Thesis Thursday 5 - From recipes to weights

In the previous post, I provided an exploratory analysis of the allrecipe dataset. This post is a continuation and details the construction of product weights from the recipe corpus. TF-IDF To obtain a measure of how unique a particular word is to given recipe category, I calculate each word-region score using the TF-IDF approach which is given by the following formula: \[ TF\text{-}IDF_{t,d} =\frac{f_{t,d}}{\sum_{t'\in d}f_{t',d}} \cdot log \frac{N}{n_{t}+1} \] where \(f_{t,d}\) is the frequency in which a term, \(t\), appears in document \(d\), \(N\) is the total number of documents in the corpus and \(n_{t}\) is the total number of documents where term \(t\) is found. [Read More]