Thesis Thursday 7 - Conclusion

Finally, the last installment of the Thesis Thursday series! Rather than going through what I have done since the previous post (basically more refinements and robustness checks), I decide share some miscellaneous thoughts and lessons learnt over the past few months. The completed research paper and accompanying slides can be downloaded from my website. ###On R and Stata I decided to code the entire project in R this time round and I have to say that I am quite won over by the capabilities of the various packages. [Read More]

Update on the SG Economic Dashboard

I have updated the SG-Dashboard with 2Q 2017 numbers. I also took the opportunity to add in a few new tables and charts. There is a new table that keeps track of value-added (VA) revisions of last quarter’s result. VA for certain industries such as construction are approximated based on early indicators and the actual numbers take a quarter or more to stream in. It is also interesting to see the actual economic performance and whether it matches up to the narrative of last quarter’s release. [Read More]

Thesis Thursday 5 - From recipes to weights

In the previous post, I provided an exploratory analysis of the allrecipe dataset. This post is a continuation and details the construction of product weights from the recipe corpus. TF-IDF To obtain a measure of how unique a particular word is to given recipe category, I calculate each word-region score using the TF-IDF approach which is given by the following formula: \[ TF\text{-}IDF_{t,d} =\frac{f_{t,d}}{\sum_{t'\in d}f_{t',d}} \cdot log \frac{N}{n_{t}+1} \] where \(f_{t,d}\) is the frequency in which a term, \(t\), appears in document \(d\), \(N\) is the total number of documents in the corpus and \(n_{t}\) is the total number of documents where term \(t\) is found. [Read More]

Thesis Thursday 4 - Analysing Recipes

One of the main component of my thesis is a mapping from consumers’ purchases to country related expenditure shares. This requires a method to associate each available product to a particular country. I have briefly discussed the issue in the introductory post but have made significant progress on this front that I think is worth sharing. The recipe dataset This recipe dataset was created by scraping recipes from that are tagged to particular region or country. [Read More]

Binscatter for R

I was trying to find an R package that provides features similar to Stata’s binscatter user written program but there does not appear to be any good substitutes around. Hence, I decided to write a function that replicates it in R. Turns out it actually took longer than I thought and there are still many bugs to fix but the developmental version is worth sharing. It can be downloaded from my Github page. [Read More]

Scraping SG's GDP data using SingStat's API

I have been trying to catch-up on the latest release of Singapore’s economic results. Unfortunately, the official press release or media reports are not very useful. They either contain too much irrelevant information or not enough details for my liking. Maybe I just like looking at the numbers and letting the figures speak for themselves. Hence, I decided to obtain the data from the official SingStat’s Table Builder website. [Read More]