Thesis Thursday 7 - Conclusion

Finally, the last installment of the Thesis Thursday series! Rather than going through what I have done since the previous post (basically more refinements and robustness checks), I decide share some miscellaneous thoughts and lessons learnt over the past few months. The completed research paper and accompanying slides can be downloaded from my website. ###On R and Stata I decided to code the entire project in R this time round and I have to say that I am quite won over by the capabilities of the various packages. [Read More]

Thesis Thursday 6 - The Final Stretch

Since my presentation about two weeks ago, I have been working on incorporating some of the suggestions and performing additional robustness tests. The updated version of the slides can be found here. A recap of the key results - I find a positive correlation between the foreign-born and consumption shares within U.S. counties but this result does not hold across Asian countries. In fact, an increase in foreign-born share led to a decline in consumption of Asian-related consumer packaged goods. [Read More]

Thesis Thursday 5 - From recipes to weights

In the previous post, I provided an exploratory analysis of the allrecipe dataset. This post is a continuation and details the construction of product weights from the recipe corpus. TF-IDF To obtain a measure of how unique a particular word is to given recipe category, I calculate each word-region score using the TF-IDF approach which is given by the following formula: \[ TF\text{-}IDF_{t,d} =\frac{f_{t,d}}{\sum_{t'\in d}f_{t',d}} \cdot log \frac{N}{n_{t}+1} \] where \(f_{t,d}\) is the frequency in which a term, \(t\), appears in document \(d\), \(N\) is the total number of documents in the corpus and \(n_{t}\) is the total number of documents where term \(t\) is found. [Read More]

Thesis Thursday 4 - Analysing Recipes

One of the main component of my thesis is a mapping from consumers’ purchases to country related expenditure shares. This requires a method to associate each available product to a particular country. I have briefly discussed the issue in the introductory post but have made significant progress on this front that I think is worth sharing. The recipe dataset This recipe dataset was created by scraping recipes from that are tagged to particular region or country. [Read More]

Thesis Thursday 3 - Model and Methodology

This week’s Thesis Thursday will be a two part special, partly to make up for last week’s missing post. I also want to take the opportunity to document some thoughts and progress. This post is arguably the more technical of the two and focuses on issues relating to the model and methodology of my paper. In the subsequent post I plan to document some analysis on the recipe dataset which I think is very interesting (maybe even more so then the actual thesis itself). [Read More]

Thesis Thursday #2

Over the past week I made a few detours and explored other options that yielded little. On the positive side, I managed to merge and clean most of the datasets and started generating some descriptive statistics to get a better understanding of the data. Migration to the U.S. Let us take a look at U.S. migration patterns from 1970 to 2012.1 I use the share of foreign born as surveyed in the decennial census and American Community Survey as a proxy for migration trends. [Read More]

Thesis Thursday - Introduction

I decided to document my progress on my masters thesis as a weekly Thursday special. Hopefully I would have enough materials or progress to continue the weekly post but this should also give me some motivation to work on it. I find the process of writing also useful to think through some ideas more carefully. This week I will be introducing the idea of my research project, the data I will be using and some issues I am trying to solve. [Read More]