notes on Quasilinear Musings
https://www.timlrx.com/tags/notes/
Recent content in notes on Quasilinear MusingsHugo -- gohugo.ioen-ustimothy.lin@alumni.ubc.ca (Timothy Lin)timothy.lin@alumni.ubc.ca (Timothy Lin)Sat, 01 Aug 2020 00:00:00 +0000Schelling's Segregation Model in Julia - Part 1
https://www.timlrx.com/2020/08/01/schelling-segregation-model-in-julia-part-1/
Sat, 01 Aug 2020 00:00:00 +0000timothy.lin@alumni.ubc.ca (Timothy Lin)https://www.timlrx.com/2020/08/01/schelling-segregation-model-in-julia-part-1/This article is the first of a series that introduces the Julia programming language by replicating Schelling, Thomas C. “Dynamic models of segregation.” Journal of mathematical sociology 1.2 (1971): 143-186.
Follow along by cloning the git repository over here: https://github.com/timlrx/learning-julia
Why Julia? From the creator’s themselves:
We are greedy: we want more.
We want a language that’s open source, with a liberal license. We want the speed of C with the dynamism of Ruby.Benchmark of popular graph/network packages v2
https://www.timlrx.com/2020/05/10/benchmark-of-popular-graph-network-packages-v2/
Sun, 10 May 2020 00:00:00 +0000timothy.lin@alumni.ubc.ca (Timothy Lin)https://www.timlrx.com/2020/05/10/benchmark-of-popular-graph-network-packages-v2/This is an update of a benchmark of popular graph / network packages post. This study aims to serve as a starting point for anyone interested in applied graph or network analysis. The featured network packages offer a convenient and standardised API for modelling data as graphs and extracting network related insights. Some common use cases include finding the shortest path between entities or calculating a measure of centrality such as the page rank score.Efficient Large Graph Propagation Algorithm
https://www.timlrx.com/2020/03/29/efficient-large-graph-propagation-algorithm/
Sun, 29 Mar 2020 00:00:00 +0000timothy.lin@alumni.ubc.ca (Timothy Lin)https://www.timlrx.com/2020/03/29/efficient-large-graph-propagation-algorithm/Cross-posting from my company’s blog, but if you have not checked it out, I wrote an interesting technical piece on how we engineered a large scale label propagation algorithm.
The accompanying slides can be found hereServerless Machine Learning with R on Cloud Run
https://www.timlrx.com/2020/01/22/serverless-machine-learning-with-r-on-cloud-run/
Wed, 22 Jan 2020 00:00:00 +0000timothy.lin@alumni.ubc.ca (Timothy Lin)https://www.timlrx.com/2020/01/22/serverless-machine-learning-with-r-on-cloud-run/One of the main challenges that every data scientist face is model deployment. Unless you are one of the lucky few who has loads of data engineers to help you deploy a model, it’s really an issue in enterprise projects. I am not even implying that the model needs to be production ready but even a seemingly basic issue of making the model and insights accessible to business users is more of a hassle then it needs to be.Speeding up R Plotly web apps - R x Javascript part I
https://www.timlrx.com/2019/12/17/speeding-up-r-plotly-webapps-r-x-javascript-part-i/
Tue, 17 Dec 2019 00:00:00 +0000timothy.lin@alumni.ubc.ca (Timothy Lin)https://www.timlrx.com/2019/12/17/speeding-up-r-plotly-webapps-r-x-javascript-part-i/Back to blogging! Sorry for the long hiatus, had some personal projects which kept me really occupied over the past few months. Hope to share about them one of these days and potentially even explore open sourcing parts of them but the idea of this post is to transfer some of my learnings over the past year to an issue in R that always irritated me - slow loading webapps.Benchmark of popular graph/network packages
https://www.timlrx.com/2019/05/05/benchmark-of-popular-graph-network-packages/
Sun, 05 May 2019 00:00:00 +0000timothy.lin@alumni.ubc.ca (Timothy Lin)https://www.timlrx.com/2019/05/05/benchmark-of-popular-graph-network-packages/This post is superseded by an updated benchmark
In this post I benchmark the performance of 5 popular graph/network packages. This was inspired by two questions I had:
Recently, I have been working with large networks (millions of vertices and edges) and often wonder what is the best currently available package/tool that would scale well and handle large scale network analysis tasks. Having tried out a few (networkx in Python and igraph in R) but on different problems, I thought it would be nice to have a head to head comparison.Binance hackathon - 2nd place solution
https://www.timlrx.com/2019/02/11/binance-hackathon-2nd-place/
Mon, 11 Feb 2019 00:00:00 +0000timothy.lin@alumni.ubc.ca (Timothy Lin)https://www.timlrx.com/2019/02/11/binance-hackathon-2nd-place/It has been about a month since my team and I placed 2nd in a hackathon organised by Binance. Since it was my first time officially doing front-end development, I thought it would be fun to blog about my experience in the hackathon and document the technical solution which I coded up in react.js.
A massive congratulations to the three winning teams of the #Binance #SAFU Hackathon who shared a prize of \(100,000 USD worth of <a href="https://twitter.Cleaning openstreetmap intersections in python
https://www.timlrx.com/2019/01/05/cleaning-openstreetmap-intersections-in-python/
Sat, 05 Jan 2019 00:00:00 +0000timothy.lin@alumni.ubc.ca (Timothy Lin)https://www.timlrx.com/2019/01/05/cleaning-openstreetmap-intersections-in-python/Introduction It has been a while since I have posted anything on Python, so I thought it is time to switch things up and write do a Python GIS tutorial. GIS in python typically revolves around the geopandas and shapely packages. If you are using OpenStreetMaps(osm) in your work, the osmnx package is also very useful and makes downloading and visualising map data straightforward.
In this post, I explore the problem of simplifying route intersections.Visualising Networks in ASOIAF - Part II
https://www.timlrx.com/2018/10/14/visualising-networks-in-asoiaf-part-ii/
Sun, 14 Oct 2018 00:00:00 +0000timothy.lin@alumni.ubc.ca (Timothy Lin)https://www.timlrx.com/2018/10/14/visualising-networks-in-asoiaf-part-ii/This is the second post of a character network analysis of George R. R. Martin’s A Song Of Ice and Fire (ASOIAF) series as well as my first submission to the R Bloggers community. A warm welcome to all readers out there! In my first post, I touched on the Tidygraph package to manipulate dataframes and ggraph for network visualisation as well as some tricks to fix the position of nodes when ploting multiple graphs containing the same node set and labeling based on polar coordinates.Visualising Networks in ASOIAF
https://www.timlrx.com/2018/09/09/visualising-networks-in-asoiaf/
Sun, 09 Sep 2018 00:00:00 +0000timothy.lin@alumni.ubc.ca (Timothy Lin)https://www.timlrx.com/2018/09/09/visualising-networks-in-asoiaf/While waiting for the winds of winter to arrive, there is plenty of time to revisit the 5 books. One of my favourite aspects of the series is the character and world building. As the song of ice and fire universe is so big, many characters are mentioned in passing while the major characters meet each other only occasionally. I thought it would be interesting to see how various characters are connected and how that progresses through the series.Applications of DAGs in Causal Inference
https://www.timlrx.com/2018/08/09/applications-of-dags-in-causal-inference/
Thu, 09 Aug 2018 00:00:00 +0000timothy.lin@alumni.ubc.ca (Timothy Lin)https://www.timlrx.com/2018/08/09/applications-of-dags-in-causal-inference/Introduction Two years ago I came across Pearl’s work on using directed cyclical graphs (DAGs) to model the problem of causal inference and have read the debate between academics on Pearl’s framework vs Rubin’s potential outcomes framework. Then I found it quite intriguing from a scientific methods and history perspective how two different formal frameworks could be developed to solve a common goal. I read a few papers on the DAG approach but without fully understanding how it could be useful to my work filed it away in the back of my mind (and computer folder).Notes on Regression - Approximation of the Conditional Expectation Function
https://www.timlrx.com/2018/02/26/notes-on-regression-approximation-of-the-conditional-expectation-function/
Mon, 26 Feb 2018 00:00:00 +0000timothy.lin@alumni.ubc.ca (Timothy Lin)https://www.timlrx.com/2018/02/26/notes-on-regression-approximation-of-the-conditional-expectation-function/The final installment in my ‘Notes on Regression’ series! For a review on ways to derive the Ordinary Least Square formula as well as various algebraic and geometric interpretations, check out the previous 5 posts:
Part 1 - OLS by way of minimising the sum of square errors
Part 2 - Projection and Orthogonality
Part 3 - Method of Moments
Part 4 - Maximum LikelihoodNotes on Graphs and Spectral Properties
https://www.timlrx.com/2017/12/25/notes-on-graphs-and-spectral-properties/
Mon, 25 Dec 2017 00:00:00 +0000timothy.lin@alumni.ubc.ca (Timothy Lin)https://www.timlrx.com/2017/12/25/notes-on-graphs-and-spectral-properties/Here is the first series of a collection of notes which I jotted down over the past 2 months as I tried to make sense of algebraic graph theory. This one focuses on the basic definitions and some properties of matrices related to graphs. Having all the symbols and main properties in a single page is a useful reference as I delve deeper into the applications of the theories. Also, it saves me time from googling and checking the relationship between these objects.Choosing a Control Group in a RCT with Multiple Treatment Periods
https://www.timlrx.com/2017/11/18/choosing-a-control-group-in-a-rct-with-multiple-treatment-periods/
Sat, 18 Nov 2017 00:00:00 +0000timothy.lin@alumni.ubc.ca (Timothy Lin)https://www.timlrx.com/2017/11/18/choosing-a-control-group-in-a-rct-with-multiple-treatment-periods/Came across a fun little problem over the past few weeks that is related to the topic of policy impact evaluation - a long time interest of mine! Here’s the setting: we have a large population of individuals and a number of treatments that we want to gauge the effectiveness of. The treatments are not necessarily the same but are targeted towards certain sub-segments in the population. Examples of such situations include online ad targeting or marketing campaigns.Notes on Regression - Singular Vector Decomposition
https://www.timlrx.com/2017/10/21/notes-on-regression-singular-vector-decomposition/
Sat, 21 Oct 2017 00:00:00 +0000timothy.lin@alumni.ubc.ca (Timothy Lin)https://www.timlrx.com/2017/10/21/notes-on-regression-singular-vector-decomposition/Here’s a fun take on the OLS that I picked up from The Elements of Statistical Learning. It applies the Singular Value Decomposition, also known as the method used in principal component analysis, to the regression framework.
Singular Vector Decomposition (SVD) First, a little background on the SVD. The SVD could be thought of as a generalisation of the eigendecomposition. An eigenvector v of matrix \(\mathbf{A}\) is a vector that is mapped to a scaled version of itself: \[ \mathbf{A}v = \lambda v \] where \(\lambda\) is known as the eigenvalue.Comparing the Population and Group Level Regression
https://www.timlrx.com/2017/10/01/comparing-the-population-and-group-level-regression/
Sun, 01 Oct 2017 00:00:00 +0000timothy.lin@alumni.ubc.ca (Timothy Lin)https://www.timlrx.com/2017/10/01/comparing-the-population-and-group-level-regression/I was planning to write a post that uses region level data to infer the underlying relationship at the population level. However, after thinking through the issue over the past few days and working out the math (below), I realise that the question I wanted to answer could not be solved using the aggregate data at hand. Nonetheless, here is a formal description of the problem outlining the assumptions needed to infer population level trends from more aggregated data.Notes on Regression - Maximum Likelihood
https://www.timlrx.com/2017/09/21/notes-on-regression-maximum-likelihood/
Thu, 21 Sep 2017 00:00:00 +0000timothy.lin@alumni.ubc.ca (Timothy Lin)https://www.timlrx.com/2017/09/21/notes-on-regression-maximum-likelihood/Part 4 in the series of notes on regression analysis derives the OLS formula through the maximum likelihood approach. Maximum likelihood involves finding the value of the parameters that maximise the probability of the observed data by assuming a particular functional form distribution.
Bernoulli example Take for example a dataset consisting of results from a series of coin flips. The coin may be biased and we want to find an estimator for the probability of the coin landing heads.Using Leaflet in R - Tutorial
https://www.timlrx.com/2017/09/13/using-leaflet-in-r-tutorial/
Wed, 13 Sep 2017 00:00:00 +0000timothy.lin@alumni.ubc.ca (Timothy Lin)https://www.timlrx.com/2017/09/13/using-leaflet-in-r-tutorial/Here’s a tutorial on using Leaflet in R. While the leaflet package supports many options, the documentation is not the clearest and I had to do a bit of googling to customise the plot to my liking. This walkthrough documents the key features of the package which I find useful in generating choropleth overlays. Compared to the simple tmap approach documented in the previous post, creating a visualisation using leaflet gives more control over the final outcome.Notes on Regression - Method of Moments
https://www.timlrx.com/2017/08/31/notes-on-regression-method-of-moments/
Thu, 31 Aug 2017 00:00:00 +0000timothy.lin@alumni.ubc.ca (Timothy Lin)https://www.timlrx.com/2017/08/31/notes-on-regression-method-of-moments/Another way of establishing the OLS formula is through the method of moments approach. This method supposedly goes way back to Pearson in 1894. It could be thought of as replacing a population moment with a sample analogue and using it to solve for the parameter of interest.
Example 1 To find an estimator for the sample mean, \(\mu=E[X]\), one replaces the expected value with a sample analogue, \(\hat{\mu}=\frac{1}{n}\sum_{i=1}^{n} X_{i} = \bar{X}\)Notes on Regression - Projection
https://www.timlrx.com/2017/08/23/notes-on-regression-projection/
Wed, 23 Aug 2017 00:00:00 +0000timothy.lin@alumni.ubc.ca (Timothy Lin)https://www.timlrx.com/2017/08/23/notes-on-regression-projection/This is one of my favourite ways of establishing the traditional OLS formula. I remember being totally amazed when I first found out how to derive the OLS formula in a class on linear algebra. Understanding regression through the perspective of projections also shows the connection between the least squares method and linear algebra. It also gives a nice way of visualising the geometry of the OLS technique.
This set of notes is largely inspired by a section in Gilbert Strang’s course on linear algebra.Notes on Regression - OLS
https://www.timlrx.com/2017/08/16/notes-on-regression-ols/
Wed, 16 Aug 2017 00:00:00 +0000timothy.lin@alumni.ubc.ca (Timothy Lin)https://www.timlrx.com/2017/08/16/notes-on-regression-ols/This post is the first in a series of my study notes on regression techniques. I first learnt about regression as a way of fitting a line through a series of points. Invoke some assumptions and one obtains the relationship between two variables. Simple…or so I thought. Through the course of my study, I developed a deeper appreciation of its nuances which I hope to elucidate in these set of notes.