Benchmark of popular graph/network packages

In this post I benchmark the performance of 5 popular graph/network packages. This was inspired by two questions I had: Recently, I have been working with large networks (millions of vertices and edges) and often wonder what is the best currently available package/tool that would scale well and handle large scale network analysis tasks. Having tried out a few (networkx in Python and igraph in R) but on different problems, I thought it would be nice to have a head to head comparison. [Read More]

Binance hackathon - 2nd place solution

Technical overview of the solution and my hackathon experience

It has been about a month since my team and I placed 2nd in a hackathon organised by Binance. Since it was my first time officially doing front-end development, I thought it would be fun to blog about my experience in the hackathon and document the technical solution which I coded up in react.js. A massive congratulations to the three winning teams of the #Binance #SAFU Hackathon who shared a prize of \(100,000 USD worth of <a href="https://twitter. [Read More]

Cleaning openstreetmap intersections in python

Introduction It has been a while since I have posted anything on Python, so I thought it is time to switch things up and write do a Python GIS tutorial. GIS in python typically revolves around the geopandas and shapely packages. If you are using OpenStreetMaps(osm) in your work, the osmnx package is also very useful and makes downloading and visualising map data straightforward. In this post, I explore the problem of simplifying route intersections. [Read More]

Visualising Networks in ASOIAF - Part II

This is the second post of a character network analysis of George R. R. Martin’s A Song Of Ice and Fire (ASOIAF) series as well as my first submission to the R Bloggers community. A warm welcome to all readers out there! In my first post, I touched on the Tidygraph package to manipulate dataframes and ggraph for network visualisation as well as some tricks to fix the position of nodes when ploting multiple graphs containing the same node set and labeling based on polar coordinates. [Read More]

Visualising Networks in ASOIAF

While waiting for the winds of winter to arrive, there is plenty of time to revisit the 5 books. One of my favourite aspects of the series is the character and world building. As the song of ice and fire universe is so big, many characters are mentioned in passing while the major characters meet each other only occasionally. I thought it would be interesting to see how various characters are connected and how that progresses through the series. [Read More]

Applications of DAGs in Causal Inference

Introduction Two years ago I came across Pearl’s work on using directed cyclical graphs (DAGs) to model the problem of causal inference and have read the debate between academics on Pearl’s framework vs Rubin’s potential outcomes framework. Then I found it quite intriguing from a scientific methods and history perspective how two different formal frameworks could be developed to solve a common goal. I read a few papers on the DAG approach but without fully understanding how it could be useful to my work filed it away in the back of my mind (and computer folder). [Read More]

Notes on Regression - Approximation of the Conditional Expectation Function

The final installment in my ‘Notes on Regression’ series! For a review on ways to derive the Ordinary Least Square formula as well as various algebraic and geometric interpretations, check out the previous 5 posts: Part 1 - OLS by way of minimising the sum of square errors Part 2 - Projection and Orthogonality Part 3 - Method of Moments Part 4 - Maximum Likelihood Part 5 - Singular Vector Decomposition [Read More]

Notes on Graphs and Spectral Properties

Here is the first series of a collection of notes which I jotted down over the past 2 months as I tried to make sense of algebraic graph theory. This one focuses on the basic definitions and some properties of matrices related to graphs. Having all the symbols and main properties in a single page is a useful reference as I delve deeper into the applications of the theories. Also, it saves me time from googling and checking the relationship between these objects. [Read More]

Choosing a Control Group in a RCT with Multiple Treatment Periods

Came across a fun little problem over the past few weeks that is related to the topic of policy impact evaluation - a long time interest of mine! Here’s the setting: we have a large population of individuals and a number of treatments that we want to gauge the effectiveness of. The treatments are not necessarily the same but are targeted towards certain sub-segments in the population. Examples of such situations include online ad targeting or marketing campaigns. [Read More]

Notes on Regression - Singular Vector Decomposition

Here’s a fun take on the OLS that I picked up from The Elements of Statistical Learning. It applies the Singular Value Decomposition, also known as the method used in principal component analysis, to the regression framework. Singular Vector Decomposition (SVD) First, a little background on the SVD. The SVD could be thought of as a generalisation of the eigendecomposition. An eigenvector v of matrix \(\mathbf{A}\) is a vector that is mapped to a scaled version of itself: \[ \mathbf{A}v = \lambda v \] where \(\lambda\) is known as the eigenvalue. [Read More]