Posts by Category

AWK

Bioinformatics

Linux

Compiling bcl2fastq v2.20 on Ubuntu 18.04

1 minute read

Unlike the MiSeq which automatically converts binary base call (BCL) files into FASTQ format using the MiSeq Reporter, output from the NextSeq, HiSeq, and No...

How to Run a Process as a Background Task

1 minute read

To my disappointment, much of my experience in wrangling DNA sequences involves short bacterial genomes. My workstation, which features a 4.0-GHz 8-core proc...

Machine Learning

Linear Regression and Gradient Descent

4 minute read

Some time ago, when I thought I didn’t have any on my plate (a gross miscalculation as it turns out) during my post-MSc graduation lull, I applied for a fina...

NGS

R

Making Maps with ggplot2

4 minute read

I remember looking at Freedom House’s beautiful (but alarming) set of visualizations on the status of global democracy in 2018 with a burning curiosity abou...

Linear Regression and Gradient Descent

4 minute read

Some time ago, when I thought I didn’t have any on my plate (a gross miscalculation as it turns out) during my post-MSc graduation lull, I applied for a fina...

Replicating WEGO Plots using ggplot2

3 minute read

WEGO (Web Gene Ontology Annotation Plot) is a tool for visualizing, comparing, and plotting gene ontology (GO) annotation results. WEGO accepts various file ...

R: Some Table Munging Tricks

2 minute read

I’ve been working with huge tables lately (at least 50,000 rows or columns). Sometimes you think you know all the basic commands you need to string together ...

On Hadley Wickham, the Prolific R Developer

2 minute read

I spent the entire November writing scripts to generate the figures for my thesis and finding ways to make my data appear lovely. While it sounds leisurely w...

Warning and Error Handling with tryCatch()

1 minute read

A few weeks ago, I worked on an implementation of Fisher’s exact test in R. The script expects a data frame with rows representing the various cases/phenotyp...

Statistics

Linear Regression and Gradient Descent

4 minute read

Some time ago, when I thought I didn’t have any on my plate (a gross miscalculation as it turns out) during my post-MSc graduation lull, I applied for a fina...

Standard Deviation and Variance

2 minute read

One of the tools that we discussed in our Data Analytics class last week was canonical correlation analysis (CCA). I won’t delve into CCA as I haven’t fully ...