Recent Posts

Replicating WEGO Plots using ggplot2

May 10, 2018

WEGO (Web Gene Ontology Annotation Plot) is a tool for visualizing, comparing, and plotting gene ontology (GO) annotation results. WEGO accepts various file formats, including GAF, XML, and TXT, making it compatible with BLAST2GO. It has an intuitive interface and produces satisfactory images like this:

Although most of WEGO’s features are neat, I found myself yearning for other subtler features, such as being able to sort the terms according to the number/percentage of genes or to automatically select the top n terms in each domain. Knowing that the only way I can attain exactly what I want is to do the visualization myself, I just exported the TSV file from WEGO and proceeded to let my fingers do all the work. The script can be divided into three parts:

Read more

R: Some Table Munging Tricks

April 18, 2018

I’ve been working with huge tables lately (at least 50,000 rows or columns). Sometimes you think you know all the basic commands you need to string together to get through any trouble, until a seemingly easy problem comes along to break this notion into a million pieces. Here are some handy table munging tricks:

Read more

Phasing Technologies (10x Genomics Chromium and Dovetail Genomics Hi-C)

March 31, 2018

Last January, I had the privilege of attending the Plant and Animal Genome Conference XXVI in San Diego, California. My boss calls it the only essential agrigenomics conference anyone needs to attend, and understandably so, because the big names in genomics—ones I would have never thought of coming into being beyond journal article bylines—are consistent attendees of this conference. The conference was divided into workshop sessions that tackled updates and recent developments from specific fields of study. I particularly enjoyed the sessions dedicated to bioinformatics in spite of my episodic inability to grasp the highly computational discussions.

Read more

Ubuntu 14.04 Login Loop and Missing Desktop Icons

February 25, 2018

After restarting my desktop the other day, I found Ubuntu 14.04 stuck in a login loop. It was not the first time this problem had reared its ugly head, and luckily, I was able to easily amend its first instance with the following steps:

  1. Login to tty1 by pressing Ctrl + Alt + F1
  2. Reinstall Ubuntu desktop (i.e. Unity): sudo apt-get install --reinstall ubuntu-desktop
  3. Voila, reboot: sudo reboot

Unfortunately, the second and most recent occurrence of the login loop was much peskier as the following attempts proved to be insufficient for solving the problem:

Read more

On Hadley Wickham, the Prolific R Developer

December 18, 2017

I spent the entire November writing scripts to generate the figures for my thesis and finding ways to make my data appear lovely. While it sounds leisurely when phrased in that manner, it was actually a month full of scouring Stack Overflow and Github pages for answers, trial and error with infrequent successes, misdirected anger springing from frustration, and countless coffee breaks. The packages dplyr, ggplot2, reshape2, and ggtree became my weapons of choice. I am actually fortunate to have had the opportunity to work with these packages several months earlier in prior engagements. Somehow it seems like the world knew what side quests and minor bosses to give to prepare me for the final boss fight.

I hadn’t given much thought about the big names in R until our Data Analytics professor mentioned this apparently prolific R developer named Hadley Wickham. A quick Google search of his name left me agape. He is guy who wrote all those R workhorse packages I am using! According to his personal website, his work can be divided into three categories: tools for data science (e.g. ggplot2, dplyr, stringr), tools for data import (e.g. readr, readxl, rvest), and software engineering (devtools, testthat). And so began my fascination with Wickham…

Read more