When starting a new project, designing new methods for data analysis or even when preparing a new publication, data visualization is essential to both explore new hypotheses and to validate our findings.
With this crash talk I aim to spread the word on the existence of a series of packages that have made my life much easier when it comes to exploratory data analysis.
In case you are reading this from this project"s repository checkout the webpage for a nicer experience.
Please, make sure to have R and RStudio installed with the following packages:
-
Data Wrangling
here
: to automate locating your root directory.tidyverse
: to read, wrangle and write tables.- Biology-specific
clusterProfiler
: to perform gene set enrichment analyses.seqinr
: to read and write files with biological sequences.ape
: to compute distances between aligned sequences.Biostrings
: to wrangle biological sequences.
-
Visualization
ggplot2
: powerful suite for general data visualization in R.ggpubr
: ggplot metapackage that wraps many funcitonalities together.ggtree
: to visualize multiple sequence alignments as trees.ComplexHeatmap
: to make heatmaps.gridExtra
: to arrange multiple plots not created throughggplot
.ggplotify
: converts any plot into a ggplot.showtext
: edit fonts more easily in R graphs.countrycode
: get country code names.
Here"s the code snipped in case you need to install some of them.
# install packages from CRAN
# wrangling
install.packages("here", dependency=TRUE)
install.packages("tidyverse", dependency=TRUE)
install.packages("seqinr", dependency=TRUE)
install.packages("ape", dependency=TRUE)
# visualization
install.packages("ggplot2", dependency=TRUE)
install.packages("ggpubr", dependency=TRUE) # requires libcurl4 and libnlopt-dev in ubuntu
install.packages("gridExtra", dependency=TRUE)
install.packages("ggplotify", dependency=TRUE)
# extra (optional)
install.packages("showtext", dependecy=TRUE)
install.packages("countrycode", dependecy=TRUE)
# install packages from Bioconductor
if (!requireNamespace("BiocManager", quietly = TRUE)){ install.packages("BiocManager") }
BiocManager::install("clusterProfiler")
BiocManager::install("ggtree")
BiocManager::install("Biostrings")
BiocManager::install("ComplexHeatmap")
BiocManager::install("org.Hs.eg.db")
Then, you can also download the notebooks for the talk with:
git clone https://github.com/MiqG/practical_tools_for_quick_data_visualization.git
or just by clicking here.
In case you"d like to re-create all the outputs from the repository, you will need to have a github account and to install jupyter-book
and ghp-import
.
Then, you can run
bash run_all.sh
Enjoy!