Skip to content

snowformatics/gwastic_desktop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

About GWAStic

GWAStic is a software for Genome-Wide Association Study (GWAS) that combines traditional statistical methods with the power of Artificial Intelligence (AI) for comprehensive genetic analysis. It supports genomic prediction and offers diverse statistical outputs for phenotypic and genotypic data, making it an essential tool for researchers aiming to uncover complex genetic interactions and predict phenotypic outcomes efficiently.

Installation

ALT TEXT

ALT TEXT

ALT TEXT

Table of Contents

Key Features:

  • Cross Platform

  • Comprehensive Genetic Analysis: GWAStic offers a wide range of methods to analyze your genomic data, allowing you to explore the associations between genetic variants and traits of interest comprehensively.

  • AI-Enhanced Data Analysis: Harness the capabilities of machine learning and AI to uncover subtle patterns, interactions, and associations that may be missed by conventional statistical methods.

  • Genomic Prediction: Take your research to the next level by using GWAStic's advanced AI models for genomic prediction. Predict future health outcomes, disease risks, or phenotypic traits based on your genetic data and environmental factors.

  • User-Friendly Interface: GWAStic's intuitive interface makes it accessible to both novice and experienced researchers. Seamlessly navigate through your data, perform analyses, and visualize results with ease.

  • Customizable Workflows: Tailor your analysis to your specific research goals with customizable workflows. Define your parameters, select the appropriate statistical models, and integrate AI components as needed for a personalized analysis experience.

  • Collaborative Research: Collaborate seamlessly with colleagues and share your findings securely within the platform.

  • Frequent Updates: Stay at the forefront of genetic research with regular software updates. GWAStic incorporates the latest advancements in GWAS and AI methodologies to keep your analyses up-to-date.

myfile

1. Installation

GWAStic software was build and successfully tested on Windows operating system (Windows 7 and 10).

Windows OS:

Tip

We recommend to install Anaconda and for managing dependencies, it is often recommended to create a new environment for your project:

Install Anaconda from https://www.anaconda.com/distribution/

Open the Anaconda Prompt

conda create --name gwastic_env python=3.9
conda activate gwastic_env

Important

Install GWAStic via pip:

pip install gwastic_desktop

Important

Run GWAStic:

Type gwastic in the Anaconda command line to start the software.

Linux OS:

Tip

We recommend to install Anaconda and for managing dependencies, it is often recommended to create a new environment for your project:

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
chmod +x Miniconda3-latest-Linux-x86_64.sh
./Miniconda3-latest-Linux-x86_64.sh
cd /home/username/miniconda3
source ~/miniconda3/bin/activate
conda create --name gwastic_env python=3.9
conda activate gwastic_env

Important

Install GWAStic via pip:

pip install gwastic_desktop

Important

Run GWAStic:

Type gwastic in the command line to start the software.

Mac OS:

Important

We've received feedback from Mac users experiencing installation issues, and we've identified a problem specific to macOS involving a missing library necessary for parallel processing. We strongly recommend installing this dependency via Homebrew before proceeding with the installation of the GWASTic software:

brew install libomp

Tip

We recommend to install Anaconda and for managing dependencies, it is often recommended to create a new environment for your project:

Install Anaconda (https://www.anaconda.com/distribution/)

Open the downloaded .pkg file to launch the installer and follow the on-screen instructions.

Open Terminal. You can do this by pressing Cmd + Space to open Spotlight Search, typing "Terminal", and pressing Enter.

Create a new environment by typing:

conda create --name gwastic_env python=3.9

Activate the environment:

conda activate gwastic_env

Important

Install GWAStic via pip:

pip install gwastic_desktop

Important

Run GWAStic:

Type gwastic in the command line to start the software.

2. Example datasets

Note

VCF file format (including vcf.gz) and Plink BED (binary) format are supported for all GWAS methods. In case of vcf, you first must convert the genotype data to bed file format. VCF example file

Note

Phenotypic data must be three columns (Family ID; Within-family ID; Value) text file delimited by space. Phenotype example file

Tip

We provide to two datasets to test GWASTic and validate the software:

Dataset 1 (Barley with row-type phenotype):

We have used a subset of data from a recent study (Milner et al.2019) focusing on the genetic basis of barley traits. The genotypic data was filtered by applying a genotyping rate cutoff of 0.02 and a minor allele frequency (MAF) threshold of 0.05. This resulted in a curated dataset comprising 949,174 SNPs. A random subset of 147 accessions from the Core 200 collection in the same study with available row-type phenotype data was selected. This phenotype describes the arrangement of kernels on the spike of the barley plant, specifically distinguishing between two-rowed and six-rowed barley - a crucial morphological and agricultural trait. The four distinct methods - XGB, RF, LR, and LMM - were employed to validate the peaks of two rowtype associated barley genes previously identified in (Milner et al. 2019).

Download the zip file containing the datasets from https://zenodo.org/records/11183758
Unpack the zip file
Start GWAStic
Choose the file barley_set\WGS300_005_0020.bed as genotypic file
Choose the file barley_set\bridge_row_type_GWAS.txt as phenotypic file
Select method and press Run GWAS

Dataset 2 (Arabidopsis thaliana with with Pseudomonas syringe):

For a quick testing and short run time, we provide a second dataset is on a hypersensitive response phenotype observed in 58 Arabidopsis thaliana host lines (∼900000 SNPs) when infected with Pseudomonas syringe expressing the avrRpm1 gene. Description of the original experiment can be found at https://arapheno.1001genomes.org/phenotype/17/.

Download the zip file containing the datasets from https://zenodo.org/records/11183758
Unpack the zip file
Start GWAStic
Choose the file small_set\example.bed as genotypic file
Choose the file small_set\pheno_gwas.csv as phenotypic file
Select method and press Run GWAS

3. References

Atwell S, Huang YS, Vilhjálmsson BJ, Willems G, Horton M, Li Y, Meng D, Platt A, Tarone AM, Hu TT, Jiang R, Muliyati NW, Zhang X, Amer MA, Baxter I, Brachi B, Chory J, Dean C, Debieu M, de Meaux J, Ecker JR, Faure N, Kniskern JM, Jones JD, Michael T, Nemri A, Roux F, Salt DE, Tang C, Todesco M, Traw MB, Weigel D, Marjoram P, Borevitz JO, Bergelson J, Nordborg M Nature. 2010 465(7298): 627-31. doi: 10.1038/nature08800

Lippert, C., Listgarten, J., Liu, Y. et al. FaST linear mixed models for genome-wide association studies. Nat Methods 8, 833–835 (2011). https://doi.org/10.1038/nmeth.1681

Milner,S. et al. (2019) Genebank genomics highlights the diversity of a global barley collection. Nature Genetics, 51(2):319-26. doi: 10.1038/s41588-018-0266-x.

Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Maller J, de Bakker PIW: Daly MJ & Sham PC (in press) PLINK: a toolset for whole-genome association and population-based linkage analysis. American Journal of Human Genetics.

4. Acknowledgment

Gwastic has incorporated the FaST-LMM library (fastlmm.github.io), to enhance its Linear Mixed Models (LMM) feature. We thank Carl Kadie and David Heckerman for not only creating this exceptional tool but also providing outstanding support and discussions.

About

Software for Genome-Wide Association Study

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages