BMD-SRA: A Boosting Model for Differentiating Sequence Read Archive Files Based on the Context.

The volume of the deposit sequence file is increase dramatically. Also, the submitter of the sequence file is main responsible for annotating. Although the submitter and public repositories pay attention to making accurate metadata, mistakes can happen. These issues can cause troubles in performing downstream analysis. BMD-SRA tries to differentiate the given sequence files into four categories including

Meta Genomes
Amplicons
Single Amplified Genomes (SAGs)
Isolated Genomes

For developing this model, some stages were tracked, which listed below:

Preparing Metadata
Downloading Sequence Files
Feature Extraction
Outlier Detection
Developing Model
Evaluation Model

How can you use it?

There are two ways for using the outcomes of the study. Generating your own model or Applying the generated model in your project.

Generating your own model

There is well-form documentation about preparing training data You can use the extracted features and generate your own model.

Load the generated model and apply it.

The generated model is accessible here. You can use the BMDSRA class and pass just two parameters to make an object.

The path of the model.
The path of the scaler.

After making an object of the BMDSRA class, just call predict function and pass the path of the sequence file.

It is worth mentioning that the BMD-SRA needs access to two files, including FeatureExtraction and Preprocessing. Also, accessing to the xgboost package is essential.

Example:

from Codes.BMDSRA import BMDSRA
model_path = "..\\..\\resource\\4-model\\model.json"
scaler_path = "..\\..\\resource\\4-model\\scaler.gz" 
model = BMDSRA(model_path, scaler_path)

seq_path = "..\\..\\resource\\2-subsra\\SRR1588386.fastq" 
res = model.predict(seq_path)
print(res)

To reach more sample about the running model you can see here

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
.idea		.idea
Codes		Codes
documents		documents
resource		resource
stoolkit		stoolkit
BMDSRA.iml		BMDSRA.iml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BMD-SRA: A Boosting Model for Differentiating Sequence Read Archive Files Based on the Context.

How can you use it?

Generating your own model

Load the generated model and apply it.

Example:

About

Releases

Packages

Languages

MartinBoleSlo/BMDSRA

Folders and files

Latest commit

History

Repository files navigation

BMD-SRA: A Boosting Model for Differentiating Sequence Read Archive Files Based on the Context.

How can you use it?

Generating your own model

Load the generated model and apply it.

Example:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages