Skip to content

Releases: reneshbedre/bioinfokit

Bioinformatics data analysis and visualization toolkit

06 Jan 22:09
Compare
Choose a tag to compare
  • analys.gff.gff_to_gtf function updated to handle dot value for phase in CDS features
  • `Breast Cancer Wisconsin (Diagnostic) Data Set added
  • visuz.stat.roc function added for visualizing the ROC
  • bartlett and levene function added to analys.stat class for checking the ANOVA assumptions
    for datasets in stacked format
  • tukey_hsd function updated for grouping order
  • Pandas series added as input for fasta.extract_seq function
  • extract_seq function moved to fasta class
  • extract_seq function deprecated from analys
  • visualization for single and multiple statistical bar charts updated for future releases
  • Tukey HSD test updated for interaction effect. Pairwise comparison for interaction effect can be calculated.
  • gff_to_gtf function updated for the GFF3 file for non-coding RNA transcripts. GFF3 files with non-coding transcripts
    (e.g. from miRBase GFF3) can be converted to GTF
  • genFam enrichment analysis function added (bioinfokit.analys.genfam.fam_enrich)
  • genfam test added
  • Tukey HSD test added to perform multiple pairwise comparisons (bioinfokit.analys.stat.tukey_hsd)
  • new option mrna_feature_name added in analys.gff.gff_to_gtf if the name of the feature (column 3 of GFF3 file) of
    protein coding mRNA is other than 'mRNA' or 'transcript' (e.g. some GFF3 file has this feature named as
    protein_coding_gene )
  • dim option added to visuz.cluster.screeplot, visuz.cluster.pcaplot and visuz.cluster.biplot to control the
    figure size
  • seqcov moved to fastq class
  • sra_db function added under fastq class for batch download of FASTQ files
    from NCBI SRA database
  • In t-test, the one sample t and paired t-test added
  • Two sample t-test switched to class based method
  • t-test function name changed to ttest from ttsam
  • programmatic access to chi-squared independence test dataset added
  • boxplot removed from t-test
  • 'adjustText' module added in setup.py (issue #12)
  • In chi-squared test, the sum of probabilities is rounded to 10 for exact sum in case of floats
  • chi-squared goodness of fit test added under the stat.chisq
  • chi-squared independence test updated for output as class attributes and mosaic plot removed
  • mergevcf renamed to concatvcf to keep with conventional naming (issue # 9)
  • programmatic access to chi-squared independence test dataset added
  • marker.vcf_anot function updated for tab-delimited text output
  • The error message for volcano, inverted volcano, and MA plot updated
    when there are no significant or non-significant genes (issue # 7)
  • The vcf_anot function output updated for strand information
  • The manhatten plot updated to add the lables in sorted order for numerical strings
  • The manhatten plot updated to add figname option
  • TPM normalization function added

Bioinformatics data analysis and visualization toolkit

29 Jul 04:51
Compare
Choose a tag to compare

v0.9 has the following updates and changes (July 28, 2020)

  • gene expression raw count normalization class added as 'analys.norm'
  • CPM and RPKM normalization function added under 'analys.norm' class
  • Sugarcane gene expression dataset added (Bedre et al., 2019)
  • In volcano, 'ma', and involcano plots, checks for lfc_thr, counts, and pv_thr added
  • legend labels, position, and figname parameters added in volcano plot
  • utility to check the non-numeric values added for ma, volcano and involcano
  • plotlegend parameter added to ma
  • the parameter for log fold change threshold lines added in ma plot
  • legend labels, position, and figname parameters added in ma plot
  • tsneplot added for t-SNE visualization
  • in bardot drop NA value function added to ignore missing values to plot dots
  • scRNA-seq dataset added (PBMC and Arabidopsis root cells)
  • fasta_reader and rev_com moved to newly created fasta class
  • tsneplot and vcf_anot initialized for future release
  • more parameters added in biplot (cluster coloring, datapoints)
  • figname added in hmap
  • ma function updated for absolute expression counts
  • svg figures added
  • pca function will be deprecated in future release
  • 2D and 3D loadings plot, biplot and scree plot functions added under the
    cluster class for PCA
  • programmatic access to iris and cotton dataset added
  • pca function will be deprecated in future release

Bioinformatics data analysis and visualization toolkit

24 May 07:39
Compare
Choose a tag to compare

v0.8 has the following updates and changes

  • GFF3 to GTF file conversion utility added and updated under class gff
  • In Manhatten plot (visuz.marker.mhat), the labeling issue with markernames parameter corrected (see issue # 4 on GitHub for details;)
  • gstyle parameter added in Manhatten plot for box style annotation
  • splitvcf function added for splitting VCF file into individual VCF files for each chromosome
  • mergevcf moved to analys.marker class
  • reg_lin function updated for multiple regression
  • degree of freedom fixed for t-test for regression coefficients
  • VIF calculation for MLR updated
  • functions fastq_reader and fqreadcounter moved to fastq class

Bioinformatics data analysis and visualization toolkit

17 Apr 03:07
Compare
Choose a tag to compare

v0.7 has the following updates and changes

  • split_fastq function added for splitting individual (left and right) paired-end fastq files from single
    interleaved paired-end file
  • GFF3 to GTF file conversion utility added under class gff
  • two-sample and Welch's t-test updated for CI and alpha parameter added
  • module termcolor removed
  • Programmatic access of dataset for ttsam added

Bioinformatics data analysis and visualization toolkit

10 Apr 17:45
Compare
Choose a tag to compare

v0.6 has the following updates and changes

  • Programmatic access of dataset added (class get_data)
  • More features for figures added (figtype, axtickfontsize, axtickfontname, axxlabel, axylabel, xlm, ylm,
    yerrlw, yerrcw)
  • In volcano plot, the typo for xlabel corrected (-log2(FoldChange) to log2(FoldChange))
  • help will be deprecated in future release
  • VIF calculation for MLR updated
  • adjustText removed

Bioinformatics data analysis and visualization toolkit

30 Mar 06:52
Compare
Choose a tag to compare

v0.5

v0.5 has the following updates and changes

  • Linear regression analysis added in analys.stat class
  • volcano, involcano, ma and heatmap functions moved to new visuz.gen_exp class
  • In volcano, parameters for new box type labeling and threshold grid lines added
  • corr_mat updated for new colormaps and moved to stat class
  • To visualize the graph in the console itself (e.g. Jupyter notebook), show parameter added
  • Pandas dataframe input added for volcano, involcano, corr_mat, ma, ttsam, and chisq
  • ttsam and chisq moved to analys.stat class
  • graph control parameters added for volcano, involcano, ma, and heatmap
  • documentation can also be accessed at https://reneshbedre.github.io/blog/howtoinstall.html
  • help will be deprecated in a future release
  • fixed the NumPy bug in visuz.stat.bardot. The int cast added to generate the number of samples, which does not accept
    float (See details of NumPy bug: numpy/numpy#15345)

Bioinformatics data analysis and visualization toolkit

17 Mar 14:14
Compare
Choose a tag to compare

v0.4 has the following updates and changes

function analyis.format.fq_qual_var() added for detecting the FASTQ quality encoding format
help module added command-line help message
class fastq added for FASTQ related functions

Bioinformatics data analysis and visualization toolkit

05 Mar 17:03
Compare
Choose a tag to compare

v0.3 has the following updates and changes

  1. bar-dot plot function added
  2. command-line help message class added