Automate case study creation | starting with `rmd` report #27

jananiravi · 2024-10-02T16:01:55Z

Start w/ @the-mayer's rmd report
Identify proteins to run through MolEvolvR
Create case studies
Which additional summarization/visualizations would help with these case studies?

The text was updated successfully, but these errors were encountered:

Cateline · 2024-10-02T21:52:46Z

Cateline
Kindly assign this task to me

jananiravi · 2024-10-03T03:11:47Z

@the-mayer could you pass along your rmd doc to Cateline? @Cateline, you can get started with MolEvolvR web submissions in the meantime to help you understand what the different functions are doing (which summarizations and visualizations they result in).

the-mayer · 2024-10-03T20:55:03Z

I'm attaching the report template and some sample output, for reference.
example_report.zip

Note, this report is parameterized, so figures can be supplied as parameters when rendering. As an example, the MolEvolvR web app renders this report by calling:

## List of graphics to include in report
        params <- list(
                    ## Results Summary
                    ### Domain Architecture
                    rs_interproscan_visualization = rs_IprGenes_rx(),
                    ### Proximity Network
                    proximity_network = rval_rs_network_layout_rx(), 
                    ## Phylogeny
                    ### Sunburst
                    sunburst = data()@df,
                    ### Data
                    data = rs_data_table_rx(),
                    ## Query Data
                    ### Data Table
                    queryDataTable = queryDataTable_rx(),
                    ### FASTA
                    fastaDataText = fastaDataText_rx(),
                    ### Query Heatmap
                    heatmap = query_heatmap_rx(),
                    ### Domain Architecture
                    query_data = query_data(),
                    query_domarch_cols = query_domarch_cols(),
                    query_iprDatabases = input$query_iprDatabases,
                    query_iprVisType = input$query_iprVisType,
                    ## Homolog Data
                    mainTable = mainTable_rx(),
                    ## Domain Architecture
                    ### Table
                    DALinTable = DALinTable_rx(),
                    ### Heatmap
                    DALinPlot = DALinPlot_rx(),
                    ### Network
                    DANetwork = DANetwork_rx(),
                    DA_Prot = DA_Prot(),
                    domarch_cols = domarch_cols(),
                    DA_Col = input$DA_Col,
                    DACutoff = DACutoff(),
                    ### Interproscan Viz
                    da_interproscan_visualization = da_IprGenes_rx(),
                    ### Upset Plot
                    # uses existing params
                    ## Phylogeny
                    ### Sunburst
                    phylo_sunburst_levels = input$levels,
                    phylo_sunburst = phylogeny_prot(),
                    ### Tree
                    tree_msa_tool = input$tree_msa_tool,
                    ### MSA
                    rep_accnums = rep_accnums(),
                    msa_rep_num = input$msa_rep_num, 
                    app_data = app_data(),
                    PhyloSelect = input$PhyloSelect, 
                    acc_to_name = acc_to_name(),
                    rval_phylo = rval_phylo(),
                    query_pin = query_pin(),
                    msa_reduce_by = input$msa_reduce_by
                  )
        ## Render RMarkdown report, with included graphics
        rmarkdown::render(tempReport, output_file = file, params = params, envir = new.env(parent = globalenv()))

As you become more familiar with the way reports are generated in the Web App, we can work together to supply the correct parameters to this report in an automated fashion. Let me know if you have any questions in the meantime!

Cateline · 2024-10-04T03:46:22Z

Thank you for sharing.

…

On Thu, 3 Oct 2024, 23:55 D Mayer, ***@***.***> wrote: I'm attaching the report template and some sample output, for reference. example_report.zip <https://github.com/user-attachments/files/17249962/example_report.zip> Note, this report is parameterized, so figures can be supplied as parameters when rendering. As an example, the MolEvolvR web app renders this report by calling: ## List of graphics to include in report params <- list( ## Results Summary ### Domain Architecture rs_interproscan_visualization = rs_IprGenes_rx(), ### Proximity Network proximity_network = rval_rs_network_layout_rx(), ## Phylogeny ### Sunburst sunburst = ***@***.***, ### Data data = rs_data_table_rx(), ## Query Data ### Data Table queryDataTable = queryDataTable_rx(), ### FASTA fastaDataText = fastaDataText_rx(), ### Query Heatmap heatmap = query_heatmap_rx(), ### Domain Architecture query_data = query_data(), query_domarch_cols = query_domarch_cols(), query_iprDatabases = input$query_iprDatabases, query_iprVisType = input$query_iprVisType, ## Homolog Data mainTable = mainTable_rx(), ## Domain Architecture ### Table DALinTable = DALinTable_rx(), ### Heatmap DALinPlot = DALinPlot_rx(), ### Network DANetwork = DANetwork_rx(), DA_Prot = DA_Prot(), domarch_cols = domarch_cols(), DA_Col = input$DA_Col, DACutoff = DACutoff(), ### Interproscan Viz da_interproscan_visualization = da_IprGenes_rx(), ### Upset Plot # uses existing params ## Phylogeny ### Sunburst phylo_sunburst_levels = input$levels, phylo_sunburst = phylogeny_prot(), ### Tree tree_msa_tool = input$tree_msa_tool, ### MSA rep_accnums = rep_accnums(), msa_rep_num = input$msa_rep_num, app_data = app_data(), PhyloSelect = input$PhyloSelect, acc_to_name = acc_to_name(), rval_phylo = rval_phylo(), query_pin = query_pin(), msa_reduce_by = input$msa_reduce_by ) ## Render RMarkdown report, with included graphics rmarkdown::render(tempReport, output_file = file, params = params, envir = new.env(parent = globalenv())) As you become more familiar with the way reports are generated in the Web App, we can work together to supply the correct parameters to this report in an automated fashion. Let me know if you have any questions in the meantime! — Reply to this email directly, view it on GitHub <#27 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BIQBLEIJL52JH43BOY6Q7Q3ZZWVLZAVCNFSM6AAAAABPH7U2RGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOJSGMYTQMBXHE> . You are receiving this because you were mentioned.Message ID: ***@***.***>

MyleeeA · 2024-10-04T22:26:20Z

Hi @jananiravi
Is this a good first issue to start with?

I’ll like to be assigned to this, to get started

jananiravi · 2024-10-05T03:16:38Z

Yes! @Cateline @MyleeeA which proteins are you both starting with, or do you want us to assign? If that's the case, give me a day to add them. Thanks!

Cateline · 2024-10-05T03:21:19Z

Hi @jananiravi , please assign me some proteins I can work with. You can add them today

jananiravi · 2024-10-05T03:34:57Z

Each of you can start with one of the 6 ESKAPE species in the CARD antibiotic resistance genes database: https://card.mcmaster.ca/download

Enterobacter spp
Staphylococcus aureus
Klebsiella pneumoniae
Acinetobacter baumannii
Pseudomonas aeruginosa
Enterococcus faecalis

Start with one drug/drug class at a time before moving into 1 drug across species or 1 species across drugs
generate generalizable functions to download, filter by species/drugs
input datasets into molevolvr to generate case study reports
download these analyses data systematically --> towards populating knowledgebases.

(Reach out via slack to those interested in bio/bioinfo to take other species -- those interested in bioinfo/r-pkg to work on well-annotated functions.)

Hope this helps!

jananiravi · 2024-10-05T03:36:22Z

@AbhirupaGhosh @charmvang @wolfeet1 @klterwelp if you card data workflows or top genes readily available, please share those as starting points as well.

jananiravi · 2024-10-05T03:37:31Z

@KewalinSamart if you have top TB disease/drug genes, create a new spinoff issue for TB gene case study with the same tags as this one. We can request assignees for that, too.

MyleeeA · 2024-10-05T09:52:27Z

Thank you so much @jananiravi

AbhirupaGhosh · 2024-10-09T17:12:32Z

Title: Process CARD Data, Map Short Names, and Run MolEvolveR

Download CARD Data: Retrieve the latest CARD dataset. (DOWNLOAD)
Open ARO_index.tsv: Parse the file (in R).
Map CARD Short Name: Map the CARD Short Name column to shortname_antibiotics.tsv and shortname_pathogens.tsv. The CARD Short Name values follow the format pathogen_gene or pathogen_gene_drug.
Sort and Group the data by pathogens and antibiotics.
Filter Favorite Bug-Drug or Bug for further analysis.
Download FASTA Sequences for the list of protein accessions filtered. (use Entrez)
Run MolEvolvR: Run the protein sequences through the MolEvolvR tool for evolutionary analysis.

I hope this helps @Cateline

Cateline · 2024-10-09T18:15:11Z

Yes, this is now clear. Does it mean that I need to cancel the pull request I had already made?

AbhirupaGhosh · 2024-10-09T19:40:09Z

Yes that would be better.

Cateline · 2024-10-09T19:55:11Z

Great. Thanks for the help

…

On Wed, Oct 9, 2024 at 12:40 PM Abhirupa Ghosh ***@***.***> wrote: Yes that would be better. — Reply to this email directly, view it on GitHub <#27 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BIQBLEJCVD6ZVG3GJPMOZ7DZ2WBC7AVCNFSM6AAAAABPH7U2RGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMBTGI4TIMZWGA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

MyleeeA · 2024-10-10T04:09:50Z

@Cateline

I see you have a better understanding now
Would you be so kind as to guide me abit
Thank you 🙏🏾

Cateline · 2024-10-10T08:51:58Z

Yeah of course Where are you stuck on?

…

On Thu, 10 Oct 2024, 07:10 MyleeeA, ***@***.***> wrote: @Cateline <https://github.com/Cateline> I see you have a better understanding now Would you be so kind as to guide me abit Thank you 🙏🏾 — Reply to this email directly, view it on GitHub <#27 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BIQBLEPRKSA73GXHIV7D7V3Z2X42LAVCNFSM6AAAAABPH7U2RGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMBTHEZDMNJWGU> . You are receiving this because you were mentioned.Message ID: ***@***.***>

MyleeeA · 2024-10-10T19:40:19Z

First of all, I'm struggling with getting the right version for R and R studio so I clone the MolEvolvR repo to enable me use it's functions in R
I understand this are the steps I need to follow.
I found the information but can't seem to locate it, I'll appreciate a link to that @Cateline

CC: @jananiravi

Cateline · 2024-10-10T20:57:23Z

https://posit.co/download/rstudio-desktop/
Try downloading from this site
I also found this resource useful: https://happygitwithr.com/rstudio-git-github#rstudio-git-github

MyleeeA · 2024-10-10T21:54:35Z

Thank you for helping out everytime I need your help @Cateline
I'll check it out

jananiravi · 2024-10-16T15:26:44Z

Phase 1: The first PR can be for the set of functions going from CARD SNPs to MolEvolvR input protein sequences (fasta). This commit will add a file to R/ with docstring/roxygen2 documentation as with other R functions in that folder.
Phase 2: Run these proteins through the MolEvolvR web-app and submit the reports for starters.
Phase 3: Create a qmd/rmd markdown (R/Quarto) report file that does everything the web-app report generator does but does so locally with the R-package functions (fully in-house within the MolEvolvR repo).

cc: @falquaddoomi @the-mayer @epbrenner

Add code for fetching and saving FASTA sequences for Staph-DA

…viLab#27

Expanded Bug-Drug.R code to retrieve and save FASTA sequences for ESKAPE pathogens resistant to DAP (Daptomycin)

jananiravi added the outreachy for outreachy interns label Oct 2, 2024

jananiravi added documentation Improvements or additions to documentation, incl. R docstring/roxygen2 good first issue Good for newcomers bioinfo Bioinformatics related labels Oct 2, 2024

jananiravi assigned Cateline Oct 3, 2024

jananiravi assigned the-mayer Oct 3, 2024

jananiravi assigned MyleeeA Oct 5, 2024

AbhirupaGhosh mentioned this issue Oct 9, 2024

Eskape case studies #82

Closed

5 tasks

jananiravi mentioned this issue Oct 16, 2024

[FEAT] Generate case study report Rmd/Qmd #102

Open

jananiravi assigned AbhirupaGhosh Oct 16, 2024

Cateline added a commit to Cateline/MolEvolvR that referenced this issue Oct 17, 2024

Fixes JRaviLab#27

f8c17b1

Add code for fetching and saving FASTA sequences for Staph-DA

Cateline added a commit to Cateline/MolEvolvR that referenced this issue Oct 17, 2024

Fixes Issue JRaviLab#27 Phase 1-Staph-DAP combination

2d1cd30

Cateline added a commit to Cateline/MolEvolvR that referenced this issue Oct 21, 2024

Fixes Phase 1 of Issue JRaviLab#27 and Issue JRaviLab#103

cfc7bf6

Cateline added a commit to Cateline/MolEvolvR that referenced this issue Oct 21, 2024

Added link to MolEvolvR Case Study report. Fixes Phase 2 of Issue JRa…

9615773

…viLab#27

awasyn self-assigned this Oct 22, 2024

awasyn mentioned this issue Oct 26, 2024

WIP: Case study report automation #111

Draft

13 tasks

jananiravi added this to the Package release milestone Oct 27, 2024

Cateline added a commit to Cateline/MolEvolvR that referenced this issue Nov 24, 2024

Automate Case-Studies Issue JRaviLab#27

1dc5c81

Expanded Bug-Drug.R code to retrieve and save FASTA sequences for ESKAPE pathogens resistant to DAP (Daptomycin)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automate case study creation | starting with `rmd` report #27

Automate case study creation | starting with `rmd` report #27

jananiravi commented Oct 2, 2024

Cateline commented Oct 2, 2024

jananiravi commented Oct 3, 2024

the-mayer commented Oct 3, 2024

Cateline commented Oct 4, 2024 via email

MyleeeA commented Oct 4, 2024

jananiravi commented Oct 5, 2024

Cateline commented Oct 5, 2024

jananiravi commented Oct 5, 2024

jananiravi commented Oct 5, 2024

jananiravi commented Oct 5, 2024

MyleeeA commented Oct 5, 2024

AbhirupaGhosh commented Oct 9, 2024 •

edited by jananiravi

Loading

Cateline commented Oct 9, 2024

AbhirupaGhosh commented Oct 9, 2024

Cateline commented Oct 9, 2024 via email

MyleeeA commented Oct 10, 2024

Cateline commented Oct 10, 2024 via email

MyleeeA commented Oct 10, 2024

Cateline commented Oct 10, 2024 •

edited

Loading

MyleeeA commented Oct 10, 2024 •

edited

Loading

jananiravi commented Oct 16, 2024 •

edited

Loading

Automate case study creation | starting with rmd report #27

Automate case study creation | starting with rmd report #27

Comments

jananiravi commented Oct 2, 2024

Cateline commented Oct 2, 2024

jananiravi commented Oct 3, 2024

the-mayer commented Oct 3, 2024

Cateline commented Oct 4, 2024 via email

MyleeeA commented Oct 4, 2024

jananiravi commented Oct 5, 2024

Cateline commented Oct 5, 2024

jananiravi commented Oct 5, 2024

jananiravi commented Oct 5, 2024

jananiravi commented Oct 5, 2024

MyleeeA commented Oct 5, 2024

AbhirupaGhosh commented Oct 9, 2024 • edited by jananiravi Loading

Cateline commented Oct 9, 2024

AbhirupaGhosh commented Oct 9, 2024

Cateline commented Oct 9, 2024 via email

MyleeeA commented Oct 10, 2024

Cateline commented Oct 10, 2024 via email

MyleeeA commented Oct 10, 2024

Cateline commented Oct 10, 2024 • edited Loading

MyleeeA commented Oct 10, 2024 • edited Loading

jananiravi commented Oct 16, 2024 • edited Loading

Automate case study creation | starting with `rmd` report #27

Automate case study creation | starting with `rmd` report #27

AbhirupaGhosh commented Oct 9, 2024 •

edited by jananiravi

Loading

Cateline commented Oct 10, 2024 •

edited

Loading

MyleeeA commented Oct 10, 2024 •

edited

Loading

jananiravi commented Oct 16, 2024 •

edited

Loading