From 3c63371bca8806efedaf9af9c8991394bdc7d5d0 Mon Sep 17 00:00:00 2001 From: Fabian Andrade Date: Mon, 19 Aug 2024 17:04:38 +0200 Subject: [PATCH] 1 rst RNA bia updated --- docs/1- Library_preparation.rst | 23 +++++++------- docs/2- Sequencing_technologies.rst | 4 +-- docs/about.rst | 35 +++++---------------- tand the sequencing technology of Illumina. | 25 +++++++++++++++ 4 files changed, 46 insertions(+), 41 deletions(-) create mode 100644 tand the sequencing technology of Illumina. diff --git a/docs/1- Library_preparation.rst b/docs/1- Library_preparation.rst index 5f4fbcf..5cd66d8 100644 --- a/docs/1- Library_preparation.rst +++ b/docs/1- Library_preparation.rst @@ -69,7 +69,7 @@ removal of unwanted products to leave only the nucleic acid fragments. Often is Check if DNA mets the quantity and quality requirements of the sequencing instrument. Assesss the quantity and size distribution of the library. -..note:: +.. note:: RNA library preparation is more complex due to the risk of degradation and requires additional steps respect DNA: - Due that RNA is converted to cDNA, PCR-amplified libraries are necessary for many sequencing instruments. @@ -83,7 +83,7 @@ Library preparation bias Among the different library preparation steps presented earlier, several biases can be introduced during the process. Here are presented the main biases introduced for either DNA or RNA, in each library preparation step and possible solutions to avoid them. -..tabs:: +.. tabs:: .. tab:: DNA library bias @@ -95,27 +95,27 @@ Here are presented the main biases introduced for either DNA or RNA, in each lib #. Fragmentation Chromatin sonication for ChIP-seq has been shown to be non-random, with euchromatin being sheared more efficiently than heterochromatin. - ..tip:: + .. tip:: To solve this it has been developed the double-fragmentation ChIP-seq protocol. #. Size Selection Agarose gel slices by heating to 50 ºC in chaotropic salt buffer decreased the representation of AT-rich sequences. - ..tip:: + .. tip:: Simple solution to this problem is to melt the gel slices in the supplied buffer at room temperature (18–22 ºC), considerably reducing GC bias. #. PCR Introduce bias in sample composition, due to the fact that not all fragments in the mixture are amplified with the same efficiency. GC-neutral fragments are amplified more efficiently than GC-rich or AT-rich fragments, and as a result fragments with high AT- or GC content may become underrepresented or are completely lost during library preparation - ..tip:: + .. tip:: - Ligate adapters that contain all necessary elements for bridge amplification on Illumina flowcells are preferred, eliminating the need for PCR to add these sequences afterwards. Nevertheless, requires relatively large quantities (41 mg) of input material. - In the extreme case of small input amount, the single cell,multiple displacement amplification (MDA) may be the preferred amplification method. MDA is an extremely powerful amplification method, allowing microgram quantities of DNA to be obtained from femtograms of starting material. For this reason, MDA has become the method of choice for whole genome amplification (WGA) from single cells - PCR additives have also been reported to reduce bias, such as betaine or tetramethylammonium chloride (TMAC) may help to further improve coverage of extremely GC-rich or AT-rich regions. - The best overall performing polymerase appears to be Kapa HiFi. .. seealso:: - .. _Library_preparation_methods_for_next_generation_sequencing_Tone_down_the_bias: http://dx.doi.org/10.1016/j.yexcr.2014.01.008 + For more information see the publication `Library preparation methods for next generation sequencing Tone down the bias `_. - For more information see the publication Library_preparation_methods_for_next_generation_sequencing_Tone_down_the_bias_ . + .. tab:: RNA library bias @@ -136,11 +136,10 @@ Here are presented the main biases introduced for either DNA or RNA, in each lib #. **Library Construction** - mRNA enrichment bias: enrich for polyadenylated RNA transcripts with oligo (dT) primers have shown that this method remove all non-poly (A) RNAs, such a reolication-dependant histones and lncRNAs (lacking of polyA), - or incomplete mRNAs. Targeting rRNA as depletion method will not limit to only mRNA molecules (also is more expensive). - - RNA fragmentation bias: can introduce lenght biases or errors (propagated to later cycles), Studies have shown that methods that involve nonspecific restriction endonucle-ases indicate less sequence bias and have been shown to per-form similarly to the physical methods - - Primer bias: deviation due to primer during PCR amplification could be avoid using the Illumina Genome Analyzer, which perform the reverse transcription directly on the flowcells. - - Adapter ligation bias: due to substrate preferencesof T4 RNA ligases, protocols that uses a set of randomnucleotide adapters at the ligation boundary evade the capture of miRNAs. - - Reverse transcription bias: reverse transcriptases tend to produce false second strand cDNA throughDNA-dependent DNA polymerase + or incomplete mRNAs. Targeting rRNA as depletion method will not limit to only mRNA molecules (also is more expensive). subtractive hybridization using rRNA-specific probes as the method that introduced the least bias in relative transcript abundance, In contrast, exonuclease treatment tends to be less efficient in rRNA depletion + - RNA fragmentation bias: can introduce lenght biases or errors (propagated to later cycles), Studies have shown that methods that involve non specific restriction endonucleases indicate less sequence bias and have been shown to perform similarly to the physical methods. + - Primer bias: deviation due to primer during PCR amplification could be avoid using the Illumina Genome Analyzer, which perform the reverse transcription directly on the flowcells. authors propose a bioinformatics tool in the formo fare weighing scheme that adjusts for the bias and makes the distribution of the reads more uniform. + - Adapter ligation bias: due to substrate preferences of T4 RNA ligases, protocols that uses a set of randomnucleotide adapters at the ligation boundary evade the capture of miRNAs. As a solution, several groups propose to randomize the 3'end of the 5'adapter and the 5'end of the 3'adapter. The strategy is based on the hypothesis that a population of degenerate adapters would average out the sequencing bias because the slightly different adapter molecules would form stable secondary structures with a more diverse population of RNAsequences - Reverse transcription bias: reverse transcriptases tend to produce false second strand cDNA throughDNA-dependent DNA polymerase. ActinomycinD, a compound that specifically inhibits DNA-dependent DNAsynthesis, has been proposed as an agent to eliminate antisense artifacts - PCR amplification bias: main source of artifactsand base composition bias in the process of library construc-tion, Extremely AT/GC-Rich, fragments of GC-neutral can be amplified more thanGC-rich or AT-rich fragments. Throughthe use of custom adapters, the samples without amplifica-tion and ligation can be hybridized directly with the oligonu-cleotides on the flowcell surface, thus avoiding the biases andduplicates of PCR. However, diff --git a/docs/2- Sequencing_technologies.rst b/docs/2- Sequencing_technologies.rst index 62a9855..9f145d6 100644 --- a/docs/2- Sequencing_technologies.rst +++ b/docs/2- Sequencing_technologies.rst @@ -41,7 +41,7 @@ Paired end *source: https://systemsbiology.columbia.edu/genome-sequencing-defining-your-experiment#:~:text=Single%2Dend%20vs.&text=In%20single%2Dend%20reading%2C%20the,opposite%20end%20of%20the%20fragment.* During library preparation are incorporated sequencing primers binding site at both ends of the DNA fragments. -This allows to reading at one read, when it finiches this direction at the specified read lenght, then starts another round od reading from the opposite end of the fragemnt. +This allows to reading at one read, when it finiches this direction at the specified read lenght, then starts another round od reading from the opposite end of the fragment. It improves: - The confidence of the sequence read @@ -53,7 +53,7 @@ It improves: .. seealso:: .. _Illumina_sequencing_by_synthesis_workflow: https://www.youtube.com/watch?v=fCd6B5HRaZ8 - See the _Illumina_sequencing_by_synthesis_workflow video by Illumina to visualize the concepts of SBS. + See the Illumina_sequencing_by_synthesis_workflow_ video by Illumina to visualize the concepts of SBS. For more information diff --git a/docs/about.rst b/docs/about.rst index 51ea9c3..748d059 100644 --- a/docs/about.rst +++ b/docs/about.rst @@ -8,7 +8,6 @@ About the course :toctree: generated -This slow-paced hands-on internal course is designed for absolute beginners who want to start using `Nextflow DSL2 `_ to achieve reproducibility of the data analysis. .. |luca| image:: images/lcozzuto.jpg @@ -47,25 +46,6 @@ Dates, time, location Program ------------------------ -*Day 1: Understand and run a basic Nexflow pipeline* - - - -*Day 1: Write, modify, and run a complex pipeline* - - - -*Day 2: Linux containers* - - - -*Day 3: Run a Nextflow pipeline in different environments* - - - -*Day 4: Nextflow modules and Tower* - - .. _home-page-outline: @@ -73,25 +53,26 @@ Program Outline ============ -This NGS Quality Control course will train participants to run FASTQC on short reads obtained with Illumina and interpret the quality control parameters offered by this tool. +This NGS Quality Control (QC) course will train participants to run FASTQC on short reads obtained with Illumina and interpret the quality control parameters offered by this tool. .. _home-page-learning: Learning objectives ============ -* Execute/Run FASTQC on short reads obtained by illumina sequencing. -* Interpret the Quality Control Parameters offered by FASTQC -* Understand the sequencing technology of Illumina. -* Main factors for choose an appropiate library preparation kit -* Description of the main tools for low quality reads and adapter removal. +* Know the main steps of a library preparation in DNA-seq and RNA-seq, the bias introduced in each of this and solutions to avoid them +* Learn how the main sequencing technologies works (Illumina for short reads and Nanopore for long reads). +* Execute/Run the main quality control tools for raw data (FASTQC/NanoPlot and FASQ-Screen) obtained for short and long reads. +* Understand the FASTQ format and interpret the Quality Control report offered by theese tools. +* Aggregate the QC reports of different tools and samples with MULTIQC. +* Execute and learn about the preprocessing tools for adapter and low quality reads removal (TRIMMOMATIC, Cutadapter, Sickle, Fastp). .. _home-page-prereq: Prerequisite / technical requirements ============ -Being comfortable working with the CLI (command-line interface) in a Linux-based environment. +Being comfortable working with the CLI (command-line interface) in a Linux-based environment (Introductory courses to CLI may be recommended)git . diff --git a/tand the sequencing technology of Illumina. b/tand the sequencing technology of Illumina. new file mode 100644 index 0000000..836adb6 --- /dev/null +++ b/tand the sequencing technology of Illumina. @@ -0,0 +1,25 @@ +* 80ca6f3 - (25 hours ago) 1 and 2 rst information added - Fabian Andrade (HEAD -> main, origin/main) +* 91c5913 - (28 hours ago) index.rst updated - Fabian Andrade +* 165d1c9 - (30 hours ago) 1 library prepration creation and 2 sequenc_technologies - Fabian Andrade +* 2995212 - (8 days ago) Update 1- Library_preparation_key_concepts.rst - Fabian Andrade Lozano +* 4e3daf5 - (8 days ago) Update index.rst - Fabian Andrade Lozano +* 7b2d50e - (8 days ago) Update conf.py - Fabian Andrade Lozano +* 0f3d659 - (8 days ago) Update github-actions.yml - Fabian Andrade Lozano +* faa40a9 - (8 days ago) Update github-actions.yml - Fabian Andrade Lozano +* adf3226 - (8 days ago) Update github-actions.yml - Fabian Andrade Lozano +* 3ecf741 - (8 days ago) Update conf.py - Fabian Andrade Lozano +* 420b06a - (8 days ago) deleted .identifier - Fabian Andrade +* bcd2638 - (8 days ago) deleted nexflow files and index modified - Fabian Andrade +* d8a344a - (8 days ago) Update .gitignore - Fabian Andrade Lozano +* 73ec2f8 - (8 days ago) .gitignore modified - Fabian Andrade +* 6936151 - (8 days ago) .gitignore modified - Fabian Andrade +* 0ee1aea - (8 days ago) .gitignore added - Fabian Andrade +* 27de573 - (9 days ago) added .github - Fabian Andrade +* 1f3695a - (9 days ago) conf.py added - Fabian Andrade +* 326b180 - (9 days ago) .readthedocs.yaml file added - Fabian Andrade +* 466e385 - (9 days ago) NGS-QC creation and library preparation cap - Fabian Andrade +* a017fe8 - (8 days ago) Add changes for 299521219f5443c437b1fa69af7cf570d338b806 - GitHub Action (origin/gh-pages) +* f02f696 - (8 days ago) Add changes for 4e3daf5cafdfb16f4a6ad9c34e9ef08d4072168c - GitHub Action +* 5b08897 - (8 days ago) Add changes for 3ecf7418207461621b2178860f2d873af9686570 - GitHub Action +* f9d44a2 - (8 days ago) Initial commit on orphan branch - Fabian Andrade (gh-pages) +* 189253c - (8 days ago) Initial commit on orphan branch - Fabian Andrade (origin/NGS-QC-Course-web) \ No newline at end of file