Skip to content

Commit

Permalink
Add changes for 8d3ef45
Browse files Browse the repository at this point in the history
  • Loading branch information
actions-user committed Sep 2, 2024
1 parent 092003c commit 24163d1
Show file tree
Hide file tree
Showing 19 changed files with 196 additions and 35 deletions.
22 changes: 19 additions & 3 deletions 1- Library preparation.html
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,9 @@
<script src="_static/js/theme.js"></script>
<link rel="author" title="About these documents" href="about.html" />
<link rel="index" title="Index" href="genindex.html" />
<link rel="search" title="Search" href="search.html" />
<link rel="search" title="Search" href="search.html" />
<link rel="next" title="2 Main Sequencing Technologies" href="2-%20Sequencing%20technologies.html" />
<link rel="prev" title="About the course" href="about.html" />
</head>

<body class="wy-body-for-nav">
Expand All @@ -44,8 +46,19 @@
</form>
</div>
</div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu">
<ul>
<ul class="current">
<li class="toctree-l1"><a class="reference internal" href="about.html">About the course</a></li>
<li class="toctree-l1 current"><a class="current reference internal" href="#">1 Library preparation</a><ul>
<li class="toctree-l2"><a class="reference internal" href="#main-causes-of-poor-quality-data">Main Causes of poor quality data</a></li>
<li class="toctree-l2"><a class="reference internal" href="#id1">Library preparation</a></li>
<li class="toctree-l2"><a class="reference internal" href="#library-preparation-bias">Library preparation bias</a><ul>
<li class="toctree-l3"><a class="reference internal" href="#dna-library-bias">DNA library bias</a></li>
<li class="toctree-l3"><a class="reference internal" href="#rna-library-bias">RNA library bias</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="2-%20Sequencing%20technologies.html">2 Main Sequencing Technologies</a></li>
<li class="toctree-l1"><a class="reference internal" href="3-%20Quality%20Control%20and%20Preprocessing.html">3 Quality Control and Preprocessing</a></li>
<li class="toctree-l1"><a class="reference internal" href="4-%20Quality%20of%20the%20mapping.html">4 Quality of the Mapping</a></li>
</ul>
Expand Down Expand Up @@ -228,7 +241,10 @@ <h3>RNA library bias<a class="headerlink" href="#rna-library-bias" title="Link t

</div>
</div>
<footer>
<footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer">
<a href="about.html" class="btn btn-neutral float-left" title="About the course" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
<a href="2-%20Sequencing%20technologies.html" class="btn btn-neutral float-right" title="2 Main Sequencing Technologies" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
</div>

<hr/>

Expand Down
22 changes: 19 additions & 3 deletions 2- Sequencing technologies.html
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,9 @@
<script src="_static/js/theme.js"></script>
<link rel="author" title="About these documents" href="about.html" />
<link rel="index" title="Index" href="genindex.html" />
<link rel="search" title="Search" href="search.html" />
<link rel="search" title="Search" href="search.html" />
<link rel="next" title="3 Quality Control and Preprocessing" href="3-%20Quality%20Control%20and%20Preprocessing.html" />
<link rel="prev" title="1 Library preparation" href="1-%20Library%20preparation.html" />
</head>

<body class="wy-body-for-nav">
Expand All @@ -44,8 +46,19 @@
</form>
</div>
</div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu">
<ul>
<ul class="current">
<li class="toctree-l1"><a class="reference internal" href="about.html">About the course</a></li>
<li class="toctree-l1"><a class="reference internal" href="1-%20Library%20preparation.html">1 Library preparation</a></li>
<li class="toctree-l1 current"><a class="current reference internal" href="#">2 Main Sequencing Technologies</a><ul>
<li class="toctree-l2"><a class="reference internal" href="#short-reads-sequencing-illumina">Short Reads sequencing (Illumina)</a><ul>
<li class="toctree-l3"><a class="reference internal" href="#single-end">Single end</a></li>
<li class="toctree-l3"><a class="reference internal" href="#paired-end">Paired end</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="#long-read-sequencing-nanopore">Long read sequencing (Nanopore)</a></li>
<li class="toctree-l2"><a class="reference internal" href="#fastq-format-and-phred-quality-score">FASTQ format and Phred quality score</a></li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="3-%20Quality%20Control%20and%20Preprocessing.html">3 Quality Control and Preprocessing</a></li>
<li class="toctree-l1"><a class="reference internal" href="4-%20Quality%20of%20the%20mapping.html">4 Quality of the Mapping</a></li>
</ul>
Expand Down Expand Up @@ -157,7 +170,10 @@ <h2>FASTQ format and Phred quality score<a class="headerlink" href="#fastq-forma

</div>
</div>
<footer>
<footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer">
<a href="1-%20Library%20preparation.html" class="btn btn-neutral float-left" title="1 Library preparation" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
<a href="3-%20Quality%20Control%20and%20Preprocessing.html" class="btn btn-neutral float-right" title="3 Quality Control and Preprocessing" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
</div>

<hr/>

Expand Down
82 changes: 66 additions & 16 deletions 3- Quality Control and Preprocessing.html
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
<link rel="index" title="Index" href="genindex.html" />
<link rel="search" title="Search" href="search.html" />
<link rel="next" title="4 Quality of the Mapping" href="4-%20Quality%20of%20the%20mapping.html" />
<link rel="prev" title="About the course" href="about.html" />
<link rel="prev" title="2 Main Sequencing Technologies" href="2-%20Sequencing%20technologies.html" />
</head>

<body class="wy-body-for-nav">
Expand All @@ -48,6 +48,8 @@
</div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu">
<ul class="current">
<li class="toctree-l1"><a class="reference internal" href="about.html">About the course</a></li>
<li class="toctree-l1"><a class="reference internal" href="1-%20Library%20preparation.html">1 Library preparation</a></li>
<li class="toctree-l1"><a class="reference internal" href="2-%20Sequencing%20technologies.html">2 Main Sequencing Technologies</a></li>
<li class="toctree-l1 current"><a class="current reference internal" href="#">3 Quality Control and Preprocessing</a><ul>
<li class="toctree-l2"><a class="reference internal" href="#illumina">Illumina</a><ul>
<li class="toctree-l3"><a class="reference internal" href="#quality-control">Quality Control</a><ul>
Expand Down Expand Up @@ -114,27 +116,58 @@ <h4>FASTQC<a class="headerlink" href="#fastqc" title="Link to this heading"><
<blockquote>
<div><ol class="arabic simple">
<li><p><strong>Basic Statistics</strong>: display the information related with the file, number and leght of the sequences, and overall %GC.</p></li>
<li><dl class="simple">
<dt><strong>Per base sequence quality</strong>: shows how the quality score (y axis) varys throughout the sequence reads (x axis). For each position a BoxWhisker is displayed, the red line represents the median and the blue the mean. Commonly the quality score tend to decrease at the end of the reads, because the polymerase tends to make more mistakes as the read progresses.</dt><dd><p>is the median os any base is less than 25 a warning will arise.</p>
</dd>
</dl>
</ol>
<p>2. <strong>Per base sequence quality</strong>: shows how the quality score (y axis) varys throughout the sequence reads (x axis).
For each position a BoxWhisker is displayed, the red line represents the median and the blue the mean.
Commonly the quality score tend to decrease at the end of the reads, because the polymerase tends to make more mistakes as the read progresses.</p>
<blockquote>
<div><p>is the median os any base is less than 25 a warning will arise.</p>
<a class="reference internal image-reference" href="_images/Per_base_seq_quality.png"><img alt="*Per Base Sequence Quality FASTQC module*" class="align-center" src="_images/Per_base_seq_quality.png" style="width: 400px;" /></a>
</div></blockquote>
<ol class="arabic" start="3">
<li><p><strong>Per tile sequence quality</strong>: shows the quality score distribution for each tile in the flowcell.</p>
<blockquote>
<div><a class="reference internal image-reference" href="_images/Per_tile_seq_quality.png"><img alt="*Per Tile Sequence Quality FASTQC module*" class="align-center" src="_images/Per_tile_seq_quality.png" style="width: 400px;" /></a>
</div></blockquote>
</li>
<li><p><strong>Per tile sequence quality</strong>: shows the quality score distribution for each tile in the flowcell.</p></li>
<li><p><strong>Per sequence quality score</strong>: shows the distribution of the quality scores for all the reads in the file. If a huge amount of reads subset have a poor average quality this could indicate a systematic problem.</p></li>
<li><p><strong>Per base sequence content</strong>: proportion of each base position for the four nucleotides. A strong bias in the nucleotide composition could indicate a problem in the library preparation.</p></li>
<li><p><strong>Per sequence GC content</strong>: GC content distribution for all the reads in the file, and compared to a modelled normal distribution of human GC content.</p></li>
</ol>
<p>4. <strong>Per sequence quality score</strong>: shows the distribution of the quality scores for all the reads in the file.
If a huge amount of reads subset have a poor average quality this could indicate a systematic problem.</p>
<blockquote>
<div><a class="reference internal image-reference" href="docs/images/FASTQC_report_images/Per_seq_quality_scores.png"><img alt="*Per Sequence Quality FASTQC module*" class="align-center" src="docs/images/FASTQC_report_images/Per_seq_quality_scores.png" style="width: 400px;" /></a>
</div></blockquote>
<p>5. <strong>Per base sequence content</strong>: proportion of each base position for the four nucleotides.
A strong bias in the nucleotide composition could indicate a problem in the library preparation.</p>
<blockquote>
<div><a class="reference internal image-reference" href="_images/Per_base_seq_content.png"><img alt="*Per Base Sequence Content FASTQC module*" class="align-center" src="_images/Per_base_seq_content.png" style="width: 400px;" /></a>
</div></blockquote>
<ol class="arabic" start="6">
<li><p><strong>Per sequence GC content</strong>: GC content distribution for all the reads in the file, and compared to a modelled normal distribution of human GC content.</p>
<blockquote>
<div><a class="reference internal image-reference" href="_images/Per_seq_GC_content.png"><img alt="*Per Sequence GC Content FASTQC module*" class="align-center" src="_images/Per_seq_GC_content.png" style="width: 400px;" /></a>
<div class="admonition danger">
<p class="admonition-title">Danger</p>
<p>If the GC content is not close to the normal distribution, this could indicate a contamination or a problem in the library preparation.
Also, depending on the organism the GC content could vary, so it is important to know the GC content of the organism of interest (so avoid comparison with reference curve).</p>
</div>
<ol class="arabic simple" start="7">
<li><p><strong>Per Base N content</strong>: If the sequencer is unable to determine the base in a position, it will be represented as an ‘N’. This section shows the distribution of Ns in the reads.</p></li>
<li><p><strong>Sequence Lenght Distribution</strong>: distribution of fragment sizes, for delimited size lenght (number of cycles) a peak only at one size is observed.</p></li>
<li><p><strong>Duplicate Sequences</strong>: shows the number of duplicated sequences in the file. a high level of duplication could indicate a enrichment bias (i.e. PCR amplification). Low level of duplication may indicate a very high level of coverage of the target sequence.</p></li>
</div></blockquote>
</li>
<li><p><strong>Per Base N content</strong>: If the sequencer is unable to determine the base in a position, it will be represented as an ‘N’. This section shows the distribution of Ns in the reads.</p>
<blockquote>
<div><a class="reference internal image-reference" href="_images/Per_base_N_content.png"><img alt="*Per Base N Content FASTQC module*" class="align-center" src="_images/Per_base_N_content.png" style="width: 400px;" /></a>
</div></blockquote>
</li>
</ol>
<ol class="arabic" start="9">
<li><p><strong>Duplicate Sequences</strong>: shows the number of duplicated sequences in the file. a high level of duplication could indicate a enrichment bias (i.e. PCR amplification). Low level of duplication may indicate a very high level of coverage of the target sequence.</p>
<blockquote>
<div><a class="reference internal image-reference" href="_images/Seq_duplication_levels.png"><img alt="*Duplicate Sequences FASTQC module*" class="align-center" src="_images/Seq_duplication_levels.png" style="width: 400px;" /></a>
</div></blockquote>
</li>
<li><p><strong>Overrepresented sequences</strong>: show in a single sequence is very overrepresented in the file. This could indicate a contamination or a problem in the library preparation.</p></li>
<li><p><a href="#id1"><span class="problematic" id="id2">**</span></a>Adapter content <a href="#id3"><span class="problematic" id="id4">**</span></a>: shows the presence of adapter sequences in the reads. If there is presence of adapters, the reads should be trimmed before further analysis.</p></li>
<li><p><a href="#id1"><span class="problematic" id="id2">**</span></a>Adapter content <a href="#id3"><span class="problematic" id="id4">**</span></a>: shows the presence of adapter sequences in the reads. If there is presence of adapters, the reads should be trimmed before further analysis.</p>
<a class="reference internal image-reference" href="_images/Adapter_content.png"><img alt="*Adapter Content FASTQC module*" class="align-center" src="_images/Adapter_content.png" style="width: 400px;" /></a>
</li>
</ol>
</div></blockquote>
<div class="admonition seealso">
Expand All @@ -157,15 +190,32 @@ <h4>FASTQ-Screen<a class="headerlink" href="#fastq-screen" title="Link to this h
<li><p>Yeast</p></li>
<li><p>Arabidopsis</p></li>
<li><p>E.coli</p></li>
<li><p>Mitochondrial: in single nucleus RNA-seq is a good control of the nuclear isolation during the DNA extraction.</p></li>
</ul>
</div></blockquote>
<p>Also, other sources of contaminats could be checked:</p>
<blockquote>
<div><ul class="simple">
<li><p>PhiX: is a control used by Illumina to check the quality of the sequencing run (if the library is under or overloaded)</p></li>
<li><p>rRNA: in RNA-seq is a good control of rRNA depletion during library preparation has not beeen amplified.</p></li>
<li><p>Mitochondrial: in single nucleus RNA-seq is a good control of the nuclear isolation during the DNA extraction.</p></li>
<li><p>Lambda</p></li>
<li><p>Vectors: to check that vectors used during library preprartion</p></li>
<li><p>Adapters</p></li>
</ul>
</div></blockquote>
<p>Example of a FASTQ-Screen report:</p>
<ul>
<li><p>Mapping result tables with the percentage of reads that map to each reference genome.</p>
<blockquote>
<div><a class="reference internal image-reference" href="_images/Mapping_results_tables.png"><img alt="*Adapter Content FASTQC module*" class="align-center" src="_images/Mapping_results_tables.png" style="width: 400px;" /></a>
</div></blockquote>
</li>
<li><p>Mapping results tables values in a plot.</p>
<blockquote>
<div><a class="reference internal image-reference" href="images/FASTQ-Screen/Mapping_results_plots.png"><img alt="*Adapter Content FASTQC module*" class="align-center" src="images/FASTQ-Screen/Mapping_results_plots.png" style="width: 400px;" /></a>
</div></blockquote>
</li>
</ul>
<p>When working with several samples and reports theese could be aggregate in a unique report using “MULTIQC”” (<a class="reference external" href="https://multiqc.info/">https://multiqc.info/</a>)</p>
</section>
</section>
Expand Down Expand Up @@ -209,7 +259,7 @@ <h3>Pre-processing<a class="headerlink" href="#pre-processing" title="Link to th
</div>
</div>
<footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer">
<a href="about.html" class="btn btn-neutral float-left" title="About the course" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
<a href="2-%20Sequencing%20technologies.html" class="btn btn-neutral float-left" title="2 Main Sequencing Technologies" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
<a href="4-%20Quality%20of%20the%20mapping.html" class="btn btn-neutral float-right" title="4 Quality of the Mapping" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
</div>

Expand Down
2 changes: 2 additions & 0 deletions 4- Quality of the mapping.html
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,8 @@
</div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu">
<ul class="current">
<li class="toctree-l1"><a class="reference internal" href="about.html">About the course</a></li>
<li class="toctree-l1"><a class="reference internal" href="1-%20Library%20preparation.html">1 Library preparation</a></li>
<li class="toctree-l1"><a class="reference internal" href="2-%20Sequencing%20technologies.html">2 Main Sequencing Technologies</a></li>
<li class="toctree-l1"><a class="reference internal" href="3-%20Quality%20Control%20and%20Preprocessing.html">3 Quality Control and Preprocessing</a></li>
<li class="toctree-l1 current"><a class="current reference internal" href="#">4 Quality of the Mapping</a><ul>
<li class="toctree-l2"><a class="reference internal" href="#introduction-to-mapping-and-tools">Introduction to Mapping and tools</a><ul>
Expand Down
Binary file added _images/Adapter_content.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/Mapping_results_tables.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/Per_base_N_content.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/Per_base_seq_content.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/Per_base_seq_quality.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/Per_seq_GC_content.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/Per_tile_seq_quality.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/Seq_duplication_levels.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 24163d1

Please sign in to comment.