fixed timetable and order

BackofenLab · Apr 23, 2024 · fec3637 · fec3637
1 parent a4469f3
commit fec3637
Show file tree

Hide file tree

Showing 3 changed files with 184 additions and 187 deletions.
diff --git a/assets/timetable.json b/assets/timetable.json
@@ -1,16 +1,13 @@
 {
     "exercise-sheet-1": "2024-04-17T09:00:00",
-    "exercise-sheet-2": "2024-12-30T09:00:00",
-    "exercise-sheet-2": "2024-12-30T09:00:00",
-    "exercise-sheet-3": "2024-12-30T09:00:00",
-    "exercise-sheet-4": "2024-12-30T09:00:00",
-    "exercise-sheet-5": "2024-12-30T09:00:00",
-    "exercise-sheet-6": "2024-12-30T09:00:00",
-    "exercise-sheet-7": "2024-12-30T09:00:00",
-    "exercise-sheet-8": "2024-12-30T09:00:00",
-    "exercise-sheet-9": "2024-12-30T09:00:00",
-    "exercise-sheet-10": "2024-12-30T09:00:00",
-    "exercise-sheet-11": "2024-12-30T09:00:00",
-    "exercise-sheet-12": "2024-12-30T09:00:00"
-
+    "exercise-sheet-2": "2024-04-30T09:00:00",
+    "exercise-sheet-3": "2024-05-07T09:00:00",
+    "exercise-sheet-4": "2024-05-14T09:00:00",
+    "exercise-sheet-5": "2024-05-28T09:00:00",
+    "exercise-sheet-6": "2024-06-04T09:00:00",
+    "exercise-sheet-7": "2024-06-11T09:00:00",
+    "exercise-sheet-8": "2024-06-18T09:00:00",
+    "exercise-sheet-9": "2024-06-25T09:00:00",
+    "exercise-sheet-10": "2024-07-02T09:00:00",
+    "exercise-sheet-11": "2024-07-09T09:00:00"
 }
diff --git a/exercise-sheet-8.Rmd b/exercise-sheet-8.Rmd
@@ -6,22 +6,27 @@ library(officer)
 ```
 
 ---
-title: "Exercise sheet 8: Suffix-Trees"
+title: "Exercise sheet 9: Data Driven Life Sciences"
 ---
 
 ---------------------------------
 
 # Exercise 1
 
-You are given the text T=`CAGTAGTAGC`.
 
+### 1a)
+::: {.question data-latex=""}
+Arrange the following terms into their correct order in the Illumina sequencing method and describe each of them briefly:
 
+- bridge amplification
 
-### 1a)
+- deblocking
 
-::: {.question data-latex=""}
+- library preparation
+
+- annealing of template strands to flow cell
 
-Draw the corresponding suffix tree!
+- fluorescence detection
 ::: 
 
 #### {.tabset}
@@ -31,125 +36,79 @@ Draw the corresponding suffix tree!
 ##### Solution
 ::: {.answer data-latex=""}
 
-```{r, echo=FALSE, out.width="100%", fig.align='center'}
-knitr::include_graphics("figures/sheet-8/suffix_tree_1.png")
-```
-::: 
+**1. Library preparation:**
 
-#### {-}
+A sequencing *library* gets *prepared* from a sample by fragmenting the original DNA and adding Illumina-specific adapter sequences to both ends of the fragments. The *library* is what gets read during sequencing.
 
+**2. Template strand annealing**
 
-### 1b)
-::: {.question data-latex=""}
+The single-stranded library fragments are used as *template strands* in the sequencing and are *annealed* to primer sequences, which are bound to the *flow cell* and are complementary to the adapter sequences of the fragments.
 
-Describe the steps of a counting query for $P =$ `TAG`.
-::: 
+**3. Bridge amplification**
 
-#### {.tabset}
+After complementary strands have been synthesized and the templates been washed off, the now flow cell-bound fragments are *amplified* in several cycles of so-called *bridge-amplification* to form fragment colonies, or *clusters* on the flow cell to guarantee a detectable fluorescence signal during sequencing. 
 
-##### Hide
+**4. Fluorescence detection**
 
-##### Solution
-::: {.answer data-latex=""}
+Illumina-sequencing is a form of *sequencing-by-synthesis* in which the nucleotides incorporated into the growing strand are detected via attached *fluorophores*. After the first $3$ steps, the following steps are iterated to sequence the entire read:
 
-* start at root node
-* locate outgoing edge that starts with $T$
-* match subsequent characters of the pattern
-* in the subtree rooted at TAG count the number of leaves $\Rightarrow 2$
-::: 
-#### {-}
+Modified nucleotides, containing a fluorescent group, are used to extend the strand, their blocking groups are cleaved from their 3`-OH groups.
 
+**5. Deblocking**
 
+*Deblocking* is the removal of the fluorophore (blocking group). It is necessary before a new round of elongation by one nucleotide can begin.
 
-### 1c)
-::: {.question data-latex=""}
 
-Describe the steps of a reporting query for $P =$ `AG`.
-::: 
-
-#### {.tabset}
-
-##### Hide
-
-##### Solution
-::: {.answer data-latex=""}
-
-* start at root node
-* locate outgoing edge that start with $A$
-* match subsequent characters of the pattern
-* in the subtree rooted at AG report the labels of all leaves $\Rightarrow \{2, 5, 8\}$
+More information about this topic can be found on the [Illumina Webpage](https://www.illumina.com/science/technology/next-generation-sequencing/sequencing-technology.html).
 ::: 
 #### {-}
 
 # Exercise 2
 
+```{r, echo=FALSE, out.width="75%", fig.align='center'}
+knitr::include_graphics("figures/sheet-9/crossword.png")
+```
+
 ### 2a)
 ::: {.question data-latex=""}
 
-Draw a generalized suffix tree for the sequences $A=$`CCATG` and $B=$ `CATG`.
-::: 
+**Solve the crossword puzzle!**
 
-#### {.tabset}
+Horizontal:
 
-##### Hide
+- 3. Added to DNA fragments during library preparation.
 
-##### Hint 1 
-::: {.answer data-latex=""}
+- 8. Illumina way of determining the order of nucleotides in a DNA strand. (3 words)
 
-Concatenate the two sequences using a unique character for splitting. e.g.
-`CCATG#CATG$`.
+- 9. ChIP-Seq can be used for sequencing DNA regions that are bound by these.
 
-Dont forget to include suffix links!
-::: 
-##### Formulae
-::: {.answer data-latex=""}
+- 11. The alphabet of life.
 
-$sl(v) = w$
+- 12. Formed by bridge-amplification on Illumina flow-cells.
 
-$\overline{v} = cb$
+- 13. Flowcell surface filled with these 2 different DNA molecules.
 
-$\overline{w} = b$
+- 15. Measure to asses the quality of the identification of nucleobases generated by automated DNA sequencing. (3 words)
 
-$c: character, b: string$
 
+Vertical:
 
-remember: $\overline{v}$ denotes the concatenation of all path labels from the root to v.
-::: 
-##### Solution
-::: {.answer data-latex=""}
+- 1. Dideoxynucleosidetriphosphates (abbrev.)
 
-```{r, echo=FALSE, out.width="100%", fig.align='center'}
-knitr::include_graphics("figures/sheet-8/suffix_tree_2.png")
-```
-::: 
-#### {-}
+- 2. Process of determining positions of reads on the reference genome.
 
-### 2b)
-::: {.question data-latex=""}
+- 4. Gene expression can be measured using this. (abbrev. hyph.)
 
-Find the Maximal Unique Matches of the sequences $A=$`CCATG` and $B=$`CATG` using 
-the tree from A).
-::: 
-
-#### {.tabset}
+- 5. The process of making many copies of a piece of DNA.
 
-##### Hide
+- 6. Found in pairs in DNA.
 
-##### Solution
-::: {.answer data-latex=""}
+- 7. Chemical group attached to nucleotides to monitor incorporation into DNA.
 
-`CATG` is the only MUM as $\overline{v} =$ `CATG` has no suffix links pointing to
-it
-::: 
-#### {-}
+- 10. File format used to store sequence information.
 
+- 14. Breakthrough sequencing method (abbrev.)
 
-# Exercise 3
-
-### 3a)
-::: {.question data-latex=""}
-
-Draw a generalized suffix tree for the sequence $A=$`ACGCACGCG`.
 ::: 
 
 #### {.tabset}
@@ -158,55 +117,40 @@ Draw a generalized suffix tree for the sequence $A=$`ACGCACGCG`.
 
 ##### Solution
 ::: {.answer data-latex=""}
-
-```{r, echo=FALSE, out.width="100%", fig.align='center'}
-knitr::include_graphics("figures/sheet-8/suffix_tree_3.png")
+```{r, echo=FALSE, out.width="75%", fig.align='center'}
+knitr::include_graphics("figures/sheet-9/crossword_solved.png")
 ```
 ::: 
-
 #### {-}
 
+# Exercise 3
+
+#### {.tabset}
 
-### 3b)
+### 3a)
 ::: {.question data-latex=""}
+You want to determine how many reads $N$ are needed to achieve a coverage depth $C$ of 20X when sequencing reads for *Escherichia coli*.
 
-Find all maximal pairs of length at least 2.
+The length of the reads $L$ is 30nt and the *E. coli* genome $G$ is approximately 4.6 million bases long.
 ::: 
 
 #### {.tabset}
 
 ##### Hide
 
-##### Solution
+##### Formula
 ::: {.answer data-latex=""}
-
-`ACGC`: $(1,5,4)$
-
-`CG`: $(2,8,2), (6,8,2)$
+$$
+N = \frac{C\times G}{L}
+$$
 ::: 
-#### {-}
-
-
-### 3c)
-::: {.question data-latex=""}
-
-Why is `C`: $(2, 8, 1)$ not a maximal pair?
-
-::: 
-
-#### {.tabset}
-
-##### Hide
 
 ##### Solution
 ::: {.answer data-latex=""}
-
-It is not right maximal.
-This can be seen since `CG`: $(2, 8, 2)$ already includes the indices 2 and 8 with
-a longer match. 
-
+$$
+N = \frac{20\times 4600000}{30} \approx 3066667 \text{ reads}
+$$
 ::: 
-#### {-}