Skip to content

Latest commit

 

History

History
19 lines (13 loc) · 1.23 KB

11.fastq.md

File metadata and controls

19 lines (13 loc) · 1.23 KB

The FASTQ format

The nucleotide sequences and qualities of the short reads produced in a sequencing experiment are commonly stored in a plain text file using the FASTQ format. In the data/raw directory, you will find two fastq files, which contain information about the short reads obtained from one of the samples in the Drosophila melanogaster experiment.

Exercise: Why do we have two fastq files for this given sample? Solution

To confirm that we are working with a fastq file and to get an idea of how this format looks like we can print the first lines of our files by typing this into the terminal:

zcat SRR031714_1.fastq.gz | head
zcat SRR031714_2.fastq.gz | head

Exercise: How many lines are used to represent a read in the fastq file? Which information do they contain? Solution

Exercise: How many reads are there in each file? Do both files contain the same number of reads? Is that what we would expect? Solution