Skip to content

Latest commit

 

History

History
89 lines (68 loc) · 9.89 KB

About.md

File metadata and controls

89 lines (68 loc) · 9.89 KB
Why LIR and LIRBase?

Long inverted repeat, long hpRNA and siRNA

An inverted repeat is a single stranded nucleotide sequence followed by its reverse complement at the downstream. The intervening sequence between the initial sequence and the reverse complement can be any length including zero. When transcribed, long inverted repeat can form long hairpin RNA genes (hpRNAs), which are much longer than typical animal or plant pre-miRNAs.

Henderson et al. reported the biogenesis of small interfering RNAs (siRNAs) from long inverted repeat in Arabidopsis thaliana for the first time (Henderson et al. 2006 Nature genetics). This siRNA biogenesis pathway was soon verified in Drosophila (Czech et al. 2008 Nature). In 2008, Okamura et al. systematically characterized the genes and mechanisms underlying the biogenesis of 21-22-nucleotide siRNAs from long hpRNAs encoded by LIRs in Drosophila (Okamura et al. 2008 Nature). They found that Dicer-2, Hen1 and Argonaute 2 played vital roles in this siRNA biogenesis pathway. This siRNA biogenesis pathway was further characterized in Arabidopsis soon (Dunoyer et al. 2010 EMBO J).

LIRs can act as functional genomic elements in eukaryotic genomes.
A typical long inverted repeat and the small RNAs originated from the LIR analyzed utilizing LIRBase are demonstrated in the following image.

<style> .aligncenter { text-align: center; } </style>

siRNA derived from long inverted repeats play important biological roles

In 2018, Lin et al. identified two long hpRNAs in Drosophila simulans, which could be processed into 21-nt siRNAs (Tao et al. 2007a PLOS Biology; Tao et al. 2007b PLOS Biology; Lin et al. 2018 Developmental Cell). These siRNAs could then repress the expression of the Dox and MDox genes which promotes X chromosome transmission by suppressing Y-bearing sperm. As a result, the two long hpRNAs and the derived siRNAs are critical to the maintenance of balanced sex ratio in the offsprings of Drosophila simulans.

The biological functions of siRNAs derived from long inverted repeats in plants and animals were also reported in recent years.

In mouse, siRNAs derived from LIRs were reported to regulate gene expression in oocytes (Tam et al. 2008 Nature; Watanabe et al. 2008 Nature).

In Drosophila, another hpRNA and the derived siRNAs were reported to regulate testis gene expression and control male fertility (Wen et al. 2015 Molecular Cell).

In apple, a long hpRNA and the generated siRNAs contributed to the resistance of apple to leaf spot disease (Zhang et al. 2018 Plant Cell).

In soybean, a long hpRNA and the derived 22-nt siRNAs regulate the seed coat color of soybean (Tuteja et al. 2009 Plant Cell; Cho et al. 2013 PLOS ONE; Jia et al. 2020 Plant Cell).

In rice, we previously found that several LIRs were present in one parental genome of an elite hybrid but were absent from the other parental genome (Yao et al. 2020 Computational and Structural Biotechnology Journal). As a result, siRNAs derived from the LIRs were detected and expressed in only one parental genome. The association between the LIRs and siRNAs were further detected and verified in an F2 population derived from a self-cross of the elite hybrid.

<style> .aligncenter { text-align: center; } </style>

Comprehensive genome-wide identification of LIRs and long hpRNAs in eukaryotic genomes are urgently needed

In 2013, Axtell urgently called on the comprehensive genome-wide identification and annotation of long inverted repeats and long hpRNAs (Axtell et al. 2013 Annual Review of Plant Biology). However, genome-wide identification and annotation of long inverted repeats were only conducted in very few organisms. None database or web server for annotation and analysis of long inverted repeats and long hpRNAs exist up to now.

Using Inverted Repeats Finder (IRF) (Warburton et al. 2004 Genome Research), we identified a total of 6,789,791 long inverted repeats in the whole genomes of 424 eukaryotes, including 297,317 LIRs in 77 invertebrate metazoa genomes, 1,902,296 LIRs in 142 plant genomes and 4,590,178 LIRs in 208 vertebrate genomes. We requested a minimum length of 400 nt for both arms of the long inverted repeat identified by IRF, to remove potential miniature inverted-repeat transposable element (MITE) or Alu element from the result of IRF.

Nomenclature of a long inverted repeat in LIRBase

Each long inverted repeat has a unique identifier in LIRBase determined by the species name and several features of the LIR including the chromosome ID, the start coordinate of the left arm, the end coordinate of the left arm, the start coordinate of the right arm, the end coordinate of the right arm.

Please be noted that the sequence of a LIR in LIRBase is composed of the left arm sequence, the loop sequence, the right arm sequence, as well as two 200-bp sequences flanking the LIR (the left flanking sequence and the right flanking sequence). The genomic coordinates of both arms of the LIR are reflected in the identifier of the LIR, while the flanking sequences are not denoted in the identifier of the LIR.

<style> .aligncenter { text-align: center; } </style>

References