Limited evidence for evolutionarily conserved targeting of long non-coding RNAs by microRNAs
© Alaei-Mahabadi and Larsson; licensee BioMed Central Ltd. 2013
Received: 5 June 2013
Accepted: 29 July 2013
Published: 20 August 2013
Long non-coding RNAs (lncRNAs) are emerging as important regulators of cell physiology, but it is yet unknown to what extent lncRNAs have evolved to be targeted by microRNAs. Comparative genomics has previously revealed widespread evolutionarily conserved microRNA targeting of protein-coding mRNAs, and here we applied a similar approach to lncRNAs.
We used a map of putative microRNA target sites in lncRNAs where site conservation was evaluated based on 46 vertebrate species. We compared observed target site frequencies to those obtained with a random model, at variable prediction stringencies. While conserved sites were not present above random expectation in intergenic lncRNAs overall, we observed a marginal over-representation of highly conserved 8-mer sites in a small subset of cytoplasmic lncRNAs (12 sites in 8 lncRNAs at 56% false discovery rate, P = 0.10).
Evolutionary conservation in lncRNAs is generally low but patch-wise high, and these patches could, in principle, harbor conserved target sites. However, while our analysis efficiently detected conserved targeting of mRNAs, it provided only limited and marginally significant support for conserved microRNA-lncRNA interactions. We conclude that conserved microRNA-lncRNA interactions could not be reliably detected with our methodology.
KeywordsLong non-coding RNA lncRNA microRNA Comparative genomics
While small non-coding RNAs, such as microRNAs, have well-established functions in the cell, long non-coding RNAs (lncRNAs) have only recently started to emerge as widespread regulators of cell physiology . Although early examples were discovered decades ago, large-scale transcriptomic studies have since revealed that mammalian genomes encode thousands of long (>200 nt) transcripts that lack coding capacity, but are otherwise mRNA-like [2–4]. Their biological importance has been controversial, but novel functional lncRNAs with roles, for example, in vertebrate development , pluripotency  and genome stability  are now being described at increasing frequency.
A few recent studies describe interactions between small and long non-coding RNAs, where lncRNAs act either as regulatory targets of microRNA-induced destabilization [8, 9] or as molecular decoys of microRNAs [10–13]. Recent results also show that stable circular lncRNAs can bind and inhibit microRNAs [14, 15]. Importantly, RNAi-based studies, including silencing of 147 lncRNAs with lentiviral shRNAs , show that lncRNAs are, in principle, susceptible to repression by Argonaute-small RNA complexes, despite often localizing to the nucleus. In addition, there are data from crosslinking and immunoprecipitation (CLIP) experiments that support binding of Argonaute proteins to lncRNAs [16, 17].
Comparative genomics has revealed that most protein-coding genes are under conserved microRNA control: conserved microRNA target sites are present in 3’ untranslated regions (UTRs) of protein-coding mRNAs at frequencies considerably higher than randomly expected, clearly demonstrating the impact of microRNAs on mRNA evolution [18, 19]. While lncRNAs in general are weakly conserved, they may have local patches of strong sequence conservation . It was recently shown that developmental defects caused by knockdown of lncRNAs in zebrafish could be rescued by introduction of putative human orthologs identified based on such short patches , supporting that lncRNA functions may be conserved over large evolutionary distances despite limited sequence similarity. It is thus plausible that lncRNAs also have evolved to be targeted by microRNAs despite their overall low conservation, and that this would manifest itself through the presence of target sites in local conserved segments.
Next, we investigated site frequencies in lncRNAs, specifically of the intergenic type to avoid confounding genomic overlaps. In a set of 2,121 intergenic lncRNA genes, we observed no significant enrichment of sites (Figure 2B). Restricting our search to 3’ or 5’ ends of transcripts, or subsets of intergenic lncRNAs previously found to have conserved promoter regions , resulted in a similar lack of enrichment (data not shown).
Pan-mammalian conserved 8-mer putative microRNA target sites in cytoplasmic intergenic long non-coding RNAs (lncRNAs)
Cabili et al.
Conserved targeting of lncRNAs by microRNAs is plausible, given that LncRNAs are susceptible to AGO-mediated repression, and that they show patch-wise strong sequence conservation. However, our analysis indicates that this is not a widespread phenomenon, even though a small subset of cytoplasmic transcripts showed a weak enrichment of conserved sites at marginal statistical significance. LncRNAs are currently defined solely based on length and coding capacity, and are as such likely to represent a highly functionally diverse group. It is thus possible that other, not yet defined, subfamilies have evolved to be microRNA targets, but that this signal is too diluted to be detectable in our current analysis.
It should be noted that the GENCODE annotation used here is one of several published lncRNA sets, and while comprehensive, it does not cover all known transcribed loci . Likewise, there are several approaches to target site prediction and detailed results may vary. Notably, our analysis was designed to capture an overall signature of conserved targeting, and when applied to mRNAs it efficiently recapitulated a strong enrichment signal. Different implementations and annotations could give variable results at the level of individual transcripts and sites, but the main conclusion is unlikely to depend on these parameters.
While some established microRNA-lncRNA interaction sites are conserved to various extents, in principle enabling detection by comparative genomics approaches [8–10], others lack conservation despite having experimentally confirmed functions [12, 13]. This is consistent with data showing that many non-conserved human microRNA sites can mediate targeting . Notably, even well-characterized lncRNAs, such as HOTAIR and XIST, have often evolved rapidly, and may show considerable functional and structural differences within the mammalian lineage [24, 25]. Our comparative genomics methodology therefore does not exclude that non-conserved and recently evolved targeting could be commonplace, and this motivates further computational and experimental studies.
We relied on the GENCODE coding/non-coding classification, and considered as lncRNAs genes that only produced transcripts of the ‘antisense’, ‘lincRNA’, ‘non_coding’ and ‘processed_transcript’ types. We excluded pseudogenes, as well as any gene producing any splice isoform shorter than 200 nt. Genes with symbols corresponding to any RefSeq coding gene, or to the UCSC browser xenoRefGene set, were removed from the long non-coding set, to control for a small number of cases of obvious incorrect coding/non-coding classification in the GENCODE annotation. This resulted in set of 13,751/9,122 lncRNA transcripts/genes. A smaller subset of 2,121/2,777 intergenic lncRNA genes/transcripts were stringently defined by requiring a genomic separation of at least 10 kb to any other annotated gene.
MicroRNA target sites in GENCODE v7 genes were mapped as described previously . Random seed sequences were generated under a dinuclotide model that preserved nucleotide frequencies of the actual microRNA family seeds, and were subsequently mapped in the same way as the actual seed sequences. Ratios of observed-to-expected site counts were calculated based on these random seeds, for different conservation level thresholds and seed match types. To assess the statistical significance of these ratios, 20 sets of random seeds were evaluated, each set being of the same size as the set of actual conserved families (n = 87). At least 19/20 cases of ratio >1 were required for significance at the empirical P ≤0.05 level, and 18/20 for P = 0.10. MicroRNA family definitions and conservation classifications were derived from TargetScan . We used data from a previous study  to define subsets of lncRNAs with conserved regulatory regions. The 500 or 250 most conserved intergenic lncRNAs based on either pan-mammal or pan-vertebrate promoter conservation scores (in total, four sets) were analyzed as described above.
RNA-seq data (fastq files) produced within the ENCODE project  by the Gingeras laboratory (Cold Spring Harbor Laboratories, Cold Spring Harbor, NY, USA) were obtained through the UCSC FTP server. A total of 1.71 billion 76 nt read pairs from polyA+ nuclear and cytoplasmic fractions from seven human cell lines (Gm12878, HelaS3, HepG2, Huvec, H1hesc, Nhek and K562) were aligned to the human hg19 reference genome with Tophat . The aligner was supplied with GENCODE gene models using the -G option. Genes were quantified using the HTSeq-count utility (http://www-huber.embl.de/users/anders/HTSeq). Cytoplasmic transcripts were defined as having a normalized cytoplasm/nucleus ratio >1. A total of at least 20 mapped reads across all conditions was required, to avoid unreliable cytoplasm/nuclear ratios in the low-abundance range.
Ethical approval or patient consent was not required for this study.
EL designed the study, analyzed data, and wrote the manuscript. BA analyzed data. Both authors read and approved the final manuscript.
Crosslinking and immunoprecipitation
Long non-coding RNA
We would like to acknowledge Drs. Anders Jacobsen and Debora S. Marks for helpful comments and discussions. This work was supported by grants from the Swedish Medical Research Council; the Swedish Cancer Society; the Assar Gabrielsson Foundation; the Magnus Bergvall Foundation; the Åke Wiberg foundation; and the Lars Hierta Memorial Foundation.
- Wang KC, Chang HY: Molecular mechanisms of long noncoding RNAs. Mol Cell. 2011, 43: 904-914. 10.1016/j.molcel.2011.08.018.PubMed CentralView ArticlePubMedGoogle Scholar
- Carninci P, Kasukawa T, Katayama S, Gough J, Frith MC, Maeda N, Oyama R, Ravasi T, Lenhard B, Wells C, Kodzius R, Shimokawa K, Bajic VB, Brenner SE, Batalov S, Forrest AR, Zavolan M, Davis MJ, Wilming LG, Aidinis V, Allen JE, Ambesi-Impiombato A, Apweiler R, Aturaliya RN, Bailey TL, Bansal M, Baxter L, Beisel KW, Bersano T, Bono H, Chalk AM, et al: The transcriptional landscape of the mammalian genome. Science. 2005, 309: 1559-1563.View ArticlePubMedGoogle Scholar
- Cabili MN, Trapnell C, Goff L, Koziol M, Tazon-Vega B, Regev A, Rinn JL: Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev. 2011, 25: 1915-1927. 10.1101/gad.17446611.PubMed CentralView ArticlePubMedGoogle Scholar
- Derrien T, Johnson R, Bussotti G, Tanzer A, Djebali S, Tilgner H, Guernec G, Martin D, Merkel A, Knowles DG, Lagarde J, Veeravalli L, Ruan X, Ruan Y, Lassmann T, Carninci P, Brown JB, Lipovich L, Gonzalez JM, Thomas M, Davis CA, Shiekhattar R, Gingeras TR, Hubbard TJ, Notredame C, Harrow J, Guigó R: The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression. Genome Res. 2012, 22: 1775-1789. 10.1101/gr.132159.111.PubMed CentralView ArticlePubMedGoogle Scholar
- Ulitsky I, Shkumatava A, Jan CH, Sive H, Bartel DP: Conserved function of lincRNAs in vertebrate embryonic development despite rapid sequence evolution. Cell. 2011, 147: 1537-1550. 10.1016/j.cell.2011.11.055.PubMed CentralView ArticlePubMedGoogle Scholar
- Guttman M, Donaghey J, Carey BW, Garber M, Grenier JK, Munson G, Young G, Lucas AB, Ach R, Bruhn L, Yang X, Amit I, Meissner A, Regev A, Rinn JL, Root DE, Lander ES: lincRNAs act in the circuitry controlling pluripotency and differentiation. Nature. 2011, 477: 295-300. 10.1038/nature10398.PubMed CentralView ArticlePubMedGoogle Scholar
- Huarte M, Guttman M, Feldser D, Garber M, Koziol MJ, Kenzelmann-Broz D, Khalil AM, Zuk O, Amit I, Rabani M, Attardi LD, Regev A, Lander ES, Jacks T, Rinn JL: A large intergenic noncoding RNA induced by p53 mediates global gene repression in the p53 response. Cell. 2010, 142: 409-419. 10.1016/j.cell.2010.06.040.PubMed CentralView ArticlePubMedGoogle Scholar
- Hansen TB, Wiklund ED, Bramsen JB, Villadsen SB, Statham AL, Clark SJ, Kjems J: miRNA-dependent gene silencing involving Ago2-mediated cleavage of a circular antisense RNA. EMBO J. 2011, 30: 4414-4422. 10.1038/emboj.2011.359.PubMed CentralView ArticlePubMedGoogle Scholar
- Calin GA, Liu CG, Ferracin M, Hyslop T, Spizzo R, Sevignani C, Fabbri M, Cimmino A, Lee EJ, Wojcik SE, Shimizu M, Tili E, Rossi S, Taccioli C, Pichiorri F, Liu X, Zupo S, Herlea V, Gramantieri L, Lanza G, Alder H, Rassenti L, Volinia S, Schmittgen TD, Kipps TJ, Negrini M, Croce CM: Ultraconserved regions encoding ncRNAs are altered in human leukemias and carcinomas. Cancer Cell. 2007, 12: 215-229. 10.1016/j.ccr.2007.07.027.View ArticlePubMedGoogle Scholar
- Franco-Zorrilla JM, Valli A, Todesco M, Mateos I, Puga MI, Rubio-Somoza I, Leyva A, Weigel D, Garcia JA, Paz-Ares J: Target mimicry provides a new mechanism for regulation of microRNA activity. Nat Genet. 2007, 39: 1033-1037. 10.1038/ng2079.View ArticlePubMedGoogle Scholar
- Cazalla D, Yario T, Steitz JA: Down-regulation of a host microRNA by a Herpesvirus saimiri noncoding RNA. Science. 2010, 328: 1563-1566. 10.1126/science.1187197.PubMed CentralView ArticlePubMedGoogle Scholar
- Wang J, Liu X, Wu H, Ni P, Gu Z, Qiao Y, Chen N, Sun F, Fan Q: CREB up-regulates long non-coding RNA, HULC expression through interaction with microRNA-372 in liver cancer. Nucleic Acids Res. 2010, 38: 5366-5383. 10.1093/nar/gkq285.PubMed CentralView ArticlePubMedGoogle Scholar
- Wang Y, Xu Z, Jiang J, Xu C, Kang J, Xiao L, Wu M, Xiong J, Guo X, Liu H: Endogenous miRNA sponge lincRNA-RoR regulates Oct4, Nanog, and Sox2 in human embryonic stem cell self-renewal. Developmental cell. 2013, 25: 69-80. 10.1016/j.devcel.2013.03.002.View ArticlePubMedGoogle Scholar
- Memczak S, Jens M, Elefsinioti A, Torti F, Krueger J, Rybak A, Maier L, Mackowiak SD, Gregersen LH, Munschauer M, Loewer A, Ziebold U, Landthaler M, Kocks C, Le-Noble F, Rajewsky N: Circular RNAs are a large class of animal RNAs with regulatory potency. Nature. 2013, 495: 333-338. 10.1038/nature11928.View ArticlePubMedGoogle Scholar
- Hansen TB, Jensen TI, Clausen BH, Bramsen JB, Finsen B, Damgaard CK, Kjems J: Natural RNA circles function as efficient microRNA sponges. Nature. 2013, 495: 384-388. 10.1038/nature11993.View ArticlePubMedGoogle Scholar
- Jalali S, Bhartiya D, Lalwani MK, Sivasubbu S, Scaria V: Systematic transcriptome wide analysis of lncRNA-miRNA interactions. PLoS One. 2013, 8: e53823-10.1371/journal.pone.0053823.PubMed CentralView ArticlePubMedGoogle Scholar
- Paraskevopoulou MD, Georgakilas G, Kostoulas N, Reczko M, Maragkakis M, Dalamagas TM, Hatzigeorgiou AG: DIANA-LncBase: experimentally verified and computationally predicted microRNA targets on long non-coding RNAs. Nucleic Acids Res. 2013, 41: D239-D245. 10.1093/nar/gks1246.PubMed CentralView ArticlePubMedGoogle Scholar
- Friedman RC, Farh KK, Burge CB, Bartel DP: Most mammalian mRNAs are conserved targets of microRNAs. Genome Res. 2009, 19: 92-105.PubMed CentralView ArticlePubMedGoogle Scholar
- Lewis BP, Burge CB, Bartel DP: Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell. 2005, 120: 15-20. 10.1016/j.cell.2004.12.035.View ArticlePubMedGoogle Scholar
- Guttman M, Garber M, Levin JZ, Donaghey J, Robinson J, Adiconis X, Fan L, Koziol MJ, Gnirke A, Nusbaum C, Rinn JL, Lander ES, Regev A: Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs. Nat Biotechnol. 2010, 28: 503-510. 10.1038/nbt.1633.PubMed CentralView ArticlePubMedGoogle Scholar
- Jeggari A, Marks DS, Larsson E: miRcode: a map of putative microRNA target sites in the long non-coding transcriptome. Bioinformatics. 2012, 28: 2062-2063. 10.1093/bioinformatics/bts344.PubMed CentralView ArticlePubMedGoogle Scholar
- Blanchette M, Kent WJ, Riemer C, Elnitski L, Smit AF, Roskin KM, Baertsch R, Rosenbloom K, Clawson H, Green ED, Haussler D, Miller W: Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res. 2004, 14: 708-715. 10.1101/gr.1933104.PubMed CentralView ArticlePubMedGoogle Scholar
- Farh KK, Grimson A, Jan C, Lewis BP, Johnston WK, Lim LP, Burge CB, Bartel DP: The widespread impact of mammalian MicroRNAs on mRNA repression and evolution. Science. 2005, 310: 1817-1821. 10.1126/science.1121158.View ArticlePubMedGoogle Scholar
- Schorderet P, Duboule D: Structural and functional differences in the long non-coding RNA hotair in mouse and human. PLoS Genet. 2011, 7: e1002071-10.1371/journal.pgen.1002071.PubMed CentralView ArticlePubMedGoogle Scholar
- Nesterova TB, Slobodyanyuk SY, Elisaphenko EA, Shevchenko AI, Johnston C, Pavlova ME, Rogozin IB, Kolesnikov NN, Brockdorff N, Zakian SM: Characterization of the genomic Xist locus in rodents reveals conservation of overall gene structure and tandem repeats but rapid evolution of unique sequence. Genome Res. 2001, 11: 833-849. 10.1101/gr.174901.PubMed CentralView ArticlePubMedGoogle Scholar
- Myers RM, Stamatoyannopoulos J, Snyder M, Dunham I, Hardison RC, Bernstein BE, Gingeras TR, Kent WJ, Birney E, Wold B, Crawford GE: A user's guide to the encyclopedia of DNA elements (ENCODE). PLoS Biol. 2011, 9: e1001046-10.1371/journal.pbio.1001046.View ArticleGoogle Scholar
- Trapnell C, Pachter L, Salzberg SL: TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. 2009, 25: 1105-1111. 10.1093/bioinformatics/btp120.PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.