Strategies for genetic inactivation of long noncoding RNAs in zebrafish

The number of annotated long noncoding RNAs (lncRNAs) continues to grow; however, their functional characterization in model organisms has been hampered by the lack of reliable genetic inactivation strategies. While partial or full deletions of lncRNA loci disrupt lncRNA expression, they do not permit the formal association of a phenotype with the encoded transcript. Here, we examined several alternative strategies for generating lncRNA null alleles in zebrafish and found that they often resulted in unpredicted changes to lncRNA expression. Removal of the transcription start sites (TSSs) of lncRNA genes resulted in hypomorphic mutants, due to the usage of either constitutive or tissue-specific alternative TSSs. Deletions of short, highly conserved lncRNA regions can also lead to overexpression of truncated transcripts. In contrast, knock-in of a polyadenylation signal enabled complete inactivation of malat1, the most abundant vertebrate lncRNA. In summary, lncRNA null alleles require extensive in vivo validation, and we propose insertion of transcription termination sequences as the most reliable approach to generate lncRNA-deficient zebrafish.

Genetic inactivation of lncRNAs is less straightforward than for coding genes, where deletion of an exon or a point mutation in the open reading frame (ORF) often leads to stop codons or frame-shift mutations and subsequent loss of function. Several complementary strategies have been implemented to achieve genetic loss of lncRNA func-tion, including full or partial deletion of the lncRNA locus, deletion and subsequent replacement of the lncRNA locus by a reporter gene (Nakagawa et al. 2012;Sauvageau et al. 2013), deletion of the lncRNA transcription start site (TSS) and upstream regulatory regions (Fitzpatrick et al. 2002;Zhang et al. 2012) and sequence inversions ( Fig.1; Bitetti et al. 2018). Although commonly used, these lncRNA inactivation strategies have several caveats and limitations. Full deletions of lncRNA loci, which often span several kilobases, or lncRNA replacement by a reporter gene are invasive and might lead to phenotypes that are caused by removal of regulatory DNA motifs. Deletions of lncRNA TSS and upstream promoter regions may result in usage of alternative TSSs or cryptic promoters and/or impact the expression of neighboring genes. A less invasive and more accurate approach is to inactivate lncRNAs by integrating a premature polyadenylation [poly(A)] cassette. This strategy has been successfully implemented in several recent mouse lncRNA studies ( Fig. 1; Bond et al. 2009;Grote et al. 2013;Anderson et al. 2016;Ballarino et al. 2018). Whereas lncRNA locus deletion and partial lncRNA gene inversion strategies have been applied in zebrafish to genetically inactivate lncRNAs (Kok et al. 2015;Hosono et al. 2017;Bitetti et al. 2018;Goudarzi et al. 2019), analyses of complementary lncRNA silencing approaches including the minimally invasive insertion of the poly(A) sequences have not yet been carried out.
Here, we examined the efficiency of several strategies for CRISPR-Cas9-mediated inactivation of lncRNAs in zebrafish. Careful evaluation of lncRNA zebrafish mutants demonstrated that caution is required when analyzing each individual mutant allele. When genetically manipulating lncRNA loci, we found that usage of constitutive or tissue-specific alternative TSSs, overexpression or destabilization of truncated lncRNA transcripts commonly take place in vivo, minimizing or confounding the effect of the intended genetic intervention. In contrast, using our minimally invasive knock-in of a premature polyadenylation signal into the malat1 locus diminished malat1 transcripts to undetectable levels, effectively establishing a malat1 null allele in zebrafish.

RESULTS
Deletion of the conserved region of the lncRNA cyrano leads to overexpression of the truncated transcript A small fraction of zebrafish lncRNAs are conserved in mammals, representing a promising set of candidates for functional interrogation (Ulitsky et al. 2011;Hezroni et al. 2015). The conserved regions of lncRNAs are usually relatively short, ranging between 50-300 nucleotides (nt) (Ulitsky et al. 2011;Hezroni et al. 2015) and can be efficiently targeted for CRISPR-Cas9-mediated deletions in zebrafish, offering a minimally invasive strategy for functional inactivation (Fig. 1). To examine the effect of this strategy on lncRNA expression, we chose the deeply conserved lncRNA cyrano (Ulitsky et al. 2011) for genetic inter-rogations in zebrafish. We generated a ∼280 base pair (bp) deletion of the most conserved region of the 5.5 kb sequence, hereafter referred as cyrano ΔCR ( Fig. 2A,B; Ulitsky et al. 2011). Interestingly, we detected elevated levels of the residual truncated transcript in homozygous cyrano ΔCR zebrafish embryos and across cyrano ΔCR adult tissues apart from the brain (Fig. 2C,D; Supplemental  Fig. 1A). These results suggest that removal of a relatively small region of a lncRNA may have an unexpected effect on the transcript levels, potentially leading to its unintended overexpression.
TSS deletion of the cyrano locus results in hypomorphic zebrafish mutants Next, we tested if deleting the sequences surrounding and containing lncRNA TSS elements is a reliable alternative strategy for zebrafish lncRNA genetic inactivation. To this end, we generated a minimally invasive cyrano ΔTSS mutant allele by removing sequences containing the cyrano TSS (0 to +84) (Fig. 2E). Although cyrano transcript levels were reduced in cyrano ΔTSS fish, the transcript was still robustly detectable by RNA blot analysis and qRT-PCR, resulting in a hypomorphic cyrano ΔTSS mutant (Fig. 2F,G). The 5 ′ RACE (rapid amplification of cDNA ends) analysis demonstrated that in the absence of the two main TSSs usually used in WT animals, an alternative upstream TSS maintains cyrano expression in cyrano ΔTSS mutant zebrafish (Supplemental Fig. 1B-D).
Notably, neither the cyrano ΔCR mutant, with removal of the highly conserved miR-7 site (Ulitsky et al. 2011), nor the cyrano ΔTSS mutant fish exhibited obvious morphological defects. This observation is consistent with recent zebrafish and mouse studies (Kleaveland et al. 2018;Goudarzi et al. 2019) and is in contrast to previous studies that used a morpholino-based knockdown approach to inactivate cyrano (Ulitsky et al. 2011;Sarangdhar et al. 2018). lncRNA TSS removal leads to tissue-specific alternative TSS usage, maintaining lncRNA expression To test if the usage of alternative TSSs is a prevalent cellular mechanism to maintain lncRNA gene expression, we examined the effect of TSS deletions on additional lncRNAs in zebrafish. We generated a lnc-sox4a ΔTSS mutant allele by removing ∼200 bp surrounding the lnc-sox4a TSS (−43 to +157) (Fig. 3A,B). lnc-sox4a (chr19:29,161,676-29,270,573; Zv9/danRer7) (Ulitsky et al. 2011) is highly expressed in the zebrafish ovary and was successfully abolished in lnc-sox4a ΔTSS embryos and across lnc-sox4a ΔTSS adult tissues (Fig. 3C,D). However, lnc-sox4a was robustly expressed in the adult lnc-sox4a ΔTSS brain at levels comparable to WT (Fig. 3D). The 5 ′ RACE analysis confirmed that a tissue-specific alternative TSS, located in an intron 70 kb , was used only in the lnc-sox4a ΔTSS animals and maintained lncRNA expression specifically in the adult brain ( Fig. 3D). While homozygous lnc-sox4a ΔTSS fish were viable and fertile, our alternative strategy to eliminate lnc-sox4a expression by deleting the last exon failed to generate homozygous fish (Supplemental Fig. 2C,D).
We generated an additional lncRNA mutant by removing ∼390 bp surrounding the lnc-pou2af1 TSS (−74 to +315) (Fig. 4A,B). Similar to the lnc-sox4a ΔTSS allele, the level of lnc-pou2af1 (chr15:16770170-16773 603; Zv9/danRer7) was abolished in lnc-pou2af1 ΔTSS embryos and in a subset of tested lnc-pou2af1 ΔTSS adult tissues ( Fig. 4C; Supplemental Fig. 3A). However, in skin, kidney, intestine and testis, expression of lnc-pou2af1 was robustly detected in lnc-pou2af1 ΔTSS fish (Fig. 4D,E). The 5 ′ RACE analysis showed that several alternative TSSs, located ∼1 kb upstream of the main TSS, were used in the lnc-pou2af1 ΔTSS animals in a tissue-specific manner ( Together, our data showed that in the absence of the main TSS, alternative TSSs can be used in a tissue-specific manner, generating hypomorphic mutants, and minimizing the effect of the intended gene inactivation.

Insertion of a polyadenylation signal resulted in a malat1 null allele in zebrafish
Given the evidence that usage of alternative TSSs may be a common cellular mechanism to confer lncRNA expression, we tested if knock-in of a poly(A) signal into a lncRNA locus can be applied in zebrafish as a minimally invasive alternative to generate lncRNA null alleles. This approach has been successfully used to inactivate lncRNAs in mice (Grote et al. 2013;Anderson et al. 2016;Isoda et al. 2017;Ballarino et al. 2018). The malat1 locus produces one of the most abundant lncRNAs in vertebrate genomes (Ulitsky et al. 2011;Hezroni et al. 2015). Because malat1 is a mono-exonic lncRNA of ∼7.5 kb and its locus contains multiple TSSs and clustered enhancers forming a so-called super-enhancer (Pérez-Rico et al. 2017), any deletion strategy of the locus, including TSS removal, has a strong potential to affect cis regulatory elements (Fig. 5A). Therefore, we applied our improved protocol for the efficient targeted knock-in to insert a 131 bp SV40 poly(A) signal into the malat1 locus in zebrafish ( (Eissmann et al. 2012;Nakagawa et al. 2012;Zhang et al. 2012) and is in contrast to morpholino-based malat1 inactivation in zebrafish (Wu et al. 2018). Taken together, compared to lncRNA deletion strategies, poly(A) signal insertion was the most efficient and least invasive approach in zebrafish.

DISCUSSION
The identification of lncRNAs in model vertebrates, their comparative genomics analyses and recent progress in genome editing technologies has led to the generation of multiple mutant lncRNA alleles. Because common strategies for genetic inactivation of lncRNAs often do not allow distinguishing between functions mediated by the lncRNA transcript and those mediated by overlapping DNA regulatory motifs, the generation and interpretation of lncRNA null alleles can be challenging. Here, we compared zebrafish lncRNA mutant alleles generated using several alternative and commonly applied CRISPR-Cas9 strategies for lncRNA inactivation. We demonstrated that relatively small deletions of conserved regions of lncRNAs, which represent attractive target sequences to eliminate or diminish lncRNA functions (Bitetti et al. 2018;Kleaveland et al. 2018), might result in unexpected changes in lncRNA levels, such as overexpression of the remaining transcript, as demonstrated for cyrano. One possibility is that deletion of the conserved region of cyrano, which removed a highly conserved and extensively paired site to miR-7 (Ulitsky et al. 2011), stabilized the cyrano transcript in zebrafish. Alternatively, deletion of this region of cyrano in zebrafish might have caused transcriptional up-regulation. For example, if deletion of this region abrogated cyrano function, cells might have boosted transcription of the locus in an attempt to restore cyrano activity. Deletion of the conserved region of mouse cyrano does not lead to increased lncRNA levels (Kleaveland et al. 2018), which suggests that cyrano regulation has diverged between fish and mammals. A better understanding of cyrano regulation and function will help identify the source of this ectopic effect on the remaining lncRNA transcript observed in fish and how this effect might complicate interpretation of the deletion results.
Moreover, we showed that the removal of TSS and upstream regulatory regions, a commonly used approach considered to be straightforward to interpret, can result in the presence of either constitutive or tissue-specific alternative TSSs that preclude efficient inactivation of lncRNAs and result in hypomorph mutant animals. Although not shown in this study, usage of temporal-specific alternative TSSs might also contribute to the maintenance of lncRNA expression at specific developmental stages, complicating the analysis and interpretation of TSS mutant alleles in animal models. Interestingly, a recent study reported that a 326 bp deletion removing cyrano's TSS leads to loss of the lncRNA expression (Goudarzi et al. 2019). The difference observed between the cyrano ΔTSS alleles may be a consequence of the larger deletion used by Goudarzi et al. potentially leading to a more effective down-regulation of cyrano. In addition, the choice of the lncRNA detection method as well as the developmental timing of detection are important. Our data show that in TSS deletion alleles, lncRNA expression is often abolished at early embryonic stages and robustly reestablished later during development by tissue-specific alternative TSSs. These collective observations underscore the necessity to carefully validate TSS deletion alleles. Importantly, our improved protocol for efficient targeted knock-in in zebrafish enabled examination of the effect of a poly(A) signal insertion into the most abundant and enhancer-dense lncRNA locus. We demonstrate that this minimally invasive genome editing strategy, previously shown to be successful for lncRNA inactivation in mice (Grote et al. 2013;Anderson et al. 2016;Isoda et al. 2017;Ballarino et al. 2018), is a highly effective strategy in zebrafish. Given the ease of our knock-in approach, which combines the use of a single-strand oligo as a template for homologous recombination and inhibition of nonhomologous end joining, we anticipate that the insertion of a poly(A) sequence will become a widespread strategy for generating lncRNA mutant alleles in zebrafish. Furthermore, the knock-in strategy can be used for genetic tagging of lncRNAs with selfcleaving ribozymes, which has been demonstrated to perturb lncRNA expression in mouse embryonic stem cells (Tuck et al. 2018) but has not been tested yet in model organisms.
Taken together, evaluation of several independent lncRNA mutant alleles in zebrafish indicates that a combination of complementary lncRNA inactivation approaches and their careful analyses are required for robust and accurate lncRNA functional interrogation.
All zebrafish were bred and maintained at Institut Curie, Paris. Animal care and use for this study were performed in accordance with the recommendations of the European Community (2010/63/UE) for the care and use of laboratory animals. Experimental procedures were specifically approved by the ethics committee of Institut Curie CEEA-IC #118 (project CEEA-IC 2017-017) in compliance with the international guidelines. Zebrafish were staged using standard procedures (Kimmel et al. 1995).

Generation of the malat1 poly(A) allele by CRISPR/ Cas9-mediated homologous recombination in zebrafish
The CRISPR/Cas9-mediated knock-in protocol was optimized as described in Supplemental Figure 4A. Zebrafish malat1 poly(A) mutant was generated by insertion of a single SV40 poly(A) signal (131 bp) into the malat1 locus. Briefly, one-cell stage embryos were injected with a single guide RNA (100 ng, Supplemental

RNA blots
Total RNA was isolated using TRIzol (Invitrogen), separated on 1% agarose gels containing 0.8% formaldehyde, and transferred to nylon membrane (Nytran SPC, GE Healthcare) by capillary action. Blots were hybridized with α-UTP 32 P-labeled RNA probes at 68°C in ULTRAhyb buffer (Ambion) as recommended by the manufacturer. RNA probe template was amplified from zebrafish brain cDNA by PCR using the primers listed in Supplemental Table 3 (the sequence of the T7 promoter is underlined) and in vitro transcribed (RNA Maxiscript, Ambion) in the presence of α-UTP 32 P. For each replicate, RNA isolated from 30-100 embryos or tissues from three to six adult fish was used. The gel blots and hybridizations in Figure 5C were performed in biological triplicates. The hybridizations in Figures 2F and 5D were performed once.
RNA ligase-mediated and oligo-capping rapid amplification of cDNA ends (5 ′ ′ ′ ′ ′ RACE) TSS usage was determined by rapid amplification of cDNA ends (RACE) according to manufacturer's instruction (GeneRacer kit, Life Technology). Gene specific primers listed in Supplemental Table 3 were used to amplify lncRNA 5 ′ RACE products through PCR and nested PCR, subcloned into the PCR BLUNT II TOPO vector (Invitrogen), and transformed in the NEB TOP-10 cells. A minimum of 12 colonies were sequenced, and the sequences were aligned to the corresponding lncRNA genomic locus.

SUPPLEMENTAL MATERIAL
Supplemental material is available for this article.

ACKNOWLEDGMENTS
We thank all members of the Shkumatava laboratory and Ines Drinnenberg for useful discussions. This work was supported by grants from the European Research Council (FLAME-337440), ATIP-Avenir, La Fondation Bettencourt Schueller, ANR-11-LABX-0044_DEEP, and ANR-10-IDEX-0001-02, as well as PSL and La Ligue Nationale Contre Le Cancer doctoral fellowships to P.L. Author contributions: P.L. developed the protocol for the targeted knock-in in zebrafish and contributed to the design, generation, and analysis of the lnc-sox4a ΔTSS , lnc-sox4a Δ3'exon , and malat1 poly(A) alleles. H.E. contributed to the design, generation, and analysis of the lnc-pou2af1 ΔTSS and the maintenance and analyses of lncRNA alleles. L.D and F.C. contributed to lncRNA expression analyses and the maintenance of lncRNA alleles. S. M. contributed to the design, generation, and analyses of the cyrano alleles. A.B. contributed to the design, generation, and analyses of the malat1 poly(A) allele. A.G. contributed to the design and generation of the cyrano alleles. P.L. and A.S. wrote the final version of the manuscript. A.S. conceived and supervised the study.