REVIEW
Long noncoding RNAs are a promising therapeutic target in various diseases
1 Laboratory of Functional Genomics,Research Centre of Medical Genetics, Moscow, Russia
2 Laboratory of Medical Genetic Technologies, Department of Basic Research of MDRI,Yevdokimov Moscow State University of Medicine and Dentistry, Moscow, Russia
3 Genomic Functional Analysis Laboratory,Moscow Institute of Physics and Technology (State University), Dolgoprudny, Russia
Correspondence should be addressed: Alexandra Filatova
ul. Moskvorechie, d. 1, Moscow, Russia, 115478; ur.xednay@ccaam
Contribution of the authors to this work: Filatova AYu — planning, literature analysis, drafting of a manuscript; Sparber PA — literature analysis, drafting of a manuscript; Krivosheeva IA — literature analysis, drafting of a manuscript; Skoblov MYu — drafting of a manuscript. All authors participated in editing of the manuscript.
Inspired by the emergence of next generation sequencing, a lot of research studies have been carried out in the past decade, leading to an amazing discovery: although about 70 % of the human genome is transcribed, only 1.5 % of it encode proteins. The rest of the transcriptome is represented by noncoding RNAs (ncRNAs), including such well-studied classes of ribonucleic acids as ribosomal (rRNA), transfer (tRNA), micro (miRNA), small nuclear (snRNA), small nucleolar (snoRNA) and other types of RNA. In recent years, another class of ncRNAs, long noncoding RNAs (lncRNAs), has been the focus of active research. lncRNAs arelarge transcripts (>200 nt) lacking a long open reading frame.
According to the most recent data provided by GENCODE (Encyclopedia of genes and gene variants), the human genome contains 15,787 lncRNA genes [1]. Although less than 200 of them have been assigned a function so far [2], it is already clear that lncRNAs constitute a heterogeneous group of functionally diverse transcripts. They can regulate gene expression at the transcriptional level by forming complexes with transcription factors [3, 4] or by recruiting chromatin-modifying complexes, such as repressive PRC1 [5], PRC2 [4, 6, 7] and LSD1 [8] or activating TrxG [9]. Besides, lncRNAs can modulate posttranscriptional events. Long noncoding RNAs may play role of competing endogenous RNAs (ceRNAs). These RNAs attract microRNAs and regulate expression level of transcripts containing common miRNA binding sites [10, 11]. LncRNAs can also form duplexes with target mRNAs and inhibit their translation [12] or disrupting their stability [13, 14]. In addition, some long noncoding RNAs can modulate pre-mRNA splicing [15, 16, 17].
Although few lncRNAs have been functionally characterized so far, it is now clear that they have a role in many diseases. For example, the lncRNADisease database contains entries about 321 lncRNAs associated with 221 diseases from ~500 publications [18], including various cancer types, neurodegenerative disorders, cardiovascular diseases, conditions associated with genome imprinting, and other pathologies. In a number of cases, lncRNAs have been recognized as key components of molecular pathways to disease, therefore they definitely have the potential to be used as biomarkers and therapeutic targets. Below, we describe several interesting lncRNAs that can be used as a therapeutic target in various diseases and highlight some of the currently existing approaches to gene therapy that can be employed to modulate lncRNA activity.
Gene therapy approaches targeting long noncoding RNA
Disease development and progression can be triggered by both activation of lncRNA expression [3, 19, 20, 21, 22, 23] and its downregulation [24, 25, 26, 27, 28], prompting researchers to seek therapeutic ways to activate or suppress lncRNA expression or inhibit its activity. Expression vectors (plasmids or viral particles) and gene-specific transcriptional activators are used to activate gene expression. Among gene-silencing tools are RNA-interference, antisense oligonucleotides (ASOs), gene-specific transcription repression and genome editing. LncRNA activity can be inhibited by ASOs or small molecules.
The techniques mentioned above were designed to treat diseases caused by alterations in protein-coding genes, but they can also be successfully used to manipulate lncRNA. The repertoire of tools currently used for manipulating protein-coding genes is much more diverse only because we know quite a lot about the life and functions of proteins, while our knowledge of lncRNA is limited. We believe that new facts about lncRNA functions in norm and pathology will stimulate the development of novel, effective and highly specific methods of gene therapy. For example, research studies of secondary lncRNA structures may yield very interesting results adding to the existing pool of knowledge.
Below we briefly describe some basic strategies of regulating lncRNA expression and activity (see the figure).
Expression vectors
Expression vectors are the most popular tools used in research studies and clinical practice for gene upregulation. This approach is actively used to compensate for loss-of-function mutations. There are a large number of different vectors systems and ways of their delivery to cells (viral and non-viral). Each system has some advantages and disadvantages [29].
Of particular interest are tissue- and tumor-specific promoters that ensure high specificity of target gene expression. For example, promoter of lncRNA H19 is active in different cancer cells. Therefore, incorporated into a construct expressing tumor suppressor genes (both protein-coding and noncoding), this promoter will regulate their expression only in tumorigenic cells [30].
RNA interference
RNA interference is a mechanism of gene silencing by small RNA molecules, small interfering RNAs (siRNAs) and mircoRNAs, that are short RNA duplexes of 21 to 25 bp in length. Unwound, one of siRNA/miRNA strands called a guide strand gets incorporated into the RNA-induced silencing complex (RISC) and guides it to the RNA target, eventually causing its degradation or inhibiting its translation.
RNA interference can be used to silence both protein-coding and long noncoding RNA genes. Currently new methods of gene therapy are being elaborated based on the use of siRNA, microRNA and small hairpin RNA (shRNA). The latter is siRNA precursor; shRNA is delivered into the cell encoded in the expression vector. Although RNA interference is a very effective gene silencing tool, there are a few issues related to its specificity, immunogenicity and delivery that hamper its use as a therapeutic technique. Researchers are attempting to circumvent these issues, one of the solutions being chemical modifications to small RNAs.
Antisense oligonucleotides
Antisense oligonucleotides (ASOs) are small synthetic RNA/DNA-based molecules that complementarily bind to RNA targets to downregulate their expression or affect their function. Most commonly, ASOs recruit RNAse H that cleaves the RNA strand of an RNA/DNA duplex [31]. Some ASOs are designed to modulate splicing: they interact with pre-mRNA and blocking its binding with splicing factors (this is called splice-switching) [32]. Splicing-modulating ASOs are produced with chemical modifications (peptide nucleic acid (PNA) or phosphoramidate morpholino oligomers (PMO)). The resulting oligonucleotides do not activate RNAse H-dependent cleavage [33]. Some ASOs have been shown to block the binding of lncRNA to the chromatin-modifying complex PRC2 [21].
Among other ASO modifications are LNA (Locked nucleic acid) and 2’-O-Methyl (2’-OMe) modifications. They increase ASO stability, specificity and affinity to the RNA target. Besides, the LNA-modification has been shown to have no negative effect on the ability of an RNA molecule to bind to RISC [33]. To sum up, RNA-interference and antisense oligonucleotides are still the most popular tools for gene silencing used in clinical practice. As of 2016, 26 siRNA- and ASO-based drugs were in clinical trials, presumably effective against >50 diseases [33].
Genome editing
Genome editing is very instrumental in repairing deleterious mutations or implementing gene knockout. In our review we will provide examples of disorders caused by lncRNA overexpression. In such cases, it is possible to use genomic editing for knockout of long non-coding RNA genes.
Genome editing techniques gained their popularity after the sufficient amount of data had been accumulated about zinc finger proteins (ZNF) [34] and transcription activator-like effector proteins (TALENs) [35], synthetic constructs capable of binding to DNA. Their mechanism of action relies on the ability of certain protein sequences (monomers) to bind to certain nucleotides of the double-stranded DNA. Assembled into a longer construct, these monomers result into a protein that can bind to a desired DNA sequence. Each ZNF monomer recognizes a 3 nt long sequence, while TALEN monomers recognize single nucleotides, which makes TALEN systems more flexible.
Unlike TALEN and ZNF, their younger rival CRISPR (Clustered Regulatory Interspaced Short Palindromic Repeats) employs a small guide RNA (sgRNA) capable of binding to the target DNA and recruits the Cas9 nuclease to cleave both strands of the DNA molecule. CRISPR/Cas9 is more effective and easier to use than ZNF or TALEN systems [36]. It is now actively used for genome editing and some other tasks [37]. In spite of the ongoing debate about whether CRISPR/Cas9 is ready to be introduced into clinical practice considering its insufficient specificity [38, 39, 40], many research teams are attempting to optimize this technology and design new CRISPR/Cas9-based tools that could be used in the clinical setting.
Transcription regulation
The potential of CRISPR/Cas9 is not be limited to genome editing tasks. This technology can be used to regulate gene expression without interfering with the DNA structure. This can be (and has been) achieved by engineering a mutant nuclease-deactivated Cas9 protein (dCas9). In order to regulate gene expression, dCas9 is attached to different protein domains that either activate (CRISPRa: VP64, p65, Rta) or repress (CRISPRi: KRAB, ZNF10) transcription [41, 42, 43, 44]. Such systems have proved to be effective in the experiments on cell lines. For example, Gilbert et al. used a dCas9-KRAB system to repress transcription of 5 lncRNAs implicated in cancerogenesis (H19, MALAT1, NEAT1, TERC, and XIST). The researchers reported a >80 % drop in the expression of target RNAs, which means that the system can be no less effective than RNA interference or ASOs [45]. In addition, Perez-Pinera et al. have demonstrated a possibility of using dCas9-VP64 to enhance transcription of various protein-coding genes 2- to 250-fold [44].
CRISPRa/i-based methods have a number of advantages over RNA interference and expression vectors. First, only CRISPRa/i can be used to modulate RNA function in cis. It is known that noncoding RNAs often exert their function in the transcriptionally active locus, and sometimes lncRNA transcription itself is crucial for the regulation of neighboring genes [21, 28, 46]. Second, activation of the endogenous promoter will trigger normal expression of all lncRNA splice variants [47]. However, considering the fact that lncRNAs often overlap with one or several protein-coding genes or share with them a promoter region, the use of CRISPR-based systems for regulating lncRNA activity is somewhat limited [47].
Regulation of lncRNA activity by small molecules
Another approach to treating lncRNA-associated diseases is based on the use of small molecules that disrupt interactions between lncRNA and its partner proteins [48, 49]. Capable of blocking formation of lncRNA-protein complexes, such molecules can be identified by high throughput screening [50]. Compared to other therapeutic agents used in gene therapy, small molecules are easy to deliver to their targets and are readily absorbable by the cells. Interactions between lncRNAs and proteins, such as HOTAIR–PRC2, ANRIL–CBX7, PCAT-1–PRC2 and H19–EZH2, have become attractive targets for screening for inhibitors of small molecules [51]. For example, Zhou et al. have elucidated the role of HOTAIR in glioblastoma using small molecules, DZNEP and 2-PCPA, capable of inhibiting HOTAIR interaction with PRC2 and LSD1 proteins [52].
lncRNAs as potential therapeutic targets in various diseases
The role of lncRNA SAMMSON in melanoma
Melanoma is a malignant tumor originating from melanocytes of the skin. Melanomas are highly metastatic [53]. According to the American Cancer Society, melanomas account for 4–6 % of new cancer cases [54].
In 2016 Leucci et al. [3] carried out an extensive research study on the role of the long noncoding RNA SAMMSON in melanoma and attempted to evaluate its feasibility as a therapeutic target. Drawing on the Cancer Genome Atlas (TCGA) data, the authors demonstrated that this lncRNA is expressed ectopically in more than 90 % of melanoma samples and that its expression is melanoma-specific. Further research revealed that SAMMSON expression is triggered by the well-known melanoma-specific transcription factor SOX10. Knockdown of SAMMSON by LNA-modified oligonucleotides in melanoma cell lines considerably slowed cell growth and stimulated apoptosis. It was also shown that SAMMSON directly interacts with protein p32, and that the mitochondrial fraction of p32 is downregulated after SAMMSON knockdown. This, in turn, disrupts synthesis of proteins involved in the mitochondrial respiratory chain, reduces mitochondrial membrane potential and promotes apoptosis. Besides, low levels of p32 in mitochondria entail accumulation of ‘toxic’ aberrant mitochondrial precursor proteins in the cytosol, which also leads to cell death.
In the light of the above, SAMMSON seemed to be a good therapeutic target in melanoma. To prove it, Leucci et al. conducted a study in vivo using a xenograft mouse model; the animals received a modified antisense oligonulceotide against SAMMSON. The antisense nucleotide slowed tumor growth 1.5-fold in comparison with the controls. Besides, a combination treatment with the antisense oligonucleotide and dabrafenib (selective inhibitor of BRAFV600E mutant) considerably enhanced the therapeutic effect of the latter (the observed effect was two times stronger). The researchers concluded that lncRNA SAMMSON can be used as an early marker of melanoma malignancy and a promising therapeutic target.
The role of lncRNA BCAR4 in breast cancer
Breast cancer is the most common type of cancer occurring in women. It is the second leading cause of death in cancer-stricken females (14 %). Xing et al. [19] have demonstrated that lncRNA BCAR4 is not expressed in healthy breast tissue, but observed in more than 50 % of breast tumor samples. BCAR4 expression increases when cancer spreads to the lymph nodes. Besides, increased BCAR4 expression correlates with declining survival rates of breast cancer patients. Earlier studies conducted by other authors demonstrated that BCAR4 expression is triggered in malignant cells in response to tamoxifen, rendering them resistant to antiestrogen therapy [55].
Xing et al. [19] have shown that BCAR4 knockdown in breast cancer cell lines significantly reduces cell migration and invasion, but does not affect cell proliferation. Using mass spectrometry and methods of affinity purification of lysates, the researchers established that lncRNA BCAR4 directly interacts with SNIP1 and PNUTS proteins. It was discovered that SNIP1 mediates formation of the phosphorylated GLI2/BCAR4 complex. GLI2 is a transcription factor: it regulates transcription of genes involved in cell migration and invasion by activating hedgehog signaling pathway. The ChiRP assay (Chromatin isolation by RNA purification) localized the BCAR4 transcript to the promoter regions of the GLI2 targetgenes (PTCH1, IL-6, MUC5AC, TGF-b). Upon BCAR4 knockdown, expression of these genes was downregulated.
It was established that lncRNA BCAR4 interacts with PNUTS forming a complex with protein phosphatase 1 (PP1), which in turn dephosphorylates RNA polymerase II to maintain its normal function. Therefore, lncRNA BCAR4 makes an important contribution to activating GLI2 target genes transcription, which at the cellular level stimulates migration and invasion of malignant cells.
In their work, Xing et al. demonstrated a therapeutic effect of BCAR4 knockdown in vivo using a mouse xenograft model of an aggressively metastasizing breast tumor. In the course of treatment, the animals received intravenous injections of 2 different LNA-modified antisense oligonucleotides against BCAR4; controls received scramble LNA. The animals treated with therapeutic LNA showed considerable regression of lung metastasis, while the size of the main remained stable.
The researchers also tried another treatment strategy: injections of target shRNAs into adipose breast tissue. The outcome was similar to that of the first experiment: considerable regression of lung metastasis and no changes in the size of the main tumor. But the therapeutic effect was more marked. Xing et al. proposed to use BCAR4 expression as a marker of breast cancer progression and as a therapeutic target in patients at high risk of metastasis and with resistance to estrogen antagonists.
lncRNA HOTAIR in various types of cancer
HOTAIR (Hox transcript antisense intergenic RNA) is a noncoding RNA implicated in various cancer types [7]. HOTAIR is transcribed from the antisense strand of HOXC cluster of chromosome 12 and is capable of recruiting PRC2 (polycomb repressive complex 2) and LSD1 (lysine-specific demethylase 1) [8] to another cluster of homeobox genes called HOXD [7]. PRC2 catalyzes histone H3K27 methylation, while LSD1 catalyzes demethylation of H3K4me2 causing silencing of target genes. Further experiments showed that HOTAIR can recruit PRC2 not only to HOXD genes, but also to a variety of others, including PGR (progesterone receptor gene), genes of the protocadherin family (PCDH10, PCDHB5, PCDH20), EPHA1 and JAM2 involved in tumor angiogenesis [56], and tumor suppressor genes (e. g. PTEN [7]).
It has been shown that HOTAIR expression in breast cancer metastases is dramatically increased [56], while its expression in the main tumor is quite heterogeneous. Based on the analysis of primary tumors, the researchers concluded that elevated levels of HOTAIR expression is predictive factor for metastatic growth and poor prognosis. Besides, the experiments on cell lines and in vivo (on mice) demonstrated that increased HOTAIR expression promotes invasion and metastasis to the lungs. Gupta et al. modeled metastatic tumors in mice by injecting vector-transduced HOTAIR — overexpressing MDA-MB-231 cells in the tail vein; an empty vector was used for control. The engraftment of cells with HOTAIR overexpression to the mammary fat padwas significantly but not markedly increased in comparison with the controls, while the engraftment to lung was 4 times higher [56].
Further studies elucidated the role of HOTAIR in the development of different cancer types, such as esophageal squamous-cell carcinoma, non-small cell lung cancer, gastric cancer, hepatocellular carcinoma, endometrial cancer, prostate cancer, nasopharyngeal carcinoma, laryngeal squamous-cell carcinoma, pancreatic cancer, colorectal cancer, melanoma, glioma, and sarcoma. In most cases, HOTAIR overexpression correlates with aggressive metastatic growth and poor survival rates [57].
Therefore, HOTAIR can be a promising therapeutic target in cancers with poor treatment outcomes. So far there have been a few of in vivo experiments on mouse xenograft models, reporting considerable inhibition of tumor growth associated with reduced HOTAIR expression. For example, Li et al. conducted experiments on the mouse models of laryngeal squamous-cell carcinoma. The animals received subcutaneous injections of the Hep-2 cell line to develop cancer. Treatment included intratumoral injections of a lentiviral vector containing shRNA against HOTAIR. By the end of the experiment, tumors in the main group were significantly smaller than in the controls (1.113 ± 0.209 g vs. 1.960 ± 0.584 g, respectively) [58].
The role of lncRNA MALAT1 in cancer
The long noncoding RNA MALAT1 (Metastasis associated lung adenocarcinoma transcript 1) was first described in 1997, but got its current name only in 2003 when Ji et al. demonstrated the association between its expression and metastasis in patients with non-small cell lung cancer. It is the first lncRNA whose role has been described in the development of cancer [59]. MALAT1 is expressed abundantly in healthy human tissues and is conserved in mammals [59]. It has been shown that MALAT1 localizes to the cell nucleus in the nuclear speckles [15].
MALAT1 сan form complexes with SR splicing proteins and change their localization in the nucleus. It is also capable of regulating phosphorylation of SF2/ASF. Tripathi et al. have demonstrated that MALAT1 knockdown in the cervical adenocarcinoma cell line HeLa modulates alternative splicing of many genes [16]. However, other studies conducted on lung cancer cell lines (A549, WT, GFP, KO1-3) do not report any considerable effect of MALAT1 knockdown on alternative splicing [60]. Besides, MALAT1 knockdown in mice does not result in developmental disorders or a pathological phenotype and does not change localization of SP proteins. Perhaps, MALAT1 function is different in mice and humans or special conditions are required (such as stress) for MALAT1 to manifest its activity phenotypically [20, 61].
While it is still debatable whether MALAT1 modulates alternative splicing, its involvement in the expression regulation of a number of genes is undeniable. For example, Tano et al. have demonstrated that MALAT1 knockdown in A549 lung carcinoma cells results in the reduced expression of genes responsible for cell migration (CTHRC1, CCT4, HMMR, ROD1, etc.), which negatively affects cell motility [62]. Another extensive research study conducted on several cell lines confirmed the role of MALAT1 in the activation of genes implicated in metastasis (GPC6, LPHN2, CDCP1 and ABCA1). The study demonstrated that expression of migration and invasion inhibitor genes (MIA2, ROBO1) increases following MALAT1 knockdown [60]. A possible mechanism of expression regulation by MALAT1 was proposed by Yang et al. who demonstrated that this lncRNA is capable of forming complexes with protein Pc2, specifically, with its unnmethylated fraction; Pc2 methylated fraction interacts with another lncRNA, TUG1, and is a component of the PRC1 complex (polycomb repressive complex 1) [5].
The role of MALAT1 in promoting human lung cancer metastasis was shown in the first works dedicated to this lncRNA [59, 62]. Later, the role of its aberrant expression was shown for other cancer types, including bladder cancer, breast cancer, cervical cancer, colorectal cancer, endometrial cancer, esophageal cancer, gastric cancer, hepatocellular carcinoma, melanoma, neuroblastoma, osteosarcoma, ovarian cancer, prostate cancer, pituitary adenoma, multiple myeloma and renal cell carcinoma [20].
Considering the above said, MALAT1 seems to be a very promising therapeutic target to prevent metastatic growth. Gutschner et al. studied lung cancer metastasis on a mouse model in vivo. The mice received subcutaneous injections of human-derived EBC-1 cells to develop cancer. Then they were divided into 2 groups: one group received subcutaneous injections of an antisense oligonucleotide against MALAT1, the other received a control ASO. It was shown that the size of the primary tumor did not change between the groups, while the number and size of lung metastases were smaller in the animals treated with ASO against MALAT1. The researcher concluded that performing MALAT1 knockdown in the tumor can effectively prevent metastasis [60].
The role of lncRNA BACE1-AS in Alzheimer’s disease
Alzheimer’s disease is the most common form of age-related dementia, the neurodegenerative disorder manifested by memory loss and speech and cognitive impairments as a result of neuronal loss caused by extracellular deposition of β-amyloid plaques that damage the cells [63]. A key role in the formation of amyloid plaques is played by BACE1 (β-site APP-cleaving enzyme 1), the β-secretase that cleaves the APP precursor protein to produce β-amyloid that assembles into plaques [64].
BACE1-AS is a 2 kbp long noncoding RNA transcribed from the opposite strand of locus 11q23.3. This lncRNA contains a 106 nt-long region fully complementary to exon 6 of mRNA BACE1. Faghihi et al. studied involvement of BACE1-AS into the pathogenesis of Alzheimer’s by measuring BACE1 expression [13]. In their experiment, strand-selective knockdown of the BACE1-AS transcript performed in human neuroblastoma cells (SH-SY-5Y) significantly reduced the levels of BACE1-AS, its antisense partner BACE1, and β-secretase protein. At the same time, increased expression of BACE1-AS was accompanied by an increase in BACE1 RNA and protein expression. Besides, the researchers showed that BACE1 and BACE1-AS can form an RNA-RNA duplex which enhances stability of BACE1 mRNA. Different cell stress factors, includingtreatment with amyloid plaques, stimulate overexpression of both BACE1 and BACE1-AS.
The obtained data are consistent with the fact that in patients with Alzheimer’s BACE1-AS expression increases 2 to 6-fold in the affected brain regions, compared to control samples. Cellular stress stimulates expression of lncRNA BACE1-AS that forms a duplex with BACE1, enhancing its stability. As a result, β-secretase levels go up prompting deposition of amyloid plaques, which in turn stimulates expression of BACE1-AS, forming a vicious circle.
Based on the results of their previous work, the authors hypothesized that siRNA against BACE1-AS and BACE1 may have the potential of being a good therapeutic target in Alzheimer’s [65]. In their experiment they used Tg-19959 transgenic mice with overexpressing human APP. The animals were implanted with an osmotic minipumps in their 3rd ventricle and received continuous infusions of LNA-modified siRNAs against BACE1 and BACE1-AS separately or against overlapping region over the period of 14 days. In all cases BACE1 mRNA levels appeared to be significantly reduced following the knockdown, but simultaneous knockdown of both transcripts was the most effective: BACE1 mRNA level decreased by 60 % of the initial value. The authors also investigated the effect of BACE1-AS knockdown on the levels of insoluble beta-amyloid in vivo. After a 14-day-long infusion of siRNA against BACE1-AS, β-amyloid concentrations were measured in the hippocampal tissues of mice. It was shown that treatment with siRNA against BACE1-AS leads to a considerable reduction in insoluble β –amyloid concentrations in the hippocampus, while the concentrations of soluble amyloid do not change. Faghihi et al. concluded that BACE1 and BACE1-AS can be used as therapeutic targets in Alzheimer’s disease.
lncRNA SMN-AS in the development of spinal muscular atrophy
Spinal muscular atrophy (SMA) is an autosomal recessive inherited disorder characterized by progressive degeneration of neurons in the anterior horns of the spinal cord and manifested by symmetrical muscle weakness and atrophy [66]. SMA is caused by a deletion or mutation in the SMN1 gene (Survival Motor Neuron 1) [67]. It is known that duplication of human SMN1 occurred in the course of evolution resulted in the emergence of gene SMN2. Its sequence is almost identical to that of SMN1 except that it has a single nucleotide substitution in exon 7, which disrupts normal pre-mRNA splicing and causes skipping of exon 7 in the mature mRNA. As a result, an unstable truncated protein is generated. It should be noted though that 10–20 % of SMN2 mRNAs are spliced correctly and produce a mature protein identical to that of SMN1 [68, 69].
Human SMN2 is localized to the unstable chromosomal region prone to duplication, deletion and gene conversion. Therefore, the number of SMN2 copies in humans varies [70]. In SMA-stricken patients with the large number of SMN2 copies the symptoms are mild [71]. Type I SMA (1-2 SMN2 copies) tends to have an early onset; the patient dies before the age of 2 years. Patients with disease types III and IV have 3 or more copies of SMN2; in this case the onset of the disease is either juvenile or adult, and the progression is slow [72].
Methods aimed to increase endogenous concentrations of SMN2 can considerably alleviate patient’s condition. Woo et al. [21] have analyzed publicly available ChIP-seq data (ChIP-seq is chromatin immunoprecipitation with subsequent sequencing) from the ENCODE project and assumed that the repressive PRC2 complex binds to the SMN2 locus. Then the authors made experiments on SMA fibroblast cell lines. It was shown that knockdown of EZH1 and EZH2 incorporated into the PRC2 complex leads to a >2-fold increase of exon 7-containing full-length SMN mRNA levels.
Woo et al. also discovered a previously unexplored lncRNA transcribed on the SMN locus, which they termed SMN-AS1 (SMN-Antisense 1). The high homology between SMN1 and SMN2 prompted the authors to hypothesize that SMN-AS1 is transcribed from both loci. Using RT-PCR, they detected a correlation between SMN-AS1 expression and the number of SMN2 copies in the genome. Besides the researchers demonstrated that SMN-AS1 is capable of recruiting PRC2 to the SMN loci, which means that SMN-AS1 can downregulate the levels of SMN transcripts.
In the light of the above said, methods aimed to downregulate SMN-AS1 activity can be used in the management of patients with SMA. The method proposed by Woo et al. is based on interrupting lncRNA interaction with PRC2. The researchers used an LNA-modified ASO complementary to the PRC2-binding region of SMN-AS1. Transfection of such LNA-modified oligos into SMA fibroblasts led to 6-fold increased levels of full-sized SMN protein. Evidence supplied by the RNA immunoprecipitation (RIP) assay was sufficient to conclude that the introduced ASO blocked PRC2 binding to SMN-AS1. The LNA-modified oligo exhibited high specificity and did not produce any significant off-target effects. Besides, it was shown upregulation of a full-sized SMN was dependent of LNA concentrations. Similar results were obtained on the neuronal cell model derived from patient iPSCs.
Woo et al. also studied the effect of LNA-modified oligos against SMN-AS1 in combination with other previously described ASOs participating in SMN2 splicing correction [73]. It was established that treatment of cells with two antisense oligonucleotides vs. one splicing corrector leads to a 2-fold increase in the amount of full-length mRNA SMN2. The levels of the functional SMN protein also increase following the treatment. Thus, the maximum therapeutic effect can be obtained using a combination of two approaches.
lncRNA HTTAS in Huntington's disease
Huntington's disease is an autosomal dominant progressive neurodegenerative disorder with late onset and a distinct phenotype, including chorea and dystonia and the cognitive dysfunction [74]. The disease is caused by trinucleotide CAG repeat expansion in the huntingtin gene (HTT). In healthy individuals the number of repeats varies from 9 to 36; expanded to >37 repeats, the gene produces a toxic mutant protein with a long polyglutamine tract [75].
Chung et al. discovered a previously unknown lncRNA referred to as HTTAS (huntingtin antisense). This RNA is transcribed from the antisense strand of the HTT locus [28]. Two isoforms of HTTAS have been described so far, HTTAS_v1 being the most interesting of the two. Its first exon harbors the expanded CAG repeat region. Chung et al. have demonstrated that antisense transcript expression is dependent on the length of the repeat: the higher the number of repeats, the lower the expression level. Induction of under- and overexpression of HTTAS_v1 in human cell lines HEK293 and SH-SY-5Y provided sufficient evidence about the negative effect of the HTTAS_v1 transcript expression on the level of HTT mRNA. Using a genetic construct containing a cytomegalovirus promoter to increase HTTAS_v1 expression, the researchers achieved a 90 % reduction of HTT levels regardless of the repeat length. It was shown that the negative effect of normal HTTAS_v1 expression on HTT is weaker if the repeat is longer. To sum up, HTTAS inhibits HTT expression in healthy individuals; however, in pathology this mechanism is disrupted leading to excessive accumulation of the toxic protein and disease progression. The authors concluded that HTTAS_v1 overexpression may serve as a therapeutic tool in the treatment of Huntington’s disease.
lncRNA UBE3A-ATS in Angelman syndrome
Angelman syndrome is a genomic imprinting disorder; among its signs are mental retardation, seizures, facial dysmorphism and a specific behavioral phenotype [76]. In 60–70 % of cases this condition is caused by deletion of the 15q11-13 region on the maternal chromosome. Other less common causes include paternal uniparental disomy (2–5 % of cases) and mutations in the UBE3A gene (20 % of cases). All of the above leads to the lack of UBE3A expression, the gene that codes for E3 ubiquitin ligase normally expressed only from the maternal chromosome in neurons.
UBE3A-ATS is a long noncoding RNA transcribed from the antisense strand of UBE3A. It is a part of a larger transcript in humans and mice, whose transcription start site is located next to the imprinting center on the long arm of chromosome 15 [77]. In healthy individuals this lncRNA is exclusively expressed in neurons from the paternal chromosome, while sense UBE3A transcripts are expressed maternally [78].
Meng et al. investigated how mRNA UBE3A expression is regulated by its antisense partner lncRNA UBE3A-ATS [46]. Using the mouse model of Angelman syndrome, the researchers showed that deletion of UBE3A-ATS promoter region activates expression of UBE3A on the paternal chromosome in vivo. To confirm that UBE3A-ATS transcription silences UBE3A, the researchers used mice in which UBE3A-ATS transcription from the paternal chromosome was prematurely terminated. It was discovered that UBE3A-ATS-deficient neurons had elevated levels of UBE3A expression on the paternal chromosome, revealing the role of noncoding RNA UBE3A-ATS in silencing UBE3A. A conclusion was drawn that induced expression of UBE3A on the paternal chromosome prevent a pathological phenotype in cases when the 15q11-13 region is deleted from the maternal chromosome.
In 2015 the same team published another work proposing UBE3A-ATS silencing by ASO for triggering UBE3A expression on the paternal chromosome as a possible therapy for Angelman syndrome. The experiment was carried out on a mouse model. The authors obtained a neural culture from model animals and treated it with ASO against UBE3A-ATS, which increased UBE3A expression to 66–90 % of its initial level in wildtype mice. Besides, the authors proved specificity of the used ASO by RT-PCR demonstrating that UBE3A neighboring genes did not change their expression.
A series of in vivo experiments were carried out in which single doses of ASOs against UBE3A-ATS were injected into the lateral ventricle of the brain of adult mice. The animals responded to injections positively. A month after the injection they did not display any significant changes in weight, neural cell death rates or increased formation of glial tissue. Four weeks after the injection a considerable 60–70 % drop in UBE3A-ATS levels was observed, while UBE3A expression increased 2- or 5-fold in various regions of the brain and the spinal cord. The level of UBE3A-ATS remained low for 16 weeks after the injection, then it started to grow gradually until reaching its initial level by week 20. Expression level of UBE3A changed relatively. Besides, the analysis of the phenotype showed that ASO administration ameliorated cognitive deficits associated with the disease [22].
lncRNA DBE-T in the development of type 1 Landouzy-Dejerine facioscapulohumeral muscular dystrophy
Type 1 Landouzy-Dejerine facioscapulohumeral muscular dystrophy (FSHD1) is an autosomal dominant muscular dystrophy with a progressive loss of muscle strength in facial muscles and shoulder girdle [79]. The diseases is caused by a deletion in the chromosomal region 4q35, which harbors 11 to 110 copies of the 3.3kpb-long D4Z4 macrosatellite repeat in healthy people. If the number of repeats drops to <11, the disease develops [23]. If the number of D4Z4 repeat copies is big, this locus is heterochromatic and transcriptionally silent. The repressive PRC2 complex plays a role in maintaining the silent state of 4q35 by exhibiting methyltranspherase activity towards histone H3 (H3K27me3). Once the number of repeats starts to go down, H3K27me3 methylation is reduced and transcription of 4q35 genes is derepressed. Among genes harbored by the 4q35 region is DUX4 [80, 81]. Protein DUX4 is a transcriptional factor, and its aberrant expression is toxic for cells [82].
Cabianca et al. discovered that in the muscles of FSHD1-stricken patients transcription of lncRNA DBE-T (D4Z4 Binding Element-Transcript) occurs at locus D4Z4. Expression of this transcript is not detected in healthy people. Due to the large number of D4Z4 repeats, PcG (Polycomb Group) proteins actively bind to each repeat inhibiting transcription in this locus. If the number of D4Z4 repeats drops, PcG proteins do not bind to DNA and do not hinder DBE-T transcription. Using chromatin and RNA immunoprecipitation assays (ChIP-qPCR and RIP), the researchers established that noncoding RNA DBE-T can bind directly to protein ASH1L and recruit it to the D4Z4 locus. Protein ASH1L is a component of the TrxG complex that mediates transcription derepression in locus 4q35, thus activating expression of the DUX4 protein toxic for muscle cells [9].
It is known that other genes in locus 4q35, such as FRG1 (Facioscapulohumeral muscular dystrophy (FSHD) region gene 1), also contribute to FSHD1 development [80]. In their previous works aimed to elaborate new approaches to FSHD1 therapy, Wallace et al. proposed the use of adeno-associated virus-mediated delivery of microRNA to gene FRG1 [83]. It is evident though that FSHD1 pathogenesis is very complex and involves several genes from the D4Z4 locus. While their expression is triggered by lncRNA DBE-T. This fact suggested that silencing of lncRNA DBE-T can be a promising therapeutic method in patients with FSHD1 [9], though no works about its feasibility have been published so far.
CONCLUSION
Until recently it was thought that proteins are the main end product of gene expression. Therefore, for a long time research studies of disease pathogenesis and therapeutic methods were bound to rely on protein-coding genes. But once the state-of-the-art methods of analysis had been introduced, the whole universe of noncoding transcripts was discovered that are no less diverse and abundant than proteins. Exploration of lncRNA functions has just begun. But it is already clear that they are involved into the majority of cellular processes and participate in the pathogenesis of various diseases. It has been shown that lncRNA behavior can be manipulated using standard molecular biological approaches. Success in this field depends on the profound and detailed investigation of lncRNA function in in norm and pathology. This reminds us of the importance of fundamental science which serves as a basis for applied research.