|Year : 2017 | Volume
| Issue : 1 | Page : 1-12
Genome-wide transcriptome analysis of prostate cancer tissue identified overexpression of specific members of the human endogenous retrovirus-K family
Department of Molecular Biology, Max Planck Institute for Infection Biology; Department of Microbiology/Molecular Biology, Institute of Biology, Humboldt University of Berlin, Berlin, Germany
|Date of Submission||29-Nov-2016|
|Date of Acceptance||18-Jan-2017|
|Date of Web Publication||23-Feb-2017|
Department of Molecular Biology, Max Planck Institute for Infection Biology, Chariteplatz 1, 10117 Berlin
Source of Support: None, Conflict of Interest: None
Aim: Human endogenous retroviruses (HERVs) are integrated into the human genome and represent 8% of the total genome. A retrovirus is the most complete retroelement and is characterized by three defined sets of regions of genes: gag, pol, and env, flanking by long terminal repeats. Among different HERVs, the K family is one of those that have been most recently integrated into the human genome. Activation and expression of members of this family are shown to be connected with some human diseases including prostate cancer. Here, we showed the global expression pattern of HERV-K (HML-2) in prostate cancer tissue.
Methods: Samples from 14 patients were subjected to whole transcriptome sequencing on rRNA-depleted samples. This analysis was performed through a distinct bioinformatics method and confirmed through a series of quantitative polymerase chain reaction and immunoblotting experiments.
Results: For the first time, we showed the expression of gag protein in prostate cancer tissue both in sequencing results and also immunoblotting. This showed a higher expression of the gag protein in the tumor samples relative to benign samples.
Conclusion: Overexpression of gag in the tumor can implicate a role within its overexpression in tumor tissue, either by acting on neighboring genes or through the activation of promoter transcription factors.
Keywords: Human endogenous retroviruses, prostate cancer, RNA-seq
|How to cite this article:|
Sayanjali B. Genome-wide transcriptome analysis of prostate cancer tissue identified overexpression of specific members of the human endogenous retrovirus-K family. Cancer Transl Med 2017;3:1-12
|How to cite this URL:|
Sayanjali B. Genome-wide transcriptome analysis of prostate cancer tissue identified overexpression of specific members of the human endogenous retrovirus-K family. Cancer Transl Med [serial online] 2017 [cited 2018 Mar 20];3:1-12. Available from: http://www.cancertm.com/text.asp?2017/3/1/1/200859
| Introduction|| |
About 42% of the human genome is constituted by retroelements. In addition to non-long terminal repeat (LTR) retrotransposons of the long interspersed nuclear elements and short interspersed nuclear elements families, other members are distinct exogenous retroviruses known as human endogenous retroviruses (HERVs). Several groups can be distinguished, of which the HERV-K group is among the youngest, i.e., they have been recently integrated into the human genome.,,, The HERV-K family comprises more than 900 solitary LTRs and around 90 intact full-length proviruses.,,,,, HERV-K expression has been assessed in various human cancers. Studies suggest a remarkable tissue- and cancer-specific activity of individual HERV-K members., For example, HERV-K24 expression was prominent in malignant germ cell tumors and cell lines [Table 1]. Another HERV-K element on chromosome 22q11.23, named ERVK-32, is prostate-specific and highly expressed in most prostate carcinomas.,
|Table 1: Previously published studies on the association of human endogenous retrovirus expression with prostate cancer|
Click here to view
The HERV-K Ch22q11.23 locus consists of a full-length HERV-K type 2 provirus with a solitary LTR (5'-LTR-I) located 700 base pairs (bp) upstream of the 5'-LTR (5'-LTR-II). The two 5'-LTRs are separated by 163 nt of HERV-K Leader and 5' gag sequence and 490 nt unique sequence. In the full-length HERV, env and pol genes are defective, but a gag open reading frame (ORF) exists and can be expressed. A gag protein encoded by HERV-K Ch22q11.23 is frequently detected in prostate cancer, and the presence of serum antibodies against it correlates with higher tumor stage and worse prognosis. HERV-K Ch22q11.23 produces accessory proteins which are rec, np9, or an unknown one called hel. This occurs through alternative splicing. Rec is a functional homolog of HIV Rev protein. It is also known that HERV-K expression is androgen-dependent and androgen-regulated proviruses might interact with the androgen receptor network and contribute to its deregulation in prostate cancer.
In this study, we have investigated the transcriptional activity of HERV-K proviruses in normal and cancerous prostatic tissues by applying a bioinformatics-based strategy based on high-throughput sequencing technology. Furthermore, the expression of specific HERVs through quantitative polymerase chain reaction (qPCR) and immunoblotting was shown. In addition, different splicing variants of the HERV-K on Ch22q11.23 locus were characterized. The aim of the study was to demonstrate the overexpression of prostate-specific HERV-K in prostate cancer patient materials.
| Methods|| |
Tissue samples were collected from patients undergoing radical prostatectomy for histologically proven primary prostate cancer at the Urologische Klinik, Charité-Universitätsmedizin Berlin, Germany. The median patient age was 63 years (range 54–69). Serum prostate-specific antigen levels were measured before surgery, and ranged from 7.9 to 58 ng/mL (median 11.2 ng/mL) [Supplementary Table 1]. Five out of 14 patients (35%) had organ-confined disease (pT2), whereas the remaining 65% had nonorgan-confined disease (pT3 and pT4). Using the Gleason score (GS) system, samples were scored from 3 to 9 with a median of 7 (8 out of 14 samples had a GS of 7). Tissue samples were obtained immediately after surgery within a sterile environment, snap-frozen in liquid nitrogen, and stored at −80°C. The samples were then analyzed by a pathologist. After H and E staining, both a cancerous part and an adjacent cancer-free sample were obtained from the frozen tissues. The Institutional Review Board approval for this study was obtained, and all patients gave their informed consent before surgery (Charité-Universitätsmedizin Berlin, document no. EA1/134/12).
Frozen tissues were mechanically sliced (average weights of tissues were around 100 mg) and immediately lysed in RNA-lysis buffer, column-purified, and eluted according to the manufacturer's instructions (QIAamp RNeasy Mini Kit, QIAGEN GmbH, Hilden, Germany). The OD260/280 ratio and nucleic acid concentrations were determined using the Nanodrop-1000 instrument (Peqlab Biotechnologie GmbH, Erlangen, Germany). In addition, the nucleic acid size distribution was evaluated by a Bioanalyzer-2100 (Agilent Technologies, Inc., Santa Clara, CA, USA). Only RNA samples with RIN numbers above 8.0 were selected for further analysis. Before sample sequencing, human ribosomal RNA was depleted using a RiboMinus kit according to the manufacturer's protocol (RiboMinus™ Eukaryote System v2, Life Technologies, Carlsbad, USA).
Sequencing, alignment, and data analysis
Illumina Hiseq2500 sequencing libraries were prepared following the NEBNext Ultra™ DNA Library Prep Kit protocol for Illumina library preparation (New England Biolabs, Ipswich, USA). Sequencing was performed at the Max Planck Genome Centre in Cologne, Germany (http://mpgc.mpipz.mpg.de/home/). RNA-seq data were generated for 28 samples (14 pairs of tumor and cancer-free tissue samples) at an average expected depth of 10 million 100 bp paired-end reads-per-sample on the human rRNA-depleted samples. The computational pipeline used for data analysis is shown in [Figure 1]. The quality of the raw sequencing data was checked using FASTQC (http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc).
|Figure 1. Sequencing, mapping, alignment, and bioinformatics pipeline. Sequences were aligned against the human genome (GRCh37/hg19) augmented by four sequences containing virus variants that do not or only partially exist in the human genome sequence: gi|5802813|gb|AF164611.1|Homo sapiens endogenous retrovirus human endogenous retrovirus-K103; gi|357197676|gb|JN675067.1|Homo sapiens isolate HML-2 12q13.2 endogenous virus human endogenous retrovirus-K; gi|16507981|gb|AY037928.1|Homo sapiens endogenous retrovirus human endogenous retrovirus-K-113 complete genome; gi|390194988|gb|JQ790992.1| Homo sapiens endogenous virus human endogenous retrovirus-K105 gag-pro-pol gene|
Click here to view
The sequencing data were then mapped against the human genome (using GEM mapper  to the human genome [GRCh37/hg19]) augmented by 4 sequences containing HERV-K variants that do not or only partially exist in the human genome sequence database (giǀ5802813ǀgbǀAF164611.1ǀHERV-K103; giǀ357197676ǀgbǀJN675067.1ǀHML-2_12q13.2; giǀ16507981ǀgbǀAY037928.1ǀHERV-K113; giǀ390194988ǀgbǀJQ790992.1ǀǀHERV-K105 gag-pro-pol gene). TopHat2 was used for spliced alignments. Spliced junctions were identified using Cufflinks and/or Sashimi Plot tool (http://www.broadinstitute.org/igv/Sashimi). Sequences with low quality and multiple mappings were disregarded. R/Bioconductor and the DESeq package were used to perform differential gene expression analysis between tumor and cancer-free prostate samples. For the isoform analysis, Cufflinks was used. Overlaps of aligned reads and the entries of a custom repeat region database (repeat masker regions downloaded from UCSC on June 23, 2014 integrated with ERVK sites from Subramanian et al.) were determined using custom software. Afterward, unique hits, nonunique hits, and hits weighted by the number of possible mappings were computed for each region in the data.
P values were calculated using an unpaired two-tailed Student's t-test with P < 0.05 considered statistically significant. For the differential gene expression analysis, an adjusted P < 0.05 was considered statistically significant. qPCR and HERV quantification were performed using the same biological replicates used for RNA-seq analysis. Principal component analyses (PCAs) was conducted with the method prcomp available in R using all unique hits in cancer-free or tumor samples.
Quantitative polymerase chain reaction
We evaluated the expression pattern of GAG-HERV-K ch22q11.23 and HERV-K Ch22q1.23 isoform 9 in normal and cancer tissues by reverse transcription-PCR (RT-PCR). A total of 3 µg of RNA was extracted from the same prostate tissues used for RNA-seq, using the GeneJET RNA Purification Kit according to manufacturer's protocol (Life Technologies). After DNase-treatment (Invitrogen, Carlsbad, USA) and Bioanalyzer quality control (Agilent Technologies, Inc., Santa Clara, CA, USA), the resulting RNA was used for RT-PCR validation. As the studied gene of interest harbors no introns, RNA was pretreated with DNase to remove gDNA before RT. Reactions were prepared according to the manufacturer's protocol using SYBR Select Master Mix (Invitrogen, Carlsbad, USA) and cycled on a 7900HT Real-Time PCR System (Applied Biosystems, Foster City, USA). GAPDH was used as an internal control, and all reactions were run in triplicate. mRNA levels were quantified by calculating average 2−ΔΔCt values, where Ct is the cycle number for the control and target transcript at the chosen threshold. ΔCt = Cttarget – CtGAPDH was calculated by subtracting the average Ct of GAPDH from the average Ct of the target transcript. Primers used in this study were manually designed using the NCBI BLAST tool [Supplementary Table 2]. The relative difference in expression between tumor and cancer-free samples was calculated as fold-change.
Samples for Western blotting were homogenized in T-PER Tissue Protein Extraction Reagent (Life Technologies, a proprietary detergent in 25 mM bicine, 150 mM sodium chloride; pH 7.6) containing a protease inhibitor cocktail (cOmplete, Roche, Basel, Switzerland). Extracts were used for the separation of proteins on 12% sodium dodecyl sulfate polyacrylamide gel electrophoresis. After protein transfer, the polyvinylidene difluoride membranes were blocked in 5% nonfat dry milk in tris-buffered saline/0.05% tween-20. The following antibodies were used: human teratocarcinoma-derived virus (HTDV)/HERV-K/Gag-specific monoclonal antibody HERMA-1 was provided by Boller (the monoclonal antibody HERMA-1 that is specific for the HTDV HERV-K/gag protein). Differentiated human teratocarcinoma cell lines produce the HTDV particles encoded by the HERV sequence HERV-K. After screening 2000 human sera, Bieda et al. and Boller et al. successfully produced HERMA-1/HTDV antibody and provided us with this antibody, mouse monoclonal B-anti-actin antibody (1:5000, Sigma-Aldrich, St. Louis, USA), and antimouse secondary antibody (Amersham), diluted 1:2000.
| Results|| |
Catalog of prostate-specific human endogenous retroviruses based on their transcriptional activity
Prostate tissue cells express specific HERVs. For the analysis of genome-wide expression of HERVs in 28 prostate tumor/benign samples, high-throughput next generation sequencing (NGS) was performed. The median patient age was 63 years (range 54–69) [Table 2]. Tissue samples were obtained from patients undergoing radical prostatectomy at Charité Hospital in Berlin, Germany. All tumors selected for this study were confirmed by a pathologist for the presence of carcinogenic cells based on Gleason pattern, and the matching benign parts lacked any detectable cancerous cells. Tumor tissue samples were categorized according to the GS, with an average GS of 7. Tissue samples were subjected to RNA isolation and then prepared for NGS. Using Illumina Hiseq2500, an average of 14.3 million 100 bp paired-end reads were generated per sample. After FASTQC quality control and filtering, a total of 402,891,188 reads could be mapped to the human genome (hg19) [Supplementary Table 3], which then mapped against the human genome. Applying the stringency described in methods part, a large portion of sequence reads for each sample could be mapped unambiguously to specific HERV sequences, especially HERV-K family [Supplementary Table 3]. Sequences with low quality and multiple mappings were disregarded. PCAs of unique reads per sample, which were mapped unambiguously to specific HERVs, revealed comprehensive differences between tumor and normal samples [Figure 2]a and [Figure 2]b, with a few exceptions. Similar results were obtained after removing the patient and sample effects, which may otherwise introduce bias into the analysis [Figure 2]c. Afterward, analyses of differentially expressed genomic regions between tumor and cancer-free samples highlighted the transcriptional activity of HERVs in tumor samples. Thus, we decided to systematically investigate genome-wide transcription of members of the HERV families in tumor and cancer-free tissue samples, thereby applying stringent parameters to exclude sequence reads with low quality or multiple mapping. Within the genome-wide RNA-seq data set, approximately 7975 differentially expressed HERVs were identified (P < 0.05). A total of 6707 of these HERV loci were inserted in a genomic region that contained either an exon or intron, 960 HERVs were inserted in an exon, and 1268 HERV loci were located in an intergenic region and did not overlap with known genes, which means although most differentially expressed HERV regions were detected in intergenic regions, only 15% of the HERVs were located outside of genes and promoters in the dataset [Figure 3]a and [Figure 3]b. The repeat region database (http://www.repeatmasker.org/) contains 5,297,900 distinct sites covering all repeat types, of which 1,290,920 sites were detected in this data set with multiple mapping hits and 1,175,349 (22.19%) repeats with at least one unique hit in one sample. Thus, the differential analysis results offer valuable insight into the expression of HERVs in prostate tissue and revealed a single-specific locus and several other loci and isoforms with known/unknown function in prostate cancer being differentially expressed.
|Table 2: Characteristics of the 14 prostate samples included in this study|
Click here to view
|Figure 2: Principal component analyses on read counts per sample for 28 prostate samples representing 14 benign/tumor pairs. Tumors and the benign matching pair are shown as red and green, respectively. IDs show sample type and patient number. (a) Principal component analyses on read counts per sample (unique hits only); sites with zero variance and a maximum of ≤ 5 unique hits across samples were excluded from the study. Variance was scaled to 1 to give all sites the same importance. Not excluding low-expressed sites (≤ 5 unique hits) shows a clear discrepancy between sample N2 and all other samples caused by widespread low-level expression in many sites. (b) Principal component analyses on human endogenous retrovirus-K loci only. This includes all ERVK sites (from http://www.repeatmasker.org/) without filtering. (c) Principal component analyses on human endogenous retrovirus-K loci after removing patient and sample effects|
Click here to view
|Figure 3. Distribution of significantly differentially expressed repeat loci across the chromosomes. The plot shows the genomic positions of significantly differentially expressed loci against the fold change. Each chromosome is colored by an alternating color, while white vertical lines indicate the centromeric region for each chromosome. (a) Distribution of repeats across all loci. (b) Distribution of repeats only in intergenic loci|
Click here to view
A few members of the human endogenous retrovirus-K family are strongly transcribed and deregulated
Across all HERV repeats, the K-family got the most hits across all samples as there were 5843 HERV-K loci transcribed according to the RNA-seq data analysis, of which 3792 are covered by at least one unique hit in at least one sample. Top hits (out of 3792 hits) are shown in [Figure 4]. Next differential expression between prostate cancer and paired benign samples was analyzed, using the software package DESeq2. On average, 15 per 10000 reads were mapped to HERV-K loci [Supplementary Table 3]. Two HERV-K members were strongly expressed, namely, HERVs on Ch17p13.1 and Ch22q11.23. In line with the previous reports on HERVs expression in prostate cancer,, the analyses of the RNA-Seq data showed that tumor samples exhibit a higher expression of HERVs than benign samples. Across all prostate samples, transcripts from HERV-K Ch22q11.23 were most frequently identified. In addition, transcripts from up to four other HERV-K proviruses were detected at varying frequencies. In addition to the predominantly transcribed HERV-K Ch22q11.23, analyses identified transcripts from ERVK-17p13.1, Ch22 ERVK-11, and Ch19 ERVK-3, which were expressed differently across samples [Figure 5]a. In agreement with a recently published database of RNA-seq data from different healthy tissues, we confirmed that Ch22q11.23 is one of the most differentially expressed ones [Supplementary Figure 1]. The same analysis was performed with the high-throughput sequencing data and we identified several differentially expressed HERVs including the one in Ch22q11.23. Additional HERVs that were less and/or equally differentially expressed compared to Ch22q11.23 were also identified. Notably, 17p13.1 appeared to be a prostate-specific HERV [Figure 5]b. These two regions allowed a separation of tumor and normal samples and might be used as prostate cancer biomarkers in future as also suggested by others.,
|Figure 4: Strongly expressed human endogenous retrovirus-K loci in prostate tissue. Loci with a median number of.>.10 reads across all samples are shown. Human endogenous retroviruses on Ch17p13.1 and Ch22q11.23 are the strongest expressed ones. Median-hits show the median number of the reads mapped to each locus|
Click here to view
|Figure 5: Differential expressions of human endogenous retrovirus-K loci in tumor versus cancer-free prostate tissue samples (a) Expressions of human endogenous retrovirus-K elements were assessed by RNA-seq and deregulation determined by DESeq. Ch22q11.23 is predominantly expressed in tumors.(upregulation in tumor versus cancer-free samples is shown in red) (b) Deregulation of human endogenous retrovirus-K loci of tumor versus cancer.free prostate tissue samples as log 2 fold change of human endogenous retrovirus-K loci expression in prostate tissue. Ch22q11.23 shows strong upregulation in tumor across all samples pairs. Red shows the statically significant ones|
Click here to view
Quantification of specific human endogenous retrovirus-K transcription in prostate cancer tissues by quantitative reverse transcription-polymerase chain reaction
For the validation of genome-wide RNA-seq data, qRT-PCR in a set of prostatic tissues was performed, using specific primer pairs for the gag region of HERV-K Ch22q11.23. In line with RNA sequencing data, transcripts from HERV-K Ch22q11.23 were detectable in prostatic tissue samples, and the expression was significantly higher in prostate tumor tissues compared to benign prostate samples (fold changes ranging from < 10-fold changes to > 90-fold in some samples) [Figure 6]a. Overall, both RNA-Seq and qRT-PCR results showed comparable fold changes in selected samples and confirmed previous observations that HERV-K Ch22q11.23 gag gene is expressed at significantly higher levels in prostate tumors compared to normal prostate tissue as also shown by others.
|Figure 6: Gag expression in tumor versus tumor-free samples. (a) Upregulation of the expression of Ch22q11.23 gag in prostate tumor samples detected by quantitative reverse transcription-polymerase chain reaction. Relative gene expression level of human endogenous retrovirus-K Ch22q11.23 gag region was measured by quantitative reverse transcription-polymerase chain reaction in tumor versus cancer-free sample. Expression is shown as log 2 fold change, and error bars denote standard error mean. (b) Human endogenous retrovirus-K gag protein expression judged from immunoblotting. Four tumor/cancer-free pairs were selected based on quantitative reverse transcription-polymerase chain reaction results. For Western blotting, the HERMA-1/HTDV antibody used was reported to be able to detect the human endogenous retrovirus-K-derived gag protein-24,34 Actin served as a tissue lysate loading control|
Click here to view
Human endogenous retroviruses-K encoded gag protein expression in prostate tissue samples
To investigate whether these observations could be confirmed on the protein level, immunoblotting analyses were performed using the same prostatic samples that were used for RNA-seq approach. The monoclonal antibody HERMA-1/HTDV that was used was suggested to be specific for the HERV-K-derived gag protein., According to strong specificity of the antibody, results showed a clear difference between tumor and cancer-free whole tissue lysates. Several bands were observed, among them a major band with a molecular mass of 80 kDa, likely corresponding to the gag precursor protein, and a prominent band with a molecular mass of around 30 kDa, likely representing the cleaved gag (core protein) of HTDV7/HERV-K [Figure 6]b.
Characterization of human endogenous retrovirus-K Ch22q11.23 alternative transcripts and splicing variants in prostate tissue
In addition to the analysis of differential gene expression, it was investigated whether different isoforms of the region HERV-K Ch22q11.23 were differentially expressed. Detecting the different isoforms could be helpful in understanding the function of each transcript, for instance, if a functional env or np9 protein might be produced. Using Cufflinks isoform, a total of 11 isoforms were identified in the RNA-seq data in the region of HERV-K Ch22q11.23 (Chr22:23840000-23891000) [Figure 7]a and [Figure 7]b. The predicted transcripts were analyzed in region Ch22q11.23 (Chr22:23840000-23891000) for each sample in RNA-seq data. Analysis using DNASTAR software showed the presence of putative gag in transcripts TCONS_00000002, TCONS_00000008, TCONS_00000009, TCONS_000000010, and TCONS_000000011, when aligned against HERV-K113, HERV-K115, and HERV-K109, as known complete structure HERV-K models [Figure 7]c. These were the five most expressed transcripts in all samples. All isoforms have an ORF for gag, but for env, there are splice variants resulting in truncated transcripts.
|Figure 7: Human endogenous retrovirus.K Ch22q11.23 splicing variants.(predicted by Cufflinks). Cufflinks were run for the region near the human endogenous retrovirus.K at 22q11.23.(22:23840000.23891000) for each sample, to characterize possible transcripts from the RNA.Seq data..(a) Transcription Models of isoforms. Eleven isoforms were identified; they are labeled TCONS00000001 to TCONS000000011. On the left side, tx_id shows the isoforms numbered with TCONS00000001, etc. Position shows the location of isoforms on chromosome 22..(b) Mapping of alternative splice variants/isoforms to human endogenous retrovirus-K Ch22q11.23. Models from individual samples were then merged to a single model.(shown with a red arrow). For that unified model, expression levels were estimated by cuffquant and cuffnorm, normalizing to the number of total reads per sample.(excluding ribosomal RNA sites)..(c) A screen shot of DNASTAR analysis shows the differences between isoforms and K108.and/or K113 human endogenous retroviruses|
Click here to view
Isoforms 8 and 9 were the most expressed ones among the others and showed the strongest difference between tumor and benign in RNA-seq results [Figure 8]a and [Figure 8]b. To further confirm these finding, specific primers were designed for isoform 9 from the sequencing data using the BLAST tool (http://blast.ncbi.nlm.nih.gov) to check the expression of isoform 9 in qRT-PCR reaction. It was confirmed by qRT-PCR with isoform-specific primers that the full-length isoform (isoform 9) was strongly deregulated and overexpressed in tumor samples compared to cancer-free samples [Figure 8]c.
|Figure 8: Expression pattern of isoform 9 from human endogenous retrovirus-K Ch22q11.23..(a) Median expression level across all samples for each transcript isoform is shown as fragments per Kilobase of transcript per Million mapped reads. TCONS_00000009.(isoform9) shows the highest expression..(b) The expression level of different isoforms across all samples is shown as fragments per Kilobase of transcript per Million mapped reads. Each isoform is colored differently.(isoform9 is shown as dark blue/purple and represents the highest expression in almost all samples)..(c) Expression of isoform 9 in prostate tissue. Relative gene expression level of human endogenous retrovirus-K Ch22q11.23 isoform9 was measured in 14 benign and 14 matching prostate cancer tissues by quantitative reverse transcription-polymerase chain reaction. All transcript levels were adjusted to expression of each matching benign section for each sample. Expression is shown as log 2 fold change, and error bars denote standard error mean|
Click here to view
As mentioned above, isoform 9 was the most abundant. Results show the differential expression of this isoform in tumor versus benign tissue in almost all samples. As expected, the two approaches yielded largely overlapping and also complementary information. Future work might be done for the effects of upregulation of Ch22q11.23 locus on neighboring genes [Supplementary Table 4 [Additional file 1]] or even the methylation status of the whole region. However, comparing current sequencing data to already published databases, we found similar results in prostate tissue.
| Discussion|| |
In the last couple of years, several studies have demonstrated correlations between HERV expression and human disease such as multiple sclerosis., Furthermore, there is accumulating evidence that HERV expression may contribute to prostate cancer.,,,,,,,, Moreover, immune responses to endogenous retrovirus-related particles and proteins have also been correlated to cancer,,,,[39 but no clear clinical significance has been suggested. The HERV-K located on ch22q11.23 is a complete retrovirus element, with gag, pol, and env genes, flanked by an LTR sequence. The pol and env genes have mutations and/or deletions that prevent the proteins from being translated. In contrast, the gag gene has a complete ORF that translates to a protein of 715 amino acids. Restricted expression of gag-HERV-K Ch22q11.23 in prostate cancer tissue suggests this molecule as a suitable biomarker and target for prostate cancer. In this study, the HERV expression in prostate cancer and benign tissue was investigated by whole transcriptome RNA-sequencing. A general difficulty in the analysis lies in the nature of the HERV-K repeat sequences, which are very similar to each other.
In this regard, primers for qPCR have to therefore be capable of specifically amplifying the selected region. Thus, primer design is critical, and the software DNASTAR and/or NCBI BLAST tool were used to design-specific primers. Moreover, we characterized an HERV-K gag protein which is located on chromosome 22q11.23, to be highly expressed in prostate cancer and also capable of producing proteins as was confirmed in immunoblotting experiments. Since NGS technology has previously been applied to prostate tissue as well as other tissues, direct comparisons with previous studies are possible. The evident tissue-specificity of HERV-K expression raises the question, of which factors determine such a high expression in prostate cancer. An in silico analysis identified the distribution of transcription factor binding sites in 5'-LTR of the HERV as was reported previously as well. For example, it is known that HERV-K Ch22q11.23 is regulated by androgens through androgen receptor binding sites in its 5'-LTR and its increased expression in prostate cancer may derive from the distorted androgen response., [Table 3] shows the specific transcription factors that bind directly in the genomic region of Ch22q11.23. Recently, Goering et al. showed that the predicted binding sites in the LTRs of expressed proviruses in prostatic tissues were very similar to those of silent proviruses, including those for the androgen receptor. This finding suggests that tissue- and tumor-specific HERV-K expression patterns are more determined by epigenetic factors than by specific transcription factors. Recently, there was a report that the DNA methylation status of the HERV-K LTR contributes to gag-HERV-K expression in cancer cell lines (VCaP, LNCap, and PC3).
|Table 3: Transcription factor binding site of human endogenous retrovirus-K Ch22q11.23 represented by ENCODE from UCSC genome browser for the Chr22:23,878,000–23,891,000 genomic region, GRCh37/hg19 2009 human assembly|
Click here to view
In another study, HERV-K Ch22q11.23 was observed to be highly expressed in prostate cancer samples, and the expression was dependent on LTR region demethylation. Nevertheless, methylation is just one of the several mechanisms that impact chromatin structure and gene expression, and it is possible that some of the heterogeneity in gag expression observed among cancer cell lines might be related to differences in other issues (i.e., acetylation). In 2007, Tomlins et al. reported a translocation event in the promoter region of HERV-K in the prostate. In that report, the HERV-K ch22q11.23 5'-LTR region was shown to be fused with ETV1, leading to an overexpression of the gene in patients with prostate cancer, but it is not yet known if the LTR region alone is sufficient to activate EVT1 expression in these patients. In addition, there is evidence that the 5'-LTR region of HERV-K is active and also responses to androgen stimulation. These findings emphasize the importance of HERV-K activity in prostate cancer. However, we believe that specific expression of gag in prostate cancer occurs by the combination of several mechanisms such as DNA demethylation or other epigenetic modification as well as androgen stimulation. Whether the expression of HERV-K gag is a natural event or there is a biologic relevance related to prostate cancer progression still needs to be addressed.
It is well known that clinical staging of prostate cancer is a critical step in evaluating the risk of the disease and therefore treatment strategies. Results presented here indicate that HERV-K gag might be a useful biomarker for diagnostic purposes in prostate cancer, as the expression levels were significantly different in tumor and benign samples, with regard to the fact that the benign samples were also taken from a cancer patient and might not be completely composed of healthy cells. However, as HERV-K Ch22q11.23 appears to be specifically relevant in prostate cancer,,, a thorough analysis is necessary. Therefore, the locus different transcripts were analyzed. With its two 5'-LTRs, the locus seems to be complex, as was shown before there are some reports that transcription can start from either 5'-LTR, with generally more frequent initiation from the upstream LTR. This can be important for studying the mechanism of HERV activation in future. The analysis showed a higher expression of some specific full-length isoforms of the transcripts in the cancerous material. Obviously, these experiments do not rule out other effects of HERV-K Ch22q11.23 transcripts and proteins in vivo, especially on the interaction of cancer with the immune system. In addition to HERV-K Ch22q11.23, HERV-K Ch17p13.1 on chromosome 17 emerged as broadly expressed in normal prostate and prostate cancer. It is also commonly expressed in Tera-1 cells and melanoma cell lines as reported previously.,
Analysis showed that HERV-K Ch17p13.1 expression is likewise increased in prostate cancer. Another HERV-K locus commonly expressed in the prostate is ERVK-15 at chromosome 7q34. According to the NGS results, its expression is only slightly deregulated in prostate cancer. Furthermore, alternative splice variants were detected, again with differences between the samples. The identified splicing pattern explains why no evidence for a rec protein or for a non-rec ORF was obtained in previous studies. The newly defined np9-like protein-encoding transcript is overexpressed in prostate cancer. Detecting the different isoforms can be helpful in understanding the function of each transcript as the different function of transcripts in the p63 gene or even env and rec or np9 of HERVs. In this study, we observed and confirmed by qPCR that special isoforms, the full length, in this case, is expressed at a higher level compared to the others. This means there might be a special function and/or functionality for this transcript. Future experiments might focus on the splicing variants of the mentioned HERV as this is a prostate-specific endogenous provirus. Recently, Bhardwaj et al. reported strong HERV-K Ch22q11.23 expression in Tera-1 cells and characterized several transcripts. We observed a yet wider range of splice variants in prostatic tissues. They also discovered that the long noncoding RNA TCONS_l2_00017644 expressed predominantly in prostate, testes, and ovaries actually contains the HERV-K Ch22q11.23 gag ORF (70 kDa) and is therefore not a lincRNA., Likewise, PCAT-14, reported as a prostate-specific lincRNA strongly upregulated in prostate cancer, is in fact, a coding HERV-K Ch22q11.23 transcript. Accordingly, in a recent RNA-seq study of the germ cell cancer cell line Tera-1, two proviruses contributed more than half of all HERV-K transcripts, namely, ERVK-24 and interestingly HERV-K Ch22q11.23. The expression analysis of the surrounding genes located on chromosome 22q11.23 showed that expression of gag-HERV-K is not due to unspecific region activation in prostate cancer as was reported previously as well., These findings shed light on the study of the neighboring genes around the HERVs as well as epigenetic mechanism of activation related to HERV expression [Figure 9].
|Figure 9: Expression plots of human endogenous retrovirus-K Ch22q11.23 and neighboring area. As clearly shown, the expression of human endogenous retrovirus on locus 22q11.23 is significantly higher in tumor versus benign samples, but the neighboring genes on that locus do not show such a strong deregulation|
Click here to view
The in vivo relevance of increased HERV-K Ch22q11.23 expression and/or methylation status of the LTR region in prostate cancer is interesting. Such differences might reflect different subtypes or stages of prostate cancer, and some of the differences in the induction of gag which was reported to be androgen dependent may be related to the transformation in cell cultures. Another interesting aspect of HERVs activation is in relation to the immune system. Specific immune responses to endogenous retrovirus particles were described before.,,, It was also shown that patients with advanced prostate cancer have a higher incidence of antibody-mediated beta-cellular immune response to HERV-K gag protein, which can suggest an increase in gene expression due to prostate cancer progression.
However, the exact correlation of immune response to cancer progression is not yet well understood. One possibility is the activation of intracellular RNA immune sensors which can induce an innate immune response in the cells that express high levels of provirus. Gag protein produced in high amounts might lead to disturbances of autophagy or proteasome processes and might lead to the induction of an adaptive immune response. Future investigations should address whether different isoforms of HERV-K might have different roles in cancer progression and/or diagnosis as well as investigations whether the expression or protein production might change with androgen deprivation and how this can impact the immune responses in the tissue.
| Acknowledgments|| |
I would like to thank Dr. Hans Krause and Waltraut Jekabsons for preparing the prostate probes, Dr. Munir Al-Zeer and Prof. Holger Brüggemann for helping in experimental design and scientific discussion, Dr. Christian F. P. Scholz and Dr. Hilmar Berger for bioinformatics analysis, Prof. Thomas F. Meyer for helpful insights during the work, Dr. Sara Shahnejat-Bushehri for writing, editing, and finalizing the manuscript, and also the contribution and services of the Max Planck Institute of Genome Center in Cologne-Germany for preparation of libraries and performing the NGS, especially Dr. Lisa Czaja.
Financial support and sponsorship
Conflict of interest
There are no conflicts of interest.
| References|| |
Bannert N, Kurth R. Retroelements and the human genome: new perspectives on an old relation. Proc Natl Acad Sci U S A
2004; 101 Suppl 2: 14572–9.
Mayer J, Meese E, Mueller-Lantzsch N. Human endogenous retrovirus K homologous sequences and their coding capacity in old world primates. J Virol
1998; 72 (3): 1870–5.
Reus K, Mayer J, Sauter M, Zischler H, Müller-Lantzsch N, Meese E. HERV-K (OLD): ancestor sequences of the human endogenous retrovirus family HERV-K (HML-2). J Virol
2001; 75 (19): 8917–26.
Belshaw R, Pereira V, Katzourakis A, Talbot G, Paces J, Burt A, Tristem M. Long-term reinfection of the human genome by endogenous retroviruses. Proc Natl Acad Sci U S A
2004; 101 (14): 4894–9.
Belshaw R, Dawson AL, Woolven-Allen J, Redding J, Burt A, Tristem M. Genomewide screening reveals high levels of insertional polymorphism in the human endogenous retrovirus family HERV-K (HML2): implications for present-day activity. J Virol
2005; 79 (19): 12507–14.
Marchi E, Kanapin A, Magiorkinis G, Belshaw R. Unfixed endogenous retroviral insertions in the human population. J Virol
2014; 88 (17): 9529–37.
Subramanian RP, Wildschutte JH, Russo C, Coffin JM. Identification, characterization, and comparative genomic distribution of the HERV-K (HML-2) group of human endogenous retroviruses. Retrovirology
2011; 8: 90.
Downey RF, Sullivan FJ, Wang-Johanning F, Ambs S, Giles FJ, Glynn SA. Human endogenous retrovirus K and cancer: innocent bystander or tumorigenic accomplice? Int J Cancer
2015; 137 (6): 1249–57.
Macfarlane CM, Badge RM. Genome-wide amplification of proviral sequences reveals new polymorphic HERV-K (HML-2) proviruses in humans and chimpanzees that are absent from genome assemblies. Retrovirology
2015; 12: 35.
Moyes DL, Martin A, Sawcer S, Temperton N, Worthington J, Griffiths DJ, Venables PJ. The distribution of the endogenous retroviruses HERV-K113 and HERV-K115 in health and disease. Genomics
2005; 86 (3): 337–41.
Turner G, Barbulescu M, Su M, Jensen-Seaman MI, Kidd KK, Lenz J. Insertional polymorphisms of full-length endogenous retroviruses in humans. Curr Biol
2001; 11 (19): 1531–5.
Hohn O, Hanke K, Bannert N. HERV-K (HML-2), the best preserved family of HERVs: endogenization, expression, and implications in health and disease. Front Oncol
2013; 3: 246.
Wang-Johanning F, Frost AR, Jian B, Azerou R, Lu DW, Chen DT, Johanning GL. Detecting the expression of human endogenous retrovirus E envelope transcripts in human prostate adenocarcinoma. Cancer
2003; 98 (1): 187–97.
Mayer J, Blomberg J, Seal RL. A revised nomenclature for transcribed human endogenous retroviral loci. Mob DNA
2011; 2 (1): 7.
Tomlins SA, Laxman B, Dhanasekaran SM, Helgeson BE, Cao X, Morris DS, Menon A, Jing X, Cao Q, Han B, Yu J, Wang L, Montie JE, Rubin MA, Pienta KJ, Roulston D, Shah RB, Varambally S, Mehra R, Chinnaiyan AM. Distinct classes of chromosomal rearrangements create oncogenic ETS gene fusions in prostate cancer. Nature
2007; 448 (7153): 595–9.
Sauter M, Schommer S, Kremmer E, Remberger K, Dölken G, Lemm I, Buck M, Best B, Neumann-Haefelin D, Mueller-Lantzsch N. Human endogenous retrovirus K10: expression of Gag protein and detection of antibodies in patients with seminomas. J Virol
1995; 69 (1): 414–21.
Wang-Johanning F, Liu J, Rycaj K, Huang M, Tsai K, Rosen DG, Chen DT, Lu DW, Barnhart KF, Johanning GL. Expression of multiple human endogenous retrovirus surface envelope proteins in ovarian cancer. Int J Cancer
2007; 120 (1): 81–90.
Schmitt K, Reichrath J, Roesch A, Meese E, Mayer J. Transcriptional profiling of human endogenous retrovirus group HERV-K (HML-2) loci in melanoma. Genome Biol Evol
2013; 5 (2): 307–28.
Ishida T, Obata Y, Ohara N, Matsushita H, Sato S, Uenaka A, Saika T, Miyamura T, Chayama K, Nakamura Y, Wada H, Yamashita T, Morishima T, Old LJ, Nakayama E. Identification of the HERV-K gag antigen in prostate cancer by SEREX using autologous patient serum and its immunogenicity. Cancer Immun
2008; 8: 15.
Goering W, Ribarska T, Schulz WA. Selective changes of retroelement expression in human prostate cancer. Carcinogenesis
2011; 32 (10): 1484–92.
Pérot P, Cheynet V, Decaussin-Petrucci M, Oriol G, Mugnier N, Rodriguez-Lafrasse C, Ruffion A, Mallet F. Microarray-based identification of individual HERV loci expression: application to biomarker discovery in prostate cancer. J Vis Exp
2013; (81): e50713.
Marco-Sola S, Sammeth M, Guigó R, Ribeca P. The GEM mapper: fast, accurate and versatile alignment by filtration. Nat Methods
2012; 9 (12): 1185–8.
Reis BS, Jungbluth AA, Frosina D, Holz M, Ritter E, Nakayama E, Ishida T, Obata Y, Carver B, Scher H, Scardino PT, Slovin S, Subudhi SK, Reuter VE, Savage C, Allison JP, Melamed J, Jäger E, Ritter G, Old LJ, Gnjatic S. Prostate cancer progression correlates with increased humoral immune response to a human endogenous retrovirus GAG protein. Clin Cancer Res
2013; 19 (22): 6112–25.
Agoni L, Guha C, Lenz J. Detection of human endogenous retrovirus K (HERV-K) transcripts in human prostate cancer cell lines. Front Oncol
2013; 3: 180.
Bhardwaj N, Montesion M, Roy F, Coffin JM. Differential expression of HERV-K (HML-2) proviruses in cells and virions of the teratocarcinoma cell line Tera-1. Viruses
2015; 7 (3): 939–68.
Hanke K, Chudak C, Kurth R, Bannert N. The Rec protein of HERV-K (HML-2) upregulates androgen receptor activity by binding to the human small glutamine-rich tetratricopeptide repeat protein (hSGT). Int J Cancer
2013; 132 (3): 556–67.
Kreimer U, Schulz WA, Koch A, Niegisch G, Goering W. HERV-K and LINE-1 DNA methylation and reexpression in urothelial carcinoma. Front Oncol
2013; 3: 255.
Cabili MN, Trapnell C, Goff L, Koziol M, Tazon-Vega B, Regev A, Rinn JL. Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev
2011; 25 (18): 1915–27.
Wallace TA, Downey RF, Seufert CJ, Schetter A, Dorsey TH, Johnson CA, Goldman R, Loffredo CA, Yan P, Sullivan FJ, Giles FJ, Wang-Johanning F, Ambs S, Glynn SA. Elevated HERV-K mRNA expression in PBMC is associated with a prostate cancer diagnosis particularly in older men and smokers. Carcinogenesis
2014; 35 (9): 2074–83.
Goering W, Schmitt K, Dostert M, Schaal H, Deenen R, Mayer J, Schulz WA. Human endogenous retrovirus HERV-K (HML-2) activity in prostate cancer is dominated by a few loci. Prostate
2015; 75 (16): 1958–71.
Wang-Johanning F, Li M, Esteva FJ, Hess KR, Yin B, Rycaj K, Plummer JB, Garza JG, Ambs S, Johanning GL. Human endogenous retrovirus type K antibodies and mRNA as serum biomarkers of early-stage breast cancer. Int J Cancer
2014; 134 (3): 587–95.
Prensner JR, Iyer MK, Balbin OA, Dhanasekaran SM, Cao Q, Brenner JC, Laxman B, Asangani IA, Grasso CS, Kominsky HD, Cao X, Jing X, Wang X, Siddiqui J, Wei JT, Robinson D, Iyer HK, Palanisamy N, Maher CA, Chinnaiyan AM. Transcriptome sequencing across a prostate cancer cohort identifies PCAT-1, an unannotated lincRNA implicated in disease progression. Nat Biotechnol
2011; 29 (8): 742–9.
Hahn S, Ugurel S, Hanschmann KM, Strobel H, Tondera C, Schadendorf D, Löwer J, Löwer R. Serological response to human endogenous retrovirus K in melanoma patients correlates with survival probability. AIDS Res Hum Retroviruses
2008; 24 (5): 717–23.
Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol
2013; 14 (4): R36.
Bieda K, Hoffmann A, Boller K. Phenotypic heterogeneity of human endogenous retrovirus particles produced by teratocarcinoma cell lines. J Gen Virol
2001; 82(Pt 3): 591–6.
Boller K, Janssen O, Schuldes H, Tönjes RR, Kurth R. Characterization of the antibody response specific for the human endogenous retrovirus HTDV/HERV-K. J Virol
1997; 71 (6): 4581–8.
Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol
2014; 15 (12): 550.
Rakoff-Nahoum S, Kuebler PJ, Heymann JJ, E Sheehy M, Ortiz GM, S Ogg G, Barbour JD, Lenz J, Steinfeld AD, Nixon DF. Detection of T lymphocytes specific for human endogenous retrovirus K (HERV-K) in patients with seminoma. AIDS Res Hum Retroviruses
2006; 22 (1): 52–6.
Wang-Johanning F, Rycaj K, Plummer JB, Li M, Yin B, Frerich K, Garza JG, Shen J, Lin K, Yan P, Glynn SA, Dorsey TH, Hunt KK, Ambs S, Johanning GL. Immunotherapeutic potential of anti-human endogenous retrovirus-K envelope protein antibodies in targeting breast tumors. J Natl Cancer Inst
2012; 104 (3): 189–210.
Criscione SW, Zhang Y, Thompson W, Sedivy JM, Neretti N. Transcriptional landscape of repetitive elements in normal and cancer human cells. BMC Genomics
2014; 15: 583.
[Figure 1], [Figure 2], [Figure 3], [Figure 4], [Figure 5], [Figure 6], [Figure 7], [Figure 8], [Figure 9]
[Table 1], [Table 2], [Table 3]