


|

Molecular Diagnosis in Pathology: The Bridge to the 21st Century
Moderators: Dr. Ricardo Lloyd and Dr. George Kontogeorgos
|
Section 5 -
|
Bioinformatics Approach Leads to the Discovery of the
TMPRSS2:ETS Gene Fusion in Prostate Cancer.

Mark A. Rubin, MD
Brigham and Women's Hospital
Harvard Medical School
Boston, MA (U.S.A.)
|


Microarray experiments generate copious data that can be used to identify significantly
differentially expressed genes between known classes of samples. This approach can lead to the
identification of molecular biomarkers. For example, AMACR ( a-Methylacyl CoA racemase), Hepsin, and
Fatty Acid Synthetase are all over expressed in prostate cancer as compared to benign prostate tissue
[1,
2,
3].
Statistical significance for biomarkers is demonstrated by comparing the mean expression of one
class to another. For example in Figure 1 (left, biomarkers profile), AMACR in prostate cancer (class 2, red) is significantly over expressed as compared to the reference class -
benign prostate tissue (class 1, blue). These results are visually
appreciated by ordering the expression of AMACR by class.

Figure 1: Cancer Outlier Profiler Analysis (COPA). A cancer biomarker (left),
such as AMACR, demonstrates significant over expression in the majority of cancer samples (red) as
compared to benign samples (blue). An oncogene outlier profiler for ERG is characterized by significant
over expression in a subpopulation of samples within the prostate cancer samples (red). Standard
statistical tests such as the Student's t-test are useful for the biomarker profile but fail to identify
profiles with only a few outlier cases. COPA transforms the data (as described in text) to accentuate
profiles with outliers. These data are from the study by LaPoint et al.

The difference in the mean AMACR expression between the two groups is statistically
significant although there is some expression in benign tissues that is at a similar level to some
prostate cancer samples. In order to rank the best biomarkers for a specific class, one can compare the
results of multiple micorarray experiments in a meta analysis approach. In a meta-analysis of four cDNA
expression array data sets, AMACR was one of the genes most consistently over expressed in prostate
cancer [7]. This meta-analysis approach has lead to the development of the publicly available compendium
of expression array data called Oncomine (www.oncomine.org) that
allows researchers to investigate over 300 expression array datasets [8]. However, one limitation to
this standard biomarker analysis is how does it deal with genes significantly differentially expressed in
only a subset of the tumors?

Tumor cells thrive by developing a growth advantage over neighboring benign cells through
a variety of genetic and epigenetic alterations. Over expression of oncogenes favors this growth
advantage and can occur through gene copy number amplification, activating mutations or by constitutive
promoter activation. Oncogenes such as her-2-neu or EGFR are examples where over expression is observed in only a subset of tumors
from patients with breast or lung cancer, respectively. Thus, the expression array profile of an
oncogene, may look very different when compared to AMACR. In a recent study from our group, a simple
approach was developed to identify oncogene profiles that can be characterized by over expression of a
small subset of biologically important outlier cases.

The method called Cancer Outlier Profile Analysis (COPA) was developed based on the idea
that evaluating variance in a data set using the median instead of the mean would maintain the peaks of
outliers. COPA has three steps. First, gene expression values are median centered, setting each gene's
median expression value to zero. Second, the median absolute deviation (MAD) is calculated and scaled to
1 by dividing each gene expression value by its MAD (Figure 1). This approach
was used instead of centering data around the mean because it has less effect on the tails or outliers.
Third, the 75th, 90th, and 95th percentiles of the transformed
expression values are tabulated for each gene and then genes are rank-ordered by their percentile scores,
leading to a prioritized list of outlier profiles.

By applying COPA, 132 gene expression data sets representing 10,486 microarray
experiments were interrogated for outlier genes [9]. Examples of known genes that are over expressed in
a subset of a particular tumor type were identified such as the oncogene her-2-neu and E-Cadherin (CDH1) (see Table). Interestingly, genes such as RUNX1T1 (ETO) and PBX1 also scored high on
COPA. These two genes are known to be associated with the AML-ETO and E2A-PBX1 gene translocations in acute myeloid leukemia and acute lymphoblastic
leukemia, respectively. Both of these translocations only occur in a subset of the cases, (i.e., outlier
cases). Two genes consistently scored high in prostate cancer microarray experiments, ERG (Figure 1, right) and ETV1. Both of these genes are members of the
ETS family of transcription factors. They were over expressed in the majority (50-70%) of prostate
cancers and were mutually exclusive across several independent gene expression datasets, suggesting that
they may be functionally redundant in prostate cancer development [9]. Because the ETS family of
transcription factors has previously been seen in the genomic translocation of the Ewing's family tumors,
AML and other rare tumors, the possibility that they were part of a translocation in prostate cancer was
explored. When the ERG cDNA transcript was evaluated exon by exon, over expression was seen at the
distal (3' end) but not the proximal portion (5' end). By sequencing the cDNA transcripts, fusions of
the 5'-untranslated region of TMPRSS2 (21q22.3) with the ETS transcription factor family members, either ERG
(21q22.2), ETV1 (7p21.2) [9], and more recently ETV4 [10] were identified,
suggesting a novel mechanism for overexpression of the ETS genes in prostate cancer (Figure 2).

Figure 2. Anatomy of the TMPRSS2 to ETS Family Gene Fusions Identified in Prostate Cancer. Adapted
from Tomlins et al Science 310:644.

Thus, the identification of these gene fusions between the prostate-specific, strongly
androgen-regulated gene TMPRSS2 (21q22.3) to ERG, ETV1, or ETV4 was a
surprising discovery. Using other methods to validate these findings (i.e., RT-PCR and fluorescence in
situ hybridization (FISH)) in human prostate cancer samples, the TMPRSS2:ETS
gene fusions are seen in up to 80% of hospital based clinical cohorts. TMPRSS2:ETS gene fusions have not
in been detected in the precursor lesion high-grade prostatic intraepithelial neoplasia (PIN) or
prostatic atrophy (PIA). Because TMPRSS2 is regulated by androgens, even in
the setting of hormone ablation therapy for metastatic prostate cancer, low levels of androgen may still
be sufficient to drive ETS overexpression.

The TMPRSS2:ETS gene fusion appears to be one of the earliest events
involving prostate cancer invasion and leads to the over expression of the fused ETS gene in an
androgen-regulated manner. There is still much to be learned about this common prostate cancer gene
fusion. The DNA breakpoint(s) have not yet been identified but would help in the development of
diagnostic tools for prostate cancer. The exact frequency of the TMPRSS2:ETS fusion still needs to be determined in population-based studies. The
high percentage of TMPRSS2:ERG fusion prostate cancers suggests that ERG may be the most common fusion partner. The hospital-based studies to date
suggest that at least 50% of prostate cancers harbor the TMPRSS2:ERG gene
fusion. With the recent identification of a third molecular subtype (TMPRSS2:ETV4) [10], one can anticipate finding other translocation partners such
as FLI1 based on expression array data. This would be similar to
observation in the Ewing's family tumors, where approximately 85% of tumors
harbor a tumor-associated t(11;22)(q24;q12) rearrangement resulting in the juxtaposition of the EWS gene (EWing's Sarcoma Gene) on chromosome 22 with the FLI1 gene on chromosome 11. Four other ETS family
members have been identified as translocation partners of EWS. The second
most common ETS translocation partner is ERG
seen in approximately 10% of cases [11]. Finally, the identification of the TMPRSS2:ETS gene fusion in prostate cancer suggests that distinct molecular
subtypes may further define risk for disease progression. Future studies will explore associations with
clinical outcome and response to treatment. Perhaps most importantly, therapeutic targets to the gene
fusion(s) are being investigated that might lead to a rational drug development similar to the
development of imatinib (STI571, Gleevec) therapy for CML.

Table. Cancer Outlier Profile Analysis (COPA)*: The 15Top Ranked Genes from Tomlins et al Science 310:644.

| Rank | % | Score | Gene | Cancer | Reference | Evidence |
| 1 | 90 | 21.9 | CDH1 | Melanoma | Bittner et al. [12] | |
| 1 | 95 | 20.1 | RUNX1T1 | Leukemia | Valk et al. [13] | XX |
| 1 | 95 | 15.4 | PRO1073 | Renal | Vasselli et al. [14] | X |
| 1 | 95 | 14.2 | MYH11 | Sarcoma | Segal et al. [15] | |
| 1 | 90 | 13.0 | PBX1 | Leukemia | Ross et al. [16] | XX |
| 1 | 95 | 10.0 | ETV1 | Prostate | Lapointe et al. [17] | ** |
| 1 | 90 | 7.5 | WHSC1 | Myeloma | Tian et al. [18] | X |
| 1 | 75 | 5.4 | ERG | Prostate | Dhanasekaran et al. [19] | ** |
| 1 | 75 | 5.2 | FOX03A | Breast | Wang et al. [20] | |
| 1 | 75 | 4.4 | ERG | Prostate | Welsh et al. [21] | ** |
| 1 | 75 | 4.3 | CCND1 | Myeloma | Zhan et al. [22] | X |
| 1 | 75 | 3.7 | PCSK7 | Leukemia | Cheok et al. [23] | |
| 1 | 75 | 3.4 | ERG | Prostate | Lapointe et al. [17] | ** |
| 1 | 75 | 3.4 | ERG | Prostate | Dhanasekaran et al. [2] | ** |
| 1 | 75 | 2.6 | IGH@ | Lung | Wigle et al. [24] | |
X=literature evidence for acquired pathognomonic translocation; XX=indicates that translocation was identified in the reference study; **=signifies ERG and ETV1 outlier profiles in prostate cancer.


References
- Luo J, Duggan DJ, Chen Y, et al. Human prostate cancer and benign prostatic hyperplasia: molecular dissection by gene expression profiling. Cancer Res 2001;61(12):4683-8.

- Dhanasekaran SM, Barrette TR, Ghosh D, et al. Delineation of prognostic biomarkers in prostate cancer. Nature 2001;412(6849):822-6.

- Luo J, Dunn T, Ewing C, et al. Gene expression signature of benign prostatic hyperplasia revealed by cDNA microarray analysis. Prostate 2002;51(3):189-200.

- Rubin MA, Zhou M, Dhanasekaran SM, et al. alpha-Methylacyl coenzyme A racemase as a tissue biomarker for prostate cancer. Jama 2002;287(13):1662-70.

- Luo J, Zha S, Gage WR, et al. alpha-Methylacyl-CoA Racemase: A New Molecular Marker for Prostate Cancer. Cancer Res 2002;62(8):2220-6.

- Xu J, Stolk JA, Zhang X, et al. Identification of differentially expressed genes in human prostate cancer using subtraction and microarray. Cancer Res 2000;60(6):1677-82.

- Rhodes DR, Barrette TR, Rubin MA, Ghosh D, Chinnaiyan AM. Meta-analysis of microarrays: interstudy validation of gene expression profiles reveals pathway dysregulation in prostate cancer. Cancer Res 2002;62(15):4427-33.

- Rhodes DR, Yu J, Shanker K, et al. ONCOMINE: A Cancer Microarray Database and Integrated Data-Mining Platform. Neoplasia 2004;6(1):1-6.

- Tomlins SA, Rhodes DR, Perner S, et al. Recurrent Fusion of TMPRSS2 and ETS Transcription Factor Genes in Prostate Cancer. Science 2005;310(5748):644-8.

- Tomlins SA, Mehra R, Rhodes DR, et al. TMPRSS2:ETV4 Gene Fusions define a third molecular subtype of prostate cancer. Cancer Res 2006;66(7):3396-400.

- Delattre O, Zucman J, Melot T, et al. The Ewing family of tumors--a subgroup of small-round-cell tumors defined by specific chimeric transcripts. N Engl J Med 1994;331(5):294-9.

- Bittner M, Meltzer P, Chen Y, et al. Molecular classification of cutaneous malignant melanoma by gene expression profiling. Nature 2000;406(6795):536-40.

- Valk PJ, Verhaak RG, Beijen MA, et al. Prognostically useful gene-expression profiles in acute myeloid leukemia. N Engl J Med 2004;350(16):1617-28.

- Vasselli JR, Shih JH, Iyengar SR, et al. Predicting survival in patients with metastatic kidney cancer by gene-expression profiling in the primary tumor. Proc Natl Acad Sci U S A 2003;100(12):6958-63.

- Segal NH, Pavlidis P, Noble WS, et al. Classification of clear-cell sarcoma as a subtype of melanoma by genomic profiling. J Clin Oncol 2003;21(9):1775-81.

- Ross ME, Zhou X, Song G, et al. Classification of pediatric acute lymphoblastic leukemia by gene expression profiling. Blood 2003;102(8):2951-9.

- Lapointe J, Li C, Higgins JP, et al. Gene expression profiling identifies clinically relevant subtypes of prostate cancer. Proc Natl Acad Sci U S A 2004;101(3):811-6.

- Tian E, Zhan F, Walker R, et al. The role of the Wnt-signaling antagonist DKK1 in the development of osteolytic lesions in multiple myeloma. N Engl J Med 2003;349(26):2483-94.

- Dhanasekaran SM, Dash A, Yu J, et al. Molecular profiling of human prostate tissues: insights into gene expression patterns of prostate development during puberty. Faseb J 2005;19(2):243-5.

- Wang Y, Klijn JG, Zhang Y, et al. Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet 2005;365(9460):671-9.

- Welsh JB, Sapinoso LM, Su AI, et al. Analysis of gene expression identifies candidate markers and pharmacological targets in prostate cancer. Cancer Res 2001;61(16):5974-8.

- Zhan F, Hardin J, Kordsmeier B, et al. Global gene expression profiling of multiple myeloma, monoclonal gammopathy of undetermined significance, and normal bone marrow plasma cells. Blood 2002;99(5):1745-57.

- Cheok MH, Yang W, Pui CH, et al. Treatment-specific changes in gene expression discriminate in vivo drug response in human leukemia cells. Nat Genet 2003;34(1):85-90.

- Wigle DA, Jurisica I, Radulovich N, et al. Molecular profiling of non-small cell lung cancer and correlation with disease-free survival. Cancer Res 2002;62(11):3005-8.
|


|
|
|