12.3: Oncogenes - Biology

12.3: Oncogenes - Biology

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

An oncogene is a gene that when mutated or expressed at abnormally-high levels contributes to converting a normal cell into a cancer cell. Cancer cells are cells that are engaged in uncontrolled mitosis.

The signals for normal mitosis

Normal cells growing in culture will not divide unless they are stimulated by one or more growth factors present in the culture medium (e.g, Epidermal Growth Factor (EGF)). The growth factor binds to its receptor, an integral membrane protein embedded in the plasma membrane with its ligand-binding site exposed at the surface of the cell. Examples:

  • the Epidermal Growth Factor Receptor (EGFR). The gene encoding it, EGFR, is also known as HER1.
  • another growth factor receptor is encoded by the gene ERBB2 (also known as HER2.)
  • Binding of a growth factor to its receptor triggers a cascade of signaling events within the cytosol. Many of these involve
    • kinases — enzymes that attach phosphate groups to other proteins. Examples: the proteins encoded by SRC, RAF, ABL, and the fusion protein encoded by BCR/ABL found in chronic myelogenous leukemia (CML).
    • or molecules that turn on kinases. Example: RAS. RAS molecules reside on the inner surface of the plasma membrane where they serve to link receptor activation to "downstream" kinases like RAF.
  • In most cases, phosphorylation activates the protein and eventually transfers the signal into the nucleus.

Here phosphorylation activates transcription factors that bind to promoters and enhancers in DNA, turning on their associated genes. Examples: AP-1, a heterodimer of the proteins encoded by jun and fos. Some of the genes turned on by these transcription factors encode other transcription factors (e.g., myc).

Some of the genes turned on by these downstream transcription factors encode cyclins that prepare the cell to undergo mitosis. Genes that participate in any one of the steps above can become oncogenes if they become mutated so that their product becomes constitutively active (that is, active all the time even in the absence of a positive signal) or they produce their product in excess. Possible causes include if their promoter and/or enhancer has become mutated (e.g., the oncomouse: a transgenic mouse that has both copies of its myc gene under the influence of extra-powerful promoters) or loss (e.g., by a translocation) of the 3'-UTR of their mRNA so that a microRNA (miRNA) that normally represses translation can no longer do so.

All these oncogenes act as dominants; if the cell has one normal gene (called a proto-oncogene) and one mutated gene (the oncogene) at a pair of loci, the abnormal product takes control. No single oncogene can, by itself, cause cancer. It can, however, increase the rate of mitosis of the cell in which it finds itself. Dividing cells are at increased risk of acquiring mutations, so a clone of actively dividing cells can yield subclones of cells with a second, third, etc. oncogene. When a clone loses all control over its mitosis, it is well on its way to developing into a cancer.

This graph (based on the work of E. Sinn et al, Cell 49:465,1987) shows the synergistic effect of two oncogenes. The fraction (%) of transgenic mice without tumors is shown as a function of age.

Other types of potential cancer-promoting genes

  • Genes that inhibit apoptosis: The suicide of damaged cells — apoptosis — provides an important mechanism for ridding the body of cells that could go on to form a cancer. It is not surprising then that inhibiting apoptosis can promote the formation of a cancer. Example: Bcl-2. The product of this gene inhibits apoptosis. Overexpression of the gene is a hallmark of B-cell cancers.
  • Genes involved in repairing DNA or stopping mitosis if they fail: Mutations arise from an unrepaired error in DNA. So any gene whose product participates in DNA repair probably can also behave as an oncogene when mutated. For example: ATM. ATM (="ataxia telangiectasia mutated") gets its name from a human disease of that name, whose patients — among other things — are at increased risk of cancer. The ATM protein is also involved in detecting DNA damage and interrupting the cell cycle when damage is found. It is estimated that fully 1% of the ~21,000 genes in the human genome are proto-oncogenes.
  • Tumor-Suppressor Genes: The products of some genes inhibit mitosis. These genes are called tumor suppressor genes. In contrast to oncogenes, these behave as recessives — both alleles must be defective to lose their braking effect on mitosis.

Receptor Biology

Michael Roberts is Professor of Biology at Linfield College in McMinnville (Oregon, USA). He has taught biology to students for 40 years, first at Yale University and since 1981 at Linfield College. His scientific focus is cardiovascular physiology and the regulation of animal body temperature.

Anne Kruchten is Associate Professor of Biology at Linfield College in McMinnville (Oregon, USA). A graduate of the University of Minnesota, she joined Linfield College in 2006. Her scientific focus is the regulation of cell migration.

Holland-Frei Cancer Medicine. 6th edition.

Marco A. Pierotti , PhD, Gabriella Sozzi , PhD, and Carlo M. Croce , MD.

The activation of oncogenes involves genetic changes to cellular protooncogenes. The consequence of these genetic alterations is to confer a growth advantage to the cell. Three genetic mechanisms activate oncogenes in human neoplasms: (1) mutation, (2) gene amplification, and (3) chromosome rearrangements. These mechanisms result in either an alteration of protooncogene structure or an increase in protooncogene expression (Figure 6-5). Because neoplasia is a multistep process, more than one of these mechanisms often contribute to the genesis of human tumors by altering a number of cancer-associated genes. Full expression of the neoplastic phenotype, including the capacity for metastasis, usually involves a combination of protooncogene activation and tumor suppressor gene loss or inactivation.

Figure 6-5

Schematic representation of the main mechanisms of oncogene activation (from protooncogenes to oncogenes). The normal gene (protooncogene) is depicted with its transcibed portion (rectangle). In the case of gene amplification, the latter can be duplicated (more. )


Epigenetics has changed the commonly accepted knowledge of cancer biology. As genetics was recognized as the major component responsible for the tumorigenic process in the past, today we tend to assign a pivotal role to epigenetics in triggering or supporting the cancer progression.

Many epigenetic players can cooperate to transform cells into cancer cells, as different classes of epigenetic factors have been found altered in cancers. Among them stand out histone modifiers [histone acetyltransferases (HATs), histone deacetylases (HDACs) and histone methyltransferases], chromatin remodelers, DNA modifiers (DNA methyl- and hydroxymethyltransferases) and noncoding RNAs [ 1–4], all of which have a direct or indirect effect on chromatin structure and dynamics.

Genetic factors mainly rely on loss- or gain-of-function of tumor suppressors or oncogenes, respectively, whereas epigenetic-driven tumorigenic alterations are based on (potentially reversible) alteration of enzymatic activities, giving rise to a more globally aberrant phenotype [ 5].

Typically, epigenetic alterations include aberrant DNA methylation and/or histone modification patterns, leading to altered gene expression of key regulator genes mainly involved in control of cell growth and proliferation as well as DNA repair or maintenance of genome stability [ 2, 6]. This suggests that restoring their expression to the physiological level, e.g. with the use of epigenetic modulators, may contribute to cancer resolution. Based on this, the use of epigenetic modulators as anticancer compounds has been proposed as a new therapeutic strategy for cancer and other diseases [ 7–9]. As a prime example, HDAC inhibitors (HDACi) have already been introduced in clinical treatments of cutaneous T-cell leukemia, and currently many more epigenetic modulators are undergoing screening process [ 10]. In fact, modulation of acetylation levels seems to be the most straightforward and so far most successful way to restore the normal epigenetic pattern and ultimately the correct gene expression profile. This highlights the importance of acetylation in cancer biology.

In this review, we aim to explore the various levels of involvement of acetyltransferases in cancer and the general contribution of impaired acetylation patterns on tumorigenesis.


The first fusion gene [3] was described in cancer cells in the early 1980s. The finding was based on the discovery in 1960 by Peter Nowell and David Hungerford in Philadelphia of a small abnormal marker chromosome in patients with chronic myeloid leukemia—the first consistent chromosome abnormality detected in a human malignancy, later designated the Philadelphia chromosome. [4] In 1973, Janet Rowley in Chicago showed that the Philadelphia chromosome had originated through a translocation between chromosomes 9 and 22, and not through a simple deletion of chromosome 22 as was previously thought. Several investigators in the early 1980s showed that the Philadelphia chromosome translocation led to the formation of a new BCR/ABL1 fusion gene, composed of the 3' part of the ABL1 gene in the breakpoint on chromosome 9 and the 5' part of a gene called BCR in the breakpoint in chromosome 22. In 1985 it was clearly established that the fusion gene on chromosome 22 produced an abnormal chimeric BCR/ABL1 protein with the capacity to induce chronic myeloid leukemia.

It has been known for 30 years that the corresponding gene fusion plays an important role in tumorgenesis. [5] Fusion genes can contribute to tumor formation because fusion genes can produce much more active abnormal protein than non-fusion genes. Often, fusion genes are oncogenes that cause cancer these include BCR-ABL, [6] TEL-AML1 (ALL with t(12 21)), AML1-ETO (M2 AML with t(8 21)), and TMPRSS2-ERG with an interstitial deletion on chromosome 21, often occurring in prostate cancer. [7] In the case of TMPRSS2-ERG, by disrupting androgen receptor (AR) signaling and inhibiting AR expression by oncogenic ETS transcription factor, the fusion product regulates the prostate cancer. [8] Most fusion genes are found from hematological cancers, sarcomas, and prostate cancer. [9] [10] BCAM-AKT2 is a fusion gene that is specific and unique to high-grade serous ovarian cancer. [11]

Oncogenic fusion genes may lead to a gene product with a new or different function from the two fusion partners. Alternatively, a proto-oncogene is fused to a strong promoter, and thereby the oncogenic function is set to function by an upregulation caused by the strong promoter of the upstream fusion partner. The latter is common in lymphomas, where oncogenes are juxtaposed to the promoters of the immunoglobulin genes. [12] Oncogenic fusion transcripts may also be caused by trans-splicing or read-through events. [13]

Since chromosomal translocations play such a significant role in neoplasia, a specialized database of chromosomal aberrations and gene fusions in cancer has been created. This database is called Mitelman Database of Chromosome Aberrations and Gene Fusions in Cancer. [14]

Presence of certain chromosomal aberrations and their resulting fusion genes is commonly used within cancer diagnostics in order to set a precise diagnosis. Chromosome banding analysis, fluorescence in situ hybridization (FISH), and reverse transcription polymerase chain reaction (RT-PCR) are common methods employed at diagnostic laboratories. These methods all have their distinct shortcomings due to the very complex nature of cancer genomes. Recent developments such as high-throughput sequencing [15] and custom DNA microarrays bear promise of introduction of more efficient methods. [16]

Gene fusion plays a key role in the evolution of gene architecture. We can observe its effect if gene fusion occurs in coding sequences. [17] Duplication, sequence divergence, and recombination are the major contributors at work in gene evolution. [18] These events can probably produce new genes from already existing parts. When gene fusion happens in non-coding sequence region, it can lead to the misregulation of the expression of a gene now under the control of the cis-regulatory sequence of another gene. If it happens in coding sequences, gene fusion cause the assembly of a new gene, then it allows the appearance of new functions by adding peptide modules into multi domain protein. [19] The detecting methods to inventory gene fusion events on a large biological scale can provide insights about the multi modular architecture of proteins. [20] [21] [22]

Purine biosynthesis Edit

The purines adenine and guanine are two of the four information encoding bases of the universal genetic code. Biosynthesis of these purines occurs by similar, but not identical, pathways in different species of the three domains of life, the Archaea, Bacteria and Eukaryotes. A major distinctive feature of the purine biosynthetic pathways in Bacteria is the prevalence of gene fusions where two or more purine biosynthetic enzymes are encoded by a single gene. [23] Such gene fusions are almost exclusively between genes that encode enzymes that perform sequential steps in the biosynthetic pathway. Eukaryotic species generally exhibit the most common gene fusions seen in the Bacteria, but in addition have new fusions that potentially increase metabolic flux.

In recent years, next generation sequencing technology has already become available to screen known and novel gene fusion events on a genome wide scale. However, the precondition for large scale detection is a paired-end sequencing of the cell's transcriptome. The direction of fusion gene detection is mainly towards data analysis and visualization. Some researchers already developed a new tool called Transcriptome Viewer (TViewer) to directly visualize detected gene fusions on the transcript level. [24]

Biologists may also deliberately create fusion genes for research purposes. The fusion of reporter genes to the regulatory elements of genes of interest allows researches to study gene expression. Reporter gene fusions can be used to measure activity levels of gene regulators, identify the regulatory sites of genes (including the signals required), identify various genes that are regulated in response to the same stimulus, and artificially control the expression of desired genes in particular cells. [25] For example, by creating a fusion gene of a protein of interest and green fluorescent protein, the protein of interest may be observed in cells or tissue using fluorescence microscopy. [26] The protein synthesized when a fusion gene is expressed is called a fusion protein.

Materials and methods

Plant material

Experiments were carried out with ebi-1 that had been backcrossed four times to the parental transgenic line 6A carrying the CAB2:LUC+ reporter construct (NASC ID N9352).

The T-DNA line SALK_128255.54.50.n was obtained from NASC and plants homozygous for the T-DNA were confirmed by PCR using primers 5'-ttgccgcagtaacaaaggtac-3', 5'-agtttatccggaagcaaatgg-3' (WT band in Col-0, no band in homozygous SALK line). The left border sequence was amplified with 5'-agtttatccggaagcaaatgg-3' and LBb primer. CAB2:LUC+ was introduced using Agrobacterium-mediated transformation and dipping protocol [36].

Screen for circadian clock mutants

The mutagenesis and screening have been described in [18]. Briefly, Arabidopsis Ws-2 transgenic seeds carrying the CAB2:LUC+ transgene (described above) were mutagenized by soaking in 100 mM EMS for 3 h. The resulting M1 population was sown and self-fertilized, and the M2 population was screened for seedlings with altered timing of CAB2:LUC+ expression in constant darkness.

Analysis of circadian rhythms

Seedlings were then sown on Murashige and Skoog medium containing 3% sucrose and 1.5% agar. They were entrained in a growth chamber in light/dark cycles at 22°C for 7 days before transfer to constant light and temperature. Two methods where used to measure CAB2:LUC+ activity. For the initial screen and preliminary characterization of the mutant in constant dark an automated luminometer was used (Topcount, Perkinelmer, Cambridge, UK)as described [37]. The second method for the characterization of the mutant in constant light and subsequent characterization of backcrossed lines and T-DNA mutants was a low-light video imaging system as described in [37]. The method for measuring rhythms in leaf movement used older 12-day-old seedlings and a method identical to that described in [38].

Sequencing WS-2 and ebi-1

DNA was isolated using a plant DNeasy kit (Qiagen, Crawley, West Sussex, UK) Two read tag libraries were prepared, one for ebi-1 and one for Ws. Emulsion PCR using the standard SOLiD protocol was performed on each library. The libraries were deposited onto separate slides and sequenced in a single run using the SOLiD analyzer version 2 (Life Technologies).

For the 454 genome sequencing, 5 μg of Ws-2 DNA was fragmented by nebulization. Fragmented DNA was analyzed using a Bioanalyzer (Agilent Technologies, Wokingham, Berkshire, UK)to ensure that the majority of the fragments were between 350 and 1,000 bp. The purified fragmented DNA was processed according to the 454 FLX Titanium Library construction kit and protocol (Roche Applied Science, Burgess Hill, East Sussex, UK). Library fragments were added to emulsion PCR beads at a ratio of 1:1 to emPCR at the optimal of 1.5 DNA molecules per bead and amplified according to the manufacturer's instructions (Roche Applied Science) and a full pico-titre plate was sequenced.

The resulting 35-character color-space tags from both sequencing runs were then mapped to the 119.7 Mbp Col-0 reference sequence [39] using the matching pipeline of the off-machine SOLiD data analysis package Corona Lite [40] employing a range of matching schemas, based on the full-length 35-character color-space tags as well as schemas based on tags trimmed to 25 characters to remove the most error-prone positions. Putative SNPs relative to Col-0 were then called for each genome using Corona Lite's SNP detection pipeline.

The resulting SNP list for ebi-1 was then cross-referenced with that of Ws-2 to identify SNPs shared by both genomes, as well as SNPs occurring only in ebi-1 or only in Ws-2. At this stage low-confidence SNPs were filtered out by excluding all SNP loci where coverage was 5 or less, SOLiD SNP scores were less than 0.7, or the SNP was heterozygous, in either genome. To ensure only high-confidence SNPs were considered, a further screening round was undertaken in which only those reported by all matching schemas employed were considered for subsequent analysis.

Using current (TAIR 8) annotations [39] as a guide, high-confidence SNPs were classified and enumerated. The sequence data for Ws-2 are archived at TAIR and available as a track on the Arabidopsis genome hosted at TAIR [SpeciesVariant:393] [41].

SNP validation

To validate the SNPs between ebi-1 and Ws-2, we used a simple PCR-based approach of CAPS and dCAPS analysis. PCR primers for CAPS/dCAPS analysis were designed using dCAPS finder 2.0 [42]. A standard PCR protocol was used to amplify products from ebi-1 and Ws-2, and the PCR products were digested and run on a 4% agarose gel and scored. The primers, restriction sites and product sizes are summarized in Additional file 4. The SNPs in PRR7 and EBI were further validated by standard sequencing methods.

Quantification of RNA using real-time PCR

Seedlings were grown under 12-h light/12-h dark cycles for 6 days. Seedlings were harvested directly into liquid nitrogen at 1 h after dawn and 1 h after dusk using a green safety light. The RNA was subsequently extracted using an RNeasy Plant Mini Kit (Qiagen, Hilden, Germany). cDNA was synthesized from 1 μg of total RNA using the iScript™ cDNA synthesis kit (Bio-Rad Laboratories, Inc., Hercules, CA, USA). Real-time PCR was performed with a MyIQ™, ICycler or CFX96 Real-Time PCR Detection System (Bio-Rad Laboratories, Hempstead, Hertfordshire, UK), using iQ SYBR ® Green Supermix (Bio-Rad Laboratories). The efficiency of amplification was assessed relative to β-TUBULIN (βTUB) expression. The measurements were repeated at least two times with independent biological material. Expression levels were calculated relative to the reference gene using a comparative threshold cycle method [43]. The results show the mean of four biological replications, each with three technical repeats, and expressed relative to the mean of the wild-type series after standardization to βTUB. Primers for βTUB have been published previously [44]. The EBI-specific primers were as follows: EBI-F, 5'-TGC GAG AAT ATG CTT AAT TGC-3' EBI-R, 5'-CCA CAA CAT CAC AAG ACA AG-3'.

Mapping ebi-1

An F2 mapping population was made between ebi-1 and Col-0. A set of approximately 20 individuals from this population, which had their ebi-1 phenotype confirmed in the F3, had recombination events in chromosome 5 and placed the ebi-1 mutation on the north arm of chromosome 5. This mapping population was increased and with two individuals we were further able to limit the mapping interval to between CIW18 and nga158.

Acquisition of chromosome instability is a mechanism to evade oncogene addiction

Chromosome instability (CIN) has been associated with therapeutic resistance in many cancers. However, whether tumours become genomically unstable as an evolutionary mechanism to overcome the bottleneck exerted by therapy is not clear. Using a CIN model of Kras-driven breast cancer, we demonstrate that aneuploid tumours acquire genetic modifications that facilitate the development of resistance to targeted therapy faster than euploid tumours. We further show that the few initially chromosomally stable cancers that manage to persist during treatment do so concomitantly with the acquisition of CIN. Whole-genome sequencing analysis revealed that the most predominant genetic alteration in resistant tumours, originated from either euploid or aneuploid primary tumours, was an amplification on chromosome 6 containing the cMet oncogene. We further show that these tumours are dependent on cMet since its pharmacological inhibition leads to reduced growth and increased cell death. Our results highlight that irrespective of the initial CIN levels, cancer genomes are dynamic and the acquisition of a certain level of CIN, either induced or spontaneous, is a mechanism to circumvent oncogene addiction.

Keywords: Chromosome instability breast cancer cMet mouse models resistance.

Results and Discussion

Estimating fragment bias in existing protocols

Fragment counts in an RNA-Seq experiment are determined by two different phenomena: fragments originating from highly expressed transcripts will appear more often in the data than those originating from lower-expressed transcripts, and library preparations include biases that may preferentially select some potential fragments over others. By fragment bias we mean only the over- or under-representation of fragments due to sequence-specific or positional bias as discussed in the Background. Because expression levels also affect fragment abundances, it is necessary to jointly estimate transcript abundances and bias parameters in order to properly learn the bias directly from RNA-Seq data.

This issue is illustrated by example in Figure 2 where the need for joint estimation of bias parameters and expression values is evidenced by comparison of the raw counts of bases at the starts/ends of fragments (panel a) and the adjusted counts normalized by the abundances of transcripts (panel b). The latter calculation is affected by the bias parameters, so that joint estimation is required. We expanded the likelihood framework described in [6] in order to perform such parameter estimation (see Materials and methods), resulting in 'learned' bias weights (panel d Figure 2) that were used to adjust expected fragment counts in the computation of abundances using our likelihood model. Figure 3 shows an example of how well these bias estimates capture the over- and under-representations of reads at different positions of a transcript, based on its sequence.

Bias correction within transcripts. An example showing the effect of bias correction on the read counts for human transcript NM_004684 . The top panel shows raw read counts (number of 3' ends of fragments at each location), and the bottom panel shows the product of the bias parameters (total bias weight defined in the Supplementary methods in Additional file 3) at the same locations. We correctly identify bias at different positions and can therefore correct for the non-uniformity. Note that the bias parameters were learned from the entire dataset excluding reads mapped to this transcript in order to cross-validate our results. The RNA-Seq for the experiment was performed with the NSR protocol [21], which is why 3' counts were used instead of 5'.

Validation by comparison to alternative expression assays

We emphasize that our goal was not to validate RNA-Seq per se, but rather to show that bias correction improves expression estimation. Therefore, in interpreting the correlations throughout the paper, we focused on improvements in correlation with bias correction and not on the absolute value. In this regard, we report most of our results as fraction discrepancy explained, which we calculated by dividing the change in R 2 after bias correction by the difference of the initial R 2 from 1 (a perfect correlation). Selected correlation plots can be found in Figure S3 of Additional file 1 and all raw expression data in Additional file 2. Furthermore, we mention that we observed that correlation results were sensitive to the extent of filtering of low abundance fragments and we therefore attempted to eliminate filtering in the experiments we performed (see Materials and methods for more detail).

A major problem with validating RNA-Seq expression estimates is that there is no clear 'gold standard' for expression estimation. Comparison of RNA-Seq to microarrays has suggested that the former technology is more accurate than the latter [13]. We examined the recently published NanoString nCounter gene expression system [14], but noticed many unexplainable outliers and high variance between technical replicates (see Figure S4 of Additional file 1 and data in Additional file 2). Quantitative reverse transcription PCR (qRT-PCR) has served as a benchmark in numerous studies but it is not a perfect expression measurement assay [15], and it is therefore a priori unclear which technology currently produces the most accurate expression estimates. Nevertheless, at present we believe it to be the best measure of expression aside from, perhaps, RNA-Seq itself. Due to the previously demonstrated superiority of RNA-Seq over microarrays, and the problems with NanoString, we performed all our benchmarking with respect to qRT-PCR.

We began by comparing the expression estimates on the Microarray Quality Control (MAQC) Human Brain Reference (HBR) dataset, which includes 907 transcripts with uniquely mapping TaqMan qRT-PCR probes [16], with RNA-Seq data from the same sample sequenced by Illumina (SRA012427) [17] (Figure 4). We examined the correlation of the Cufflinks output with the qRT-PCR expression data and observed an increase of R 2 from 0.753 before correction to 0.807 after correction.

Correlation between RNA-Seq and qRT-PCR. (a) Expression estimates before bias correction (tail of arrows) and after correction (points of arrows) for the SRA012427 dataset compared to qRT-PCR values for the same transcripts. Red arrows show decrease in expression after correction and blue an increase. Note that we have zoomed in on lower-expression transcripts (the majority) for clarity. (b) Distribution of log-fold change in expression after bias correction.

We examined the basis for change in correlation by further investigating, for each transcript, whether its expression estimate increased or decreased after bias correction, and by how much. The arrows in Figure 4 show the direction and extent of expression change with correction, and the overall fold-change distribution. Many fragments show large changes in expression with a median absolute fold change of 1.5 (Figure 4b). To establish the significance of the improvement in correlation, we performed a permutation test where we changed the expression estimates of transcripts randomly according to the fold change distribution in Figure 4b. We obtained a P-value of 0.0007, meaning that the improvement in R 2 our correction accomplishes is highly significant. Together, these results show that bias correction may dramatically affect expression estimates via both increases and decreases of expression values, and that these changes provide an overall improvement in abundance estimates.

Comparison with previous methods

In [8], a method for bias correction is proposed that is based on correcting read counts for transcripts according to the bias learned for patterns at the start of reads (normalized using sequences in the interior of reads). This approach uses less information than our method, as it is restricted to learning bias within the read sequence, and cannot capture bias surrounding the start site. Furthermore, count-based methods do not fully exploit the information available in paired-end reads which allow for the determination of fragment length. Fragment length can help in assigning ambiguously mapped fragments to transcripts and our method takes advantage of this. On the other hand, since read counts have been promoted as an acceptable way to measure abundance [18], we compared the method to ours using the MAQC qRT-PCR data from the previous section. Figure 5 shows the results of the method of [8], both before and after bias correction (R 2 = 0.711 before and R 2 = 0.715 after correction). To obtain these results we used the software package Genominator [8], following the guidelines in the documentation, with the exception that bias was learned separately for each chromosome, as the software was not able to load an entire genome into memory. More details are provided in the Materials and methods section.

Comparison with previous methods. A comparison of our method ( Cufflinks ) with Genominator [8] and mseq [10]. The y-axis shows the R 2 value for the correlation between uncorrected (green) and bias corrected (orange) RNA-Seq expression estimates and qRT-PCR for the three methods. Correlation plots for these data can be found in Figure S3 of Additional file 1.

We also compared our approach to the mseq method in [10]. We again used the MAQC HBR qRT-PCR data and this time prepared the sequences and learned parameters for models following the suggested guidelines in [10], that is we trained the parameters of a MART model for bias by learning from the 100 most expressed transcripts in the experiment, and then tested on the set of 907 transcripts with uniquely mapping TaqMan probes. In this case, we observed an uncorrected R 2 = 0.730 and corrected R 2 = 0.755. Note that the even though the expression was again calculated using counts, the initial correlation of mseq is better than that of Genominator due to the fact that the implementation in [10] required us to remap the reads directly to the transcript sequences, which is presumably more accurate than relying on spliced mapping.

We suspect that the overall inferior results of both the Genominator and mseq in comparison to Cufflinks are due in part to the fact that the bias parameters cannot be learned from raw read counts, but must be normalized by the expression values of the transcripts from which the reads originate (Figure 2). For example, in [10], bias parameters are learned from what are estimated to be the most highly expressed transcripts based on RPKM, but these are likely to also be the most positively biased transcripts, and are therefore not representative in terms of their sequence content. We also believe that, as we argued in [6], it is important to account for fragment lengths in estimating expression, and read count based expression measures do not use such information. Another issue affecting Genominator is that instead of computing the expected read count as is done in Cufflinks and mseq , the observed read counts are adjusted. This means that in positions lacking read alignments, there is no correction of bias. We believe this may partially explain the improved performance of mseq in comparison to Genominator .

Technical replicates

A recurring worry with RNA-Seq has been that repeated experiments, possibly based on different libraries or performed in different laboratories, may be variable due to experimental 'noise'. We investigated these effects starting with an exploration of the correlation between technical replicates before and after bias correction. We define technical replicates to be the sequencing of two different libraries that have been prepared using the same protocol from a single sample. This differs slightly from some previous uses in particular, technical replication has also referred to two sequencing experiments from the same library. Such replicates have already been shown to exhibit very little variability [18, 19].

We postulated that the differences between expression estimates from two different libraries should be reduced after bias correction. We tested this hypothesis in a series of analyses whose results are shown in Figure 6. First, we examined libraries prepared in two different experiments from the same MAQC Universal Human Reference (UHR) sample. In the first experiment [20], which we will refer to by its accession SRA008403, the sample was sequenced from one library preparation. In the second experiment [19], which we will refer to as SRA010153, the sample was sequenced in four separate library preparations. Although the same protocol was used in all five replicates, the learned bias weights differ somewhat between the data produced by the two labs (see Figure S2 in Additional file 1).

Variable technical replicates. Results of correlation tests showing improvement after bias correction for technical replicates. Fraction Explained Discrepancy was calculated by dividing the change in R 2 after bias correction by the difference of the initial R 2 from one (a perfect correlation). Note that when two RNA-Seq datasets are compared, the correction in the legend was applied to both. The pairwise correlations of the four SRA010153 replicates versus qRT-PCR and SRA008403, respectively, were averaged for the figure. Even though the same RH priming protocol was used in both labs, the bias differs slightly (see Figure S2 of Additional file 1) between the preps, which is why our correction method was able to improve the correlation.

Figure 6 shows how correlations of the replicates with qRT-PCR and each other were affected by bias correction. Although the method does improve the pairwise correlations between different library preparations within SRA010153, the initial correlation is already so high (average R 2 > 0.96) that we only show the average pairwise correlations against qRT-PCR and the SRA008403 dataset. The greater correlation among the SRA010153 replicates as compared to the correlation between them and SRA008403 further indicates that bias is more similar when the protocol is carried out by the same lab, presumably by the same person. Bias correction clearly recovers much of the differences in quantification between the replicates introduced by sequence and positional bias. Furthermore, as in the initial validation example, the correction brings both sets closer in line with the qRT-PCR standard.

Library preparation methods

In Figure 7 we demonstrate our ability to correct bias specific to libraries prepared using different protocols. For this experiment, we tried our method on several libraries from a study comparing strand-specific protocols (SRA020818) using the same yeast sample [11], as well as a dataset generated using the 'not so random' (NSR) priming protocol on the human MAQC HBR sample [21]. We compared all of these datasets with a standard Random Hexamer (RH) control for the given sample. Note that although the control (RH) and dUTP libraries have the same sequence bias (see Figure S2 in Additional file 1) and near-perfect initial correlation (R 2 > 0.99), the remaining discrepancy is reduced by positional bias correction.

Variable library preparations. Results of correlation tests showing improvement after bias correction of datasets generated using different library prep methods, all of which are strand-specific. The first four protocols are described in [11] and the final in [21]. All datasets were compared against a control that was generated using the standard Illumina RH protocol. The first four datasets used the control from [11] with the same yeast sample. The last dataset (NSR) was compared against the HBR dataset from SRA010153 since it is also consists of single-end reads.

Because the NSR dataset was sequenced from the MAQC HBR sample, we were also able to compare it to the qRT-PCR standard. We found that our method explained 33.5% of the discrepancy between an initial estimation and qRT-PCR.

Sequencing platforms

Previous studies on bias in RNA-Seq have focused on experiments performed with Illumina sequencers. To investigate whether bias persists with other prep and sequencing technologies, we examined bias in a SOLiD experiment that sequenced both MAQC samples using the standard whole transcriptome (WT) protocol. We saw clear signs of both sequence-specific and positional bias that differed from the other protocols we had examined (see Figure S2 of Additional le 1).

We next compared the expression estimates for the SOLiD dataset with one from Illumina (accession SRA012427) before and after bias correction. In order to illustrate that our improvement in correlation does not come solely from correcting bias in the Illumina dataset, we tested whether there was some improvement from correcting one dataset at a time, as compared to simultaneous correction for both platforms. We found an increase of R 2 from 0.74 to 0.88 (Illumina correction) and 0.85 (SOLiD correction) compared to 0.94 for both. These results are summarized in Figure 8. While one cannot draw general conclusions based on a single experiment, we note that our approach to quantifying bias should be useful in future studies that aim to quantitatively compare the bias among different sequencing platforms.

Bias in different sequence technologies. Results of correlation tests showing improvement after bias correction of datasets generated using different sequencing technologies. The Illumina dataset is SRA012427 (x-axes) and the SOLiD data is SOLiD4_ HBR_PE_50x25 (y-axes). Both used the same MAQC HBR sample. Red axes and lines denote uncorrected FPKM values and blue corrected, while purple regression lines denote a comparison between corrected and uncorrected values. Both datasets are being corrected for different biases, which causes their expression estimates to become more correlated. Note that the plot is zoomed in on the lower abundance transcripts for clarity but captures over 98% of those in the experiment.


In conclusion, agents targeting driver oncogenic mutations in the advanced NSCLC setting have already changed the treatment paradigm. Given the high incidence of KRAS mutations in patients with NSCLC, this is a promising therapeutic target. However, KRAS is a heterogeneous entity and other coexisting alterations may be crucial for its role and biologic impact. Even though attempts to target KRAS pathway have shed little light so far, new molecules or new therapeutic strategies may revolutionize outcomes in patients with KRAS-driven NSCLC in the near future. Further investigations to better understand the pathways involved, to identify possible synthetic lethal partners and for a better patient selection are needed.

Regulation of Autophagy by p53

The tumor suppressor protein p53 is often inactivated in tumor cells, for instance due to mutations of p53 itself, due to mutations of the kinases that lead to its activation (such as ATM or Chk1) or due to the amplification of MDM2, the E3 ubiquitin ligase that targets p53 to proteasome-mediated destruction. 35, 36 Hence, inactivation of the p53 system is one of the most frequent alterations that occur in cancer. p53 is mainly viewed as a transcription factor that transactivates proapoptotic and cell cycle-arresting genes, 36 thereby favoring apoptosis and senescence of cancer cell precursors (which explains its tumor-suppressive effects as a ‘guardian of the genome’) or of cancer cells that respond to chemotherapy or radiotherapy (which explains its positive impact on anticancer treatment). p53 can also transactivate an autophagy-inducing gene, dram, which codes for a lysosomal protein, 37 and p53-dependent induction of autophagy has been documented by several groups in response to DNA damage, 38 Arf activation, 39 or reexpression of p53 in p53-negative tumor cells. 40

Recently, we observed that inactivation of p53 by deletion, depletion or inhibition also can trigger autophagy. 41 Thus, human and mouse cells subjected to knockout, knockdown or pharmacological inhibition of p53 manifest signs of autophagy, such as depletion of p62/SQSTM1, LC3 lipidation (and hence conversion of LC3-I into LC3-II), redistribution of GFP-LC3 in cytoplasmic puncta and electron microscopic evidence of autophagosomes and autolysosomes, 42 both in vitro and in vivo. 41 This applies to a variety of methods for p53 inactivation: chemical inhibition with cyclic pifithrin-α (PFT-α), knockdown with siRNAs specific for human p53, mouse p53 or the Caenorhabditis elegans p53 ortholog cep1 or homologous recombination of p53 in human cancer cells, mice or nematodes. p53 inactivation was found to induce autophagy in several nontransformed or malignant human cell lines, namely HFFF2 fibroblasts, HCT116 colon cancer cells, SH-SY5Y neuroblastoma and HeLa cervical cancer cells, in mouse embryonic fibroblasts (MEF), in vivo in multiple mouse tissues (kidney, pancreas, liver, brain and heart), as well as in C. elegans embryos and adult pharyngeal cells. 41 Electron microscopy and immunofluorescence experiments indicate that p53 inactivation induces both autophagy of the ER (reticulophagy) and mitochondria (mitophagy). When p53 was inhibited in an acute fashion by addition of PFT-α, reticulophagy was induced more rapidly than mitophagy, suggesting an intimate relationship between p53 inhibition and ER stress (which also induces preferential reticulophagy). Accordingly, p53 triggered the phosphorylation of eIF2α, which is a hallmark of ER stress. Moreover, the knockdown or knockout of IRE1 , one of the quintessential ER stress effectors, reduced autophagy induction by p53 inactivation. 41

Inhibition of p53 caused autophagy in enucleated cells, indicating that the cytoplasmic, nonnuclear pool of p53 can regulate autophagy. We also observed that retransfection of p53 −/− cells with wild-type (WT) p53 suppressed autophagy and that this effect could be mimicked by a p53 mutant that is excluded from the nucleus, due to the deletion of the nuclear localization sequence. In contrast, retransfection of p53 −/− cells with a nucleus-restricted p53 mutant (in which the nuclear localization sequence has been deleted) failed to inhibit autophagy. Hence, cytoplasmic (but not nuclear) p53 is responsible for the inhibition of autophagy. Several distinct autophagy inducers (e.g., nutrient depletion or addition of rapamycin, lithium, tunicamycin or thapsigargin) stimulated the rapid degradation of p53. The p53 protein was depleted both from the nucleus and from the cytoplasm, with similar kinetics. Inhibition of the p53-specific E3 ubiquitin ligase MDM2 (by two distinct pharmacological inhibitors or by knockdown) avoided p53 depletion and simultaneously prevented the activation of autophagy. Finally, a p53 mutant that lacks the MDM2 ubiquitinylation site and hence is more stable than WT p53 was particularly efficient in inhibiting autophagy. 41

In conclusion, p53 has a dual function in the control of autophagy. On the one hand, nuclear p53 can induce autophagy through transcriptional effects. On the other hand, cytoplasmic p53 may act as a master repressor of autophagy. How this latter effect is achieved in mechanistic terms is not clear yet (Figure 1).

Cytoplasmic versus nuclear effects of p53. p53 has a Janus role in the control of autophagy. Nuclear p53 promotes the transcription of proapoptotic and cell cycle-arresting genes, and also can act as an autophagy-inducing transcription factor. In contrast, cytoplasmic p53 degradation exerts an autophagy-inhibitory function. Both of the p53 ‘faces’ are not completely known at molecular level

Watch the video: The cancer gene we all have - Michael Windelspecht (June 2022).