WO2010108638A1 - Tumour gene profile - Google Patents

Tumour gene profile Download PDF

Info

Publication number
WO2010108638A1
WO2010108638A1 PCT/EP2010/001773 EP2010001773W WO2010108638A1 WO 2010108638 A1 WO2010108638 A1 WO 2010108638A1 EP 2010001773 W EP2010001773 W EP 2010001773W WO 2010108638 A1 WO2010108638 A1 WO 2010108638A1
Authority
WO
WIPO (PCT)
Prior art keywords
genes
nsclc
set forth
expression levels
samples
Prior art date
Application number
PCT/EP2010/001773
Other languages
French (fr)
Other versions
WO2010108638A9 (en
Inventor
Jun Hou
Joan Gertrudis Jacobus Victor Aerts
Gerardus Grosveldm Franklin
Jacobus Nicholaas Jozef Philipsen
Original Assignee
Erasmus University Medical Center Rotterdam
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Erasmus University Medical Center Rotterdam filed Critical Erasmus University Medical Center Rotterdam
Publication of WO2010108638A1 publication Critical patent/WO2010108638A1/en
Publication of WO2010108638A9 publication Critical patent/WO2010108638A9/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/112Disease subtyping, staging or classification
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/118Prognosis of disease development
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers

Definitions

  • the present invention relates to the diagnosis, categorisation and prognosis of and for lung cancers, in particular non-small cell lung carcinomas (NSCLC).
  • NSCLC non-small cell lung carcinomas
  • Lung cancer is the most frequent cause of cancer deaths in Europe. There were 386,300 new lung cancer cases in 2006, with an estimated 334,800 deaths. This accounts for 13.5% of all cancer deaths [I].
  • lung cancer is in clinical practice sub-divided into four major histological subtypes: small cell lung cancer (SCLC), squamous cell carcinoma (SCC), adenocarcinoma (ADC), and large-cell carcinoma (LCC).
  • SCLC small cell lung cancer
  • SCC squamous cell carcinoma
  • ADC adenocarcinoma
  • LCC large-cell carcinoma
  • NSCLC non-small cell lung carcinomas
  • EGFR epidermal growth factor receptor
  • chemotherapies might depend on the expression of certain proteins, for example thymidine synthetase for Pemetrexed [28]. Therefore the refined classification of subtypes of NSCLC is becoming of more and more interest.
  • the staging of a cancer depends solely on the tumor extension at the moment of diagnosis and does not incorporate the biological behaviour of the tumor.
  • staging can, to some degree, be a prognostic indicator, it is still far from precise.
  • the increase in the predictive value of staging over the past decades is mostly due to the increase in the sensitivity of imaging techniques and not related to a better understanding of the biology of a tumor.
  • a method for classifying a test tissue sample as a malignant non-small cell lung carcinoma (NSCLC) by analysis of gene expression comprising the steps of: (a) assaying the expression levels of 5 or more genes selected from the genes set forth in Table 1 ; (b) comparing the expression levels of 5 or more genes with the expression levels of said 5 or more genes in a known non-cancerous tissue sample; wherein a change in the expression levels of said 5 or more genes indicates that the test tissue sample is a malignant NSCLC sample.
  • the 5 or more genes comprise, consist of or consist essentially of the genes set forth in Table 2.
  • the expression of the 5 or more genes is analysed with a two-dimensional hierarchical clustering of the expression levels of 5 or more genes as set forth in Table 1 ; wherein a correlation between the expression levels of said 5 or more genes and the pattern of gene expression levels observed in the two-dimensional clustering indicates that the test tissue sample is a malignant NSCLC sample.
  • the two-dimensional hierarchical clustering is performed using OmniViz (Bio wisdom) software, for example as set forth below.
  • a method for classifying a test tissue sample of a malignant non-small cell lung carcinoma (NSCLC) into LCC, ADC or SCC subtypes by analysis of gene expression comprising the steps of: (a) assaying the expression levels of 75 or more genes selected from the genes set forth in Table 3; (b) comparing the expression levels of said 75 or more genes with the expression levels of 75 or more genes from NSCLC tumour samples as set forth in Table 3; wherein a correlation between the expression levels of said 75 or more genes and the pattern of gene expression levels observed in Table 3 indicates that the test tissue sample is a malignant LCC, ADC or SCC subtype NSCLC sample.
  • NSCLC non-small cell lung carcinoma
  • gene expression is analysed by two-dimensional hierarchical clustering, which provides a graphical representation of comparative gene expression and facilitates classification of NSCLC samples into the relevant subtypes.
  • the 75 or more genes comprise, consist of or consist essentially of the 75 genes set forth in Table 4.
  • the optimised signature set forth in this table allows classification of NSCLC into LCC, ADC or SCC subtypes by genetic analysis using a reduced population of probes.
  • a method for predicting the survival time of a patient suffering from a non-small cell lung carcinoma (NSCLC) by analysis of gene expression comprising the steps of: (a) assaying the expression levels of 17 genes selected from the genes set forth in Table 5; (b) either (i) comparing the expression levels of said 17 genes with a two-dimensional hierarchical clustering of the expression levels of 17 genes from NSCLC tumour samples as set forth in Table 5; or (ii) fitting the expression levels of said 17 genes to a survival model to derive a prognostic index.
  • NSCLC non-small cell lung carcinoma
  • the methods according to the invention are in vitro methods.
  • Figure 1 shows a flowchart of the combinations of diagnostic methodologies, leading to an assessment of clinical risk for each patient. Accordingly, the invention provides a method for assessing clinical risk in a patient suffering from or suspected from suffering from NSCLC, comprising the steps of:
  • kits for assessing the presence, subtype or severity of NSCLC comprise reagents for measuring the presence of mRNA or polypeptides encoded by the genes identified herein.
  • kits may contain instructions as to use.
  • the kits may contain instructions as to the selection of genes to be screened in the diagnosis of NSCLC as set forth herein.
  • the genes are 5 or more of the genes set forth in Table 1, the 5 genes set forth in Table 2, 75 or more of the genes set forth in Table 3, the 75 genes set forth in Table 4 or the 17 genes set forth in Table 5.
  • the kit may contain instructions for the detection of the gene products expressed from said mRNA species.
  • any method for recognising the levels of expression of a gene may be used in the context of the present invention.
  • the genes identified in each gene signature, and the changes in expression levels associated therewith, are identified in the Tables set out below; analysis can be made manually, or using automated means, to compare the expression levels observed in a test sample to those observed in a reference sample.
  • kits in accordance with the invention may comprise any reagents suitable for measuring gene expression levels.
  • Such reagents comprise reagents for measuring levels of mRNA, or cDNA derived from mRNA, and/or reagents suitable for measuring levels of polypeptide gene products.
  • a kit may comprise nucleic acid probes which hybridise specifically to mRNA or cDNA specific for the appropriate gene signature, under appropriate conditions.
  • the probes may be immobilised onto a solid surface, such as glass slides, membranes of various types, columns or beads, and may be in the form of an addressable array. If the probes are on an array, the identity of each probe is advantageously known as a result of the spatial arrangement on the array itself.
  • Probes may be used in solution, to probe nucleic acids derived from the sample.
  • labelling means may be provided, to label either the probes or the sample nucleic acids.
  • Primers may also be provided, to prime extension reactions for amplification and/or labelling of sample nucleic acids.
  • the primers are specific for mRNA transcribed from the genes identified in the gene signatures set forth herein, or corresponding cDNA.
  • kits may alternatively, or in addition, comprise reagents such as immunoglobulins, RNA or peptide aptamers and the like which are capable of specifically detecting the polypeptide gene products of the target genes.
  • reagents such as immunoglobulins, RNA or peptide aptamers and the like which are capable of specifically detecting the polypeptide gene products of the target genes.
  • the present invention provides a diagnostic kit for use in characterising NSCLC tumours, comprising a set of reagents for specifically measuring the abundance of the mRNA species transcribed from the 5 or more of the genes set forth in Table 1 , the 5 genes set forth in Table 2, 75 or more of the genes set forth in Table 3, the 75 genes set forth in Table 4 or the 17 genes set forth in Table 5; or the gene products expressed from said mRNA species.
  • the reagents comprise a set of oligonucleotide primers or probes which hybridise specifically to said genes, which may advantageously be attached to a solid phase in the form of an array.
  • the array consists of a library of oligonucleotides affixed to a solid phase, and said library of oligonucleotides consists substantially of oligonucleotides which are specific for the 5 or more of the genes set forth in Table 1, the 5 genes set forth in Table 2, 75 or more of the genes set forth in Table 3, the 75 genes set forth in Table 4 or the 17 genes set forth in Table 5.
  • the reagents are selected from immunoglobulin molecules, RNA aptamers and peptide aptamers.
  • the kit is for use in detecting the presence of NSCLC tumour tissue, comprising a set of nucleic acid probes or primers which recognise the transcripts of the genes set forth in Table 2.
  • the kit is for use in differentiating between LCC, ADC or SCC subtypes of NSCLC, comprising a set of nucleic acid probes or primers which recognises the transcripts of the genes set forth in Table 4.
  • the kit is for use in estimating the prognosis for survival of a patient suffering from NSCLC, comprising a set of nucleic acid probes or primers which recognises the transcripts of the genes set forth in Table 5.
  • kits may further include labelling means.
  • immunoglobulins, RNA or peptide aptamers may be substituted for, or may supplement, the nucleic acid reagents in kits according to the invention.
  • Figure 1 is a table representing the use of assays according to the invention in the assessment of patients diagnosed with or suspected of suffering from NSCLC.
  • Figure 2 shows Kaplan-Meier plots for survival of NSCLC patients, separating the patients with good and poor prognoses as assessed using the gene signature set forth in Table 5. Light grey bars indicate the end of the follow-up.
  • Figure 3 shows survival prediction by published prognostic signatures.
  • Kaplan-Meier curves for the best performing signatures are shown for 82 Erasmus MC patients (left) and 89 Duke University NSCLC patients (right), fitted by their risk assignments. Grey bars indicate patients at last follow-up, still alive. P-values are between brackets if overall survival of the low risk group is actually lower than that of the high risk group.
  • NSCLC is "classified” by being identified as belonging to the SCC, LCC or ADC subtype, according to the terminology normally used in the art.
  • the present invention provides a further level of classification of NSCLC, as set forth below and in Table 5.
  • NSCLC can also be assessed as to severity by means of the present invention.
  • a prognosis for the patient's survival can be derived by the methods provided herein. Survival prognosis involves providing an estimate of the likelihood that a patient will survive for a given time period, for example 5 years.
  • the expression levels of genes are assayed in accordance with the present invention by measuring the levels of either nucleic acids or proteins encoded by the gene which are present in a sample. Expression levels are considered herein to be the amounts of mRNA or polypeptide which are present in a sample; they may be influenced, therefore, by for instance modulations in levels of transcription, translation, mRNA or protein turnover.
  • target genes Genes whose expression levels are described herein as being useful for identifying, classifying or measuring the severity of NSCLC are referred to as "target” genes; groups of target genes form gene signatures, which can be used to identify, classify or measure the severity of NSCLC.
  • Nucleic acids are nucleic acids as is commonly understood in the art, and include DNA, RNA and artificial nucleic acids. In the context of the present invention, the levels of naturally-occurring nucleic acids will generally be measured using techniques known to those skilled in the art. Probes, primers and other nucleic acid molecules used in the present invention may comprise synthetic nucleotides or other modifications, as is known in the art.
  • Reagents for measuring gene expression levels include nucleic acids and ligands, such as antibodies, which are capable of detecting the RNA or polypeptide products of the target genes described herein. Reagents may be selective, in that they bind to or detect only the RNA or polypeptide products of the target genes, or non-selective, capable of binding to or detecting a wider population of genes, with the selectivity being introduced in a later stage of the assay.
  • assays can be conducted on arrays that comprise many genes in addition to the target genes, and the detection of changes in the expression levels of the target genes will be achieved by selective analysis of the arrays.
  • the Affymetrix Gene chip analyser is capable of identifying binding to probes on gene chip arrays, thereby measuring the degree of hybridisation to the probe sets representing genes on the array as well as the identity of the probes hybridised to at the same time.
  • primers may be used to selectively detect the RNA gene products of target genes.
  • a "primer” is an oligonucleotide, whether produced naturally as in a purified restriction digest or produced synthetically, that is capable of acting as a point of initiation of synthesis when placed under conditions in which synthesis of a primer extension product that is complementary to a nucleic acid strand is induced, (i.e., in the presence of nucleotides and an inducing agent such as DNA polymerase and at a suitable temperature and pH).
  • the primer is preferably single stranded for maximum efficiency in the initiation of the reaction, but may alternatively be double stranded.
  • the primer is first treated to separate its strands before being used to prepare extension products.
  • the primer is an oligodeoxyribonucleotide.
  • the primer must be sufficiently long to prime the synthesis of extension products in the presence of the inducing agent. The exact lengths of the primers will depend on many factors, including temperature, source of primer and the use of the method.
  • probe refers to an oligonucleotide (i.e., a sequence of nucleotides), whether occurring naturally as in a purified restriction digest or produced synthetically, recombinantly or by PCR amplification, that is capable of hybridizing to at least a portion of another oligonucleotide of interest.
  • a probe may be single-stranded or double-stranded. Probes are useful in the detection, identification and isolation of particular gene sequences. It is contemplated that any probe used in the present invention can be labelled with a reporter molecule so that is detectable in any detection system, including, but not limited to enzyme (e. g., ELISA, as well as enzyme-based histochemical assays), fluorescent, radioactive, and luminescent systems. It is not intended that the present invention be limited to any particular detection system or label.
  • sample is used to denote biological samples which may be obtained from animals (including humans) and encompass fluids, solids, tissues, and gases.
  • Biological samples include sputum and blood products, such as plasma, serum and the like.
  • a sample is ordinarily a tissue sample obtained from normal tissue or a NSCLC.
  • “Comparing”, as used herein, includes comparison of expression levels of target genes directly with a control, as well as comparison with profiles, as described further herein. In comparisons according to the present invention, a match is sought between a pattern of gene expression seen in a control or in a predefined profile.
  • isolated when used in relation to a nucleic acid, refers to a nucleic acid sequence that is identified and separated from at least one component or contaminant with which it is ordinarily associated in its natural source. Isolated nucleic acid is present in a form or setting that is different from that in which it is found in nature.
  • isolated polypeptides are polypeptides or proteins separated from at least one component or contaminant with which they are ordinarily associated in their natural source
  • the sample used for analysis comprises tissue sample, which includes tumour tissue, and in particular human lung cancer tumour tissue.
  • tissue is, but is not limited to, epithelial tissue and connective tissue; other tissue types as may be used as and if they occur in a lung tumour.
  • NSCLC are comprised of epithelial tissue.
  • Samples are obtained from surgically resected lungs, or may be obtained from patients by standard biopsy techniques.
  • microdissection is used to ensure that the cell types subjected to analysis are the intended cell type.
  • Normal samples can be obtained from the same patient, adjacent the rumour, or from patients not suffering from cancer. Typically, normal samples will be of the same tissue type (i.e. epithelial tissue, connective tissue) as the tumour sample.
  • tissue type i.e. epithelial tissue, connective tissue
  • the normal sample is used to establish reference expression profiles to distinguish normal from cancerous tissues, for example as in Table 1. If an analysis model is defined, for example using two-dimensional hierarchical clustering, it is only necessary to analyse a tumor sample from a patient rather than both a tumor sample and a normal sample from the same or different patients.
  • the levels mRNAs present in a sample which are encoded by the gene identified in Tables 1-5 may be measured directly. Analysis is conveniently carried out by labelling the RNA in cells from the sample and assaying the abundance of the desired mRNA species. To prepare RNA from tumour and/or normal samples, total or poly(A)+ RNA is processed according to any suitable technique, for example as set forth below, to produce cDNA and subsequently cRNA, which is conveniently used in microarray analysis.
  • Copies of the cRNA or cDNA may be amplified, for example by RT-PCR. Fluorescent tags or digoxigenin-dUTP can then be enzymatically incorporated into the newly synthesized cDNA/cRNA or can be chemically attached to the new strands of DNA or RNA.
  • the assessment of expression is performed by gene expression profiling using oligonucleotide-based arrays or cDNA-based arrays of any type; RT-PCR (reverse transcription-Polymerase Chain Reaction), real-time PCR, in-situ hybridisation, Northern blotting, serial analysis of gene expression (SAGE) for example as described by Velculescu et al Science 270 (5235): 484-487, or differential display. Details of these and other methods can be found for example in Sambrook et al, 1989, Molecular Cloning: A Laboratory Manual.
  • the assessment uses a microarray assay.
  • Microarrays can be constructed by a number of available technologies. Array technology and the various techniques and applications associated with it are described generally in numerous textbooks and documents. Gene array technology is particularly suited to the practice of the present invention. Methods for preparing microarrays are well known in the art. These include Lemieux et al., (1998), Molecular Breeding 4,277-289, Schena and Davis. Parallel Analysis with Biological Chips, in PCR Methods Manual (eds. M. Innis, D. Gelfand, J. Sninsky), Schena and Davis, (1999), Genes, Genomes and Chips. In DNA Microarrays : A Practical Approach (ed. M.
  • array technology Major applications for array technology include the identification of sequence (nucleotide sequence/nucleotide sequence mutation) and the determination of expression level (abundance) of nucleotide sequences.
  • Gene expression profiling may make use of array technology, optionally in combination with proteomics techniques (Celis et al, 2000, FEBS Lett, 480 (1) : 2-16; Lockhart and Winzeler, 2000, Nature 405 (6788) : 827-836; Khan et al. , 1999,20 (2): 223-9).
  • any library may be arranged in an orderly manner into an array, by spatially separating the members of the library.
  • suitable libraries for arraying include nucleic acid libraries (including DNA, RNA, oligonucleotide and other nucleic acid libraries), peptide, polypeptide and protein libraries, as well as libraries comprising other types of molecules, such as ligand libraries.
  • the members of a library are generally fixed or immobilised onto a solid phase, preferably a solid substrate, to limit diffusion and admixing of the samples.
  • the libraries may be immobilised to a substantially planar solid phase, including membranes and non- porous substrates such as plastic and glass.
  • the samples are preferably arranged in such a way that indexing (i. e. reference or access to a particular sample) is facilitated.
  • the samples are applied as spots in a grid formation.
  • Common assay systems may be adapted for this purpose.
  • an array may be immobilised on the surface of a microplate, either with multiple samples in a well, or with a single sample in each well.
  • the solid substrate may be a membrane, such as a nitrocellulose or nylon membrane (for example, membranes used in blotting experiments).
  • Alternative substrates include glass, or silica based substrates.
  • the samples are immobilised by any suitable method known in the art, for example, by charge interactions, or by chemical coupling to the walls or bottom of the wells, or the surface of the membrane.
  • Other means of arranging and fixing may be used, for example, pipetting, drop-touch, piezoelectric means, ink-jet and bubblejet technology, electrostatic application, etc.
  • photolithography may be utilised to arrange and fix the samples on the chip.
  • the samples may be arranged by being "spotted" onto the solid substrate; this may be done by hand or by making use of robotics to deposit the sample.
  • arrays may be described as macroarrays or microarrays, the difference being the size of the sample spots.
  • Macroarrays typically contain sample spot sizes of about 300 microns or larger and may be easily imaged by existing gel and blot scanners.
  • the sample spot sizes in microarrays are typically less than 200 microns in diameter and these arrays usually contain thousands of spots.
  • microarrays may require specialised robotics and imaging equipment, which may need to be custom made. Instrumentation is described generally in a review by Cortese, 2000, The Engineer 14 [H]: 26.
  • targets and probes may be labelled with any readily detectable reporter such as a fluorescent, bioluminescent, phosphorescent, radioactive reporter.
  • kits which comprise microarrays which are specific for a desired set of genes.
  • microarrays according to the invention may consist of a solid phase and, immobilised thereto, a library of nucleic acid oligonucleotides or probes which consists substantially of one or more of the gene signatures identified herein, and listed in the Tables, especially Tables 2, 4, and 5.
  • Such specialised microarrays are less expensive to produce than general purpose microarrays, and less difficult and expensive to analyse.
  • the arrays according to the invention may comprise a library of oligonucleotides which is larger than, though still comprising, one or more of the gene signatures described herein, but still smaller than the set consisting of all known genes.
  • such arrays may comprise gene signatures which are useful for detecting other forms of cancer, or other types of NSCLC, or which may provide different insights into the prognosis for NSCLC patients, or the like.
  • Nucleic acid signatures in accordance with the invention may be detected by nucleic acid analysis which relies on amplification and/or sequencing of sample nucleic acids. Since the invention aims to measure gene expression, the methods used must quantitatively measure transcribed nucleic acid levels. The measured nucleic acids must therefore be mRNA, or nucleic acids derived quantitatively from mRNA such as cDNA.
  • nucleic acid amplification requires nucleic acid amplification.
  • Many amplification methods rely on an enzymatic chain reaction (such as a polymerase chain reaction, a ligase chain reaction, or a self- sustained sequence replication), a linear amplification procedure, or on the replication of all or part of the vector into which the desired sequence has been cloned.
  • the amplification according to the invention is an exponential amplification, as exhibited by for example the polymerase chain reaction.
  • amplification methods can be used in the methods of the present invention, and include polymerase chain reaction (PCR), PCR in situ, ligase amplification reaction (LAR), ligase hybridisation, Qbeta bacteriophage replicase, transcription-based amplification system (TAS), genomic amplification with transcript sequencing (GAWTS), nucleic acid sequence-based amplification (NASBA) and in situ hybridisation.
  • Primers suitable for use in various amplification techniques can be prepared according to methods known in the art.
  • PCR is a nucleic acid amplification method described inter alia in U.S. Pat. Nos. 4,683,195 and 4,683,202.
  • PCR consists of repeated cycles of DNA polymerase generated primer extension reactions.
  • the target DNA is heat denatured and two oligonucleotides, which bracket the target sequence on opposite strands of the DNA to be amplified, are hybridised. These oligonucleotides become primers for use with DNA polymerase.
  • the DNA is copied by primer extension to make a second copy of both strands. By repeating the cycle of heat denaturation, primer hybridisation and extension, the target DNA can be amplified a million fold or more in about two to four hours.
  • PCR is a molecular biology tool, which must be used in conjunction with a detection technique to determine the results of amplification.
  • An advantage of PCR is that it increases sensitivity by amplifying the amount of target DNA by 1 million to 1 billion fold in approximately 4 hours.
  • PCR can be used to amplify any known nucleic acid in a diagnostic context (Mok et al, (1994), Gynaecologic Oncology, 52: 247-252).
  • Self-sustained sequence replication is a variation of TAS, which involves the isothermal amplification of a nucleic acid template via sequential rounds of reverse transcriptase (RT), polymerase and nuclease activities that are mediated by an enzyme cocktail and appropriate oligonucleotide primers (Guatelli et al. (1990) Proc. Natl. Acad. Sci . US A 87 : 1874).
  • Enzymatic degradation of the RNA of the RNA/DNA heteroduplex is used instead of heat denaturation.
  • RNase H and all other enzymes are added to the reaction and all steps occur at the same temperature and without further reagent additions.
  • Ligation amplification reaction or ligation amplification system uses DNA ligase and four oligonucleotides, two per target strand. This technique is described by Wu, D. Y. and Wallace, R. B. (1989) Genomics 4:560. The oligonucleotides hybridise to adjacent sequences on the target DNA and are joined by the ligase. The reaction is heat denatured and the cycle repeated.
  • RNA replicase for the bacteriophage Q ⁇ which replicates single- stranded RNA, is used to amplify the target DNA, as described by Lizardi et al. (1988)
  • the target DNA is hybridised to a primer including a T7 promoter and a Q ⁇ 5' sequence region.
  • a primer including a T7 promoter and a Q ⁇ 5' sequence region is used as a primer to generate a cDNA connecting the primer to its 5' end in the process.
  • the resulting heteroduplex is heat denatured.
  • a second primer containing a Q ⁇ 3' sequence region is used to initiate a second round of cDNA synthesis.
  • T7 RNA polymerase then transcribes the double-stranded DNA into new RNA, which mimics the Q ⁇ . After extensive washing to remove any unhybridised probe, the new RNA is eluted from the target and replicated by Q ⁇ replicase. The latter reaction creates 10 fold amplification in approximately 20 minutes.
  • rolling circle amplification (Lizardi et al, (1998) Nat Genet 19:225) is an amplification technology available commercially (RCAT ⁇ ) which is driven by DNA polymerase and can replicate circular oligonucleotide probes with either linear or geometric kinetics under isothermal conditions.
  • a geometric amplification occurs via DNA strand displacement and hyperbranching to generate 10 12 or more copies of each circle in 1 hour. If a single primer is used, RCAT generates, in a few minutes, a linear chain of thousands of tandemly linked DNA copies of a target covalently linked to that target.
  • SDA strand displacement amplification
  • SDA comprises both a target generation phase and an exponential amplification phase.
  • target generation double-stranded DNA is heat denatured creating two single-stranded copies.
  • a series of specially manufactured primers combine with DNA polymerase (amplification primers for copying the base sequence and bumper primers for displacing the newly created strands) to form altered targets capable of exponential amplification.
  • the exponential amplification process begins with altered targets (single-stranded partial DNA strands with restricted enzyme recognition sites) from the target generation phase.
  • An amplification primer is bound to each strand at its complementary DNA sequence.
  • DNA polymerase then uses the primer to identify a location to extend the primer from its 3' end, using the altered target as a template for adding individual nucleotides.
  • the extended primer thus forms a double-stranded DNA segment containing a complete restriction enzyme recognition site at each end.
  • a restriction enzyme is then bound to the double stranded DNA segment at its recognition site.
  • the restriction enzyme dissociates from the recognition site after having cleaved only one strand of the double-sided segment, forming a nick.
  • DNA polymerase recognises the nick and extends the strand from the site, displacing the previously created strand.
  • the recognition site is thus repeatedly nicked and restored by the restriction enzyme and DNA polymerase with continuous displacement of DNA strands containing the target segment.
  • Each displaced strand is then available to anneal with amplification primers as above.
  • the process continues with repeated nicking, extension and displacement of new DNA strands, resulting in exponential amplification of the original DNA target.
  • Identification of nucleic acid sequences can for example be performed by primer extension or sequencing techniques. Such techniques may involve the parallel and/or serial processing of a large number of different template nucleic acid molecules.
  • a library of probes on an array may be employed.
  • a high sensitivity analytical technique may be used to characterize individually nucleic acid molecules which become immobilised on the array, by hybridisation to the probes.
  • primer extension reactions may be used to incorporate labeled nucleotide(s) that can be individually detected in order to sequence individual molecules and/or determine the identity of at least one nucleotide position on individual nucleic acid molecules.
  • Detection may involve labeling one or more of the primers and or extension nucleotides with a detectable label (e.g., using fluorescent label(s), FRET label(s), enzymatic label(s), radio-label(s), etc.).
  • Detection may involve imaging, for example using a high sensitivity camera and/or microscope (e.g., a super-cooled camera and/or microscope).
  • Suitable techniques may be selected by one of ordinary skill in the art.
  • high- throughput sequencing approaches are listed in KY. Chan, Mutation Reseach 573 (2005) 13-40 and include, but are not limited to, near- term sequencing approaches such as cycle- extension approaches, polymerase reading approaches and exonuclease sequencing, revolutionary sequencing approaches such as DNA scanning and nanopore sequencing and direct linear analysis.
  • Examples of current high-throughput sequencing methods are 454 (pyro)sequencing, Solexa Genome Analysis System, Agencourt SOLiD sequencing method (Applied Biosystems), MS-PET sequencing (Ng et al., 2006, http ://nar . oxfordjournals.org/cgi/content/full/34/ 12/e84).
  • a digital analysis (e.g., a digital amplification and subsequent analysis) may be performed to obtain a statistically significant quantitative result.
  • Certain digital techniques are known in the art, see for example, US Patent No. 6,440,706 and US Patent No. 6,753,147, incorporated herein by reference.
  • an emulsion-based method for amplifying and/or sequencing individual nucleic acid molecules may be used (e.g., BEAMing technology; International Published Application Nos. WO2005/010145, WO00/40712, WO02/22869, WO03/044187, WO99/02671, herein incorporated by reference).
  • a sequencing method that can sequence single molecules in a biological sample may be used. Sequencing methods are known and being developed for high throughput (e.g., parallel) sequencing of complex genomes by sequencing a large number of single molecules (often having overlapping sequences) and compiling the information to obtain the sequence of an entire genome or a significant portion thereof. Suitable sequencing techniques may involve high speed parallel molecular nucleic acid sequencing as described in PCT Application No. WO 01/16375, US Application No. 60/151,580 and U.S. Published Application No. 20050014175, the entire contents of which are incorporated herein by reference. Other sequencing techniques are described in PCT Application No. WO 05/73410, PCT Application No. WO 05/54431, PCT Application No.
  • Sequencing techniques for use in connection with the invention may involve exposing a nucleic acid molecule to an oligonucleotide primer and a polymerase in the presence of a mixture of nucleotides. Changes in the fluorescence of individual nucleic acid molecules in response to polymerase activity may be detected and recorded.
  • the specific labels attached to each nucleic acid and/or nucleotide may provide an emission spectrum allowing for the detection of sequence information for individual template nucleic acid molecules.
  • a label is attached to the primer/template and a different label is attached to each type of nucleotide (e.g., A, T/U, C, or G). Each label emits a distinct signal which is distinguished from the other labels.
  • Useful sequencing methods include high throughput sequencing using the 454 Life Sciences Instrument System (International Published Application No. WO2004/069849, filed January 28, 2004). Briefly, a sample of single stranded DNA is prepared and added to an excess of DNA capture beads which are then emulsified. Clonal amplification is performed to produce a sample of enriched DNA on the capture beads (the beads are enriched with millions of copies of a single clonal fragment). The DNA enriched beads are then transferred into PicoTiterPlate (TM) and enzyme beads and sequencing reagents are added. The samples are then analyzed and the sequence data recorded. Pyrophosphate and luciferin are examples of the labels that can be used to generate the signal.
  • a label includes but is not limited to a fluorophore, for example green fluorescent protein (GFP), a luminescent molecule, for example aequorin or europium chelates, fluorescein, rhodamine green, Oregon green, Texas red, naphthofluorescein, or derivatives thereof.
  • the polynucleotide is linked to a substrate.
  • a substrate includes but is not limited to, streptavidin-biotin, histidine-Ni, S-tag-S-protein, or glutathione-S-transferase (GST).
  • a substrate is pretreated to facilitate attachment of a polynucleotide to a surface
  • the substrate can be glass which is coated with a polyelectrolyte multilayer (PEM), or the polynucleotide is biotinylated and the PEM- coated surface is further coated with streptavidin.
  • PEM polyelectrolyte multilayer
  • single molecule sequencing technology available from US Genomics, Mass., may be used.
  • technology described, at least in part, in one or more of US patents 6,790,671 ; 6,772,070; 6,762,059; 6,696,022; 6,403,311; 6,355,420; 6,263,286; and 6,210,896 may be used.
  • sequencing methods may be used to analyze DNA and/or RNA according to methods of the invention. It should be appreciated that a sequencing method does not have to be a single molecule sequencing method, since generally nucleic acid material from a substantial sample or biopsy will be available for analysis.
  • the levels of polypeptides encoded by the genes identified in Tables 1-5 can be measured directly, without measuring mRNA levels.
  • polypeptides can be detected by differential mobility on protein gels, or by other size analysis techniques such as mass spectrometry.
  • Peptides derived from the gene signatures identified herein can be differentiated by size analysis.
  • the detection means is sequence-specific, such that a particular gene product can accurately be identified as the product of a member of any given gene signature.
  • polypeptide or RNA molecules can be developed which specifically recognise the desired gene products in vivo or in vitro.
  • immunoglobulin molecules may be used to specifically bind to the target polypeptides, for instance in a western blot or ELISA.
  • the immunoglobulins or the target polypeptides may be labelled, to provide a means of identification and measurement. Ideally, such measurements are carried out on an array of immunoglobulin molecules.
  • An "immunoglobulin” is one of a family of polypeptides which retain the immunoglobulin fold characteristic of immunoglobulin (antibody) molecules, which contains two [beta] sheets and, usually, a conserved disulphide bond.
  • immunoglobulin superfamily are involved in many aspects of cellular and non-cellular interactions in vivo, including widespread roles in the immune system (for example, antibodies, T-cell receptor molecules and the like), involvement in cell adhesion (for example the ICAM molecules) and intracellular signalling (for example, receptor molecules, such as the PDGF receptor).
  • Preferred immunoglobulins are antibodies, which are capable of binding to target antigens with high specificity.
  • Antibodies can be whole antibodies, or antigen-binding fragments thereof.
  • the invention includes fragments such as Fv and Fab, as well as Fab' and F(ab') 2 , and antibody variants such as scFv, single domain antibodies, Dab antibodies and other antigen-binding antibody-based molecules.
  • polypeptides encoded by the genes set forth in Tables 1-5, or peptides derived therefrom, can be used to generate antibodies for use in the present invention.
  • the peptides used preferably comprise an epitope which is specific for a polypeptide encoded by a gene in accordance with the invention.
  • Polypeptide fragments which function as epitopes can be produced by any conventional means (see, for example, U.S. Pat. No. 4,631,211).
  • antigenic epitopes preferably contain a sequence of at least 4, at least 5, at least 6, at least 7, more preferably at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 20, at least 25, at least 30, at least 40, at least 50, and, most preferably, between about 15 to about 30 amino acids.
  • Preferred polypeptides comprising immunogenic or antigenic epitopes are at least 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95 or 100 amino acid residues in length.
  • Antibodies can be generated using antigenic epitopes of polypeptides according to the invention by immunising animals, such as rabbits or mice, with either free or carrier- coupled peptides, for instance, by intraperitoneal and/or intradermal injection of emulsions containing about 100 [mu]g of peptide or carrier protein and Freund's adjuvant or any other adjuvant known for stimulating an immune response.
  • Antibodies for use in the present invention can be fused to marker sequences, such as a peptide which facilitates purification of the fused polypeptide.
  • the marker amino acid sequence is a hexa-histidine peptide, such as the tag provided in a pQE vector (QIAGEN, Inc., 9259 Eton Avenue, Chatsworth, Calif, 91311), among others, many of which are commercially available.
  • hexa-histidine provides for convenient purification of the fusion protein.
  • Another peptide tag useful for purification, the "HA" tag corresponds to an epitope derived from the influenza hemagglutinin protein (Wilson et al., (1984) Cell 37: 767.
  • Antibodies as described herein can be altered antibodies comprising an effector protein such as a label.
  • labels which allow the imaging of the distribution of the antibody in vivo.
  • Such labels can be radioactive labels or radioopaque labels, such as metal particles, which are readily visualisable within the body of a patient. This can allow an assessment to be made without the need for tissue biopsies.
  • they can be fluorescent labels or other labels which are visualisable on tissue.
  • the antibody is preferably provided together with means for detecting the antibody, which can be enzymatic, fluorescent, radioisotopic or other means.
  • the antibody and the detection means can be provided for simultaneous, simultaneous separate or sequential use, in a diagnostic kit intended for diagnosis.
  • the antibodies for use in the invention can be assayed for immunospecific binding by any method known in the art.
  • the immunoassays which can be used include but are not limited to competitive and noncompetitive assay systems using techniques such as western blots, radioimmunoassays, ELISA, sandwich immunoassays, immunoprecipitation assays, precipitin reactions, gel diffusion precipitin reactions, immunodiffusion assays, agglutination assays, complement- fixation assays, immunoradiometric assays, fluorescent immunoassays and protein A immunoassays.
  • Such assays are routine in the art (see, for example, Ausubel et al, eds, 1994, Current Protocols in Molecular Biology, Vol. 1, John Wiley & Sons, Inc., New York, which is incorporated by reference herein in its entirety).
  • Immunoprecipitation protocols generally comprise lysing a population of cells in a lysis buffer such as RIPA buffer (1% NP-40 or Triton X-100, 1% sodium deoxycholate, 0.1% SDS, 0.15 M NaCl, 0.01 M sodium phosphate at pH 7.2, 1% Trasylol) supplemented with protein phosphatase and/or protease inhibitors (e.g., EDTA, PMSF, aprotinin, sodium vanadate), adding the antibody of interest to the cell lysate, incubating for a period of time (e.g., 1-4 hours) at 4°C, adding protein A and/or protein G sepharose beads to the cell lysate, incubating for about an hour or more at 4°C, washing the beads in lysis buffer and resuspending the beads in SDS/sample buffer.
  • a lysis buffer such as RIPA buffer (1% NP-40 or Triton X-100, 1% sodium deoxy
  • Western blot analysis generally comprises preparing protein samples, electrophoresis of the protein samples in a polyacrylamide gel (e.g., 8%-20% SDS-PAGE depending on the molecular weight of the antigen), transferring the protein sample from the polyacrylamide gel to a membrane such as nitrocellulose, PVDF or nylon, blocking the membrane in blocking solution (e.g., PBS with 3% BSA or non-fat milk), washing the membrane in washing buffer (e.g., PBS-T ween 20), exposing the membrane to a primary antibody (the antibody of interest) diluted in blocking buffer, washing the membrane in washing buffer, exposing the membrane to a secondary antibody (which recognises the primary antibody, e.g., an antihuman antibody) conjugated to an enzymatic substrate (e.g., horseradish peroxidase or alkaline phosphatase) or radioactive molecule (
  • ELISAs comprise preparing antigen, coating the well of a microtitre plate with the antigen, adding the antibody of interest conjugated to a detectable compound such as an enzymatic substrate (e.g., horseradish peroxidase or alkaline phosphatase) to the well and incubating for a period of time, and detecting the presence of the antigen.
  • a detectable compound such as an enzymatic substrate (e.g., horseradish peroxidase or alkaline phosphatase)
  • a detectable compound such as an enzymatic substrate (e.g., horseradish peroxidase or alkaline phosphatase)
  • a second antibody conjugated to a detectable compound can be added following the addition of the antigen of interest to the coated well.
  • the binding affinity of an antibody to an antigen and the off-rate of an antibody-antigen interaction can be determined by competitive binding assays.
  • a competitive binding assay is a radioimmunoassay comprising the incubation of labelled antigen (e.g., 3 H or 125 I) with the antibody of interest in the presence of increasing amounts of unlabeled antigen, and the detection of the antibody bound to the labelled antigen.
  • the affinity of the antibody of interest for a particular antigen and the binding off-rates can be determined from the data by Scatchard plot analysis.
  • Competition with a second antibody can also be determined using radioimmunoassays.
  • the antigen is incubated with antibody of interest conjugated to a labelled compound (e.g., 3 H or 125 I) in the presence of increasing amounts of an unlabeled second antibody.
  • a labelled compound e.g., 3 H or 125 I
  • Polypeptide levels may be measured using alternative peptide-specif ⁇ c reagents.
  • Such reagents include peptide or RNA aptamers, which can specifically detect a defined polypeptide sequence. Proteins can be detected by protein gel assay, antibody binding assay, or other detection methods known in the art.
  • RNA aptamers can be produced by SELEX.
  • SELEX is a method for the in vitro evolution of nucleic acid molecules with highly specific binding to target molecules. It is described, for example, in U.S. patents 5654151, 5503978, 5567588 and 5270163, as well as PCT publication WO 96/38579, each of which is specifically incorporated herein by reference.
  • the SELEX method involves selection of nucleic acid aptamers, single-stranded nucleic acids capable of binding to a desired target, from a library of oligonucleotides.
  • the SELEX method includes steps of contacting the library with the target under conditions favourable for binding, partitioning unbound nucleic acids from those nucleic acids which have bound specifically to target molecules, dissociating the nucleic acid- target complexes, amplifying the nucleic acids dissociated from the nucleic acid-target complexes to yield a ligand-enriched library of nucleic acids, then reiterating the steps of binding, partitioning, dissociating and amplifying through as many cycles as desired to yield highly specific, high affinity nucleic acid ligands to the target molecule.
  • SELEX is based on the principle that within a nucleic acid library containing a large number of possible sequences and structures there is a wide range of binding affinities for a given target.
  • a nucleic acid library comprising, for example a 20 nucleotide randomised segment can have 4 20 structural possibilities. Those which have the higher affinity constants for the target are considered to be most likely to bind.
  • the process of partitioning, dissociation and amplification generates a second nucleic acid library, enriched for the higher binding affinity candidates. Additional rounds of selection progressively favour the best ligands until the resulting library is predominantly composed of only one or a few sequences. These can then be cloned, sequenced and individually tested for binding affinity as pure ligands.
  • the iterative selection/amplification method is sensitive enough to allow isolation of a single sequence in a library containing at least 10 14 sequences.
  • the nucleic acids of the library preferably include a randomised sequence portion as well as conserved sequences necessary for efficient amplification.
  • Nucleic acid sequence variants can be produced in a number of ways including synthesis of randomised nucleic acid sequences and size selection from randomly cleaved cellular nucleic acids.
  • the variable sequence portion can contain fully or partially random sequence; it can also contain subportions of conserved sequence incorporated with randomised sequence. Sequence variation in test nucleic acids can be introduced or increased by mutagenesis before or during the selection/amplification iterations and by specific modification of cloned aptamers.
  • the expression of genes in a test sample can be compared to the expression of genes in known tumour and normal samples, in order to determine whether tumour cells are present.
  • the pattern of gene expression for the genes identified in the present application differs in the tumour sample from that in a normal sample.
  • the tumour sample contains those cells which show the physiological and morphological characteristics associated with malignancy, including the ability for unrestricted independent growth and proliferation.
  • a normal sample is a sample which does not comprise any cells which show the physiological and morphological characteristics associated with malignancy.
  • a normal sample may be a tissue sample isolated from tissue adjacent to a tumour in a patient suffering from cancer. Alternatively, it may be a sample isolated from an individual not suffering from cancer. The normal sample acts as a control.
  • Expression levels of genes identified herein may be greater or lower in a tumour sample compared to a normal sample.
  • the identification of tumour or normal tissue depends not only on upregulation of certain genes, but on a general pattern of change in gene expression.
  • comparison with a normal sample is no longer necessary.
  • the presence of a tumour may be assessed by comparison with the pattern associated with the tumour.
  • Hierarchical Cluster Analysis is defined as grouping or segmenting a collection of objects into subsets or "clusters".
  • the objects to be clustered can be either the genes or the samples: genes can be clustered by comparing their expression profiles across the set of samples, or the samples can be clustered by comparing their expression profiles across the set of genes. In such a way, the genes (or samples) within each cluster are more closely related to one another than genes (or samples) grouped within different clusters. In a hierarchical clustering analysis, the genes (or samples) are not partitioned into a particular cluster in a single step.
  • NSCLC can be differentiated either in the presence of controls, from other, known
  • NSCLC types or without controls once the expression patters of the genes set forth herein have been established.
  • the prognosis of patients with a tumour uses the same procedure as described above for obtaining the samples and expressing the gene sets either as individual genes or as sets.
  • a Cox's proportional hazards regression analysis is then performed for each gene thereby allowing the selection of overall survival associated genes.
  • a risk score is then determined for the individual patients that comprise the summation of multiplying the regression coefficient of the selected gene by the corresponding expression intensity.
  • Cox regression is a method for investigating the effect of several variables upon the time a specified event takes to happen. In the context of an outcome such as death this is known as Cox regression for survival analysis.
  • the method does not assume any particular "survival model" but it is not truly non-parametric because it does assume that the effects of the predictor variables upon survival are constant over time and are additive in one scale.
  • Based on the median risk score patients are then categorized as having a high or low- risk of surviving or having a relapse free survival. This is determined by a comparison to the corresponding Kaplan-Meier estimates of overall survival.
  • NSCLC Newcastle disease virus
  • Identifying a signature for NSCLC A genome-wide gene expression analysis using Affymetrix Ul 33 Plus 2.0 arrays was performed on the cohort of 91 patients with NSCLC. All tumor samples were reviewed by two independent pathologists to determine their histopathological types, cancer cell contents, and degree of differentiation. Eight LCC samples had a high level of cell type heterogeneity, presenting with acinar differentiation or squamous cell components. 19 percent (17 out of 91) of tumor samples had a discrepancy in histopathological classification, including five of rare types of NSCLC with a histological composition of multiple cell types. To sketch a precise histological profile, these 17 samples were excluded from creating histology signatures.
  • RNA pellets were washed with 75% ethanol and dissolved in RNase-free water. If applicable, they were stored at -80 0 C for further usage.
  • the integrity of the isolated total RNA was determined using the Agilent 2100
  • RNA samples were kept for further processing if the 28s/18s ratio of the RNA was higher than 1.2.
  • concentrations of the RNAs were measured with a NanoDrop ND-1 1 1 UV-VIS spectrophotometer.
  • Double strand (ds) cDNA synthesis was performed according to the standardized protocol for One-Cycle cDNA synthesis from Affymetrix (Santa Clara, CA). Approximately 5 ⁇ g of total RNA was first converted to single strand cDNA in a 20 ⁇ l First-Strand Reaction Mix, containing poly-A control RNA, 100 ⁇ mol T7-Oligo Primer, Ix first strand buffer, 0.2 mol DTT 10 mmol dNTP mix and Superscript II. In detail, the sample RNA, the poly- A control RNA and the T7-Oligo Primer were mixed and incubated for 10 min at 70 °C.
  • the first strand buffer, the DTT and the dNTP mix were added and incubated for 2 min at 42 0 C, followed by adding Superscript II and incubation for 1 hour at 42 0 C.
  • the ds cDNA was prepared from the resultant First-Strand Reaction Mix, mixed with Ix second strand reaction buffer, 30 mmol dNTP mix, E.coli DNA ligase, E.coli DNA Polymerase I and RNaseH. The mix was incubated for 2 hours at 16 0 C, then supplemented with T4 DNA Polymerase and then incubated for a further 5 minutes at 16 0 C. The reaction was stopped by the addition of EDTA to a final concentration of 5 ⁇ M.
  • the Sample Cleanup Module and GeneChip IVT Labelling Kit from Affymetrix were used to purify the synthesized ds cDNA, which was then used to generate biotin-labelled cRNA in the presence of Ix IVT Labelling buffer, IVT Labelling NTP Mix, IVT Labelling Enzyme Mix and RNase-free water in a total volume of 40 ⁇ l. After an incubation of 16 hours at 37 0 C, the concentration and quality of the labelled cRNA were checked with a NanoDrop ND- 1000 UV-VIS spectrophotometer. An A 2 6o/A 28 o ratio between 1.9 and 2.1 was considered acceptable.
  • Hybridization was conducted following Affymetrix instructions for the GeneChip® Human Genome Ul 33 Plus 2.0 array.
  • the GeneArray scanner 3000 (Affymetrix) was then employed to detect the hybridization signals.
  • Microarrays that did not pass the quality assessment were removed from further analyses.
  • the quality metrics used to exclude microarrays was the statistics summary calculated by the GCOS algorithm during the processing of probe-level data.
  • the primary inclusion criteria include: all arrays had to have comparable noise values (Raw Q, measurement for the pixel-to-pixel variation of probe cells on the chip); background values were within the range of 20 to 100; percent of present calls for probe sets on the array should not be below 45%.
  • the other criteria were: arrays with extremely high or low values for any of these parameters, e.g.
  • RMA Robust Multi-Array average
  • GCOS Global Scaling
  • This algorithm was a summary method embedded in GeneChip Operating Software (GCOS) from Affymetrix, and fully described in the data analysis fundamentals manual.
  • the signal intensity of each probe was firstly corrected by the overall background.
  • the differences between perfect match (PM) and mismatch (MM) probes were examined by using background-adjusted intensities for each probe pair.
  • the significance of the differences between PM and MM probe sets was reflected by a p-value calculated by onesided Wilcoxon-signed rank test.
  • the final signal for a probe set was assigned as the one- step biweight estimate of the combined differences of all probe pairs belonging to one probe set.
  • the trimmed mean signal of each array was then scaled to the same Target Intensity (e.g. 250) by a global method to minimize technique-derived discrepancies.
  • Probe sets were involved in further analysis only if their expression levels deviated from the overall mean in at least one array by a minimum factor of 2.5, because the remaining data were unlikely to be informative. The result was that 43,160 probe sets were eliminated, and 11,515 probe sets remained for further analysis.
  • Clustering was performed without taking into account any external information such as histology subtypes and tumor stages, with each of the selected 11,515 probe sets using the
  • the resulting 11,515 probe sets from the filtering step was the starting point for all supervised analyses which, for instance, correlated gene expression with clinical variables such as histological subtype.
  • Class comparison analysis was performed by using Significance Analysis of Microarray (SAM), integrated in OminiViz version 5.1.
  • Class prediction analysis was performed with the use of Prediction Analysis of Microarrays (PAM) software, integrated in BRBArray version 3.6.
  • Clustering was performed using the Spotfire DecisionSite software (TIBCO, Palo Alto, CA). The samples were clustered with various signatures using the Weighted Pair-Group Method algorithm and similarity measured by Euclidean distance.
  • SAM discovered differentially expressed genes among different sample classes, e.g. between non-cancerous tissues and tumours (this Example) or among different histology subtypes (Example 2).
  • this algorithm calculated the different expression for each gene between classes relative to the variation expected in the mean difference.
  • false discovery rate FDR was controlled by randomly permutating the classes of samples 100 times.
  • Signature probe sets for assigned classes were selected by a change factor of 2 and a FDR of less than 1 percent.
  • the class comparisons were performed with both RMA- and GCOS-processed data.
  • the common probe sets identified by both sets of data were selected as the final signatures.
  • the resultant signatures from Class Comparison were tested by the nearest shrunken centroids algorithm (PAM) to identify subgroups of genes that best characterized the predefined classes.
  • the prediction accuracy of optimized signatures was determined by performing "leave-one-out" cross validation within the training set, with one sample omitted each time and class label being predicted with other samples for the omitted sample.
  • the predictive models generated by the optimal subsets were subsequently applied to make predictions of classes for samples in the validation set, which were not involved in the corresponding class comparisons.
  • the prediction accuracy on validation samples was calculated by comparing predicted class labels with the histopathological diagnoses for those samples; samples without histopathological records were excluded from the calculations.
  • NSCLC are sub-classified by histology signature genes
  • NSCLC is a class of tumor with a high degree of heterogeneity
  • genes characterizing histological features were identified using strictly selected tumor samples. The samples used had consistent histological diagnoses two independent pathologists, and displayed no apparent tissue heterogeneity.
  • ADC ADC
  • SCC SCC
  • LCC LCC
  • the association between the prognosis profile and clinical parameters was studied.
  • the prognosis profile was significantly associated with age (p ⁇ 0.023), smoking years (p ⁇ 0.014), gender (p ⁇ 0.012) and Forced Expiratory Volume 1 (p ⁇ 0.009), a parameter reflecting lung function, but not with tumor stage, tumor cell content, tumor histology and tumor size (Table 7).
  • Table 8 shows the WaId statistics and significance for each variable tested.
  • Tumor stage and the 17 probe set prognostic predictor were significantly related to the hazard of death.
  • the prognostic predictor presented the highest importance which was 21.682 compared to 3.797 from tumor stage.
  • the relative hazard ratio predicted by the prognostic predictor was 2.465 (95% confidence interval, 1.686 to 3.604, p ⁇ 1.5E-06), the highest one among all tested risks (Table 6).
  • the inclusion of the prognostic predictor to the predictive model resulted in a change in model performance of 19.5, in terms of -2 log likelihood, with a p-value of 9.8E-06, compared to 24.3 and 2.0E-03 introduced by the model comprising all clinical variables.
  • the multivariate proportional hazard analysis shows that the gene expression profile-derived prognostic predictor of 17 probe sets is the strongest predictor of the likelihood of death (Table 8).
  • T:N ratio Ratio of average expression in NSCLC samples / normal lung tissue T mean 21og transformation of mean expression value in NSCLC samples (average of all NSCLC and normal lung tissue 0).
  • N mean 21og transformation of mean expression value in normal lung tissue samples (average of all NSCLC and normal lung tissue 0).

Abstract

The present invention provides gene signatures, which are useful for characterising NSCLC tumours, and can distinguish cancerous from normal tissue, are useful in classifying NSCLC, and can be used to provide a prognosis for NSCLC patients.

Description

TUMOUR GENE PROFILE
The present invention relates to the diagnosis, categorisation and prognosis of and for lung cancers, in particular non-small cell lung carcinomas (NSCLC). There are provided sets of genes which have been found to be differentially expressed in lung cancers in a statistically significant manner, as well as uses of the gene sets in the diagnosis, categorisation and prognosis of cancers.
Lung cancer is the most frequent cause of cancer deaths in Europe. There were 386,300 new lung cancer cases in 2006, with an estimated 334,800 deaths. This accounts for 13.5% of all cancer deaths [I].
Based on histo-pathological presentation, lung cancer is in clinical practice sub-divided into four major histological subtypes: small cell lung cancer (SCLC), squamous cell carcinoma (SCC), adenocarcinoma (ADC), and large-cell carcinoma (LCC). The latter three, collectively referred to as non-small cell lung carcinomas (NSCLC), account for almost 80% of lung cancers [2]. At present, the choice of treatment in NSCLC is made exclusively based on histo-pathological features and staging. However, according to the recent studies in NSCLC, this subdivision seems to be insufficient, and is not a useful predictor for treatment regimes, especially targeted therapies. It is known that patients with the same pathological classification and/or at the same stage show dramatically different responses to the same therapy.
Common features at the molecular level may be able to explain such outcome discrepancies among patients more reliably. For instance, the efficacy of epidermal growth factor receptor (EGFR) antagonists has been shown to depend on the expression of the target - EGFR - in the tumour [27]. Furthermore, the beneficial effect of chemotherapies might depend on the expression of certain proteins, for example thymidine synthetase for Pemetrexed [28]. Therefore the refined classification of subtypes of NSCLC is becoming of more and more interest. In the same way, the staging of a cancer depends solely on the tumor extension at the moment of diagnosis and does not incorporate the biological behaviour of the tumor. It is therefore known that although staging can, to some degree, be a prognostic indicator, it is still far from precise. The increase in the predictive value of staging over the past decades is mostly due to the increase in the sensitivity of imaging techniques and not related to a better understanding of the biology of a tumor.
Recent advances in microarray technology enable researchers to recapitulate morphological characteristics of NSCLC at the gene level [3-7]. Some studies have suggested that additional subclasses of NSCLC, which may present clinical outcomes different from other subclasses, can be identified by gene expression profiles [8, 9]. However, these sub-classifications vary considerably from study to study, and most of them are limited to a histologically preselected subtype (ADC).
The reproducibility of gene signatures to predict a high risk of relapse or recurrence on new cases is rarely reported. Therefore, it is necessary to identify relevant gene signatures applicable to all histological subtypes, or even beyond common histological subclasses. Similarly it remains a challenge to identify molecular classifiers that can reliably predict specific subgroups of high- and low-risk patients, which should be helpful to select the most appropriate therapy for individual patients.
In the study set out herein, we performed gene expression profiling on early stage NSCLC and simultaneously collected normal lung tissue in order to determine classifier genes and high-risk index genes. As a result, we are able to define medically relevant gene signatures for NSCLC which provide reproducible results in the screening of lung cancer.
SUMMARY OF THE INVENTION
In a first aspect of the present invention, there is provided a method for classifying a test tissue sample as a malignant non-small cell lung carcinoma (NSCLC) by analysis of gene expression, comprising the steps of: (a) assaying the expression levels of 5 or more genes selected from the genes set forth in Table 1 ; (b) comparing the expression levels of 5 or more genes with the expression levels of said 5 or more genes in a known non-cancerous tissue sample; wherein a change in the expression levels of said 5 or more genes indicates that the test tissue sample is a malignant NSCLC sample. Preferably, the 5 or more genes comprise, consist of or consist essentially of the genes set forth in Table 2.
In an advantageous embodiment, the expression of the 5 or more genes is analysed with a two-dimensional hierarchical clustering of the expression levels of 5 or more genes as set forth in Table 1 ; wherein a correlation between the expression levels of said 5 or more genes and the pattern of gene expression levels observed in the two-dimensional clustering indicates that the test tissue sample is a malignant NSCLC sample.
We have identified 187 genes which form a signature for a malignant phenotype in NSCLC. A subset of 5 of these genes has been found to form a characteristic signature which can be used to distinguish cancerous from non-cancerous tissues with an accuracy of 93%. Preferably, the two-dimensional hierarchical clustering is performed using OmniViz (Bio Wisdom) software, for example as set forth below.
In a second aspect, there is provided a method for classifying a test tissue sample of a malignant non-small cell lung carcinoma (NSCLC) into LCC, ADC or SCC subtypes by analysis of gene expression, comprising the steps of: (a) assaying the expression levels of 75 or more genes selected from the genes set forth in Table 3; (b) comparing the expression levels of said 75 or more genes with the expression levels of 75 or more genes from NSCLC tumour samples as set forth in Table 3; wherein a correlation between the expression levels of said 75 or more genes and the pattern of gene expression levels observed in Table 3 indicates that the test tissue sample is a malignant LCC, ADC or SCC subtype NSCLC sample.
Advantageously, gene expression is analysed by two-dimensional hierarchical clustering, which provides a graphical representation of comparative gene expression and facilitates classification of NSCLC samples into the relevant subtypes.
Preferably, the 75 or more genes comprise, consist of or consist essentially of the 75 genes set forth in Table 4. The optimised signature set forth in this table allows classification of NSCLC into LCC, ADC or SCC subtypes by genetic analysis using a reduced population of probes.
In a third aspect, there is provided a method for predicting the survival time of a patient suffering from a non-small cell lung carcinoma (NSCLC) by analysis of gene expression, comprising the steps of: (a) assaying the expression levels of 17 genes selected from the genes set forth in Table 5; (b) either (i) comparing the expression levels of said 17 genes with a two-dimensional hierarchical clustering of the expression levels of 17 genes from NSCLC tumour samples as set forth in Table 5; or (ii) fitting the expression levels of said 17 genes to a survival model to derive a prognostic index.
Preferably, the methods according to the invention are in vitro methods.
Methods according to the present invention can be combined to provide a rapid and universal system for the diagnosis of NSCLC. Figure 1 shows a flowchart of the combinations of diagnostic methodologies, leading to an assessment of clinical risk for each patient. Accordingly, the invention provides a method for assessing clinical risk in a patient suffering from or suspected from suffering from NSCLC, comprising the steps of:
assaying the expression of genes in accordance with the tumour signature set forth in the first aspect of the invention, and determining if the patient is suffering from NSCLC;
predicting the survival time of the patient according to the third aspect of the invention;
classifying the NSCLC according to the second aspect of the invention; and
combining the results of the two previous steps to provide an indication for treatment of the patient on the basis of tumour type and severity.
The invention moreover provides diagnostic kits for assessing the presence, subtype or severity of NSCLC. Such kits comprise reagents for measuring the presence of mRNA or polypeptides encoded by the genes identified herein.
In one embodiment, such kits may contain instructions as to use. In particular, the kits may contain instructions as to the selection of genes to be screened in the diagnosis of NSCLC as set forth herein. Preferably, the genes are 5 or more of the genes set forth in Table 1, the 5 genes set forth in Table 2, 75 or more of the genes set forth in Table 3, the 75 genes set forth in Table 4 or the 17 genes set forth in Table 5. Moreover, the kit may contain instructions for the detection of the gene products expressed from said mRNA species. In general, it will be appreciated that any method for recognising the levels of expression of a gene may be used in the context of the present invention. The genes identified in each gene signature, and the changes in expression levels associated therewith, are identified in the Tables set out below; analysis can be made manually, or using automated means, to compare the expression levels observed in a test sample to those observed in a reference sample.
Accordingly, kits in accordance with the invention may comprise any reagents suitable for measuring gene expression levels. Such reagents comprise reagents for measuring levels of mRNA, or cDNA derived from mRNA, and/or reagents suitable for measuring levels of polypeptide gene products. For example, therefore, a kit may comprise nucleic acid probes which hybridise specifically to mRNA or cDNA specific for the appropriate gene signature, under appropriate conditions. The probes may be immobilised onto a solid surface, such as glass slides, membranes of various types, columns or beads, and may be in the form of an addressable array. If the probes are on an array, the identity of each probe is advantageously known as a result of the spatial arrangement on the array itself.
Probes may be used in solution, to probe nucleic acids derived from the sample. Moreover, labelling means may be provided, to label either the probes or the sample nucleic acids.
Primers may also be provided, to prime extension reactions for amplification and/or labelling of sample nucleic acids. The primers are specific for mRNA transcribed from the genes identified in the gene signatures set forth herein, or corresponding cDNA.
The kits may alternatively, or in addition, comprise reagents such as immunoglobulins, RNA or peptide aptamers and the like which are capable of specifically detecting the polypeptide gene products of the target genes.
In particular, the present invention provides a diagnostic kit for use in characterising NSCLC tumours, comprising a set of reagents for specifically measuring the abundance of the mRNA species transcribed from the 5 or more of the genes set forth in Table 1 , the 5 genes set forth in Table 2, 75 or more of the genes set forth in Table 3, the 75 genes set forth in Table 4 or the 17 genes set forth in Table 5; or the gene products expressed from said mRNA species. Preferably, the reagents comprise a set of oligonucleotide primers or probes which hybridise specifically to said genes, which may advantageously be attached to a solid phase in the form of an array.
Preferably, the array consists of a library of oligonucleotides affixed to a solid phase, and said library of oligonucleotides consists substantially of oligonucleotides which are specific for the 5 or more of the genes set forth in Table 1, the 5 genes set forth in Table 2, 75 or more of the genes set forth in Table 3, the 75 genes set forth in Table 4 or the 17 genes set forth in Table 5.
Alternatively, or in addition, the reagents are selected from immunoglobulin molecules, RNA aptamers and peptide aptamers.
Preferably, the kit is for use in detecting the presence of NSCLC tumour tissue, comprising a set of nucleic acid probes or primers which recognise the transcripts of the genes set forth in Table 2.
In a further example, the kit is for use in differentiating between LCC, ADC or SCC subtypes of NSCLC, comprising a set of nucleic acid probes or primers which recognises the transcripts of the genes set forth in Table 4.
In another embodiment, the kit is for use in estimating the prognosis for survival of a patient suffering from NSCLC, comprising a set of nucleic acid probes or primers which recognises the transcripts of the genes set forth in Table 5.
The kits may further include labelling means.
It will be understood that immunoglobulins, RNA or peptide aptamers may be substituted for, or may supplement, the nucleic acid reagents in kits according to the invention.
Brief Description of the Figures
Figure 1 is a table representing the use of assays according to the invention in the assessment of patients diagnosed with or suspected of suffering from NSCLC. Figure 2 shows Kaplan-Meier plots for survival of NSCLC patients, separating the patients with good and poor prognoses as assessed using the gene signature set forth in Table 5. Light grey bars indicate the end of the follow-up.
Figure 3 shows survival prediction by published prognostic signatures. Kaplan-Meier curves for the best performing signatures (by P-value) are shown for 82 Erasmus MC patients (left) and 89 Duke University NSCLC patients (right), fitted by their risk assignments. Grey bars indicate patients at last follow-up, still alive. P-values are between brackets if overall survival of the low risk group is actually lower than that of the high risk group.
Detailed Description of the Invention
Standard techniques are used for molecular, genetic and biochemical methods. See, generally, Sambrook et al., Molecular Cloning: A Laboratory Manual, 2d ed. (1989) Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N. Y. and Ausubel et al., Short Protocols in Molecular Biology (1999) 4th Ed, John Wiley & Sons, Inc.; as well as Guthrie et al., Guide to Yeast Genetics and Molecular Biology, Methods in Enzymology, Vol. 194, Academic Press, Inc., (1991), PCR Protocols: A Guide to Methods and Applications (Innis, et al. 1990. Academic Press, San Diego, Calif.), McPherson et al., PCR Volume 1, Oxford University Press, (1991), Culture of Animal Cells: A Manual of Basic Technique, 2nd Ed. (R. I. Freshney. 1987. Liss, Inc. New York, N. Y.), and Gene Transfer and Expression Protocols, pp. 109-128, ed. E. J. Murray, The Humana Press Inc., Clifton, N. J.). These documents are incorporated herein by reference.
In the context of the present invention as described herein, NSCLC is "classified" by being identified as belonging to the SCC, LCC or ADC subtype, according to the terminology normally used in the art. In addition, the present invention provides a further level of classification of NSCLC, as set forth below and in Table 5.
NSCLC can also be assessed as to severity by means of the present invention. This means that a prognosis for the patient's survival can be derived by the methods provided herein. Survival prognosis involves providing an estimate of the likelihood that a patient will survive for a given time period, for example 5 years. The expression levels of genes are assayed in accordance with the present invention by measuring the levels of either nucleic acids or proteins encoded by the gene which are present in a sample. Expression levels are considered herein to be the amounts of mRNA or polypeptide which are present in a sample; they may be influenced, therefore, by for instance modulations in levels of transcription, translation, mRNA or protein turnover.
As used herein, "measuring" and "assaying", as well as "polypeptide" and "protein", are intended to be interchangeable and equivalent in meaning. Genes whose expression levels are described herein as being useful for identifying, classifying or measuring the severity of NSCLC are referred to as "target" genes; groups of target genes form gene signatures, which can be used to identify, classify or measure the severity of NSCLC.
"Nucleic acids" are nucleic acids as is commonly understood in the art, and include DNA, RNA and artificial nucleic acids. In the context of the present invention, the levels of naturally-occurring nucleic acids will generally be measured using techniques known to those skilled in the art. Probes, primers and other nucleic acid molecules used in the present invention may comprise synthetic nucleotides or other modifications, as is known in the art.
"Reagents" for measuring gene expression levels include nucleic acids and ligands, such as antibodies, which are capable of detecting the RNA or polypeptide products of the target genes described herein. Reagents may be selective, in that they bind to or detect only the RNA or polypeptide products of the target genes, or non-selective, capable of binding to or detecting a wider population of genes, with the selectivity being introduced in a later stage of the assay.
In general, assays can be conducted on arrays that comprise many genes in addition to the target genes, and the detection of changes in the expression levels of the target genes will be achieved by selective analysis of the arrays. For example, the Affymetrix Gene chip analyser is capable of identifying binding to probes on gene chip arrays, thereby measuring the degree of hybridisation to the probe sets representing genes on the array as well as the identity of the probes hybridised to at the same time.
Alternatively, specific primers may be used to selectively detect the RNA gene products of target genes. As used herein, a "primer" is an oligonucleotide, whether produced naturally as in a purified restriction digest or produced synthetically, that is capable of acting as a point of initiation of synthesis when placed under conditions in which synthesis of a primer extension product that is complementary to a nucleic acid strand is induced, (i.e., in the presence of nucleotides and an inducing agent such as DNA polymerase and at a suitable temperature and pH). The primer is preferably single stranded for maximum efficiency in the initiation of the reaction, but may alternatively be double stranded. If double stranded, the primer is first treated to separate its strands before being used to prepare extension products. Preferably, the primer is an oligodeoxyribonucleotide. The primer must be sufficiently long to prime the synthesis of extension products in the presence of the inducing agent. The exact lengths of the primers will depend on many factors, including temperature, source of primer and the use of the method.
As used herein, the term "probe" refers to an oligonucleotide (i.e., a sequence of nucleotides), whether occurring naturally as in a purified restriction digest or produced synthetically, recombinantly or by PCR amplification, that is capable of hybridizing to at least a portion of another oligonucleotide of interest. A probe may be single-stranded or double-stranded. Probes are useful in the detection, identification and isolation of particular gene sequences. It is contemplated that any probe used in the present invention can be labelled with a reporter molecule so that is detectable in any detection system, including, but not limited to enzyme (e. g., ELISA, as well as enzyme-based histochemical assays), fluorescent, radioactive, and luminescent systems. It is not intended that the present invention be limited to any particular detection system or label.
As used herein, the term "sample" is used to denote biological samples which may be obtained from animals (including humans) and encompass fluids, solids, tissues, and gases. Biological samples include sputum and blood products, such as plasma, serum and the like. In the context of the present invention, a sample is ordinarily a tissue sample obtained from normal tissue or a NSCLC.
"Comparing", as used herein, includes comparison of expression levels of target genes directly with a control, as well as comparison with profiles, as described further herein. In comparisons according to the present invention, a match is sought between a pattern of gene expression seen in a control or in a predefined profile. The term "isolated" when used in relation to a nucleic acid, refers to a nucleic acid sequence that is identified and separated from at least one component or contaminant with which it is ordinarily associated in its natural source. Isolated nucleic acid is present in a form or setting that is different from that in which it is found in nature. Similarly, isolated polypeptides are polypeptides or proteins separated from at least one component or contaminant with which they are ordinarily associated in their natural source
Tumour Samples
The sample used for analysis comprises tissue sample, which includes tumour tissue, and in particular human lung cancer tumour tissue. Typically, such tissue is, but is not limited to, epithelial tissue and connective tissue; other tissue types as may be used as and if they occur in a lung tumour. Typically, however, NSCLC are comprised of epithelial tissue.
Samples are obtained from surgically resected lungs, or may be obtained from patients by standard biopsy techniques. Advantageously, microdissection is used to ensure that the cell types subjected to analysis are the intended cell type.
Normal samples can be obtained from the same patient, adjacent the rumour, or from patients not suffering from cancer. Typically, normal samples will be of the same tissue type (i.e. epithelial tissue, connective tissue) as the tumour sample.
The normal sample is used to establish reference expression profiles to distinguish normal from cancerous tissues, for example as in Table 1. If an analysis model is defined, for example using two-dimensional hierarchical clustering, it is only necessary to analyse a tumor sample from a patient rather than both a tumor sample and a normal sample from the same or different patients.
Nucleic acid measurement
In the method according to the present invention, the levels mRNAs present in a sample which are encoded by the gene identified in Tables 1-5 may be measured directly. Analysis is conveniently carried out by labelling the RNA in cells from the sample and assaying the abundance of the desired mRNA species. To prepare RNA from tumour and/or normal samples, total or poly(A)+ RNA is processed according to any suitable technique, for example as set forth below, to produce cDNA and subsequently cRNA, which is conveniently used in microarray analysis.
Copies of the cRNA or cDNA may be amplified, for example by RT-PCR. Fluorescent tags or digoxigenin-dUTP can then be enzymatically incorporated into the newly synthesized cDNA/cRNA or can be chemically attached to the new strands of DNA or RNA.
Preferably the assessment of expression is performed by gene expression profiling using oligonucleotide-based arrays or cDNA-based arrays of any type; RT-PCR (reverse transcription-Polymerase Chain Reaction), real-time PCR, in-situ hybridisation, Northern blotting, serial analysis of gene expression (SAGE) for example as described by Velculescu et al Science 270 (5235): 484-487, or differential display. Details of these and other methods can be found for example in Sambrook et al, 1989, Molecular Cloning: A Laboratory Manual. Preferably the assessment uses a microarray assay.
Arrays
Microarrays (or arrays) can be constructed by a number of available technologies. Array technology and the various techniques and applications associated with it are described generally in numerous textbooks and documents. Gene array technology is particularly suited to the practice of the present invention. Methods for preparing microarrays are well known in the art. These include Lemieux et al., (1998), Molecular Breeding 4,277-289, Schena and Davis. Parallel Analysis with Biological Chips, in PCR Methods Manual (eds. M. Innis, D. Gelfand, J. Sninsky), Schena and Davis, (1999), Genes, Genomes and Chips. In DNA Microarrays : A Practical Approach (ed. M. Schena), Oxford University Press, Oxford, UK, 1999), The Chipping Forecast (Nature Genetics special issue; January 1999 Supplement), Mark Schena (Ed.), Microarray Biochip Technology, (Eaton Publishing Company), Cortes, 2000, The Scientist 14 [17]: 25, Gwynne and Page, Microarray analysis : the next revolution in molecular biology, Science, 1999 August 6; and Eakins and Chu, 1999, Trends in Biotechnology, 17,217-218.
The technology is described in PCT/USOl/10063 and US 2002 090979 and references therein. Commercial suppliers include Affymetrix (California) and Clontech Laboratories (California). Alternatives to solid phase arrays include addressable microbead technologies such as VeraBead from Illumina (California).
Major applications for array technology include the identification of sequence (nucleotide sequence/nucleotide sequence mutation) and the determination of expression level (abundance) of nucleotide sequences. Gene expression profiling may make use of array technology, optionally in combination with proteomics techniques (Celis et al, 2000, FEBS Lett, 480 (1) : 2-16; Lockhart and Winzeler, 2000, Nature 405 (6788) : 827-836; Khan et al. , 1999,20 (2): 223-9).
In general, any library may be arranged in an orderly manner into an array, by spatially separating the members of the library. Examples of suitable libraries for arraying include nucleic acid libraries (including DNA, RNA, oligonucleotide and other nucleic acid libraries), peptide, polypeptide and protein libraries, as well as libraries comprising other types of molecules, such as ligand libraries.
Accordingly, where reference is made to a "library" such reference includes reference to a library in the form of an array.
The members of a library are generally fixed or immobilised onto a solid phase, preferably a solid substrate, to limit diffusion and admixing of the samples. In particular, the libraries may be immobilised to a substantially planar solid phase, including membranes and non- porous substrates such as plastic and glass.
Furthermore, the samples are preferably arranged in such a way that indexing (i. e. reference or access to a particular sample) is facilitated. Typically the samples are applied as spots in a grid formation. Common assay systems may be adapted for this purpose. For example, an array may be immobilised on the surface of a microplate, either with multiple samples in a well, or with a single sample in each well.
Furthermore, the solid substrate may be a membrane, such as a nitrocellulose or nylon membrane (for example, membranes used in blotting experiments). Alternative substrates include glass, or silica based substrates. Thus, the samples are immobilised by any suitable method known in the art, for example, by charge interactions, or by chemical coupling to the walls or bottom of the wells, or the surface of the membrane. Other means of arranging and fixing may be used, for example, pipetting, drop-touch, piezoelectric means, ink-jet and bubblejet technology, electrostatic application, etc. In the case of silicon-based chips, photolithography may be utilised to arrange and fix the samples on the chip.
The samples may be arranged by being "spotted" onto the solid substrate; this may be done by hand or by making use of robotics to deposit the sample. In general, arrays may be described as macroarrays or microarrays, the difference being the size of the sample spots. Macroarrays typically contain sample spot sizes of about 300 microns or larger and may be easily imaged by existing gel and blot scanners. The sample spot sizes in microarrays are typically less than 200 microns in diameter and these arrays usually contain thousands of spots.
Thus, microarrays may require specialised robotics and imaging equipment, which may need to be custom made. Instrumentation is described generally in a review by Cortese, 2000, The Scientist 14 [H]: 26.
Techniques for producing immobilised libraries of DNA molecules have been described in the art. Generally, most prior art methods describe how to prepare single-stranded nucleic acid molecule libraries, using for example masking techniques to build up various permutations of sequences at the various discrete positions on the solid substrate. US 5,837, 832 describes an improved method for producing DNA arrays immobilised to silicon substrates based on very large scale integration technology. In particular, US 5,837, 832 describes a strategy called "tiling" to prepare specific sets of probes at spatially- defined locations on a substrate which may be used to produced the immobilised DNA libraries of the present invention. US 5,837, 832 also provides references for earlier techniques that may also be used.
To aid detection, targets and probes may be labelled with any readily detectable reporter such as a fluorescent, bioluminescent, phosphorescent, radioactive reporter.
Labelling of probes and targets is disclosed in Shalon et al., 1996, Genome Res 6 (7): 639- 45.
The materials for use in the methods of the present invention are ideally suited for preparation of kits. A set of instructions will typically be included. The invention provides kits which comprise microarrays which are specific for a desired set of genes. For example, microarrays according to the invention may consist of a solid phase and, immobilised thereto, a library of nucleic acid oligonucleotides or probes which consists substantially of one or more of the gene signatures identified herein, and listed in the Tables, especially Tables 2, 4, and 5. Such specialised microarrays are less expensive to produce than general purpose microarrays, and less difficult and expensive to analyse. In a further embodiment, the arrays according to the invention may comprise a library of oligonucleotides which is larger than, though still comprising, one or more of the gene signatures described herein, but still smaller than the set consisting of all known genes. For instance, such arrays may comprise gene signatures which are useful for detecting other forms of cancer, or other types of NSCLC, or which may provide different insights into the prognosis for NSCLC patients, or the like.
Amplification and sequencing
Nucleic acid signatures in accordance with the invention may be detected by nucleic acid analysis which relies on amplification and/or sequencing of sample nucleic acids. Since the invention aims to measure gene expression, the methods used must quantitatively measure transcribed nucleic acid levels. The measured nucleic acids must therefore be mRNA, or nucleic acids derived quantitatively from mRNA such as cDNA.
Generation of nucleic acids for analysis from samples generally, but not universally, requires nucleic acid amplification. Many amplification methods rely on an enzymatic chain reaction (such as a polymerase chain reaction, a ligase chain reaction, or a self- sustained sequence replication), a linear amplification procedure, or on the replication of all or part of the vector into which the desired sequence has been cloned. Preferably, the amplification according to the invention is an exponential amplification, as exhibited by for example the polymerase chain reaction.
Many target and signal amplification methods have been described in the literature. See, for example, general reviews of these methods in Landegren, U., et al., Science 242:229- 237 (1988) and Lewis, R., Genetic Engineering News 10:1 , 54-55 (1990). These amplification methods can be used in the methods of the present invention, and include polymerase chain reaction (PCR), PCR in situ, ligase amplification reaction (LAR), ligase hybridisation, Qbeta bacteriophage replicase, transcription-based amplification system (TAS), genomic amplification with transcript sequencing (GAWTS), nucleic acid sequence-based amplification (NASBA) and in situ hybridisation. Primers suitable for use in various amplification techniques can be prepared according to methods known in the art.
Polymerase Chain Reaction (PCR)
PCR is a nucleic acid amplification method described inter alia in U.S. Pat. Nos. 4,683,195 and 4,683,202. PCR consists of repeated cycles of DNA polymerase generated primer extension reactions. The target DNA is heat denatured and two oligonucleotides, which bracket the target sequence on opposite strands of the DNA to be amplified, are hybridised. These oligonucleotides become primers for use with DNA polymerase. The DNA is copied by primer extension to make a second copy of both strands. By repeating the cycle of heat denaturation, primer hybridisation and extension, the target DNA can be amplified a million fold or more in about two to four hours. PCR is a molecular biology tool, which must be used in conjunction with a detection technique to determine the results of amplification. An advantage of PCR is that it increases sensitivity by amplifying the amount of target DNA by 1 million to 1 billion fold in approximately 4 hours. PCR can be used to amplify any known nucleic acid in a diagnostic context (Mok et al, (1994), Gynaecologic Oncology, 52: 247-252).
Self-Sustained Sequence Replication (3SR)
Self-sustained sequence replication (3SR) is a variation of TAS, which involves the isothermal amplification of a nucleic acid template via sequential rounds of reverse transcriptase (RT), polymerase and nuclease activities that are mediated by an enzyme cocktail and appropriate oligonucleotide primers (Guatelli et al. (1990) Proc. Natl. Acad. Sci . US A 87 : 1874). Enzymatic degradation of the RNA of the RNA/DNA heteroduplex is used instead of heat denaturation. RNase H and all other enzymes are added to the reaction and all steps occur at the same temperature and without further reagent additions.
Following this process, amplifications of 1010 have been achieved in one hour at 42°C. Ligation Amplification (LAR/ LAS)
Ligation amplification reaction or ligation amplification system uses DNA ligase and four oligonucleotides, two per target strand. This technique is described by Wu, D. Y. and Wallace, R. B. (1989) Genomics 4:560. The oligonucleotides hybridise to adjacent sequences on the target DNA and are joined by the ligase. The reaction is heat denatured and the cycle repeated.
Qβ Replicase
In this technique, RNA replicase for the bacteriophage Qβ, which replicates single- stranded RNA, is used to amplify the target DNA, as described by Lizardi et al. (1988)
Bio/Technology 6:1197. First, the target DNA is hybridised to a primer including a T7 promoter and a Qβ 5' sequence region. Using this primer, reverse transcriptase generates a cDNA connecting the primer to its 5' end in the process. These two steps are similar to the
TAS protocol. The resulting heteroduplex is heat denatured. Next, a second primer containing a Qβ 3' sequence region is used to initiate a second round of cDNA synthesis.
This results in a double stranded DNA containing both 5' and 3' ends of the Qβ bacteriophage as well as an active T7 RNA polymerase binding site. T7 RNA polymerase then transcribes the double-stranded DNA into new RNA, which mimics the Qβ. After extensive washing to remove any unhybridised probe, the new RNA is eluted from the target and replicated by Qβ replicase. The latter reaction creates 10 fold amplification in approximately 20 minutes.
Alternative amplification technology can be exploited in the present invention. For example, rolling circle amplification (Lizardi et al, (1998) Nat Genet 19:225) is an amplification technology available commercially (RCAT^^) which is driven by DNA polymerase and can replicate circular oligonucleotide probes with either linear or geometric kinetics under isothermal conditions.
In the presence of two suitably designed primers, a geometric amplification occurs via DNA strand displacement and hyperbranching to generate 1012 or more copies of each circle in 1 hour. If a single primer is used, RCAT generates, in a few minutes, a linear chain of thousands of tandemly linked DNA copies of a target covalently linked to that target.
A further technique, strand displacement amplification (SDA; Walker et al., (1992) PNAS (USA) 80:392) begins with a specifically defined sequence unique to a specific target. But unlike other techniques which rely on thermal cycling, SDA is an isothermal process that utilises a series of primers, DNA polymerase and a restriction enzyme to exponentially amplify the unique nucleic acid sequence.
SDA comprises both a target generation phase and an exponential amplification phase. In target generation, double-stranded DNA is heat denatured creating two single-stranded copies. A series of specially manufactured primers combine with DNA polymerase (amplification primers for copying the base sequence and bumper primers for displacing the newly created strands) to form altered targets capable of exponential amplification.
The exponential amplification process begins with altered targets (single-stranded partial DNA strands with restricted enzyme recognition sites) from the target generation phase. An amplification primer is bound to each strand at its complementary DNA sequence. DNA polymerase then uses the primer to identify a location to extend the primer from its 3' end, using the altered target as a template for adding individual nucleotides. The extended primer thus forms a double-stranded DNA segment containing a complete restriction enzyme recognition site at each end.
A restriction enzyme is then bound to the double stranded DNA segment at its recognition site. The restriction enzyme dissociates from the recognition site after having cleaved only one strand of the double-sided segment, forming a nick. DNA polymerase recognises the nick and extends the strand from the site, displacing the previously created strand. The recognition site is thus repeatedly nicked and restored by the restriction enzyme and DNA polymerase with continuous displacement of DNA strands containing the target segment.
Each displaced strand is then available to anneal with amplification primers as above. The process continues with repeated nicking, extension and displacement of new DNA strands, resulting in exponential amplification of the original DNA target. Identification of nucleic acid sequences, for example after amplification, can for example be performed by primer extension or sequencing techniques. Such techniques may involve the parallel and/or serial processing of a large number of different template nucleic acid molecules. In one aspect, a library of probes on an array may be employed. A high sensitivity analytical technique may be used to characterize individually nucleic acid molecules which become immobilised on the array, by hybridisation to the probes. For example, primer extension reactions may be used to incorporate labeled nucleotide(s) that can be individually detected in order to sequence individual molecules and/or determine the identity of at least one nucleotide position on individual nucleic acid molecules. Detection may involve labeling one or more of the primers and or extension nucleotides with a detectable label (e.g., using fluorescent label(s), FRET label(s), enzymatic label(s), radio-label(s), etc.). Detection may involve imaging, for example using a high sensitivity camera and/or microscope (e.g., a super-cooled camera and/or microscope).
Suitable techniques may be selected by one of ordinary skill in the art. Examples of high- throughput sequencing approaches are listed in KY. Chan, Mutation Reseach 573 (2005) 13-40 and include, but are not limited to, near- term sequencing approaches such as cycle- extension approaches, polymerase reading approaches and exonuclease sequencing, revolutionary sequencing approaches such as DNA scanning and nanopore sequencing and direct linear analysis. Examples of current high-throughput sequencing methods are 454 (pyro)sequencing, Solexa Genome Analysis System, Agencourt SOLiD sequencing method (Applied Biosystems), MS-PET sequencing (Ng et al., 2006, http ://nar . oxfordjournals.org/cgi/content/full/34/ 12/e84). In one embodiment, a digital analysis (e.g., a digital amplification and subsequent analysis) may be performed to obtain a statistically significant quantitative result. Certain digital techniques are known in the art, see for example, US Patent No. 6,440,706 and US Patent No. 6,753,147, incorporated herein by reference. Similarly, an emulsion-based method for amplifying and/or sequencing individual nucleic acid molecules may be used (e.g., BEAMing technology; International Published Application Nos. WO2005/010145, WO00/40712, WO02/22869, WO03/044187, WO99/02671, herein incorporated by reference).
In one embodiment, a sequencing method that can sequence single molecules in a biological sample may be used. Sequencing methods are known and being developed for high throughput (e.g., parallel) sequencing of complex genomes by sequencing a large number of single molecules (often having overlapping sequences) and compiling the information to obtain the sequence of an entire genome or a significant portion thereof. Suitable sequencing techniques may involve high speed parallel molecular nucleic acid sequencing as described in PCT Application No. WO 01/16375, US Application No. 60/151,580 and U.S. Published Application No. 20050014175, the entire contents of which are incorporated herein by reference. Other sequencing techniques are described in PCT Application No. WO 05/73410, PCT Application No. WO 05/54431, PCT Application No. WO 05/39389, PCT Application No. WO 05/03375, PCT Application No. WO 05/010145, PCT Application No. WO 04/069849, PCT Application No. WO 04/70005, PCT Application No. WO 04/69849, PCT Application No. WO 04/70007, and US Published Application No. 20050100932, the entire contents of which are incorporated herein by reference.
Sequencing techniques for use in connection with the invention may involve exposing a nucleic acid molecule to an oligonucleotide primer and a polymerase in the presence of a mixture of nucleotides. Changes in the fluorescence of individual nucleic acid molecules in response to polymerase activity may be detected and recorded. The specific labels attached to each nucleic acid and/or nucleotide may provide an emission spectrum allowing for the detection of sequence information for individual template nucleic acid molecules. In certain embodiments, a label is attached to the primer/template and a different label is attached to each type of nucleotide (e.g., A, T/U, C, or G). Each label emits a distinct signal which is distinguished from the other labels.
Useful sequencing methods include high throughput sequencing using the 454 Life Sciences Instrument System (International Published Application No. WO2004/069849, filed January 28, 2004). Briefly, a sample of single stranded DNA is prepared and added to an excess of DNA capture beads which are then emulsified. Clonal amplification is performed to produce a sample of enriched DNA on the capture beads (the beads are enriched with millions of copies of a single clonal fragment). The DNA enriched beads are then transferred into PicoTiterPlate (TM) and enzyme beads and sequencing reagents are added. The samples are then analyzed and the sequence data recorded. Pyrophosphate and luciferin are examples of the labels that can be used to generate the signal. A label includes but is not limited to a fluorophore, for example green fluorescent protein (GFP), a luminescent molecule, for example aequorin or europium chelates, fluorescein, rhodamine green, Oregon green, Texas red, naphthofluorescein, or derivatives thereof. In some embodiments, the polynucleotide is linked to a substrate. A substrate includes but is not limited to, streptavidin-biotin, histidine-Ni, S-tag-S-protein, or glutathione-S-transferase (GST). In some embodiments, a substrate is pretreated to facilitate attachment of a polynucleotide to a surface, for example the substrate can be glass which is coated with a polyelectrolyte multilayer (PEM), or the polynucleotide is biotinylated and the PEM- coated surface is further coated with streptavidin.
In other embodiments, single molecule sequencing technology available from US Genomics, Mass., may be used. For example, technology described, at least in part, in one or more of US patents 6,790,671 ; 6,772,070; 6,762,059; 6,696,022; 6,403,311; 6,355,420; 6,263,286; and 6,210,896 may be used.
Other sequencing methods may be used to analyze DNA and/or RNA according to methods of the invention. It should be appreciated that a sequencing method does not have to be a single molecule sequencing method, since generally nucleic acid material from a substantial sample or biopsy will be available for analysis.
Measurement of polypeptide expression
In an alternative embodiment, the levels of polypeptides encoded by the genes identified in Tables 1-5 can be measured directly, without measuring mRNA levels. For example, polypeptides can be detected by differential mobility on protein gels, or by other size analysis techniques such as mass spectrometry. Peptides derived from the gene signatures identified herein can be differentiated by size analysis. Advantageously, the detection means is sequence-specific, such that a particular gene product can accurately be identified as the product of a member of any given gene signature. For example, polypeptide or RNA molecules can be developed which specifically recognise the desired gene products in vivo or in vitro.
For example, immunoglobulin molecules may be used to specifically bind to the target polypeptides, for instance in a western blot or ELISA. The immunoglobulins or the target polypeptides may be labelled, to provide a means of identification and measurement. Ideally, such measurements are carried out on an array of immunoglobulin molecules. An "immunoglobulin" is one of a family of polypeptides which retain the immunoglobulin fold characteristic of immunoglobulin (antibody) molecules, which contains two [beta] sheets and, usually, a conserved disulphide bond. Members of the immunoglobulin superfamily are involved in many aspects of cellular and non-cellular interactions in vivo, including widespread roles in the immune system (for example, antibodies, T-cell receptor molecules and the like), involvement in cell adhesion (for example the ICAM molecules) and intracellular signalling (for example, receptor molecules, such as the PDGF receptor).
Preferred immunoglobulins are antibodies, which are capable of binding to target antigens with high specificity. "Antibodies" can be whole antibodies, or antigen-binding fragments thereof. For example, the invention includes fragments such as Fv and Fab, as well as Fab' and F(ab')2, and antibody variants such as scFv, single domain antibodies, Dab antibodies and other antigen-binding antibody-based molecules.
The polypeptides encoded by the genes set forth in Tables 1-5, or peptides derived therefrom, can be used to generate antibodies for use in the present invention. The peptides used preferably comprise an epitope which is specific for a polypeptide encoded by a gene in accordance with the invention. Polypeptide fragments which function as epitopes can be produced by any conventional means (see, for example, U.S. Pat. No. 4,631,211). In the present invention, antigenic epitopes preferably contain a sequence of at least 4, at least 5, at least 6, at least 7, more preferably at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 20, at least 25, at least 30, at least 40, at least 50, and, most preferably, between about 15 to about 30 amino acids. Preferred polypeptides comprising immunogenic or antigenic epitopes are at least 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95 or 100 amino acid residues in length. Antibodies can be generated using antigenic epitopes of polypeptides according to the invention by immunising animals, such as rabbits or mice, with either free or carrier- coupled peptides, for instance, by intraperitoneal and/or intradermal injection of emulsions containing about 100 [mu]g of peptide or carrier protein and Freund's adjuvant or any other adjuvant known for stimulating an immune response.
Antibodies for use in the present invention can be fused to marker sequences, such as a peptide which facilitates purification of the fused polypeptide. In preferred embodiments, the marker amino acid sequence is a hexa-histidine peptide, such as the tag provided in a pQE vector (QIAGEN, Inc., 9259 Eton Avenue, Chatsworth, Calif, 91311), among others, many of which are commercially available. As described in Gentz et al., Proc. Natl. Acad. Sci. USA 86: 821-824 (1989), for instance, hexa-histidine provides for convenient purification of the fusion protein. Another peptide tag useful for purification, the "HA" tag, corresponds to an epitope derived from the influenza hemagglutinin protein (Wilson et al., (1984) Cell 37: 767.
Antibodies as described herein can be altered antibodies comprising an effector protein such as a label. Especially preferred are labels which allow the imaging of the distribution of the antibody in vivo. Such labels can be radioactive labels or radioopaque labels, such as metal particles, which are readily visualisable within the body of a patient. This can allow an assessment to be made without the need for tissue biopsies.
Moreover, they can be fluorescent labels or other labels which are visualisable on tissue.
The antibody is preferably provided together with means for detecting the antibody, which can be enzymatic, fluorescent, radioisotopic or other means. The antibody and the detection means can be provided for simultaneous, simultaneous separate or sequential use, in a diagnostic kit intended for diagnosis. The antibodies for use in the invention can be assayed for immunospecific binding by any method known in the art. The immunoassays which can be used include but are not limited to competitive and noncompetitive assay systems using techniques such as western blots, radioimmunoassays, ELISA, sandwich immunoassays, immunoprecipitation assays, precipitin reactions, gel diffusion precipitin reactions, immunodiffusion assays, agglutination assays, complement- fixation assays, immunoradiometric assays, fluorescent immunoassays and protein A immunoassays. Such assays are routine in the art (see, for example, Ausubel et al, eds, 1994, Current Protocols in Molecular Biology, Vol. 1, John Wiley & Sons, Inc., New York, which is incorporated by reference herein in its entirety). Exemplary immunoassays are described briefly below. Immunoprecipitation protocols generally comprise lysing a population of cells in a lysis buffer such as RIPA buffer (1% NP-40 or Triton X-100, 1% sodium deoxycholate, 0.1% SDS, 0.15 M NaCl, 0.01 M sodium phosphate at pH 7.2, 1% Trasylol) supplemented with protein phosphatase and/or protease inhibitors (e.g., EDTA, PMSF, aprotinin, sodium vanadate), adding the antibody of interest to the cell lysate, incubating for a period of time (e.g., 1-4 hours) at 4°C, adding protein A and/or protein G sepharose beads to the cell lysate, incubating for about an hour or more at 4°C, washing the beads in lysis buffer and resuspending the beads in SDS/sample buffer. The ability of the antibody of interest to immunoprecipitate a particular antigen can be assessed by, e.g., western blot analysis. Western blot analysis generally comprises preparing protein samples, electrophoresis of the protein samples in a polyacrylamide gel (e.g., 8%-20% SDS-PAGE depending on the molecular weight of the antigen), transferring the protein sample from the polyacrylamide gel to a membrane such as nitrocellulose, PVDF or nylon, blocking the membrane in blocking solution (e.g., PBS with 3% BSA or non-fat milk), washing the membrane in washing buffer (e.g., PBS-T ween 20), exposing the membrane to a primary antibody (the antibody of interest) diluted in blocking buffer, washing the membrane in washing buffer, exposing the membrane to a secondary antibody (which recognises the primary antibody, e.g., an antihuman antibody) conjugated to an enzymatic substrate (e.g., horseradish peroxidase or alkaline phosphatase) or radioactive molecule (e.g., 32P or 125I) diluted in blocking buffer, washing the membrane in wash buffer, and detecting the presence of the antigen.
ELISAs comprise preparing antigen, coating the well of a microtitre plate with the antigen, adding the antibody of interest conjugated to a detectable compound such as an enzymatic substrate (e.g., horseradish peroxidase or alkaline phosphatase) to the well and incubating for a period of time, and detecting the presence of the antigen. In ELISAs the antibody of interest does not have to be conjugated to a detectable compound; instead, a second antibody (which recognises the antibody of interest) conjugated to a detectable compound can be added to the well. Further, instead of coating the well with the antigen, the antibody can be coated to the well. In this case, a second antibody conjugated to a detectable compound can be added following the addition of the antigen of interest to the coated well. The binding affinity of an antibody to an antigen and the off-rate of an antibody-antigen interaction can be determined by competitive binding assays. One example of a competitive binding assay is a radioimmunoassay comprising the incubation of labelled antigen (e.g., 3H or 125I) with the antibody of interest in the presence of increasing amounts of unlabeled antigen, and the detection of the antibody bound to the labelled antigen. The affinity of the antibody of interest for a particular antigen and the binding off-rates can be determined from the data by Scatchard plot analysis. Competition with a second antibody can also be determined using radioimmunoassays. In this case, the antigen is incubated with antibody of interest conjugated to a labelled compound (e.g., 3H or 125I) in the presence of increasing amounts of an unlabeled second antibody. Polypeptide levels may be measured using alternative peptide-specifϊc reagents. Such reagents include peptide or RNA aptamers, which can specifically detect a defined polypeptide sequence. Proteins can be detected by protein gel assay, antibody binding assay, or other detection methods known in the art.
For example, RNA aptamers can be produced by SELEX. SELEX is a method for the in vitro evolution of nucleic acid molecules with highly specific binding to target molecules. It is described, for example, in U.S. patents 5654151, 5503978, 5567588 and 5270163, as well as PCT publication WO 96/38579, each of which is specifically incorporated herein by reference.
The SELEX method involves selection of nucleic acid aptamers, single-stranded nucleic acids capable of binding to a desired target, from a library of oligonucleotides. Starting from a library of nucleic acids, preferably comprising a segment of randomised sequence, the SELEX method includes steps of contacting the library with the target under conditions favourable for binding, partitioning unbound nucleic acids from those nucleic acids which have bound specifically to target molecules, dissociating the nucleic acid- target complexes, amplifying the nucleic acids dissociated from the nucleic acid-target complexes to yield a ligand-enriched library of nucleic acids, then reiterating the steps of binding, partitioning, dissociating and amplifying through as many cycles as desired to yield highly specific, high affinity nucleic acid ligands to the target molecule.
SELEX is based on the principle that within a nucleic acid library containing a large number of possible sequences and structures there is a wide range of binding affinities for a given target. A nucleic acid library comprising, for example a 20 nucleotide randomised segment can have 420 structural possibilities. Those which have the higher affinity constants for the target are considered to be most likely to bind. The process of partitioning, dissociation and amplification generates a second nucleic acid library, enriched for the higher binding affinity candidates. Additional rounds of selection progressively favour the best ligands until the resulting library is predominantly composed of only one or a few sequences. These can then be cloned, sequenced and individually tested for binding affinity as pure ligands. Cycles of selection and amplification are repeated until a desired goal is achieved. In the most general case, selection/amplification is continued until no significant improvement in binding strength is achieved on repetition of the cycle. The iterative selection/amplification method is sensitive enough to allow isolation of a single sequence in a library containing at least 1014 sequences. The nucleic acids of the library preferably include a randomised sequence portion as well as conserved sequences necessary for efficient amplification. Nucleic acid sequence variants can be produced in a number of ways including synthesis of randomised nucleic acid sequences and size selection from randomly cleaved cellular nucleic acids. The variable sequence portion can contain fully or partially random sequence; it can also contain subportions of conserved sequence incorporated with randomised sequence. Sequence variation in test nucleic acids can be introduced or increased by mutagenesis before or during the selection/amplification iterations and by specific modification of cloned aptamers.
Gene expression profiles
The results of the analysis of gene expression, whether by measurement of nucleic acids or polypeptides, require interpretation. Whilst smaller groups of genes can be analysed manually, larger sets of genes will almost certainly require the assistance of computational methods in order properly to interpret any results.
A number of mathematical techniques are available in the art for analysis of gene expression results, including hierarchical clustering techniques used herein. See The analysis of gene expression data: methods and software, edited by Giovanni Parmigiani, Elizabeth S Garrett, Rafael A Irizarry, Scott L Zeger, 2003, Springer, NY, incorporated herein by reference.
In one embodiment, the expression of genes in a test sample can be compared to the expression of genes in known tumour and normal samples, in order to determine whether tumour cells are present. The pattern of gene expression for the genes identified in the present application differs in the tumour sample from that in a normal sample. In general, the tumour sample contains those cells which show the physiological and morphological characteristics associated with malignancy, including the ability for unrestricted independent growth and proliferation.
A normal sample is a sample which does not comprise any cells which show the physiological and morphological characteristics associated with malignancy. For example, a normal sample may be a tissue sample isolated from tissue adjacent to a tumour in a patient suffering from cancer. Alternatively, it may be a sample isolated from an individual not suffering from cancer. The normal sample acts as a control.
Expression levels of genes identified herein may be greater or lower in a tumour sample compared to a normal sample. The identification of tumour or normal tissue depends not only on upregulation of certain genes, but on a general pattern of change in gene expression. In a preferred embodiment, once a pattern of gene expression associated with a tumour has been established, comparison with a normal sample is no longer necessary. The presence of a tumour may be assessed by comparison with the pattern associated with the tumour.
In analyzing the genes identified in the subject application, a hierarchical clustering analysis can be applied to construct gene profiles for the identification of tumor tissue. Hierarchical Cluster Analysis is defined as grouping or segmenting a collection of objects into subsets or "clusters". The objects to be clustered can be either the genes or the samples: genes can be clustered by comparing their expression profiles across the set of samples, or the samples can be clustered by comparing their expression profiles across the set of genes. In such a way, the genes (or samples) within each cluster are more closely related to one another than genes (or samples) grouped within different clusters. In a hierarchical clustering analysis, the genes (or samples) are not partitioned into a particular cluster in a single step. Instead, a sequential merging of the genes (or samples), from low level to high level, takes place depending on the measurements of pair-wise similarity between expression profiles. At the highest level, there may be a single cluster containing all genes (or samples), while at the lowest level the clusters each consist of singleton genes (or samples).
In differentiating between two tumor types, a similar procedure as described above is performed after the samples are obtained and the selected gene sets are expressed either as individual genes or as sets. However, the focus will be on whether the gene sets represent
LCC, SCC or ADC NSCLC types. The expression patterns associated with each type of
NSCLC can be differentiated either in the presence of controls, from other, known
NSCLC types, or without controls once the expression patters of the genes set forth herein have been established. The prognosis of patients with a tumour uses the same procedure as described above for obtaining the samples and expressing the gene sets either as individual genes or as sets. A Cox's proportional hazards regression analysis is then performed for each gene thereby allowing the selection of overall survival associated genes. A risk score is then determined for the individual patients that comprise the summation of multiplying the regression coefficient of the selected gene by the corresponding expression intensity.
Cox regression (or proportional hazards regression) is a method for investigating the effect of several variables upon the time a specified event takes to happen. In the context of an outcome such as death this is known as Cox regression for survival analysis. The method does not assume any particular "survival model" but it is not truly non-parametric because it does assume that the effects of the predictor variables upon survival are constant over time and are additive in one scale. Based on the median risk score, patients are then categorized as having a high or low- risk of surviving or having a relapse free survival. This is determined by a comparison to the corresponding Kaplan-Meier estimates of overall survival.
Examples
Example 1 Patient enrolment
Samples from patients recruited in this study were obtained from two Erasmus MC collections: the Tissue Bank and the Department of Internal Oncology. All lung tumor samples and adjacent non-cancerous specimens were collected from patients who had undergone curative surgical resection between 1992 and 1998 (Internal Oncology), or between 1996 and 2004 (Tissue Bank) at the Erasmus MC.
There were 91 patients with NSCLC included in our analysis, including -based on the first pathology review- 32 adenocarcinomas (ADC), 27 squamous cell carcinomas (SCC), and 13 large cell carcinomas. The remaining patients presented with rarer types of lung tumours, such as bronchioloalveolar (BAC), carcinoid (CAR), mixed adeno-squamous, or unknown. In the cohort of patients, over 57 percent had a known smoking history, with an average of 36.7 pack years. Of the 91 NSCLC patients, 50 were at stage I, 22 were at stage II, and 10 were at either stage III or IV. Three patients displayed distal metastases at the time of diagnosis. In addition, eight patients developed multiple primary tumours at different sites originating from the same cell type or different cell types, either synchronously or non-synchronously. Three had undergone neo-adjuvant radiation or chemotherapy before the surgery. For about 30% of patients the recorded cause of death was lung cancer progression. Tumor samples were independently reviewed by two pathologists.
Identifying a signature for NSCLC A genome-wide gene expression analysis using Affymetrix Ul 33 Plus 2.0 arrays was performed on the cohort of 91 patients with NSCLC. All tumor samples were reviewed by two independent pathologists to determine their histopathological types, cancer cell contents, and degree of differentiation. Eight LCC samples had a high level of cell type heterogeneity, presenting with acinar differentiation or squamous cell components. 19 percent (17 out of 91) of tumor samples had a discrepancy in histopathological classification, including five of rare types of NSCLC with a histological composition of multiple cell types. To sketch a precise histological profile, these 17 samples were excluded from creating histology signatures.
Total RNA isolation
The samples used in this study were fresh frozen tissues. Specimens were sectioned with a Cryostat into slices of 25 μm thick for RNA extraction. For each specimen, two 6 μm thick slices were prepared with tissue immediately proximate to sections of RNA usage and reviewed by pathologists. Samples were homogenized with a mortar and pestle in TRI Reagent, and then incubated at room temperature for 5 minutes before adding 0.2 μl of chloroform for each ImI sample. After centrifuging at full speed (12000 rpm) for 20 minutes, the supernatant containing the RNA was precipitated and centrifuged with iso- propanol. The resultant RNA pellets were washed with 75% ethanol and dissolved in RNase-free water. If applicable, they were stored at -80 0C for further usage.
Assessment of RNA quality and concentration
The integrity of the isolated total RNA was determined using the Agilent 2100
BioAnalyzer. Samples were kept for further processing if the 28s/18s ratio of the RNA was higher than 1.2. The concentrations of the RNAs were measured with a NanoDrop ND-1 1 1 UV-VIS spectrophotometer.
cRNA amplification and labelling
Double strand (ds) cDNA synthesis was performed according to the standardized protocol for One-Cycle cDNA synthesis from Affymetrix (Santa Clara, CA). Approximately 5 μg of total RNA was first converted to single strand cDNA in a 20 μl First-Strand Reaction Mix, containing poly-A control RNA, 100 μmol T7-Oligo Primer, Ix first strand buffer, 0.2 mol DTT 10 mmol dNTP mix and Superscript II. In detail, the sample RNA, the poly- A control RNA and the T7-Oligo Primer were mixed and incubated for 10 min at 70 °C. Secondly, the first strand buffer, the DTT and the dNTP mix were added and incubated for 2 min at 42 0C, followed by adding Superscript II and incubation for 1 hour at 42 0C. The ds cDNA was prepared from the resultant First-Strand Reaction Mix, mixed with Ix second strand reaction buffer, 30 mmol dNTP mix, E.coli DNA ligase, E.coli DNA Polymerase I and RNaseH. The mix was incubated for 2 hours at 16 0C, then supplemented with T4 DNA Polymerase and then incubated for a further 5 minutes at 16 0C. The reaction was stopped by the addition of EDTA to a final concentration of 5 μM. The Sample Cleanup Module and GeneChip IVT Labelling Kit from Affymetrix were used to purify the synthesized ds cDNA, which was then used to generate biotin-labelled cRNA in the presence of Ix IVT Labelling buffer, IVT Labelling NTP Mix, IVT Labelling Enzyme Mix and RNase-free water in a total volume of 40 μl. After an incubation of 16 hours at 37 0C, the concentration and quality of the labelled cRNA were checked with a NanoDrop ND- 1000 UV-VIS spectrophotometer. An A26o/A28o ratio between 1.9 and 2.1 was considered acceptable. Approximately 20 μg cRNA per array was fragmented to an average size of 35-200 nucleotides by heating at 94 0C for 35 min, in the presence of Ix Fragmentation Buffer in a total volume of 40 μl. The undiluted, fragmented samples were stored at -20 0C before being subjected to hybridization. Hybridization
Hybridization was conducted following Affymetrix instructions for the GeneChip® Human Genome Ul 33 Plus 2.0 array. The GeneArray scanner 3000 (Affymetrix) was then employed to detect the hybridization signals.
Preprocessing microarray data
Array Quality Control
Microarrays that did not pass the quality assessment were removed from further analyses.
The quality metrics used to exclude microarrays was the statistics summary calculated by the GCOS algorithm during the processing of probe-level data. The primary inclusion criteria include: all arrays had to have comparable noise values (Raw Q, measurement for the pixel-to-pixel variation of probe cells on the chip); background values were within the range of 20 to 100; percent of present calls for probe sets on the array should not be below 45%. The other criteria were: arrays with extremely high or low values for any of these parameters, e.g. values beyond the range of standard deviation ± median, were excluded; signal ratio of <3 of the 3' / 5' probe sets for GAPDH and Actin were used as a cut-off; labelling and hybridization were controlled by using standard spiked-in controls according to the Affymetrix protocol; if global scaling was applied, the scaling factors for each array were within a three-fold range.
Array Data analysis
Microarray data was processed at two levels: probe level and probe set level.
At probe level by quantile normalization (RMAExpress) RMA (Robust Multi-Array average) is an integrated algorithm comprising background adjustment, quantile normalization, and expression summarization by median polish. The intensities of mismatch probes were entirely ignored due to their spurious estimation of non-specific binding. The intensities were background-corrected in such a way that all corrected values must be positive. The RMA algorithm utilized quantile normalization in which the signal value of individual probes was substituted by the average of all probes with the same rank of intensity on each chip/array. Finally Tukey's median polish algorithm was used to obtain the estimates of expression for normalized probe intensities. At probe set level by Global Scaling (GCOS vl.4)
This algorithm was a summary method embedded in GeneChip Operating Software (GCOS) from Affymetrix, and fully described in the data analysis fundamentals manual. The signal intensity of each probe was firstly corrected by the overall background. The differences between perfect match (PM) and mismatch (MM) probes were examined by using background-adjusted intensities for each probe pair. The significance of the differences between PM and MM probe sets was reflected by a p-value calculated by onesided Wilcoxon-signed rank test. The final signal for a probe set was assigned as the one- step biweight estimate of the combined differences of all probe pairs belonging to one probe set. The trimmed mean signal of each array was then scaled to the same Target Intensity (e.g. 250) by a global method to minimize technique-derived discrepancies.
Other transformations
Intensities of probe sets lower than 30 were reset to 30. The geometric mean for each probe set was calculated across all samples, or for each subgroup of samples firstly and then across all samples (OmniViz). The intensity values of individual probe sets in each sample were then displayed as the log 2 of the deviations to the calculated geometric means.
(a) Probe sets filtering
Probe sets were involved in further analysis only if their expression levels deviated from the overall mean in at least one array by a minimum factor of 2.5, because the remaining data were unlikely to be informative. The result was that 43,160 probe sets were eliminated, and 11,515 probe sets remained for further analysis.
(b) Unsupervised clustering and visualization of gene/sample similarity
Clustering was performed without taking into account any external information such as histology subtypes and tumor stages, with each of the selected 11,515 probe sets using the
K-means algorithm (OmniViz). Similarities were measured by magnitude and shape
(Euclidean distance). Pair-wised similarities between samples were sorted and visualized by the Pearson Correlation Matrix (OmniViz). The order of clusters and individual samples within each cluster was sorted according to the Pearson Correlation Coefficient.
(c) Bioinformatics analysis
The resulting 11,515 probe sets from the filtering step was the starting point for all supervised analyses which, for instance, correlated gene expression with clinical variables such as histological subtype. Class comparison analysis was performed by using Significance Analysis of Microarray (SAM), integrated in OminiViz version 5.1. Class prediction analysis was performed with the use of Prediction Analysis of Microarrays (PAM) software, integrated in BRBArray version 3.6. Clustering was performed using the Spotfire DecisionSite software (TIBCO, Palo Alto, CA). The samples were clustered with various signatures using the Weighted Pair-Group Method algorithm and similarity measured by Euclidean distance.
(d) Class comparison
SAM discovered differentially expressed genes among different sample classes, e.g. between non-cancerous tissues and tumours (this Example) or among different histology subtypes (Example 2). Firstly this algorithm calculated the different expression for each gene between classes relative to the variation expected in the mean difference. To correct multiple testing, false discovery rate (FDR) was controlled by randomly permutating the classes of samples 100 times. Signature probe sets for assigned classes were selected by a change factor of 2 and a FDR of less than 1 percent. The class comparisons were performed with both RMA- and GCOS-processed data. The common probe sets identified by both sets of data were selected as the final signatures.
(e) Class prediction
The resultant signatures from Class Comparison were tested by the nearest shrunken centroids algorithm (PAM) to identify subgroups of genes that best characterized the predefined classes. The prediction accuracy of optimized signatures was determined by performing "leave-one-out" cross validation within the training set, with one sample omitted each time and class label being predicted with other samples for the omitted sample. The predictive models generated by the optimal subsets were subsequently applied to make predictions of classes for samples in the validation set, which were not involved in the corresponding class comparisons. The prediction accuracy on validation samples was calculated by comparing predicted class labels with the histopathological diagnoses for those samples; samples without histopathological records were excluded from the calculations.
Signature genes distinguish NSCLC from normal tissues
To identify genes involved in NSCLC carcinogenesis, we compared gene expression profiles from 44 NSCLC tumours to that from 36 noncancerous lung tissues. Samples were left out from the comparison analysis if 1) donor patients received chemotherapy or radiotherapy prior to the surgery; or 2) the content of cancer cells was <60% in case of cancer. Only core samples determined by correlation analysis were involved. By using supervised analyses, we identified 187 genes that were differentially expressed in NSCLC samples compared to healthy lung samples. These genes are set forth in Table 1. A subset of these genes, 5 out of 187, was able to distinguish non-cancerous lung tissues from malignant NSCLC in the validation set with an accuracy of 98%. These genes are set forth in Table 2.
Five training and validation samples, including two tumor samples and three noncancerous samples, were incorrectly classified by the optimized tumor signature. Of these, one presented with an uncertain histological diagnosis, and two were from patients who had developed multiple primary tumours.
Survival analysis
To evaluate the prognostic value of the prognosis predictor relative to other clinical parameters, we used proportional hazard regression analysis with the defined survival time as dependent variable, death as the occurred event, and the last follow-up visit as the censored. The risk of death studied included age, tumor cell content (%), tumor size (diameter of tumor), smoking year, Forced Expiratory Volume 1, and gender, tumor histology, tumor grade, as well as computed prognosis predictor. The relation between them and the relative hazard ratio was tested with use of the WaId test. The 95% confidence interval for relative hazard ratios, and the p-values are listed in supplementary Table 8. To compare the performance in predicting the overall OS, the proportional hazard regression model was built with either involving a specific parameter or not. The contribution of each parameter to the model was evaluated by chi-square test and P-value was derived from the likelihood ratio test (Table 6) [29].
The correlation between the survival signature and clinical parameters was evaluated using predicted risk as grouping variable and with independent samples t-test for continuous variables, or non-parametric test, Mann- Whitney and maximum possibility WaId- Wolfowitz test, for categorical variables and scalar variables (Table 7). Statistical analyses were performed with SPSS 15.0 (SPSS, Chicago, IL). For each tumor from the NSCLC validation cohort, we calculated a prognosis predictor by fitting the predetermined predictive model with expression of the 17 probe sets. Patients were predicted with high- risk of death if their prognosis predictor percentile ranking was above the 60th, as determined in the procedure of identifying prognosis signature using training samples. Comparison with published signatures
(a) Prognostic signatures
If the original prognostic predictors were provided as gene symbols [30-35], EP1980627A1, we retrieved gene expression for the Erasmus MC and Duke University cohorts as follows. First, genes were mapped to the Affymetrix U 133 plus 2.0 chip, and the corresponding expression data from all relevant probe sets was extracted (Tables 9 and 10). Next, probe set level data was converted to gene level data by averaging probe sets targeting the same genes. Due to the variation between platforms, 4 genes from the Roepman et al [33] signature were missing from the Affymetrix Ul 33 plus 2.0 chip, we used the remaining 68 genes.
When the original prognostic predictors were supplied as probe sets [8, 9], WO2007034221A, either from Affymetrix U133A or U133 plus 2.0 arrays, the data was kept at probe set level. The Affymetrix HuGeneFL chip used by Beer et al / Guo et al [36, 37] deviates too much from the Ul 33 plus 2.0 chip and we therefore used gene symbols to re-map the data to the U133 plus 2.0 chip. Some studies provide multiple signature sets [31, 36, 37], in which case each signature set was tested. For all re-evaluations, a cut-off at the 50th and 60th percentile was used for dividing the two risk groups. We only show the results for the best stratification obtained (Fig. 3 and Table 11).
(b) Histology signatures
The retrieval of expression data of signature genes was the same as described in the previous section.
The reproducibility of previously published histology signatures was assessed using both Erasmus MC and Duke NSCLC cohorts ([38-40] and US20040241725 Al). Different predictive algorithms were used. For Erasmus MC NSCLC samples, the signatures were applied to aggregate three major subtypes, ADC, SCC, and LCC. In case the signature is devoted to a specific subtype, such as ADC [38, 40], it was subsequently applied to cluster NSCLC into two classes, ADC and non-ADC. All signatures were applied to cluster Duke NSCLC samples into two classes, ADC and SCC. The correct prediction rate was calculated by comparing the predicted histology to the gene-assigned histology (Erasmus MC) or the pathological review (Duke). The results from 1 -Nearest Neighbour algorithm which successfully classified all tumor samples in Erasmus MC and Duke cohorts were shown in Table 1 1.
Example 2
NSCLC are sub-classified by histology signature genes
As NSCLC is a class of tumor with a high degree of heterogeneity, genes characterizing histological features were identified using strictly selected tumor samples. The samples used had consistent histological diagnoses two independent pathologists, and displayed no apparent tissue heterogeneity. By comparing each type of NSCLC (ADC, SCC or LCC) with the other two subtypes, we identified in total 518 genes representing the three major subtypes of NSCLC: ADC, SCC, and LCC (Table 3). The percentage of correct classification from cross-validation was 96% in the training samples.
When this signature was applied to histological classification of validation samples, we found that one LCC sample and three carcinoid (CAR) samples, which were not involved in deriving the signature, were totally separated from the other tumours by clustering, existing as a unique group. This LCC sample was classified as CAR by the second pathology review. An optimized histological signature consists of 75 genes (Table 4). This optimised signature classified the training samples with 100% accuracy.
The expression profile of signature genes was applied to make cell type predictions on the samples with conflicting diagnoses (n=18). With three exceptions, all the ambiguously classified LCCs (n= 1 1) in the validation set were determined as ADC or SCC by the expression signature, consistent with the primary diagnosis. Of the 18 samples, one had an ambiguous diagnosis due to unsatisfactory histological staining, and three had a tumor cell content of less than 20%. Over 60% (11 out of 18) of these samples presented with apparent cell type heterogeneity.
Example 3
Survival risk prediction by expression profiles
To derive prognostic information from gene expression profiling, we first attempted to classify NSCLC patients into groups either with a poor or a good prognosis. Comparing the profiles of patients who died within 2 years of surgery with those of patients with long survival (> 5 years) failed to identify any significant differences in gene expression with an FDR less than 20%. Similar negative results were obtained when the analyses were restricted to ADC or LCC cases. A set of 29 probe-sets was identified having differential expression at a FDR of 10% between SCC patients with short- and long-survival (data not included).
Starting with 11,515 probe-sets remaining from the data filtering process, we performed a correlation analysis and identified 17 informative probe sets associated with survival time. The expression of these genes was then used to build a model for predicting a prognostic probability for each tumor case by fitting to a Cox proportional hazard model. A risk percentile of 60% was used for defining two risk groups which were distinguished at the significance of P value < 0.001 (Figure 2A).
Example 4
Validation of gene signatures
We studied the expression patterns of all signatures in two independent sets of microarray data from 96 NSCLC [10] and 6 normal lung specimens (GSE3526) collected in the United States (US validation set). The expression of the optimized tumor signature, consisting of 5 probe sets, was able to determine the tissue types for the US validation set with an accuracy of over 98%, 93 out of 96 NSCLC being classified as 'Tumor' and all normal lung specimens being classified as 'Healthy'. Since there were no LCC or other type of NSCLC in the validation set, we only used ADC and SCC signatures to predict the histology types for those cases. Predictions for up to 84% of tumor samples were consistent with their initial histology diagnoses on the basis of the expression profile of the optimized ADC and SCC signature. When the LCC signature was included in the prediction analysis, the correct percentage decreased to 83% (2 out of 96 NSCLC samples classified as LCC).
There were follow-up data available for 89 of 96 validation patients, and they were used to validate our survival signature. Among these patients, 5-year overall survival was 20% among patients with predicted low-risk and 3% among predicted high-risk. A Kaplan- Meier curve of overall survival stratified with expression of 17 probe-sets for selected patients is shown in Figure 2B. Patients in whom a high risk of death was predicted were separated from those at a low risk with statistical significance (p value < 0.003). If our own Erasmus MC patient cohort is combined with data from the cohortsrecruited at Duke University, the p value reduces to <0.001.
Example 5 Survival risk prediction by expression profiles
The association between the prognosis profile and clinical parameters was studied. The prognosis profile was significantly associated with age (p<0.023), smoking years (p<0.014), gender (p<0.012) and Forced Expiratory Volume 1 (p<0.009), a parameter reflecting lung function, but not with tumor stage, tumor cell content, tumor histology and tumor size (Table 7). We performed multivariate proportional hazard regression analysis to evaluate the predictive value of the prognostic predictor for patient outcome in comparison with other clinical parameters. No evidence of relation was found between relative hazard ratio and age, gender, smoking year, tumor cell content, Forced Expiratory Volume 1, tumor histology and tumor size. Table 8 shows the WaId statistics and significance for each variable tested. Tumor stage and the 17 probe set prognostic predictor were significantly related to the hazard of death. However, the prognostic predictor presented the highest importance which was 21.682 compared to 3.797 from tumor stage. Moreover the relative hazard ratio predicted by the prognostic predictor was 2.465 (95% confidence interval, 1.686 to 3.604, p<1.5E-06), the highest one among all tested risks (Table 6). Similarly, the inclusion of the prognostic predictor to the predictive model resulted in a change in model performance of 19.5, in terms of -2 log likelihood, with a p-value of 9.8E-06, compared to 24.3 and 2.0E-03 introduced by the model comprising all clinical variables. Thus, the multivariate proportional hazard analysis shows that the gene expression profile-derived prognostic predictor of 17 probe sets is the strongest predictor of the likelihood of death (Table 8).
Example 6
Comparison with other prognostic gene expression signatures
A number of gene expression profiling-derived prognostic predictors have been previously reported for NSCLC [30-34, 36, 37]. These signatures were derived from a wide variety of platforms and technological approaches (Table 9). We assessed the performance of these previously reported prognostic signatures on the Erasmus MC and Duke University data sets. A total of 14 signatures from 6 different publications were tested. For each report, the results obtained with the signature yielding the best stratification in low- and high risk groups are displayed in Kaplan-Meier curves (Fig. 3). We find that performance of the 6- gene signature of Boutros et al [30] was reasonable on the Duke University cohort (p- value 0.016) but not on the Erasmus MC cohort (p-value 0.69). The 41-gene signature reported by Shedden et al was developed for ADC samples [37]. Performance of this signature on the complete Erasmus MC and Duke University cohorts was unsatisfactory (p-values 0.113 and 0.158 respectively). However, if the analysis was limited to samples classified as ADC by our histology signature, this was the only prognostic signature that performed well on both cohorts (Erasmus MC p-value 0.016, Duke University p-value 0.019).
The observation that our 17 probe set signature performs well on independent cohorts comprising different types of NSCLC indicates that it is robust.
Example 7
Comparison with published NSCLC signatures
The signatures identified in this study were also compared to other published NSCLC signatures (Table 10) [35, 38-40] and US20040241725A1, WO2007034221A, EP1980627A1. The largest overlap was found between our histology signature and histology signature identified by Garber et al [39], 12 out of 370 genes overlapped with our 75-gene signature.
The correct prediction of histology signatures ranged from 56% to 93% (Erasmus MC), and 71% to 83% (Duke). Comparing to these signatures, the performance of Erasmus MC histology signature produced the highest accuracy (100%). The best performance from published histology signatures on Duke was 83%, comparable to that produced by Erasmus MC Histology signature (84%). ADC specific signatures performed better when the aimed aggregation was limited between ADC and non-ADC. Legend to Tables
Tables 1 and 2
T:N ratio Ratio of average expression in NSCLC samples / normal lung tissue T mean 21og transformation of mean expression value in NSCLC samples (average of all NSCLC and normal lung tissue = 0).
N mean 21og transformation of mean expression value in normal lung tissue samples (average of all NSCLC and normal lung tissue = 0).
T SD Standard deviation of mean expression value in NSCLC samples N SD Standard deviation of mean expression value in normal lung tissue samples
Tables 3 and 4
ADC:OT ratio Ratio of average expression in ADC samples / the other NSCLC samples (SCC and LCC)
SCC:OT ratio Ratio of average expression in SCC samples / the other NSCLC samples (ADC and LCC)
LCC:OT ratio Ratio of average expression in LCC samples / the other NSCLC samples (ADC and SCC)
ADC mean 21og transformation of mean expression value in ADC samples (average of all NSCLC samples = 0).
SCC mean 21og transformation of mean expression value in SCC samples (average of all NSCLC samples = 0).
LCC mean 21og transformation of mean expression value in LCC samples (average of all NSCLC samples = 0).
ADC SD Standard deviation of mean expression value in ADC samples
SCC SD Standard deviation of mean expression value in SCC samples
LCC SD Standard deviation of mean expression value in LCC sample All publications mentioned in the above specification are herein incorporated by reference. Various modifications and variations of the described methods of the invention will be apparent to those skilled in the art without departing from the scope and spirit of the invention. Although the invention has been described in connection with specific preferred embodiments, it should be understood that the invention as claimed should not be unduly limited to such specific embodiments. Indeed, various modifications of the described modes for carrying out the invention which are obvious to those skilled in molecular biology or related fields are intended to be covered by the present invention.
Figure imgf000044_0001
Figure imgf000045_0001
Figure imgf000046_0001
Figure imgf000047_0001
Figure imgf000048_0001
Figure imgf000049_0001
Figure imgf000050_0001
Figure imgf000051_0001
o
Figure imgf000052_0001
Figure imgf000053_0001
Figure imgf000054_0001
Figure imgf000055_0001
Figure imgf000056_0001
Figure imgf000057_0001
Figure imgf000058_0001
Figure imgf000059_0001
Figure imgf000060_0001
Figure imgf000061_0001
Figure imgf000062_0001
Figure imgf000063_0001
Figure imgf000064_0001
Figure imgf000065_0001
64
Figure imgf000066_0001
Figure imgf000067_0001
Figure imgf000068_0001
Figure imgf000069_0001
Figure imgf000070_0001
Figure imgf000071_0001
Figure imgf000072_0001
Figure imgf000073_0001
Figure imgf000074_0001
Figure imgf000075_0001
Figure imgf000076_0001
Figure imgf000077_0001
Figure imgf000078_0001
Figure imgf000079_0001
References
1. Ferlay, J., et al., Estimates of the cancer incidence and mortality in Europe in 2006. Ann Oncol, 2007. 18(3): p. 581-92.
2. Pretreatment evaluation of non-small-cell lung cancer. The American Thoracic Society and The European Respiratory Society. Am J Respir Crit Care Med, 1997. 156(1): p. 320-32.
3. Fujii, T., et al., A preliminary transcriptome map of non-small cell lung cancer. Cancer Res, 2002. 62(12): p. 3340-6.
4. Kikuchi, T., et al., Expression profiles of non-small cell lung cancers on cDNA microarrays: identification of genes for prediction of lymph-node metastasis and sensitivity to anti-cancer drugs. Oncogene, 2003. 22(14): p. 2192-205.
5. Yao, R., et al., Differentially expressed genes associated with mouse lung tumor progression. Oncogene, 2002. 21(37): p. 5814-21.
6. Jones, M. H., et al., Two prognostically significant subtypes of high-grade lung neuroendocrine tumours independent of small-cell and large-cell neuroendocrine carcinomas identified by gene expression profiles. Lancet, 2004. 363(9411): p. 775-81.
7. Kobayashi, K., et al., Identification of genes whose expression is upregulated in lung adenocarcinoma cells in comparison with type II alveolar cells and bronchiolar epithelial cells in vivo. Oncogene, 2004. 23(17): p. 3089-96.
8. Bhattacharjee, A., et al., Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc Natl Acad Sci U S A, 2001. 98(24): p. 13790-5.
9. Garber, M.E., et al., Diversity of gene expression in adenocarcinoma of the lung. Proc Natl Acad Sci U S A, 2001. 98(24): p. 13784-9.
10. Potti, A., et al., A genomic strategy to refine prognosis in early-stage non-small- cell lung cancer. N Engl J Med, 2006. 355(6): p. 570-80. 11. Sun, Z., D. A. Wigle, and P. Yang, Non-overlapping and non-cell-type-specific gene expression signatures predict lung cancer survival. J Clin Oncol, 2008. 26(6): p. 877- 83.
12. Lau, S. K., et al., Three-gene prognostic classifier for early-stage non small-cell lung cancer. J Clin Oncol, 2007. 25(35): p. 5562-9.
13. Raz, D. J., et al., A multigene assay is prognostic of survival in patients with early- stage lung adenocarcinoma. Clin Cancer Res, 2008. 14(17): p. 5565-70.
14. Skrzypski, M., et al., Three-gene expression signature predicts survival in early- stage squamous cell carcinoma of the lung. Clin Cancer Res, 2008. 14(15): p. 4794-9.
15. Taguchi, A., et al., Blockade of RAGE-amphoterin signalling suppresses tumour growth and metastases. Nature, 2000. 405(6784): p. 354-60.
16. Franklin, W. A., RAGE in lung tumors. Am J Respir Crit Care Med, 2007. 175(2): p. 106-7.
17. Yuan, B.Z., M.E. Durkin, and N.C. Popescu, Promoter hypermethylation of DLC- 1, a candidate tumor suppressor gene, in several common human cancers. Cancer Genet
Cytogenet, 2003. 140(2): p. 113-7.
18. Yuan, B.Z., et al., DLC-I gene inhibits human breast cancer cell growth and in vivo tumorigenicity. Oncogene, 2003. 22(3): p. 445-50.
19. Kim, T. Y., et al., DLC-I, a GTPase-activating protein for Rho, is associated with cell proliferation, morphology, and migration in human hepatocellular carcinoma.
Biochem Biophys Res Commun, 2007. 355(1): p. 72-7.
20. Syed, V., et al., Identification of ATF-3, caveolin-1, DLC-I, and NM23-H2 as putative antitumorigenic, progesterone-regulated genes for ovarian cancer cells by gene profiling. Oncogene, 2005. 24(10): p. 1774-87.
21. Ullmannova, V. and N.C. Popescu, Inhibition of cell proliferation, induction of apoptosis, reactivation of DLCl, and modulation of other gene expression by dietary flavone in breast cancer cell lines. Cancer Detect Prev, 2007. 31(2): p. 110-8. 22. Gontan, C, et al., Sox2 is important for two crucial processes in lung development: branching morphogenesis and epithelial cell differentiation. Dev Biol, 2008. 317(1): p. 296-309.
23. Park, K.S., et al., Transdifferentiation of ciliated cells during repair of the respiratory epithelium. Am J Respir Cell MoI Biol, 2006. 34(2): p. 151 -7.
24. Kwei, K.A., et al., Genomic profiling identifies TITFl as a lineage-specific oncogene amplified in lung cancer. Oncogene, 2008. 27(25): p. 3635-40.
25. Tanaka, H., et al., Lineage-specific dependency of lung adenocarcinomas on the lung development regulator TTF-I . Cancer Res, 2007. 67(13): p. 6007-11.
26. Yesner, R., Large cell carcinoma of the lung. Semin Diagn Pathol, 1985. 2(4): p. 255-69.
27. Sequist, L.V., et al., Molecular predictors of response to epidermal growth factor receptor antagonists in non-small-cell lung cancer. J Clin Oncol, 2007. 25(5): p. 587-95.
28. Scagliotti, G. V., et al., Phase III study comparing cisplatin plus gemcitabine with cisplatin plus pemetrexed in chemotherapy-naive patients with advanced-stage non-small- cell lung cancer. J Clin Oncol, 2008. 26(21): p. 3543-51.
29. Cox, D.R., Regression models and life-tables. J R Stat Soc 1972. 34: p. 187-220.
30. Boutros, P. C, et al., Prognostic gene signatures for non-small-cell lung cancer. Proceedings of the National Academy of Sciences, 2009. 106(8): p. 2824-2828.
31. Chen, H.Y., et al., A five-gene signature and clinical outcome in non-small-cell lung cancer. N Engl J Med, 2007. 356(1): p. 11-20.
32. Guo, N.L., et al., Confirmation of Gene Expression-Based Prediction of Survival in Non-Small Cell Lung Cancer. Clin Cancer Res, 2008. 14(24): p. 8213-8220.
33. Roepman, P., et al., An immune response enriched 72-gene prognostic profile for early-stage non-small-cell lung cancer. Clin Cancer Res, 2009. 15(1): p. 284-90. 34. Beer, D. G., et al., Gene-expression profiles predict survival of patients with lung adenocarcinoma. Nat Med, 2002. 8(8): p. 816-24.
35. Kikuchi, T., et al., Expression profiles of non-small cell lung cancers on cDNA microarrays: identification of genes for prediction of lymph-node metastasis and sensitivity to anti-cancer drugs. Oncogene, 2003. 22(14): p. 2192-205.
36. Lee, E. S., et al., Prediction of recurrence-free survival in postoperative non-small cell lung cancer patients by using an integrated model of clinical information and gene expression. Clin Cancer Res, 2008. 14(22): p. 7397-404.
37. Shedden, K., et al., Gene expression-based survival prediction in lung adenocarcinoma: a multi-site, blinded validation study. Nat Med, 2008. 14(8): p. 822-7.
38. Bhattacharjee, A., et al., Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc Natl Acad Sci U S A, 2001. 98(24): p. 13790-5.
39. Garber, M.E., et al., Diversity of gene expression in adenocarcinoma of the lung. Proc Natl Acad Sci U S A, 2001. 98(24): p. 13784-9.
40. Kobayashi, K., et al., Identification of genes whose expression is upregulated in lung adenocarcinoma cells in comparison with type II alveolar cells and bronchiolar epithelial cells in vivo. Oncogene, 2004. 23(17): p. 3089-96.

Claims

1. A method for classifying a test tissue sample as a malignant non-small cell lung carcinoma (NSCLC) by analysis of gene expression, comprising the steps of:
(a) assaying the expression levels of 5 or more genes selected from the genes set forth in Table 1 ;
(b) comparing the expression levels of 5 or more genes with the expression levels of said 5 or more genes in a known non-cancerous tissue sample;
wherein a change in the expression levels of said 5 or more genes indicates that the test tissue sample is a malignant NSCLC sample.
2. A method according to claim 1, wherein the comparison step in (b) comprises comparing the expression levels of said 5 or more genes with a two-dimensional hierarchical clustering of the expression levels of 5 or more genes as set forth in Table 1 ; wherein a correlation between the expression levels of said 5 or more genes and the pattern of gene expression levels observed in the two-dimensional clustering indicates that the test tissue sample is a malignant NSCLC sample.
3. A method according to claim 1 or claim 2, wherein the 5 or more genes comprise the genes set forth in Table 2.
4. A method according to claim 3, wherein the 5 or more genes consist of the genes set forth in Table 2.
5. A method for classifying a test tissue sample of a malignant non-small cell lung carcinoma (NSCLC) into LCC, ADC or SCC subtypes by analysis of gene expression, comprising the steps of
(a) assaying the expression levels of 75 or more genes selected from the genes set forth in Table 3;
(b) comparing the expression levels of 75 or more genes with the expression levels of said 75 or more genes in known LCC, ADC or SCC NSCLC samples ; wherein a correlation between the expression levels of said 75 or more genes and the pattern of gene expression levels observed in Table 3 indicates that the test tissue sample is a malignant LCC, ADC or SCC subtype of NSCLC.
6. A method according to claim 5, wherein step (b) comprises comparing the expression levels of said 75 or more genes with a two-dimensional hierarchical clustering of the expression levels of 75 or more genes from NSCLC tumour samples as set forth in Table 3, wherein a correlation between the expression levels of said 75 or more genes and the pattern of gene expression levels observed in the two-dimensional clustering indicates that the test tissue sample is a malignant LCC, ADC or SCC subtype of NSCLC.
7. A method according to claim 5 or claim 6, wherein the 75 or more genes comprise the genes set forth in Table 4.
8. A method according to claim 7, wherein the 75 or more genes consist of the genes set forth in Table 4.
9. A method for predicting the survival time of a patient suffering from a non-small cell lung carcinoma (NSCLC) by analysis of gene expression, comprising the steps of:
(a) assaying the expression levels of 17 genes as set forth in Table 5;
(b) either (i) comparing the expression levels of said 17 genes with a two- dimensional hierarchical clustering of the expression levels of 17 genes from NSCLC tumour samples as set forth in Table 5; or (ii) fitting the expression levels of said 17 genes to a survival model to derive a prognostic index.
10. A method according to any preceding claim, wherein the gene expression levels are assessed by measuring the levels of nucleic acid transcripts from the genes.
11. A method according to claim 10, wherein the levels of the nucleic acid transcripts are measured by hybridisation, sequencing or selective amplification.
12. A method according to claim 10 or claim 11, wherein mRNA levels are assayed using one or more sets of nucleic acid probes which are specific for the target mRNA species, or copies thereof, under the assay conditions used.
13. A method according to any one of claims 1 to 9, wherein the gene expression levels are assessed by measuring the levels of the polypeptide gene products of the genes.
14. A method according to claim 13, wherein the levels of polypeptide gene products are measured by immunological analysis or aptamer-based molecular recognition.
15. A method according to claim 10 or claim 13, wherein the levels of nucleic acid transcripts are measured using microarray analysis.
16. A diagnostic kit for use in characterising NSCLC tumours, comprising a set of reagents for specifically measuring the abundance of the mRNA species transcribed from 5 or more of the genes set forth in Table 1, the 5 genes set forth in Table 2, 75 or more of the genes set forth in Table 3, the 75 genes set forth in Table 4 or the 17 genes set forth in Table 5; or the gene products expressed from said mRNA species.
17. A kit according to claim 16, wherein the reagents comprise a set of oligonucleotide primers or probes which hybridise specifically to said genes.
18. A kit according to claim 17, wherein the reagents comprise a plurality of oligonucleotides attached to a solid phase in the form of an array.
19. A kit according to claim 17, wherein the array consists of a library of oligonucleotides affixed to a solid phase, and said library of oligonucleotides consists substantially of oligonucleotides which are specific for the genes set forth in Table 2 and/or Table 4 and/or Table 5.
20. A kit according to claim 16, wherein said reagents are selected from immunoglobulin molecules, RNA aptamers and peptide aptamers.
21. A kit according to claim 16, for use in detecting the presence of NSCLC tumour tissue, comprising a set of nucleic acid probes or primers which recognises the transcripts of the genes set forth in Table 2.
22. A kit according to claim 16, for use in differentiating between LCC, ADC or SCC subtypes of NSCLC, comprising a set of nucleic acid probes or primers which recognises the transcripts of the genes set forth in Table 4.
23. A kit according to claim 16, for use in estimating the prognosis for survival of a patient suffering from NSCLC, comprising a set of nucleic acid probes or primers which recognises the transcripts of the genes set forth in Table 5.
24. A kit according to any one of claims 21 to 23, further including labelling means.
25. A kit according to claim 16, for use in detecting the presence of NSCLC tumour tissue, comprising a set of reagents selected from immunoglobulin molecules, RNA aptamers and peptide aptamers which recognises the polypeptides encoded by the genes set forth in Table 2.
26. A kit according to claim 16, for use in differentiating between LCC, ADC or SCC subtypes of NSCLC, comprising a set of reagents selected from immunoglobulin molecules, RNA aptamers and peptide aptamers which recognises the polypeptides encoded by the genes set forth in Table 4.
27. A kit according to claim 16, for use in estimating the prognosis for survival of a patient suffering from NSCLC, comprising a set of reagents selected from immunoglobulin molecules, RNA aptamers and peptide aptamers which recognises the polypeptides encoded by the genes set forth in Table 5.
PCT/EP2010/001773 2009-03-23 2010-03-22 Tumour gene profile WO2010108638A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB0904957A GB0904957D0 (en) 2009-03-23 2009-03-23 Tumour gene profile
GB0904957.8 2009-03-23

Publications (2)

Publication Number Publication Date
WO2010108638A1 true WO2010108638A1 (en) 2010-09-30
WO2010108638A9 WO2010108638A9 (en) 2011-04-28

Family

ID=40640001

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2010/001773 WO2010108638A1 (en) 2009-03-23 2010-03-22 Tumour gene profile

Country Status (2)

Country Link
GB (1) GB0904957D0 (en)
WO (1) WO2010108638A1 (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012053200A1 (en) * 2010-10-21 2012-04-26 Oncotherapy Science, Inc. C18orf54 peptides and vaccines including the same
WO2013002750A2 (en) * 2011-06-29 2013-01-03 Biotheranostics, Inc. Determining tumor origin
US9212228B2 (en) 2005-11-24 2015-12-15 Ganymed Pharmaceuticals Ag Monoclonal antibodies against claudin-18 for treatment of cancer
US9512232B2 (en) 2012-05-09 2016-12-06 Ganymed Pharmaceuticals Ag Antibodies against Claudin 18.2 useful in cancer diagnosis
US9670553B2 (en) 2004-06-04 2017-06-06 Biotheranostics, Inc. Determining tumor origin
US9775785B2 (en) 2004-05-18 2017-10-03 Ganymed Pharmaceuticals Ag Antibody to genetic products differentially expressed in tumors and the use thereof
US10414824B2 (en) 2002-11-22 2019-09-17 Ganymed Pharmaceuticals Ag Genetic products differentially expressed in tumors and the use thereof
US10538816B2 (en) 2004-06-04 2020-01-21 Biotheranostics, Inc. Identification of tumors
EP3630293A4 (en) * 2017-05-22 2021-06-02 The National Institute for Biotechnology in the Negev Ltd. Biomarkers for diagnosis of lung cancer
US11430544B2 (en) 2005-06-03 2022-08-30 Biotheranostics, Inc. Identification of tumors and tissues

Citations (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4631211A (en) 1985-03-25 1986-12-23 Scripps Clinic & Research Foundation Means for sequential solid phase organic synthesis and methods using the same
US4683195A (en) 1986-01-30 1987-07-28 Cetus Corporation Process for amplifying, detecting, and/or-cloning nucleic acid sequences
US4683202A (en) 1985-03-28 1987-07-28 Cetus Corporation Process for amplifying nucleic acid sequences
US5270163A (en) 1990-06-11 1993-12-14 University Research Corporation Methods for identifying nucleic acid ligands
US5503978A (en) 1990-06-11 1996-04-02 University Research Corporation Method for identification of high affinity DNA ligands of HIV-1 reverse transcriptase
US5567588A (en) 1990-06-11 1996-10-22 University Research Corporation Systematic evolution of ligands by exponential enrichment: Solution SELEX
WO1996038579A1 (en) 1995-06-02 1996-12-05 Nexstar Pharmaceuticals, Inc. High-affinity oligonucleotide ligands to growth factors
US5654151A (en) 1990-06-11 1997-08-05 Nexstar Pharmaceuticals, Inc. High affinity HIV Nucleocapsid nucleic acid ligands
US5837832A (en) 1993-06-25 1998-11-17 Affymetrix, Inc. Arrays of nucleic acid probes on biological chips
WO1999002671A1 (en) 1997-07-07 1999-01-21 Medical Research Council In vitro sorting method
WO2000040712A1 (en) 1999-01-07 2000-07-13 Medical Research Council Optical sorting method
WO2001016375A2 (en) 1999-08-30 2001-03-08 The Government Of The United States Of America, As Represented By The Secretary, Department Of Health And Human Services High speed parallel molecular nucleic acid sequencing
US6210896B1 (en) 1998-08-13 2001-04-03 Us Genomics Molecular motors
US6263286B1 (en) 1998-08-13 2001-07-17 U.S. Genomics, Inc. Methods of analyzing polymers using a spatial network of fluorophores and fluorescence resonance energy transfer
US6355420B1 (en) 1997-02-12 2002-03-12 Us Genomics Methods and products for analyzing polymers
WO2002022869A2 (en) 2000-09-13 2002-03-21 Medical Research Council Directed evolution method
US6403311B1 (en) 1997-02-12 2002-06-11 Us Genomics Methods of analyzing polymers using ordered label strategies
US20020090979A1 (en) 2000-10-30 2002-07-11 Sydor John T. Method and wireless communication hub for data communications
US6440706B1 (en) 1999-08-02 2002-08-27 Johns Hopkins University Digital amplification
WO2003044187A2 (en) 2001-11-16 2003-05-30 Medical Research Council Emulsion compositions
US6696022B1 (en) 1999-08-13 2004-02-24 U.S. Genomics, Inc. Methods and apparatuses for stretching polymers
US6762059B2 (en) 1999-08-13 2004-07-13 U.S. Genomics, Inc. Methods and apparatuses for characterization of single polymers
WO2004070005A2 (en) 2003-01-29 2004-08-19 454 Corporation Double ended sequencing
US6790671B1 (en) 1998-08-13 2004-09-14 Princeton University Optically characterizing polymers
US20040241725A1 (en) 2003-03-25 2004-12-02 Wenming Xiao Lung cancer detection
US20050014175A1 (en) 1999-06-28 2005-01-20 California Institute Of Technology Methods and apparatuses for analyzing polynucleotide sequences
WO2005010145A2 (en) 2003-07-05 2005-02-03 The Johns Hopkins University Method and compositions for detection and enumeration of genetic variations
WO2005039389A2 (en) 2003-10-22 2005-05-06 454 Corporation Sequence-based karyotyping
US20050100932A1 (en) 2003-11-12 2005-05-12 Helicos Biosciences Corporation Short cycle methods for sequencing polynucleotides
WO2005049829A1 (en) * 2003-05-30 2005-06-02 Astrazeneca Uk Limited Process
WO2005054431A2 (en) 2003-12-01 2005-06-16 454 Corporation Method for isolation of independent, parallel chemical micro-reactions using a porous filter
WO2005073410A2 (en) 2004-01-28 2005-08-11 454 Corporation Nucleic acid amplification with continuous flow emulsion
US20070026424A1 (en) * 2005-04-15 2007-02-01 Powell Charles A Gene profiles correlating with histology and prognosis
WO2007034221A2 (en) 2005-09-23 2007-03-29 University Court Of The University Of Aberdeen Non small cell lung cancer therapy prognosis and target
EP1980627A1 (en) 2007-04-13 2008-10-15 Pangaea Biotech, S.A. Method of determining a chemotherapeutic regime and survival expectancy for non small cell lung cancer based on EGFR/CSF-1/CA IX expression
KR20090025898A (en) * 2007-09-07 2009-03-11 삼성전자주식회사 Marker, kit, microarray and method for predicting the risk of lung cancer recurrence

Patent Citations (43)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4631211A (en) 1985-03-25 1986-12-23 Scripps Clinic & Research Foundation Means for sequential solid phase organic synthesis and methods using the same
US4683202A (en) 1985-03-28 1987-07-28 Cetus Corporation Process for amplifying nucleic acid sequences
US4683202B1 (en) 1985-03-28 1990-11-27 Cetus Corp
US4683195A (en) 1986-01-30 1987-07-28 Cetus Corporation Process for amplifying, detecting, and/or-cloning nucleic acid sequences
US4683195B1 (en) 1986-01-30 1990-11-27 Cetus Corp
US5654151A (en) 1990-06-11 1997-08-05 Nexstar Pharmaceuticals, Inc. High affinity HIV Nucleocapsid nucleic acid ligands
US5270163A (en) 1990-06-11 1993-12-14 University Research Corporation Methods for identifying nucleic acid ligands
US5503978A (en) 1990-06-11 1996-04-02 University Research Corporation Method for identification of high affinity DNA ligands of HIV-1 reverse transcriptase
US5567588A (en) 1990-06-11 1996-10-22 University Research Corporation Systematic evolution of ligands by exponential enrichment: Solution SELEX
US5837832A (en) 1993-06-25 1998-11-17 Affymetrix, Inc. Arrays of nucleic acid probes on biological chips
WO1996038579A1 (en) 1995-06-02 1996-12-05 Nexstar Pharmaceuticals, Inc. High-affinity oligonucleotide ligands to growth factors
US6355420B1 (en) 1997-02-12 2002-03-12 Us Genomics Methods and products for analyzing polymers
US6403311B1 (en) 1997-02-12 2002-06-11 Us Genomics Methods of analyzing polymers using ordered label strategies
WO1999002671A1 (en) 1997-07-07 1999-01-21 Medical Research Council In vitro sorting method
US6210896B1 (en) 1998-08-13 2001-04-03 Us Genomics Molecular motors
US6263286B1 (en) 1998-08-13 2001-07-17 U.S. Genomics, Inc. Methods of analyzing polymers using a spatial network of fluorophores and fluorescence resonance energy transfer
US6790671B1 (en) 1998-08-13 2004-09-14 Princeton University Optically characterizing polymers
US6772070B2 (en) 1998-08-13 2004-08-03 U.S. Genomics, Inc. Methods of analyzing polymers using a spatial network of fluorophores and fluorescence resonance energy transfer
WO2000040712A1 (en) 1999-01-07 2000-07-13 Medical Research Council Optical sorting method
US20050014175A1 (en) 1999-06-28 2005-01-20 California Institute Of Technology Methods and apparatuses for analyzing polynucleotide sequences
US6440706B1 (en) 1999-08-02 2002-08-27 Johns Hopkins University Digital amplification
US6753147B2 (en) 1999-08-02 2004-06-22 The Johns Hopkins University Digital amplification
US6696022B1 (en) 1999-08-13 2004-02-24 U.S. Genomics, Inc. Methods and apparatuses for stretching polymers
US6762059B2 (en) 1999-08-13 2004-07-13 U.S. Genomics, Inc. Methods and apparatuses for characterization of single polymers
WO2001016375A2 (en) 1999-08-30 2001-03-08 The Government Of The United States Of America, As Represented By The Secretary, Department Of Health And Human Services High speed parallel molecular nucleic acid sequencing
WO2002022869A2 (en) 2000-09-13 2002-03-21 Medical Research Council Directed evolution method
US20020090979A1 (en) 2000-10-30 2002-07-11 Sydor John T. Method and wireless communication hub for data communications
WO2003044187A2 (en) 2001-11-16 2003-05-30 Medical Research Council Emulsion compositions
WO2005003375A2 (en) 2003-01-29 2005-01-13 454 Corporation Methods of amplifying and sequencing nucleic acids
WO2004069849A2 (en) 2003-01-29 2004-08-19 454 Corporation Bead emulsion nucleic acid amplification
WO2004070005A2 (en) 2003-01-29 2004-08-19 454 Corporation Double ended sequencing
WO2004070007A2 (en) 2003-01-29 2004-08-19 454 Corporation Method for preparing single-stranded dna libraries
US20040241725A1 (en) 2003-03-25 2004-12-02 Wenming Xiao Lung cancer detection
WO2005049829A1 (en) * 2003-05-30 2005-06-02 Astrazeneca Uk Limited Process
WO2005010145A2 (en) 2003-07-05 2005-02-03 The Johns Hopkins University Method and compositions for detection and enumeration of genetic variations
WO2005039389A2 (en) 2003-10-22 2005-05-06 454 Corporation Sequence-based karyotyping
US20050100932A1 (en) 2003-11-12 2005-05-12 Helicos Biosciences Corporation Short cycle methods for sequencing polynucleotides
WO2005054431A2 (en) 2003-12-01 2005-06-16 454 Corporation Method for isolation of independent, parallel chemical micro-reactions using a porous filter
WO2005073410A2 (en) 2004-01-28 2005-08-11 454 Corporation Nucleic acid amplification with continuous flow emulsion
US20070026424A1 (en) * 2005-04-15 2007-02-01 Powell Charles A Gene profiles correlating with histology and prognosis
WO2007034221A2 (en) 2005-09-23 2007-03-29 University Court Of The University Of Aberdeen Non small cell lung cancer therapy prognosis and target
EP1980627A1 (en) 2007-04-13 2008-10-15 Pangaea Biotech, S.A. Method of determining a chemotherapeutic regime and survival expectancy for non small cell lung cancer based on EGFR/CSF-1/CA IX expression
KR20090025898A (en) * 2007-09-07 2009-03-11 삼성전자주식회사 Marker, kit, microarray and method for predicting the risk of lung cancer recurrence

Non-Patent Citations (76)

* Cited by examiner, † Cited by third party
Title
"Pretreatment evaluation of non-small-cell lung cancer. The American Thoracic Society and The European Respiratory Society", AM J RESPIR CRIT CARE MED, vol. 156, no. 1, 1997, pages 320 - 32
AUSUBEL ET AL,: "Current Protocols in Molecular Biology", vol. 1, 1994, JOHN WILEY & SONS, INC.
AUSUBEL ET AL.: "Short Protocols in Molecular Biology,4th Ed,", 1999, JOHN WILEY & SONS, INC.
BEER, D.G. ET AL.: "Gene-expression profiles predict survival of patients with lung adenocarcinoma", NAT MED, vol. 8, no. 8, 2002, pages 816 - 24
BHATTACHARJEE, A. ET AL.: "Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses", PROC NATL ACAD SCI U S A, vol. 98, no. 24, 2001, pages 13790 - 5
BOUTROS, P.C. ET AL.: "Prognostic gene signatures for non-small-cell lung cancer", PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES, vol. 106, no. 8, 2009, pages 2824 - 2828
CELIS ET AL., FEBS LETT, vol. 480, no. 1, 2000, pages 2 - 16
CHEN, H.Y. ET AL.: "A five-gene signature and clinical outcome in non-small-cell lung cancer", N ENGL J MED, vol. 356, no. 1, 2007, pages 11 - 20
CORTES, THE SCIENTIST, vol. 14, no. 17, 2000, pages 25
CORTESE, THE SCIENTIST, vol. 14, no. 11, 2000, pages 26
COX, D.R.: "Regression models and life-tables", J R STAT SOC, vol. 34, 1972, pages 187 - 220
E. J. MURRAY,: "Gene Transfer and Expression Protocols", THE HUMANA PRESS INC., pages: 109 - 128
EAKINS; CHU, TRENDS IN BIOTECHNOLOGY, vol. 17, 1999, pages 217 - 218
FERLAY, J. ET AL.: "Estimates of the cancer incidence and mortality in Europe in 2006", ANN ONCOL, vol. 18, no. 3, 2007, pages 581 - 92
FINGER ELIZABETH C ET AL: "TbetaRIII suppresses non-small cell lung cancer invasiveness and tumorigenicity.", CARCINOGENESIS MAR 2008 LNKD- PUBMED:18174241, vol. 29, no. 3, March 2008 (2008-03-01), pages 528 - 535, XP002581318, ISSN: 1460-2180 *
FRANKLIN, W.A.: "RAGE in lung tumors", AM J RESPIR CRIT CARE MED, vol. 175, no. 2, 2007, pages 106 - 7
FUJII, T. ET AL.: "A preliminary transcriptome map of non-small cell lung cancer", CANCER RES, vol. 62, no. 12, 2002, pages 3340 - 6
GARBER, M.E. ET AL.: "Diversity of gene expression in adenocarcinoma of the lung", PROC NATL ACAD SCI U S A, vol. 98, no. 24, 2001, pages 13784 - 9
GARBER, M.E. ET AL.: "Diversity of gene expression in adenocarcinoma of the lung", PROC NATL ACAD SCI USA, vol. 98, no. 24, 2001, pages 13784 - 9
GENTZ ET AL., PROC. NATL. ACAD. SCI. USA, vol. 86, 1989, pages 821 - 824
GIOVANNI PARMIGIANI, ELIZABETH S GARRETT, RAFAEL A IRIZARRY, SCOTT L ZEGER,: "The analysis of gene expression data: methods and software", 2003, SPRINGER
GONTAN, C. ET AL.: "Sox2 is important for two crucial processes in lung development: branching morphogenesis and epithelial cell differentiation", DEV BIOL, vol. 317, no. 1, 2008, pages 296 - 309
GUATELLI ET AL., PROC. NATL. ACAD. SCI . US A, vol. 87, 1990, pages 1874
GUO, N.L. ET AL.: "Confirmation of Gene Expression-Based Prediction of Survival in Non-Small Cell Lung Cancer", CLIN CANCER RES, vol. 14, no. 24, 2008, pages 8213 - 8220
GUTHRIE ET AL.: "Methods in Enzymology", vol. 194, 1991, ACADEMIC PRESS, INC., article "Guide to Yeast Genetics and Molecular Biology"
GWYNNE; PAGE: "Microarray analysis : the next revolution in molecular biolog", SCIENCE, 6 August 1999 (1999-08-06)
HOU JUN ET AL: "Gene expression-based classification of non-small cell lung carcinomas and survival prediction.", PLOS ONE 2010 LNKD- PUBMED:20421987, vol. 5, no. 4, 2010, pages E10312, XP002581317, ISSN: 1932-6203 *
INNIS ET AL.: "PCR Protocols: A Guide to Methods and Applications", 1990, ACADEMIC PRESS
JONES, M.H. ET AL.: "Two prognostically significant subtypes of high-grade lung neuroendocrine tumours independent of small-cell and large-cell neuroendocrine carcinomas identified by gene expression profiles", LANCET, vol. 363, no. 9411, 2004, pages 775 - 81
KIKUCHI, T. ET AL.: "Expression profiles of non-small cell lung cancers on cDNA microarrays: identification of genes for prediction of lymph-node metastasis and sensitivity to anti-cancer drugs", ONCOGENE, vol. 22, no. 14, 2003, pages 2192 - 205
KIM, T.Y. ET AL.: "DLC-1, a GTPase-activating protein for Rho, is associated with cell proliferation, morphology, and migration in human hepatocellular carcinoma", BIOCHEM BIOPHYS RES COMMUN, vol. 355, no. 1, 2007, pages 72 - 7
KOBAYASHI, K. ET AL.: "Identification of genes whose expression is upregulated in lung adenocarcinoma cells in comparison with type II alveolar cells and bronchiolar epithelial cells in vivo", ONCOGENE, vol. 23, no. 17, 2004, pages 3089 - 96
KWEI, K.A. ET AL.: "Genomic profiling identifies TITF1 as a lineage-specific oncogene amplified in lung cancer", ONCOGENE, vol. 27, no. 25, 2008, pages 3635 - 40
KY. CHAN, MUTATION RESEACH, vol. 573, 2005, pages 13 - 40
LANDEGREN, U. ET AL., SCIENCE, vol. 242, 1988, pages 229 - 237
LAU, S.K. ET AL.: "Three-gene prognostic classifier for early-stage non small-cell lung cancer", J CLIN ONCOL, vol. 25, no. 35, 2007, pages 5562 - 9
LEE, E.S. ET AL.: "Prediction of recurrence free survival in postoperative non-small cell lung cancer patients by using an integrated model of clinical information and gene expression", CLIN CANCER RES, vol. 14, no. 22, 2008, pages 7397 - 404
LEMIEUX ET AL., MOLECULAR BREEDING, vol. 4, 1998, pages 277 - 289
LEWIS, R., GENETIC ENGINEERING NEWS, vol. 10, no. 1, 1990, pages 54 - 55
LIZARDI ET AL., BIO/TECHNOLOGY, vol. 6, 1988, pages 1197
LIZARDI ET AL., NAT GENET, vol. 19, 1998, pages 225
LOCKHART; WINZELER, NATURE, vol. 405, no. 6788, 2000, pages 827 - 836
MARK SCHENA: "Microarray Biochip Technology", EATON PUBLISHING COMPANY
MCPHERSON ET AL.: "PCR", vol. 1, 1991, OXFORD UNIVERSITY PRESS
MOK ET AL., GYNAECOLOGIC ONCOLOGY, vol. 52, 1994, pages 247 - 252
NATURE GENETICS, January 1999 (1999-01-01)
PARK, K.S. ET AL.: "Transdifferentiation of ciliated cells during repair of the respiratory epithelium", AM J RESPIR CELL MOL BIOL, vol. 34, no. 2, 2006, pages 151 - 7
POTTI, A. ET AL.: "A genomic strategy to refine prognosis in early-stage non-small-cell lung cancer", N ENGL J MED, vol. 355, no. 6, 2006, pages 570 - 80
R. I. FRESHNEY: "Culture of Animal Cells: A Manual of Basic Technique, 2nd Ed.", 1987, LISS, INC.
RAZ, D.J. ET AL.: "A multigene assay is prognostic of survival in patients with early-stage lung adenocarcinoma", CLIN CANCER RES, vol. 14, no. 17, 2008, pages 5565 - 70
ROEPMAN, P. ET AL.: "An immune response enriched 72-gene prognostic profile for early-stage non-small-cell lung cancer", CLIN CANCER RES, vol. 15, no. 1, 2009, pages 284 - 90
SAMBROOK ET AL.: "Molecular Cloning: A Laboratory Manual", 1989
SAMBROOK ET AL.: "Molecular Cloning: A Laboratory Manual, 2d ed.", COLD SPRING HARBOR LABORATORY PRESS
SCAGLIOTTI, G.V. ET AL.: "Phase III study comparing cisplatin plus gemcitabine with cisplatin plus pemetrexed in chemotherapy-naive patients with advanced-stage non-small-cell lung cancer", J CLIN ONCOL, vol. 26, no. 21, 2008, pages 3543 - 51
SCHENA; DAVIS: "DNA Microarrays : A Practical Approach", 1999, OXFORD UNIVERSITY PRESS, article "Genes, Genomes and Chips"
SCHENA; DAVIS: "PCR Methods Manual", article "Parallel Analysis with Biological Chips"
SEQUIST, L.V. ET AL.: "Molecular predictors of response to epidermal growth factor receptor antagonists in non-small-cell lung cancer", J CLIN ONCOL, vol. 25, no. 5, 2007, pages 587 - 95
SHALON ET AL., GENOME RES, vol. 6, no. 7, 1996, pages 639 - 45
SHEDDEN, K. ET AL.: "Gene expression-based survival prediction in lung adenocarcinoma: a multi-site, blinded validation study", NAT MED, vol. 14, no. 8, 2008, pages 822 - 7
SINGHAL SUNIL ET AL: "Gene expression profiling of non-small cell lung cancer.", LUNG CANCER (AMSTERDAM, NETHERLANDS) JUN 2008 LNKD- PUBMED:18440087, vol. 60, no. 3, June 2008 (2008-06-01), pages 313 - 324, XP002581316, ISSN: 0169-5002 *
SKRZYPSKI, M. ET AL.: "Three-gene expression signature predicts survival in early-stage squamous cell carcinoma of the lung", CLIN CANCER RES, vol. 14, no. 15, 2008, pages 4794 - 9
STRAZISAR MOJCA ET AL: "The expression of COX-2, hTERT, MDM2, LATS2 and S100A2 in different types of non-small cell lung cancer (NSCLC).", CELLULAR & MOLECULAR BIOLOGY LETTERS 2009 LNKD- PUBMED:19238334, vol. 14, no. 3, 23 February 2009 (2009-02-23), pages 442 - 456, XP002594118, ISSN: 1689-1392 *
SUN, Z.; D.A. WIGLE; P. YANG: "Non-overlapping and non-cell-type-specific gene expression signatures predict lung cancer survival", J CLIN ONCOL, vol. 26, no. 6, 2008, pages 877 - 83
SYED, V. ET AL.: "Identification of ATF-3, caveolin-1, DLC-1, and NM23-H2 as putative antitumorigenic, progesterone-regulated genes for ovarian cancer cells by gene profiling", ONCOGENE, vol. 24, no. 10, 2005, pages 1774 - 87
TAGUCHI, A. ET AL.: "Blockade of RAGE-amphoterin signalling suppresses tumour growth and metastases", NATURE, vol. 405, no. 6784, 2000, pages 354 - 60
TANAKA, H. ET AL.: "Lineage-specific dependency of lung adenocarcinomas on the lung development regulator TTF-1", CANCER RES, vol. 67, no. 13, 2007, pages 6007 - 11
ULLMANNOVA, V.; N.C. POPESCU: "Inhibition of cell proliferation, induction of apoptosis, reactivation of DLC1, and modulation of other gene expression by dietary flavone in breast cancer cell lines", CANCER DETECT PREV, vol. 31, no. 2, 2007, pages 110 - 8
VELCULESCU ET AL., SCIENCE, vol. 270, no. 5235, pages 484 - 487
WALKER ET AL., PNAS (USA), vol. 80, 1992, pages 392
WILSON ET AL., CELL, vol. 37, 1984, pages 767
WU, D. Y.; WALLACE, R. B., GENOMICS, vol. 4, 1989, pages 560
YAO, R. ET AL.: "Differentially expressed genes associated with mouse lung tumor progression", ONCOGENE, vol. 21, no. 37, 2002, pages 5814 - 21
YESNER, R.: "Large cell carcinoma of the lung", SEMIN DIAGN PATHOL, vol. 2, no. 4, 1985, pages 255 - 69
YUAN, B.Z. ET AL.: "DLC-1 gene inhibits human breast cancer cell growth and in vivo tumorigenicity", ONCOGENE, vol. 22, no. 3, 2003, pages 445 - 50
YUAN, B.Z.; M.E. DURKIN; N.C. POPESCU: "Promoter hypermethylation of DLC-1, a candidate tumor suppressor gene, in several common human cancers", CANCER GENET CYTOGENET, vol. 140, no. 2, 2003, pages 113 - 7
ZHANG XU ET AL: "Expression profiles of early esophageal squamous cell carcinoma by cDNA microarray.", CANCER GENETICS AND CYTOGENETICS OCT 2009 LNKD- PUBMED:19737650, vol. 194, no. 1, October 2009 (2009-10-01), pages 23 - 29, XP002581315, ISSN: 1873-4456 *

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10414824B2 (en) 2002-11-22 2019-09-17 Ganymed Pharmaceuticals Ag Genetic products differentially expressed in tumors and the use thereof
US9775785B2 (en) 2004-05-18 2017-10-03 Ganymed Pharmaceuticals Ag Antibody to genetic products differentially expressed in tumors and the use thereof
US10538816B2 (en) 2004-06-04 2020-01-21 Biotheranostics, Inc. Identification of tumors
US9670553B2 (en) 2004-06-04 2017-06-06 Biotheranostics, Inc. Determining tumor origin
US11430544B2 (en) 2005-06-03 2022-08-30 Biotheranostics, Inc. Identification of tumors and tissues
US10174104B2 (en) 2005-11-24 2019-01-08 Ganymed Pharmaceuticals Gmbh Monoclonal antibodies against claudin-18 for treatment of cancer
US11739139B2 (en) 2005-11-24 2023-08-29 Astellas Pharma Inc. Monoclonal antibodies against Claudin-18 for treatment of cancer
US9212228B2 (en) 2005-11-24 2015-12-15 Ganymed Pharmaceuticals Ag Monoclonal antibodies against claudin-18 for treatment of cancer
US9499609B2 (en) 2005-11-24 2016-11-22 Ganymed Pharmaceuticals Ag Monoclonal antibodies against claudin-18 for treatment of cancer
US9751934B2 (en) 2005-11-24 2017-09-05 Ganymed Pharmaceuticals Ag Monoclonal antibodies against claudin-18 for treatment of cancer
US10738108B2 (en) 2005-11-24 2020-08-11 Astellas Pharma Inc. Monoclonal antibodies against claudin-18 for treatment of cancer
US10017564B2 (en) 2005-11-24 2018-07-10 Ganymed Pharmaceuticals Gmbh Monoclonal antibodies against claudin-18 for treatment of cancer
WO2012053200A1 (en) * 2010-10-21 2012-04-26 Oncotherapy Science, Inc. C18orf54 peptides and vaccines including the same
CN103282494A (en) * 2010-10-21 2013-09-04 肿瘤疗法科学股份有限公司 C18orf54 peptides and vaccines including the same
CN103282494B (en) * 2010-10-21 2015-06-17 肿瘤疗法科学股份有限公司 C18orf54 peptides and vaccines including the same
WO2013002750A2 (en) * 2011-06-29 2013-01-03 Biotheranostics, Inc. Determining tumor origin
WO2013002750A3 (en) * 2011-06-29 2013-05-10 Biotheranostics, Inc. Determining tumor origin
US10053512B2 (en) 2012-05-09 2018-08-21 Ganymed Pharmaceuticals Ag Antibodies against claudin 18.2 useful in cancer diagnosis
US9512232B2 (en) 2012-05-09 2016-12-06 Ganymed Pharmaceuticals Ag Antibodies against Claudin 18.2 useful in cancer diagnosis
EP3630293A4 (en) * 2017-05-22 2021-06-02 The National Institute for Biotechnology in the Negev Ltd. Biomarkers for diagnosis of lung cancer
US11408887B2 (en) 2017-05-22 2022-08-09 The National Institute for Biotechnology in the Negev Ltd. Biomarkers for diagnosis of lung cancer

Also Published As

Publication number Publication date
GB0904957D0 (en) 2009-05-06
WO2010108638A9 (en) 2011-04-28

Similar Documents

Publication Publication Date Title
WO2010108638A1 (en) Tumour gene profile
US8026055B2 (en) Materials and methods for prognosing lung cancer survival
JP6404304B2 (en) Prognosis prediction of melanoma cancer
JP5745848B2 (en) Signs of growth and prognosis in gastrointestinal cancer
JP2008521383A (en) Methods, systems, and arrays for classifying cancer, predicting prognosis, and diagnosing based on association between p53 status and gene expression profile
WO2008089577A1 (en) Breast cancer gene array
EP3090265B1 (en) Prostate cancer gene profiles and methods of using the same
CA2588253A1 (en) Methods and systems for prognosis and treatment of solid tumors
WO2012158780A2 (en) Lung cancer signature
WO2010063121A1 (en) Methods for biomarker identification and biomarker for non-small cell lung cancer
US9347088B2 (en) Molecular signature of liver tumor grade and use to evaluate prognosis and therapeutic regimen
WO2006089185A2 (en) Pharmacogenomic markers for prognosis of solid tumors
WO2008157277A1 (en) Methods for evaluating breast cancer prognosis
WO2006037485A2 (en) Methods and kits for the prediction of therapeutic success and recurrence free survival in cancer therapy
US20210381057A1 (en) Recurrence gene signature across multiple cancer types
WO2013079215A1 (en) Method for classifying tumour cells
WO2010000796A1 (en) A method for predicting clinical outcome of patients with non-small cell lung carcinoma
US20130303400A1 (en) Multimarker panel
Binder et al. Molecular characterization of long-term survivors of glioblastoma using genome-and transcriptome-wide profiling

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10709983

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 10709983

Country of ref document: EP

Kind code of ref document: A1