© 2004 European Society for Medical Oncology
Clinical trial design for microarray predictive marker discovery and assessment
Departments of 1 Breast Medical Oncology and 2 Biostatistics, The University of Texas M. D. Anderson Cancer Center, Houston, TX, USA
* Correspondence to: Dr L. Pusztai, Department of Breast Medical Oncology, Unit 424, The University of Texas M. D. Anderson Cancer Center, 1515 Holcombe Boulevard, Houston, TX 77030-4009, USA. Tel: +1-713-792-2817; Fax: +1-713-794-4385; Email: lpusztai{at}mdanderson.org
| Abstract |
|---|
|
|
|---|
Transcriptional profiling technologies that simultaneously measure the expression of thousands of mRNA species represent a powerful new clinical research tool. Similar to previous laboratory analytical methods including immunohistochemistry, PCR and in situ hybridization, this new technology may also find its niche in routine diagnostics. Outcome predictors discovered by these methods may be quite different from previous single-gene markers. These novel tests will probably combine the information embedded in the expression of multiple genes with mathematical prediction algorithms to formulate classification rules and predict outcome. The performance of machine learning-algorithm-based diagnostic tests may improve as they are trained on larger and larger sets of samples, and several generations of tests with improving accuracy may be introduced sequentially. Several gene-expression profilingtechnology platforms are mature enough for clinical testing. The most important next step that is needed for further progress is the development and validation of multigene predictors in prospectively designed clinical trials to determine the true accuracy and clinical value of this new technology. This manuscript reviews methodological and statistical issues relevant to clinical trial design to discover and validate multigene predictors of response to therapy.
Key words: clinical trials, microarrays, gene-expression profiling, predictive markers, multigene predictors
| Introduction |
|---|
|
|
|---|
Decades of extensive research have yielded few clinically useful single molecular markers predictive of response to chemotherapy in patients with cancer [1
It may be useful to think of marker discovery studies as conceptually similar to clinical trials that lead to the introduction of new drugs. The hallmark of clinical drug development is the multistage trial process. A similar focused, prospective, multistage evaluation of genomic markers could facilitate the introduction of new diagnostic markers into the clinic [8
]. Phase III marker discovery studies would be expected to show that a technology can be reliably and reproducibly applied to clinical specimens and that the estimated predictive accuracy of the proposed test falls within a range that is considered clinically useful. Phase III marker validation studies would then evaluate the predictor in a larger number of cases to demonstrate that clinical outcome is better when the new marker is used for decision making compared with the current standard, which may be another marker- or no marker-based recommendation.
| Tissue sampling |
|---|
|
|
|---|
Gene expression profiling with DNA microarrays are best performed on fresh or frozen tissues, because the accuracy of the results is dependent on good quality RNA. Many investigators use excisional biopsy specimens; however, core needle biopsies or fine needle aspiration of cancer can also yield sufficient amounts of RNA for microarray experiments [9
| Gene expression data |
|---|
|
|
|---|
Reproducible measurements are key to developing reliable predictors of outcome. Measurement error will lead to underestimation of the effect of a predictive marker and may also lower the power to detect any predictive effects [17
Gene-expression profiling, particularly if performed by a high-volume central laboratory may, in fact, lend itself to greater quality control than can be achieved with many of the current molecular diagnostic tests that are most commonly performed by end users. RNA quality and quantity can be measured, and the efficacy of the RT reaction, probe labeling and hybridization process can be monitored. Arrays routinely contain several types of positive and negative controls embedded in their array matrix. Global statistics can be applied to the gene-expression data to compare any new result with existing profiles in a previous data set, enabling investigators to flag results that appear to be beyond acceptable limits of variation.
A quality-assured, properly normalized gene-expression dataset is a data matrix with a row for each sample and a column for each gene. Several standard algorithms and mathematical methods are available for analyzing such datasets (see sections on gene selection and class prediction below). A unique challenge for investigators using microarray data is how to apply observations made on one platform to data generated on another platform. Different arrays contain different sets of genes. Even for the same set of genes, different oligonucleotide sequences may be used as probes, which could result in variable signal intensity. Investigators in different laboratories who present their gene-expression results as the ratio of the test samples to a common reference sample, often use a different reference from laboratory to laboratory. Furthermore, different normalization methods and signal detection techniques yield different numeric results, even for the same raw data. Not surprisingly, cross-platform validation of results has proved difficult [24
].
| Gene selection for outcome prediction |
|---|
|
|
|---|
The fundamental goal of gene-expression profile-based prediction is to identify a set of informative genes and develop a class-prediction algorithm that formulates a rule, based on the expression level of the informative genes, by which to categorize cases into outcome groups [25
A common and erroneous assumption is that a quick scrutiny of these genes will reveal the underlying biological differences between the two groups of samples (for example, that examination of the list of differentially expressed genes between chemotherapy-sensitive and -resistant tumors will reveal why one group is sensitive to treatment and the other is not). This biological information is partly embedded in the gene list, but may be difficult to recognize for both analytical and biological reasons. Differentially expressed gene lists are unstable, particularly when they are generated from small sets of samples and when each gene has only limited discriminating value. Different statistical methods applied to the same data yield distinct but overlapping gene lists, and the rank orders of genes are particularly unstable.
Even when truly differentially expressed genes are identified, these genes may or may not contribute to the most important biological differences between the two sets of samples. Many of these genes may represent bystanders rather than playing a causative role in the biological differences between the groups. Differentially expressed gene lists are thus best considered to be hypothesis generating.
Despite the widespread use of univariate significance screening methods to select gene sets for class prediction, these methods have not been rigorously compared with the more conventional optimal feature-selection methods that form part of the classification analysis. Gene sets composed of the few most individually significant genes need not necessarily have better prediction value than other sets, particularly if the data do not contain several individually strong predictors [28
, 29
]. In other words, genes that are not individually predictive may predict well when used in combination with others. However, the search for such combinations in array data is a formidable computational challenge, given the astronomical number of potential combinations to be examined and the large number of spurious associations that may be found.
| Validation of differentially expressed genes |
|---|
|
|
|---|
The appropriate validation of a microarray experiment depends on the hypothesis being tested. Gene-expression profiling methods are commonly used for three different purposes: the first involves using microarrays as a screening tool to identify individual genes of interest that might contribute to an important biological function; the second is to obtain insight into complex biological processes by examining thousands of genes simultaneously; and the third is to use this technology as a classification tool to sort cases into clinically important categories.
With regard to the experimental data, the first two applications assume that a particular bright spot on the microarray indicates that one particular gene, corresponding to the spot, is up- (or down-) regulated. The third application assumes that a spot on the microarray lights up differently between different groups of samples. For this latter classification application, it is of secondary importance whether the signal originates from a single gene or results from a composite signal, as long as it is consistently associated with one outcome group. A composite signal could arise from non-specific or cross-hybridization of the probe sequence with genes other than the one corresponding to the spot on the array. Consequently, validation of the expression of particular genes by using different methods, such as RTPCR, is critical for the first two applications, whereas the most appropriate validation for the third application is testing the predictor on independent sets of cases.
An inherent limitation of DNA microarrays is that not all of the thousands of genes included on the array are expected to generate a perfectly specific and linear signal. A single hybridization condition is applied to tens of thousands of distinct nucleic acid hybridization reactions, resulting in suboptimal conditions for some reactions. Some degree of non-specific and cross-hybridization for a substantial number of probes is unavoidable. In general, the reported RTPCR confirmation rate of microarray data for individual genes is
70% [30
, 31
].
| Class prediction |
|---|
|
|
|---|
Class-prediction analysis typically involves supervised classification, in which statistical learning algorithms formulate the classification rules that connect gene expression profiles to observed patient outcomes [26
Because there are hundreds or thousands of times more genes than samples, the selection of a gene subset that yields maximal prediction is challenging. Several sets of genes can be selected that predict well on the original data but fail to predict accurately on independent data. The explanation for this phenomenon, called overfitting, is that artifacts or noise that are particular to the original sample are being fit to yield more accurate predictions for the original sample, but these features are not present in other data and their inclusion leads to loss in accuracy on independent data.
One solution to attempt to minimize overfitting is repeated cross-validation within the training data to determine the best classifier and also to estimate the predictive accuracy that could be expected when the predictor is applied to independent cases. During this process, a subset of cases is omitted from the training set; discriminating genes are identified from the remaining subset and a class predictor is constructed which is then tested on the held out cases [33
]. This process is repeated many times, with different sets of cases left out on each occasion, and the predictive accuracies are averaged to yield estimated error rates for each classifier. Typically, the classifier requiring the least number of genes to achieve an acceptable rate of correct classification is considered the best. To determine the true predictive accuracy of this optimized classifier, it will need to be tested on an independent set of data. What is considered as acceptable predictive accuracy depends on the clinical outcome that is to be predicted. For example, high >90% accuracy would be required for any clinically useful predictor of prognosis for breast cancer, because misclassification could lead to recommending against potentially life-saving adjuvant therapy. On the other hand, less accuracy may be sufficient for a response predictor developed to select one drug over the other. Most patients would select a drug if they have a 60% chance to benefit from it compared with another drug that gives only 30% chance of benefit, particularly if toxicities are comparable.
With any number of genes randomly included in a classifier, chance could produce a certain number of correct predictions. To assess whether the classification error rates differ significantly from what chance alone could produce, a random-label permutation test is performed [34
]. During this process, the outcome label of each case (i.e. responder versus non-responder) is randomly changed. Hundreds of such datasets with randomly permutated labels are created, and the classifier generated from the true data is applied to the randomly permutated data. The observed error rate for the true dataset is compared with the distribution of the error rates observed with the randomly permutated data sets to calculate a permutation P value. A model that results in significantly correct predictions is taken for further validation on independent data to determine its true predictive accuracy.
An important feature of learning-based clinical outcome predictors is that the performance of the predictor may improve as the training sample size increases until it reaches a plateau. Figure 1 illustrates computer-simulated learning curves of SVM-based classifiers for different gene expression datasets [34
]. It shows how the estimated cross validation classification error rates decrease as the training sample size increases. For some prediction problems where the difference in gene-expression profiles between the groups to be separated is substantial, the learning curves are steep, and relatively few samples may yield good predictors. For other prediction problems, large training samples are needed to yield a predictor that operates close to its plateau of accuracy. The simulation used existing, publicly available datasets as starting data and projected how the predictor would improve if it were trained on increasingly larger sets of cases [35
].
|
| Sample size calculations for multigene predictive marker discovery |
|---|
|
|
|---|
To discover a predictive marker for a given treatment, a single-arm study design may be sufficient. A simple strategy is to base sample-size calculations on the number needed to ensure adequate power for the univariate screening of discriminating genes; in other words, how many training samples are needed to identify reliably an individually predictive gene? If we can assume that the array data are approximately normally distributed on some scale, then we can use standard two-sample testing methods to perform sample-size calculations [36
) and type II (ß) error rates, the level of inter-patient variability, the size of the difference in the mean expression values and the prevalence of response among patients. Differences in the mean expression value of a gene between responders and non-responders can be specified in terms of standardized effect size (SES), which is the mean difference of expression values between the two groups divided by the standard deviation (SD). One can perform sample-size calculations by assuming a given prevalence of response (i.e. response rate) and by specifying acceptable
and ß error rates. For example, for two-sided
=1% and ß=10%, and assuming a 10% response rate, we would need a total of 96 patients to detect an SES of 1 or greater for any particular gene, 170 patients to detect an SES of 0.75 and 381 patients to detect an SES of 0.5 (Table 1). If we wanted to adjust for multiple comparisons by using a smaller
value, the sample size would increase.
|
Using this approach to determine the sample size needed for discovery is problematic on at least two counts: (i) it requires specification of SD, which can vary from gene to gene; and (ii) it can lead to high FDR (FDR = the proportion of genes identified as being significantly different when, in fact, they are not) due to many times more variables (genes) than samples in each group. Preliminary gene expression data from 1520 cases in each outcome group can be used to estimate values for the SD for the genes. Several methods have been developed to control the FDR within pre-specified levels [37
A few other methods for sample-size calculation for supervised classification have also been reported. The approaches of Hwang et al. [40
] and Fisher and van Belle [41
] are not related to gene selection, but rather are based on global tests of significance between outcome groups. In both approaches, the global test is based on Fisher's linear discriminant analysis. The method recently proposed by Mukherjee et al. [35
] estimates optimal sample size for discovery on the basis of assessing the statistical significance of classification performance and using preliminary results to extrapolate classification results for larger sample sizes. This is an appealing strategy which exploits the assumption that the predictive accuracy of learning algorithm-based predictors will improve as they are trained on larger and larger sets of samples until they reach a plateau of accuracy. These authors fit inverse power-law models to the preliminary data to predict, for a given classifier, how fast the classification performance increases with the sample size (Figure 1).
| Study design for predictive marker validation |
|---|
|
|
|---|
Once a candidate predictor has been identified and its predictive accuracy was estimated, the goal of an independent validation study is to: (i) define the sensitivity, specificity and the positive (PPV) and negative predictive values (NPV) with greater precision; and (ii) to prove clinical utility of the test. Different trial designs may be needed for different clinical situations, but there may not be a single best design for any particular clinical scenario. Several designs could yield complementary information (Figure 2). An important question for a predictive marker validation study is to determine whether the response rate is higher (and how much higher) in the group that is predicted to respond compared with unselected patients that may represent the current standard of care (in the case of chemotherapy for example). Single-arm validation studies may be designed to address this issue and sample size can be computed based on the known response rate in unselected patients and the estimated sensitivity and PPV of the proposed test. For example, we may assume that the response rate in unselected patients is 30% and that the PPV of the test is 60%, which indicates a two-fold greater chance of response in patients who test positive than in unselected patients. To prove that marker-positive patients have better response rates than unselected patients, the lower boundary of a two-sided 95% confidence interval should not be less than 0.3 (expected response rate in unselected patients); this requires that the standard error of the PPV be <0.1, which in turn means that the study needs to include at least 24 patients who test positive for the marker. The proportion of patients who respond is expected to be
30%, so an accurate (and sensitive) predictor should generate a similar 2030% proportion of individuals who test positive for the marker, which would require a total sample size of 80 to 200 patients for validation.
|
Single-arm validation trials have several limitations. If in the single-arm study, treatment is restricted to test-positive cases only, this represents partial validation for the test, because the response rates can only be compared with historical results and the NPV of the test is also not evaluated (clinically important response rates may also be observed in marker-negative cases). If all patients receive the same treatment and response is analyzed in the context of marker results, then a study could become rather inefficient if the overall response rate is low due to low prevalence of marker-positive cases, particularly if the test has modest-predictive values.
A higher level of evidence for the utility of a new predictive test may be generated through randomized trials that could simultaneously address response rates in selected and unselected patients, and could also assess the treatment specificity of the predictor. A definitive validation study for a response predictor may be to randomize patients to either receive treatment only if the marker is positive or to receive therapy regardless of marker status [42
]. In the case of a marker that predicts response to an existing chemotherapy, the unselected use of the treatment represents the current standard of care. This two-arm design could examine clinical utility directly, since comparison between the marker-selected and unselected trial arms could determine to what extent patient selection improves outcome compared with unselected use of the same treatment. Sample size for this design would be similar to conventional phase III clinical trials. Early stopping rules to halt the trial if the predictor performs too well or too poorly could be incorporated.
A different design may be necessary to assess whether the predictor is treatment-specific or it is only a general marker of response to (any) cytotoxic therapy. The primary trial objective is to assess interaction between a marker and the response to two or more different types of chemotherapies. Stratified randomization into two or more treatment arms on the basis of marker results is an appealing strategy. A study with two treatment arms may be designed as follows: the response marker is determined at the time of the patients' entry into the study, and each patient is assigned an expected outcome (e.g. response or no response to drug A, for which the marker was developed). The primary objective of the study would be to establish whether patients who test positive for the marker are significantly more likely to experience a response to drug A than to drug B. A secondary objective would be to show that patients who test negative for the marker will not benefit from drug A as much as the marker-positive patients do, and that for these patients the alternative therapy with drug B may be the preferred treatment. Power and sample size calculations can be based on assumptions about the prevalence of the marker among the patients and the rates of response of their tumors to drugs A and B, respectively. For example, if the prevalence of marker positivity is 25% and the rate of tumor response to drug A is 60%, and we assume that the rate of response to treatment B in the same patients is only 20%, then a total sample size of 210 patients would be needed for 98% power to demonstrate that treatment A is superior to treatment B for marker-positive patients. This study could also be conducted by limiting treatment to only marker-positive patients; however, with that design, the NPV of the test would not be evaluated, and the comparative efficacy of the alternative therapy (i.e. drug B) could not be evaluated in the patients who responded poorly. Including early discontinuation rules may be the preferred option to minimize the exposure of marker-negative patients to ineffective drugs.
Gene expression profiling has demonstrated in several proof-of-principle studies that multigene signatures can predict important clinical outcomes and therefore have the potential to evolve into true diagnostic tests. Some are concerned that moving these observations into a validation phase may be premature; because the technology itself is constantly evolving and new, improved gene sets to predict a particular outcome are identified at regular intervals [43
46
]. Technological development will never stop, but this need not discourage attempts for validation of markers already discovered. The true clinical utility of any proposed gene marker set can only be established through independent validation. It is possible, and in fact probable, that several distinct genes sets measured by different profiling platforms may predict a given clinical outcome equally well [43
46
]. It is also possible that as larger and larger patient populations are used for discovery and training of multigene predictors, second-, third- and fourth-generation predictors will emerge with steadily increasing predictive accuracy. Whereas this process will lead to increased competition among aspiring diagnostic companies (and academic laboratories), if truly useful predictors are discovered, patients will ultimately benefit. High-throughput genomic (and proteomic) technologies represent perhaps one of the most exciting opportunities in diagnostic medicine since the discovery of monoclonal antibodies. However, the true diagnostic value of this technology can only be established expeditiously through a series of well-designed marker discovery and validation studies.
Received for publication March 23, 2004. Revision received June 15, 2004. Accepted for publication June 18, 2004.
| References |
|---|
|
|
|---|
1. Bast RC Jr, Ravdin P, Hayes DF et al. American Society of Clinical Oncology Tumor Markers Expert Panel. 2000 Update of recommendations for the use of tumor markers in breast and colorectal cancer: clinical practice guidelines of the American Society of Clinical Oncology. J Clin Oncol 2001; 19: 18651878.
2. Hortobagyi GN, Hayes D, Pusztai L. Integrating newer science into breast cancer prognosis and treatment. Molecular predictors and profiles. ASCO Annual Meeting Summaries. Alexandria (VA): American Society of Clinical Oncology 2002; 191202.
3. Ramaswamy S, Golub TR. DNA microarrays in clinical oncology. J Clin Oncol 2002; 20: 19321941.
4. de Bolle X, Bayliss CD. Gene expression technology. Methods Mol Med 2003; 71: 135146.[Medline]
5. Ali TR, Li MS, Langford PR. Monitoring gene expression using DNA arrays. Methods Mol Med 2003; 71: 119134.[Medline]
6. Walker SJ, Worst TJ, Vrana KE. Semiquantitative real-time PCR for analysis of mRNA levels. Methods Mol Med 2003; 79: 211227.[Medline]
7. Paik S. Incorporating genomics into the cancer clinical trial process. Semin Oncol 2001; 28: 305309.[CrossRef][Web of Science][Medline]
8. Simon R, Altman DG. Statistical aspects of prognostic factor studies in oncology. Br J Cancer 1994; 69: 979985.[Web of Science][Medline]
9. Sotiriou C, Powles TJ, Dowsett M et al. Gene expression profiles derived from fine needle aspiration correlate with response to systemic chemotherapy in breast cancer. Breast Cancer Res 2002; 4: R3.[CrossRef][Medline]
10. Assersohn L, Gangi L, Zhao Y et al. The feasibility of using fine needle aspiration from primary breast cancers for cDNA microarray analyses. Clin Cancer Res 2002; 8: 794801.
11. Pusztai L, Ayers M, Stec J et al. Gene expression profiles obtained from single passage fine needle aspirations (FNA) of breast cancer reliably identify prognostic/predictive markers such as estrogen (ER) and HER-2 receptor status and reveal large scale molecular differences between ER-negative and ER-positive tumors. Clin Cancer Res 2003; 9: 24062415.
12. Ma X-J, Wang W, Salunga R et al. Gene expression signatures associated with clinical outcome in breast cancer via laser capture microdissection. Breast Cancer Res Treat 2003; 82 (Suppl 1): S15 (Abstr 29).
13. Baunoch D, Moore M, Reyes M et al. Microarray analysis of formalin fixed paraffin-embedded tissue: the development of a gene expression staging system for breast carcinoma. Breast Cancer Res Treat 2003; 82 (Suppl 1): S116 (Abstr 474).
14. Paik S, Shak S, Tang G et al. Multi-gene RT-PCR assay for predicting recurrence in node negative breast cancer patientsNSABP studies B-20 and B-14. Breast Cancer Res Treat 2003; 82 (Suppl 1): S10 (Abstr 16).
15. Esteva FJ, Sahin AA, Coombes K et al. Multi-gene RT-PCR assay for predicting recurrence in node negative breast cancer patientsM.D. Anderson Clinical Validation Study. Breast Cancer Res Treat 2003; 82 (Suppl 1): S11 (Abstr 17).
16. Symmans WF, Ayers M, Clark E et al. Fine needle aspiration and core needle biopsy samples of breast cancer provide similar total RNA yield, but different stromal gene expression profiles cancer. Cancer 2003; 97: 29602971.[CrossRef][Web of Science][Medline]
17. Altman DG, Lyman GH. Methodological challenges in the evaluation of prognostic factors in breast cancer. Breast Cancer Res Treat 1998; 52: 289303.[CrossRef][Web of Science][Medline]
18. King HC, Sinha AA. Gene expression profile analysis by DNA microarrays: promise and pitfalls. JAMA 2001; 286: 22802288.
19. Miller LD, Long PM, Wong L et al. Optimal gene expression analysis by microarrays. Cancer Cell 2002; 2: 353361.[CrossRef][Web of Science][Medline]
20. Rhodes A, Jasani B, Anderson E et al. Evaluation of HER-2/neu immunohistochemical assay sensitivity and scoring on formalin-fixed and paraffin-processed cell lines and breast tumors: a comparative study involving results from laboratories in 21 countries. Am J Clin Pathol 2002; 118: 408417.
21. Rhodes A, Jasani B, Barnes DM et al. Reliability of immunohistochemical demonstration of estrogen receptors in routine practice: interlaboratory variance in the sensitivity of detection and evaluation of scoring systems. J Clin Pathol 2000; 53: 125130.
22. Ambros IM, Benard J, Boavida M et al. Quality assessment of genetic markers used for therapy stratification. J Clin Oncol 2003; 21: 20772084.
23. Liu ET. Molecular oncodiagnostics: where we are and where we need to go. J Clin Oncol 2003; 21: 20522055.
24. Kuo WP, Jenssen TK, Butte AJ et al. Analysis of matched mRNA measurements from two different microarray technologies. Bioinformatics 2002; 18: 405412.
25. Ringner M, Peterson C, Khan J. Analyzing array data using supervised methods. Pharmacogenomics 2002; 3: 403415.[CrossRef][Web of Science][Medline]
26. Simon R, Radmacher MD, Dobbin K et al. Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification. J Natl Cancer Inst 2003; 95: 1418.
27. Cui X, Churchill GA. Statistical tests for differential expression in cDNA microarray experiments. Genome Biol 2003; 4: 210.[CrossRef][Medline]
28. Goldberg DA. Genetic Algorithms in Search. Optimization and Machine Learning. New York: Addison-Wesley 1989.
29. Li L, Weinberg CR, Darden TA, Pedersen LG. Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method. Bioinformatics 2001; 17: 11311142.
30. Rajeevan MS, Vernon SD, Taysavang N et al. Validation of array-based gene expression profiles by real-time (kinetic) RT-PCR. J Mol Diagn 2001; 3: 2631.
31. Taniguchi M, Miura K, Iwao H et al. Quantitative assessment of DNA microarrays-comparison with Northern blot analyses. Genomics 2001; 71: 3439.[CrossRef][Web of Science][Medline]
32. Radmacher MD, McShane LM, Simon R. A paradigm for class prediction using gene expression profiles. J Comput Biol 2002; 9: 505511.[CrossRef][Web of Science][Medline]
33. Shoa J. Linear model selection by cross-validation. J Am Stat Assoc 1993; 88: 422.
34. Good PI. Permutations Tests for Testing Hypotheses. New York: Springer-Verlag 1994.
35. Mukherjee S, Tamayo P, Rogers S et al. Estimating dataset size requirements for classifying DNA microarray data. J Comput Biol 2003; 10: 119142.[CrossRef][Web of Science][Medline]
36. Simon R, Radmacher MD, Dobbin K. Design of studies using DNA microarrays. Genet Epidemiol 2002; 23: 2136.[CrossRef][Web of Science][Medline]
37. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc (Ser B) 1995; 57: 289300.
38. Hatfield GW, Hung S, Baldi P. Differential analysis of DNA microarray gene expression data. Mol Microbiol 2003; 47: 871877.[CrossRef][Web of Science][Medline]
39. Pounds S, Morris SW. Estimating the occurrence of false positives and false negatives in microarray studies by approximating and partitioning the empirical distribution of P-values. Bioinformatics 2003; 19: 12361242.
40. Hwang D, Schmitt WA, Stephanopoulos G et al. Determination of minimum sample size and discriminatory expression patterns in microarray data. Bioinformatics 2002; 18: 11841193.
41. Fisher LD, van Belle G. Sample size calculations in selecting continuous variables to discriminate between populations. In Fisher LD, van Belle G (eds): Biostatistics: A Methodology for the Health Sciences. New York: Wiley 1993; 851858.
42. Sargent D, Allegra C. Issues in clinical trial design for tumor marker studies. Semin Oncol 2002; 3: 222230.
43. Lossos IS, Czerwinski DK, Alizadeh AA et al. Prediction of survival in diffuse large-B-cell lymphoma based on the expression of six genes. N Engl J Med 2004; 350: 18281837.
44. Alizadeh AA, Eisen MB, Davis RE et al. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 2000; 403: 503511.[CrossRef][Medline]
45. Rosenwald A, Wright G, Chan WC et al. The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. N Engl J Med 2002; 346: 19371947.
46. Shipp MA, Ross KN, Tamayo P et al. Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nat Med 2002; 8: 6874.[CrossRef][Web of Science][Medline]
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
P. de Valpine, H.-M. Bitter, M. P. S. Brown, and J. Heller A simulation-approximation approach to sample size planning for high-dimensional classification studies Biostat., July 1, 2009; 10(3): 424 - 435. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Simon The Use of Genomics in Clinical Trial Design Clin. Cancer Res., October 1, 2008; 14(19): 5984 - 5993. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. I. Dumur, M. Lyons-Weiler, C. Sciulli, C. T. Garrett, I. Schrijver, T. K. Holley, J. Rodriguez-Paris, J. R. Pollack, J. L. Zehnder, M. Price, et al. Interlaboratory Performance of a Microarray-Based Gene Expression Test to Determine Tissue of Origin in Poorly Differentiated and Undifferentiated Cancers J. Mol. Diagn., January 1, 2008; 10(1): 67 - 77. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. S. Mehta, S. O. Zakharkin, G. L. Gadbury, and D. B. Allison Epistemological issues in omics and high-dimensional biology: give the people what they want Physiol Genomics, December 13, 2006; 28(1): 24 - 32. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. D. Onken, L. A. Worley, R. M. Davila, D. H. Char, and J. W. Harbour Prognostic Testing in Uveal Melanoma by Transcriptomic Profiling of Fine Needle Biopsy Specimens J. Mol. Diagn., November 1, 2006; 8(5): 567 - 573. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. S. Verducci, V. F. Melfi, S. Lin, Z. Wang, S. Roy, and C. K. Sen Microarray analysis of gene expression: considerations in data mining and statistical treatment Physiol Genomics, May 16, 2006; 25(3): 355 - 363. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Diskin, J. Kumar, Z. Cao, J. S. Schuman, T. Gilmartin, S. R. Head, and N. Panjwani Detection of differentially expressed glycogenes in trabecular meshwork of eyes with primary open-angle glaucoma. Invest. Ophthalmol. Vis. Sci., April 1, 2006; 47(4): 1491 - 1499. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Anderson, K. R. Hess, M. Kapoor, S. Tirrell, J. Courtemanche, B. Wang, Y. Wu, Y. Gong, G. N. Hortobagyi, W. F. Symmans, et al. Reproducibility of Gene Expression Signature-Based Predictions in Replicate Experiments Clin. Cancer Res., March 15, 2006; 12(6): 1721 - 1727. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. J. Kerr Concordia res parvae crescent Ann. Onc., January 1, 2006; 17(1): 3 - 4. [Full Text] [PDF] |
||||
![]() |
C. H. Smorenburg, G. J. Peters, C. J. van Groeningen, P. Noordhuis, K. Smid, A. M. G. H. van Riel, W. Dercksen, H. M. Pinedo, and G. Giaccone Phase II study of tailored chemotherapy for advanced colorectal cancer with either 5-fluouracil and leucovorin or oxaliplatin and irinotecan based on the expression of thymidylate synthase and dihydropyrimidine dehydrogenase Ann. Onc., January 1, 2006; 17(1): 35 - 42. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Simon Roadmap for Developing and Validating Therapeutically Relevant Genomic Classifiers J. Clin. Oncol., October 10, 2005; 23(29): 7332 - 7341. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||








