Published datasets are available here. Users may practice implementation of statistical techniques on them. We seek contributions of datasets to add to this resource.
| Study | Reference | Stata file | ASCII file |
| CASS | Leisenring et al. (2000)
Weiner et al. (1979) |
est1.dta | est1.csv
est1_desc.txt |
| Pancreatic Ca biomarkers | Wieand et al. (1989) | wiedat2b.dta | wiedat2b.csv
wiedat2b_desc.txt |
| Ultrasound for hepatic mets | Tosteson and Begg. (1988) | tostbegg2.dta | tostbegg2.csv
tostbegg2_desc.txt |
| CARET PSA | Etzioni et al. (1999) | psa2b.dta | psa2b.csv
psa2b_desc.txt |
| Gene expression array | Pepe et al. (2003) | orchratio2.dta | orchratio2.csv
orchratio2_desc.txt |
| Norton neonatal audiology | Norton et al. (2000) | nnhs2.dta | nnhs2.csv
nnhs2_desc.txt |
| Leisenring neonatal audiology | Leisenring et al. (1997) | lplaudio_b.dta | lplaudio_b.csv
lplaudio_b_desc.txt |
| Prostate Ca - St. Louis | Smith et al. (1997) | psa_dre_v2.dta | psa_dre_v2.csv
psa_dre_desc_v2_.txt |
| Stover audiology | Stover et al. (1996) | dp2.dta | dp2.csv
dp2_desc.txt |
| Scintigraphy study | Muller et al. (1989) | mlt1.dta | mlt1.csv
mlt1_desc.txt |
| 59 Pap screen studies | Fahey et al. (1995) | fim.dta | fim.csv
fim_desc.txt |
| Prenatal screen data (hypothetical) | hpns.dta | hpns.csv
hpns_desc.txt |
|
| Ovarian Ca markers (hypothetical) | ocdata_b.dta | ocdata_b.csv
ocdata_b_desc.txt |
|
| Covariate adjustment datasets | Janes et al (2009) | Figure 1, scenario 1
Figure 1, scenario 2 |
.csv file and .txt file
.csv file and .txt file |
| ROC regression dataset | Janes et al (2009) | Figure 4 | .csv file and .txt file |
| Simulated AKI data | Pepe et al (2007, 2008) | aki_sim.dta | aki_sim.csv file
aki_sim_desc.txt file |
| Two frameworks for ordinal ratings | Morris et al (2009) | two_marker_sim.dta | two_marker_sim.csv file
two_marker_sim_desc.txt file |
Etzioni R, Pepe M, Longton G, Hu C, Goodman G (1999). Incorporating the time dimension in receiver operating characteristic curves: A case study of prostate cancer. Medical Decision Making 19:242-51.
Fahey MT, Irwig LM, Macaskill P (1995). Meta-analysis of Pap test accuracy. American Journal of Epidemiology 141:680-9.
Janes H, Longton G, Pepe MS. Accommodating Covariates in ROC Analysis. Stata Journal 9(1):17-39, 2009.
Leisenring W, Alonzo T, Pepe MS (2000). Comparisons of predictive values of binary medical diagnostic tests for paired designs. Biometrics 56:345-51.
Leisenring W, Pepe MS, Longton G (1997). A marginal regression modelling framework for evaluating medical diagnostic tests. Statistics in Medicine 16:1263-81.
Morris DE, Pepe MS, and Barlow WE. Contrasting two frameworks for ROC analysis of ordinal ratings. Cancer Epidemiology Biomarkers and Prevention (in press).
Muller C, Wasserman HJ, Erlank P, Klopper JF, Morkel HR, Ellmann A (1989). Optimisation of density and contrast yielded by multiformat photographic images used for scintigraphy. Physics in Medicine and Biology 34:473-81.
Norton SJ, Gorga MP, Widen JE, Folsom RC, Sininger Y, Cone-Wesson B, Vohr BR, Mascher K, Fletcher K. (2000). Identification of neonatal hearing impairment: Evaluation of transient evoked ototacoustic emission, distortion product otoacoustic emission, and auditory brain stem response test performance. Ear and Hearing 21:508-28.
Pepe MS (2003). The Statistical Evaluation of Medical Tests for Classification and Prediction (Oxford Statistical Science Series). Oxford University Press.
Pepe MS, Longton G, Anderson G, Schummer M (2003). Selecting differentially expressed genes from microarray experiments. Biometrics 59:133-42. .
Pepe M, Zheng Y, Jin Y., Huang Y, Parikh C, Levy W. (2008) Evaluating the ROC performance of markers for future events. events. Lifetime Data Analysis 14(1):86-113.
Smith DS, Bullock AD, Catalona WJ (1997). Racial differences in operating characteristics of prostate cancer screening tests. The Journal of Urology 158:1861-66.
Stover L, Gorga MP, Neely T (1996). Torwards optimizing the clinical utility of distortion product otoacoustic emission measurements. Journal of the Acoustical Society of America 100:956-967.
Tosteson AN, Begg CB (1988). A general regression methodology for ROC curve estimation. Medical Decision Making 8:204-15.
Weiner DA, Ryan TJ, McCabe CH, Kennedy JW, Schloss M, Tristani F, Chaitman BR, Fisher LD (1979). Exercise stress testing. Correlations among history of angina, ST-segment response and prevalence of coronary-artery disease in the Coronary Artery Aurgery Study (CASS). New England Journal of Medicine 301(5):230-5.
Wieand S, Gail MH, James BR, James KL (1989). A family of nonparametric statistics for comparing diagnostic markers with paired or unpaired data. Biometrika 76:585-92.