5. Our hypothesis
Certain demographic, diagnostic, and treatment parameters
can reliably predict survival time for women with breast
cancer. Using this knowledge, we could build a “calculator” to
estimate survival time for individuals.
7. Data exploration and variable
selection
• Survival time (months)
• Age at diagnosis
• Year of birth
• Race
• Origin (Hispanic recode)
• Stage
• Histology
• Tumor extent
• Number of primary tumors
• Laterality
• ER Status
• PR Status
• Radiation therapy
146 variables in
SEER database
13 variables
of interest