Teaching Medical Research Methodology : All modern medical and public health research now requires a considerable amount of biostatistics,
computer science, data processing and machine
learning: Data Science
Call Girls Ludhiana Just Call 9907093804 Top Class Call Girl Service Available
Nicholas Jewell MedicReS World Congress 2014
1. Teaching Medical Research Methodology
Nicholas P. Jewell
Departments of Statistics &
School of Public Health (Biostatistics)
University of California, Berkeley
October 16, 2014
26. Even Simple Questions . . .
(Nicholas Chamandi and others at Google)
• Data streams
– Interact with one record at a time
– Records are not guaranteed to be sorted in a meaningful way
– One record is not necessarily one i.i.d statistical observation
Suppose Xi is the number of
queries (per day, say) for a user
Easy to compute
26
but not
Xn
i=1
Xi
Xn
i=1
Xi
2
How about the median or
mode of the distribution?
27. Curriculum Evolution
cs (then): Han, J. and Kamber, M. (2000). Data Mining: Concepts and Techniques, 1st edition [2006, 2nd edition]
stat (then): Hastie, T., Tibshirani, R., and Friedman, J. (2001). Elements of Statistical
Learning, 1st edition [2009, 2nd edition]
cs (now): Rajaraman, A., Leskovec, J., and Ullman, J.D. (2012+). Mining of Massive Datasets, manuscript
27
Then Now
cs stat cs stat
data warehouse regression models distributed system
OLAP (online analytical proc) lasso, ridge, PCA, PLS Map-Reduce, Hadoop
?
association rules splines, kernel smooth association, freq items ?
classification CART MARS GAM PageRank, link analysis ?
clustering boosting boosting
prediction model classification, SVM SVD, dim reduction
text mining neural networks machine learning
multimedial mining clustering online advertisement
transactional db∗ p >> n
Recommendation sys
social networks network models social networks