Nonclinical Statistics, Pfizer
Sunday, April 3, 2016 in San Francisco
Room: Salon 5 & 6
Full-day: 9:00am - 4:30pm
R for Predictive Modeling:
A Hands-On Introduction
Intended Audience: Practitioners who wish to learn how to execute on predictive analytics by way of the R language; anyone who wants "to turn ideas into software, quickly and faithfully."
Knowledge Level: Either hands-on experience with predictive modeling (without R) or both hands-on familiarity with any programming language (other than R) and basic conceptual knowledge about predictive modeling is sufficient background and preparation to participate in this workshop.
This one-day session provides a hands-on introduction to R. the well-known open-source platform for data analysis. Real examples are employed in order to methodically expose attendees to best practices driving R and its rich set of predictive modeling packages, providing hands-on experience and know-how. R is compared to other data analysis platforms, and common strengths and pitfalls in using R are discussed.
The instructor, a leading R developer and the creator of caret, an R package that streamlines the process for creating predictive models, will guide attendees on hands-on execution with R, covering:
- A working knowledge of the R system
- The strengths and limitations of the R language
- Preparing data with R, including splitting, resampling and variable creation
- Developing predictive models with R, including decision trees, support vector machines and ensemble methods
- Visualization: Exploratory Data Analysis (EDA), and tools that persuade
- Evaluating predictive models, including viewing lift curves, characterizing overfitting and other topics as time allows.
Each participant will receive a copy of Max's book Applied Predictive Modeling.
Hardware: Bring Your Own Laptop
Each workshop participant is required to bring their own laptop running Windows or OS X. The software used during this training program, R, is free and readily available for download.
Attendees receive an electronic copy of the course materials and related R code at the conclusion of the workshop.
- Workshop program starts at 9:00am
- Morning Coffee Break at 10:30 - 11:00am
- Lunch provided at 12:30 - 1:15pm
- Afternoon Coffee Break at 2:30 - 3:00pm
- End of the Workshop: 4:30pm
Max Kuhn, Director, Nonclinical Statistics, Pfizer
Max Kuhn is a Director of Nonclinical Statistics at Pfizer Global R&D in Connecticut. He has been apply models in the medical diagnostic and pharmaceutical industries for over 15 years.
He is a leading R developer and the author of several R packages including the caret package that provides a simple and consistent interface to over 140 predictive models available in R.
Mr. Kuhn has taught courses on modeling within Pfizer and externally, including a class for the India Ministry of Information Technology.
His book, Applied Predictive Modeling, is scheduled to be available in April, 2013.