Monday, May 15, 2017 in San Francisco
Supercharging Prediction with
- Practitioners: Analysts who would like to learn theoretical principles of and practical tips for how to build model ensembles.
- Technical Managers: Project leaders and managers who are responsible for developing predictive analytics solutions and want to understand the potential value and limitations of model ensembles.
Knowledge Level: Beginning to intermediate understanding of statistical methods or predictive modeling algorithms.
Once you know the basics of predictive analytics including data exploration, data preparation, modeling building, and model evaluation, what can be done to improve model accuracy? One key technique is the use of model ensembles, combines several or even thousands of models into a single, new model score. It turns out that model ensembles are usually more accurate than any single model, and they are typically more fault tolerant than single models.
Are model ensembles an algorithm or an approach? How can one understand the influence of key variables in the ensembles? Which options affect the ensembles most? This workshop dives into the key ensemble approaches including Bagging, Random Forests, and Stochastic Gradient Boosting. Attendees will learn "best practices" and attention will be paid to learning and experiencing the influence various options have on ensemble models so that attendees will gain a deeper understanding of how the algorithms work qualitatively and how one can interpret resulting models. Attendees will also learn how to automate the building of ensembles by changing key parameters.
Participants are expected to know the principles of predictive analytics and how the most important algorithms in predictive analytics work (like decision trees, neural networks, regression, etc.).
Course Notes and Free Textbook
All data referenced in the workshop will be provided on a USB drive and will also be made available via an internet link. Electronic copies of the workshop notebook will be distributed to attendees upon arrival on the USB drive. All attendees will also receive a paperback copy of Dean's book, Applied Predictive Analytics.
The key concepts covered during this workshop can be applied to many predictive analytics projects regardless of the software used. Live demonstrations using Salford Systems SPM and KNIME will be included in the workshop. Participants will receive an evaluation copy of SPM as part of the registration. KNIME is open source.
Laptops are not required for this course, but is recommended to view the course slides and take notes. Additionally, all participants who would like to experiment with ensembles during the demonstrations may do so with the software provided.
- Software installation (if not already installed): 8:30am
- Workshop program starts at 9:00am
- Morning Coffee Break at 10:30 - 11:00am
- Lunch provided at 12:30 - 1:15pm
- Afternoon Coffee Break at 2:30 - 3:00pm
- End of the Workshop: 4:30pm
Dean Abbott, President, Abbott Analytics
Dean Abbott is President of Abbott Analytics in San Diego, California. Mr. Abbott has over 21 years of experience applying advanced data mining, data preparation, and data visualization methods in real-world data intensive problems, including fraud detection, risk modeling, text mining, response modeling, survey analysis, planned giving, and predictive toxicology. In addition, Mr. Abbott serves as chief technology officer and mentor for start-up companies focused on applying advanced analytics in their consulting practices.
Mr. Abbott is a seasoned instructor, having taught a wide range of data mining tutorials and seminars for a decade to audiences of up to 400, including PAW, KDD, AAAI, IEEE and several data mining software users conferences. He is the instructor of well-regarded data mining courses, explaining concepts in language readily understood by a wide range of audiences, including analytics novices, data analysts, statisticians, and business professionals. Mr. Abbott also has taught applied data mining courses for major software vendors, including SPSS-IBM Modeler (formerly Clementine), Unica PredictiveInsight (formerly Affinium Model), Enterprise Miner (SAS), Model 1 (Group1 Software), and hands-on courses using Statistica (Statsoft), Tibco Spotfire Miner (formerly Insightful Miner), and CART (Salford Systems).