Conference Day 1: Monday, April 15, 2013
Registration & Networking Breakfast
Conference Chair Welcome Remarks
9:05-9:20am • Room: Golden Gate A
Diamond Sponsor Presentation
Measure Right, Manage Forward, Make a Difference
Modern consumers are everywhere, all of the time. As this new generation of customer continues to evolve, so must the analytics used to measure the experiences they have with companies and organizations. Eric Feinberg, Senior Director at ForeSee, will discuss Next Generation Customer Experience Analytics as a system of metrics that goes beyond single-number measurements and eliminates outdated metric silos to better support today's multi-channel, multi-device world we live in. He will explain what this new generation of predictive analytics needs to be and how it can help you create an analytics platform that allows you to measure right, manage forward, and make a difference in your business.
9:20-10:10am • Room: Golden Gate A
The $3m Heritage Health Prize: Results and Conclusions
The Heritage Health Prize is the largest ever predictive modeling competition. It required data scientists to build algorithms that predict who will go to hospital in the next year, so that preventive action can be taken. Two years on the prize has now just ended. This talk will talk about the competition and some of the lessons that can be learned from it.
Speaker: Anthony Goldbloom, CEO, Kaggle
10:10-10:40am • Room: Exhibit Hall
Exhibits & Morning Coffee Break
Gold Sponsor Presentation
The Future of Commerce is Here
Retailers are using real-time predictive analytics to drive profitability through top line growth and bottom line cost reduction by assimilating data from across the enterprise and blending that with real-time transactional data to personalize every customer interaction. Learn how the Inkiru Predictive Intelligence™ Platform provides an end-to-end solution for data scientists.
Gold Sponsor Presentation
Application of innovative analytics in business
"71% of CMOs feel unprepared to deal with the data explosion over the next five years. 68% feel unprepared to deal with social media. 65% feel unprepared to deal with the growing number of channel and tech device choices." - InformationWeek
During the session, we will showcase innovations in business applications of analytics designed to guide CXOs to deal with the increasingly complex customer environment
Track 1: Social Data
Mapping Social Media to Predict Influence and Measure Propagation
Hidden within social media streams are structures that identify the most influential voices on any topic. Social network analysis and visualization can take millions of messages and reveal the shape of the crowd and the people at the center of it. Using the free and open NodeXL application, this talk demonstrates the tools and methods needed to create detailed maps of any social media topic. Learn to map and analyze social networks extracted from email, Facebook, Twitter, YouTube, message boards, and the WWW. No coding or prior experience needed!
Speaker: Marc Smith, Director, Social Media Research Foundation
10:50-11:35am • Room: Golden Gate A
Track 2: Financial Services
Case Study: Scotiabank
Mortgage Liquidation Model Building and Application
The purpose of development of a mortgage liquidation model is to enable Group Treasury and Asset Liability Management to reduce cash flow uncertainty and improve budgeting and hedge effectiveness. A multinomial logistic regression model was built to predict two mortgage events: full payment and early renewal. The model was vetted by validation team, and applied to cash flow analysis and gap reporting.
Wenlei Shi, Data Mining Analyst, Scotiabank
11:40am-12:00pm • Room: Salon 5 & 6
Track 1: Survey Analysis
Case Study: Cox Communications
Using Analytics to Guide the Creation of Creative for Segment-Targeted Campaigns
This session extends a relatively new area of analytics - the use of predictive techniques to generate success-elevating insights into consumer decision psychology. We'll begin by reviewing the qualitative technique of laddering interviews and its resulting maps of consumer thought that have been a proven source of marketing strategy and creative guidance. Its problem has been that it is too expensive to do enough research to fully understand the nuanced motivational differences between segments. We'll continue by showing how specialized survey data can be analyzed in a way that can extend laddering insights down to the segment level.
Track 2: Uplift Modeling
Case Study: Intuit
Uplift Modeling - Direct Marketing Case Studies
Several million customers use Quicken and Quick Books. Understanding the effectiveness of marketing campaigns is essential to Intuit for customer retention. In this case study, we describe how we implemented uplift models to discover "incremental impact" attributable solely to the campaigns. The key takeaways from the presentation are:
- how we mine customer data to derive predictors of response
- the discussion of differences in traditional response models and uplift modeling
- to demonstrate effectiveness of uplift modeling in retaining customers
12:05-12:25pm • Room: Salon 5 & 6
Track 1: Spam Detection
Case Study: MailChimp.com
Monkeys & Math: How MailChimp Catches Bad Guys
Hear from MailChimp's Chief Scientist John Foreman as he dishes on dirty data and demonstrates the latest in MailChimp's anti-abuse artificial intelligence. MailChimp sends 3 billion emails a month for their millions of users, and they can't afford to let a drop of spam go out. Learn how the company is using cutting edge NoSQL solutions and predictive models to leave the bad guys out in the cold.
Track 2: Uplift Modeling
Case Study: Hewlett-Packard
A Generic Uplift Modeling Framework to Calculate ROI - Application in Promotion Effectiveness
Various retail chains in consumer electronics run different discount promotions on HP products round the year where the promotion spends are shared between these chains and HP. In this context, business wanted to know the extent of usefulness of these promotions through a data-driven approach, which can be leveraged in planning, executing and optimizing these spend dollars. As an essential part of the objective, a generic uplift modeling framework was built to calculate ROI for different discount promotions using the statistical technique called ANCOVA in an unconventional way. The successful implementation of this solution led to a huge dollar impact.
12:25-1:30pm • Room: Golden Gate A
Lunch & Learn
Industrialization of Analytics – Enjoy the Journey
The explosion of Big Data with exponential growth in volume and variety of data has helped and complicated the process of generating insights for business. Through an exchange of ideas, we will discuss an Inside: Outside perspective on industrializing the process to convert data into insights, insights into smart business decisions and action. We will take you through this journey through a series of anecdotes and case studies from two different perspectives. David Kreutter, VP Global Business Analytics and Insights at Pfizer will present expert view on his journey and how you can take your organization through it successfully. Pankaj Kulshreshtha, Business Leader, Analytics and Research at Genpact will bring in his experiences of analytics evolution journeys from partnering with clients across industries.
David Kreutter, VP Business Analytics and Insights, Pfizer
Analytics and the Presidential Elections
This talk will describe how the Obama Campaign used analytics to improve decision making in virtually every function within the organization. We'll talk about how data from a variety of sources was used to improve fundraising, volunteer recruiting and mobilization, media targeting, and optimize voter contacts. We will cover what kind of data was available to the campaign, what technologies were developed and/or used, and how the resulting products were adopted by the campaign in order to help win the presidential elections. Although the focus will be on the elections and politics, we'll also talk about lessons learned during the campaign and how some of the same techniques can be applied to other industries and organizations.
Vendor Elevator Pitches
Case Study: Minted.com
Mining Customer Behavior for Targeted Marketing
Modern marketing tools enable very targeted messaging and offers. The challenge is to understand our customer and prospects well enough to come up with the offer and surrounding message that will resonate in a timely way with each individual. There are two steps in meeting the challenge. First, we need to pick strategies that are practical and have real economic benefits. Next, invent the tactics which enable these strategies.
In this case study, you will learn how Minted mines our "customer initiated actions" from logs of interactions with our site, responses to our email and print campaigns, and other customer touch points to target customers with optimal offers and engaging messages.
Anne-Elise Lansdown, Marketing Manager, Minted
Case Study: Netflix
Building a Data Science Team from Scratch
In this session, Netflix analytical leader Chris Pouliot shares his experience building a large team of data scientists at Netflix. He formed a central, horizontal team for the company, which spans across all business verticals. Chris shares many interesting insights and stories, covering pitfalls and successes experienced as he built the team, as well as the great successes and positive impact at Netflix achieved.
2:35-3:20pm • Room: Golden Gate A
Track 2: Advanced Methods
Case Study: Qualcomm
M.A.R.S. - an Underused Modeling Method
The flexibility and power of using the Multivariate Adaptive Regression Splines (M.A.R.S.) approach to predict the demand of a product or the find optimal performance characteristics of a semiconductor chip will be discussed. Real world examples will be given demonstrating the capture of trends, such as, weekly, daily, hourly, and holiday effects in a statistical model. The ease of using both numeric and text data will be illustrated. The approach will compared to other approaches such as ARIMA time series, neural networks and multivariate regression.
Exhibits & Afternoon Break
Track 1: Search Engine Marketing
Case Study: Kelley Blue Book
Driving Search Engine Marketing with Deep Analytics
This session provides a deep dive into a case study about Search Engine Marketing (SEM) profit maximization and techniques used to identify when is the best time to stop spending on cost-per-click advertising. We will share a simple user-friendly model, in addition to discussing the need for a more robust approach, to identify diminishing returns in SEM spend, while addressing core challenges of both approaches. This session will provide analysts and analytics leadership with an effective framework to improve SEM spend efficiency, accuracy and applicability to meet business objectives.
Track 1: Budgeting & Macroeconomics
Case Study: Hewlett-Packard
An Innovative Approach to Hedge Against Macroeconomic Uncertainties Affecting Businesses
Businesses are continuously faced with margin pressures due to the impact of macroeconomic headwinds caused by inflation, exchange rate, interest rate and GDP. A 2-step innovative prediction framework is developed to hedge against these externalities. At the first step, income statement variables, i.e. revenue, COGS, SG&A are forecast factoring in the impact macroeconomic indicators. If there is a gap in the budgeted and forecast numbers, the second step establishes the relationship between cost drivers and income statement variables through multiple simulations. This enables the leadership conduct scenarios to achieve more realistic forecasts based on prevalent and forecast macroeconomic uncertainties.
Keshav Loomba, Manager - Global Analytics Finance, Hewlett-Packard
Ashish Kumar Singh, Global Analytics, Strategic Consulting and Business Planning, Hewlett-Packard
Case Study: Selective Insurance Group
Strategies and Considerations for Fraud Detection in Insurance Claims
By embedding analytics into its' claim handling Selective Insurance Group, Inc. (SIGI NASE), a top 40 P+C insurance carrier, has significantly increased its SIU activities as measured by the number and proportion of claims it classifies as "fraudulent." In general, insurance fraud can be very difficult to detect. Common challenges include text mining, unstructured data, censored data, small and biased samples, measuring success, etc. This presentation will focus on some of the lessons-learned in the identification of insurance fraud and the challenges of deploying automated analytic tools in insurance claims handling.
Case Study: Orbitz
Delivering on Expectations: Core Competencies for Data Scientists
The McKinsey Institute predicts a need for 1.5 million additional mangers and analysts in the United States who can ask the right questions and consume the results of the analysis of big data effectively. This session is geared for statistical modelers and advanced analytics professionals. You will learn about the changing landscape and the skill sets required for data miners in the new era of Big Data.
- While statistical modeling is not going away, analytics groups are advised to leverage machine-learning approaches as well.
- While traditional statistical modeling software packages are not going away, analytics groups need to actively embrace new skill-sets in emerging software such as open-source tools (e.g., R, MongoDB) and Big Data tools (e.g: Hadoop).
Wenqing Lu, Director, Statistical Modeling and Analytics, Orbitz Worldwide
Case Study: Broadspire
To Sue or Not to Sue: Predicting Litigation Risk
Litigation is a major cost factor in handling casualty claims. Follow the development and testing of a "double barreled" litigation prediction application for our claims system and our parallel e-Triage system, which provides a richer data environment for certain types of insurance claims. This is a major enhancement of a robust predictive system now in use for over six years and an expansion of predictive know-how to control claim costs. See how we apply our continuous improvement philosophy to making predictive analytics a core competency inside an industry leading claims service.
Bangalore Gunashakar, Senior Technical Consultant, Broadspire
Sergo Grigalashvili, VP Architecture, Analytics, GSR, Crawford & Company
LOCF Programming in Clinical Trial Analysis
Pharmaceutical and biotech companies often conduct longitudinal studies on human subject that spend several visits (weeks or months.) There are many situations where subjects do not following instructions, skip scheduled visits, or drop out of the study all together. These result in missing data in the datasets.
LOCF is a method used to deal with these missing data in Clinical Trial Analysis. It stands for "Last Observation Carried Forward." And it is a common imputation method used to impute missing values and missing visits.
This paper will explain the LOCF concept, demonstrate the programming framework, and introduce sample SAS code to accomplish LOCF
SAS – Integrated Object Model
Kenneth M. Lin
Fun with SAS Integrated Object Model (IOM)
Creating Interactive SAS Driven Report Infrastructure using MS Excel and VBA
For more information on the event please go to http://www.basas.com/
Bay Area useR Group Meeting
BARUG will be holding this meeting jointly with with the SVForum. So, please be on the lookout to introduce yourself to SFForum members.
Our main speaker for the evening will be Bryan Lewis author of several R packages and Chief Data Scientist at Paradigm4. Bryan will demonstrate the scidb package for R
Tess Nesbitt of Upstream will lead off with a lightning talk about time-to-event statistical models she builds with R to help customize marketing strategies.
For more details see the BARUG meet -up page.
Conference Day 2: Tuesday, April 16, 2013
Registration & Networking Breakfast
Conference Chair Welcome Remarks
Diamond Sponsor Presentations
Transform Your Future with Predictive Analytics
The measure of intelligence is the ability to change." – Albert Einstein. Never has this statement been truer than in today's world of real-time data…. where an organization or an individual can either prosper or perish by the quality of a decision. The data has always been out there. Finally, now we possess the technology to truly harness it. Imagine what might happen if you could run "what-if" scenarios at lightning speed. Or empower everyone in your organization to visualize the possibilities. Now is the time, and you are the person to decide to seize that opportunity. What would you do tomorrow if there were no limitations? Join John MacGregor, VP and Head of the Centre of Predictive Analytics at SAP, to discover how new technologies can help you transform your future.
Putting IBM Watson to Work
IBM's Watson captured the imagination of millions when it beat the all time champions of the US game show, Jeopardy!. To do so, it overcame traditional limitations of computers by communicating in natural human language, churning through 200 million pages of unstructured data to find answers in three seconds, and learning from each experience to improve performance over time. But as impressive as this accomplishment was, it was only the beginning. IBM is working closely with leading organizations in a variety of industries to put Watson to work. The possibilities are endless! Join Edward Nazarko, a leading IBM Architect, in an engaging discussion of ways that Watson is using predictive models to revolutionize expectations of how computers can help organizations in all industries live and work better.
Exhibits & Morning Coffee Break
Gold Sponsor Presentation
Gold Sponsor Presentation
Applying Predictive Analytics in Real-time
Product demonstration to illustrate how events are processed through customized models hosted on the Inkiru Predictive intelligence™ Platform. Every time an event occurs on a merchant's website it gets logged in our graphing and metric database. Attendees will learn how we traverse our graph, map it to corresponding metrics, run data through the customized models, and return a highly accurate score in less than 500 milliseconds. Once the merchant is provided with the score, they can personalize their interaction for each customer.
Track 1: Industry News
Using Analytics to Build Your Analytics Bench: Announcing 2012 Analytics Professionals Study Results
Many innovative businesses and IT organizations appreciate the competitive advantage analytics capabilities can provide and have ambitions to reach increasing levels of analytics maturity. However, the well-documented shortage of analytic talent leaves many firms without a strong analytic talent bench and little knowledge about how and where to find analytics professionals needed to get there. In this presentation, Greta Roberts will discuss results from a major 2012 Study of Analytics Professionals that crosses industries, experience and skills. Practical insights shared include key best practices, trends and correlations that lend unexpected insight into building a strong and scalable analytic talent bench.
Predictive Analytics (PA) has become increasingly mature as a technical discipline over the past decade in part because it stands on the shoulders of the related disciplines of data mining and machine learning. However, there are recurring themes that permeate discussion boards and conferences that have become my personal pet peeves. This talk examines five of them and why they matter to practitioners, including why we must have humility in how far data science and algorithms can take us, and the value of business objectives, measuring "success," and measuring "significance."
Track 1: Cross-Enterprise Analytics
Case Study: Monster Worldwide
Win With Advanced Analytics
Monster was the pioneer in the online recruitment industry. To maintain its competitive advantage, it has taken the data-driven road using research, business intelligence and predictive analytics and text analytics. Join this session to hear how Monster went from good to great using business analytics to support its overall decision-making process across all regions. Jean-Paul Isson will provide highlights from his new book, "Major Steps to Win with Analytics with the Big Data." He will also discuss Monster's success with increasing customer retention, market share and customer profitability, while managing competition from paid sites, free sites and social networks.
Customer loyalty continues to be a challenge to any marketer and more so in the retail trading business. WB Mason, office supplies providers, realized the same trying to identify the stickiness of their loyal SMB customer base. A 20% loss in each successive cycle of purchase necessitated thinking through a model that can not only capture when these customers are likely to come back for repurchase but also the offer that WB Mason should be making at that point. We use a Repeat purchase model and eventually a Multinomial Logit model to understand customer purchase behavior.
11:40–12:00pm • Room: Salon 5 & 6
Track 1: Lead Management
Case Study: Citrix
How Predictive Analytics Changes the Game by Front-Ending the Funnel
In owning the front end of the revenue funnel, marketing has a unique vantage point. In this session, learn how Citrix leveraged the data and insights acquired to build a finely tuned marketing and lead management strategy for their market-leading cloud, collaboration, networking and virtualization technologies. By applying predictive analytics to their model, Citrix was able to increase the campaign effectiveness and lead to opportunity conversion rate. Get guidance from Citrix on how you can apply these methods to increase marketing contribution to the pipeline.
12:05–12:25pm • Room: Salon 5 & 6
Track 1: Churn Modeling
Case Study: Paychex
Customer Retention: Pulling the Needle from the Haystack
In these economic times, it is critical for businesses to have a stronghold on client retention, with businesses excelling in this arena better positioned for long-term success. To optimize the value of retention efforts, it's essential to understand which clients are the best fit for retention campaigns. In this session, we will review how Paychex leveraged two existing models, Paychex Attrition Model and a custom-built Lifetime Value Model, to create a Retention Tracking System (RTS). Since being deployed across the entire branch network, the RTS has become an invaluable resource as offices nation-wide strive to meet, and exceed, retention goals.
11:40–12:05pm • Room: Golden Gate A
Track 2: Brand Analytics
Case Study: Dell
The Illusive Brand: How to Measure Brand and the Communications Focused On It
Measuring a brand health is very difficult and can be convoluted. Often, if you have multiple metrics such as NPS or survey results, they will not align on how your brand health is changing. Helping business leaders understand how they can impact brand health is even more difficult. Natalie will present ideas on how to model out marketing's impacts on the brand, measuring the long-term impacts of an overriding campaign, and how to handle differing trends from various brand health metrics. In addition, we will discuss how to explain these models and their errors to decision makers.
Platinum Sponsor Presentation
Predicting a fraudulent payment – is it possible?
Investigating disbursement fraud typically happens ex post facto. The deed is done; the money has left the account. In the notion that prevention is better than detection, there is a need to find ways in which organizations can better protect the leakage of their funds before it is too late. The challenge in prevention of this nature is determining what disbursement may be fraudulent or inappropriate, in real-time, without throttling business operational effectiveness, and impacting the overall bottom-line.
Given the observations around the rapid evolution of predictive analytic capabilities, perhaps an innovative way forward would be to consider an in-line predictive risk-driven continuous monitoring mechanism, which filters those pending disbursements based on fluxing risk indications of events preceding the actual payment.
This presentation walks through an innovative approach and architecture, as to how this might be practically achieved with today's techniques and approaches, in a corporate setting.
Special Plenary Session
General Lessons We Can Learn from Blackbox Trading
Beating the market with skill, rather than luck, is so hard that it's arguably impossible. A strong working approximation is that markets are efficient - that prices reflect available information almost instantaneously. Accordingly, we have failed often. But our success building quantitative investment systems has been great - most notably with a hedge fund that beat the S&P-500 every year for a decade, with only 2/3rds the risk (volatility). This talk will highlight key lessons learned from the long battle, and how those insights have helped solve many other predictive analytics challenges.
Big Data for Predictive Analytics
Moderator: Eric Siegel, Ph.D., Predictive Analytics World
If Big Data begs the question, "What to do with all this data?" predictive analytics answers, "Learn from it to predict behavior." But just how much predictive payoff comes with going so big? This expert panel will address the new demands on predictive analytics solutions and best practices as data grows to enormity, and will recommend tactics to fully leverage data's growing magnitude to improve the business performance of predictive analytics initiatives.
Anil Kaul, CEO and Co-founder, AbsolutData
Eric Feinberg, Senior Director of Mobile, Media and Entertainment, ForeSee
Exhibits & Afternoon Break
3:50–4:35pm • Room: Salon 5 & 6
Track 1: Data Visualization
Case Study: Wells Fargo
Data Visualization Design Using Shneiderman's Mantra: Overview First, Zoom and Filter, Then Details-On-Demand
This session explores applications of Shneiderman's mantra for visual data analysis (overview first, zoom and filter, then details-on-demand) as a framework in the context of three complex analytical applications at Wells Fargo:
- Analytics Process
- Interactive Meeting Facilitation
- Dashboard Design
Dana Zuber, Strategy and Analytics Executive, Wells Fargo
3:50–4:10pm • Room: Golden Gate A
Track 2: Likelihood-to-Recommend
Case Study: AAA
Multicollinearity and Sparse Data in Key Driver Analysis: Challenges and Solutions
AAA-NCNU is a membership organization with multiple products ERS, insurance, travel, and car care. Determining key drivers of Likelihood-to-Recommend is complicated by the multicollinearity among some attributes and by other attributes, filtered based on experience. The first challenge was addressed through Shapley Regression, while the second was handled through bivariate linear regressions adjusted for incidence. With these two methods, we were able to estimate the relative impact of the drivers expressed in percentages. The company used the results for prioritizing decisions and allocating resources.
Raymond Reno, Senior Vice President, Market Strategies International
4:15–4:35pm • Room: Golden Gate A
Track 2: Insurance & HR Analytics
Case Study: Crawford Global Technical Services
Using Predictive Analytics for Strategic Planning at Crawford GTS
When the nature of business heavily depends on natural events, market condition, and individual professional relations, how does Crawford & Company manage the global work force of +500 executive general adjusters optimally? We mine proprietary data for the most complex insurance claims to forecast demand by geography, industry, insurer, and peril. We also analyze work force profile and combine the forecasted demand with supply for strategic planning for each region and industry. This presentation covers the approach we use to manage hundreds of models cost effectively for three objectives:
- Managing Global Work Force
- Optimizing Business Operations
- Improving Client Relations
Dr. Andries Willemse, SVP, Crawford Global Technical Services, Crawford & Company
4:40-5:30pm • Room: Salon 5 & 6
Track 1: Data Visualization
Case Study: Blue Shield of California
No Country for Fat Men - Investigating Obesity with Visual Analytics
Visual analytics is gaining importance due to the explosion of data availability and processing capabilities. In this example, we demonstrated the power of visual analytics to investigate various aspect of obesity using a readily available commercial product called Tableau on the CHIS (California Health Interview Survey). A recent JAMA article claimed that there was no time to waste in doing obesity research and a broad-based effort was needed. Since CHIS tracked responses to hundreds of questions, our demonstration provided an excellent example of how visual analytic tools could empower end-users to find interesting relationships within a morass of data.
4:40-5:30pm • Room: Golden Gate A
Track 2: HR Analytics
Case Study: ConAgra Foods
Aging of the Baby Boomer Generation and the Upcoming Talent Tsunami
Over the next 10 years, approximately 40% of ConAgra Foods' workforce will become retirement eligible. With 25,000 employees across seven business units and almost 90 locations, the task of understanding, analyzing and solving for the demand was monumental. As a $14 billion consumer packaged goods company, ConAgra Foods' "strategy" or "Recipe for Growth" would be fueled by transforming the organization's culture, winning in talent acquisition, and accelerating development of existing employees. During this session, we will discuss how to leverage existing data and predictive analytics to transform the organization's approach to talent.
Sara Roberts, Leader, Advanced Analytics, ConAgra Foods, Inc.