Predictive Analytics World for Business New York 2017

Oct 29-Nov 2, 2017 – Jacob Javits Convention Center

Workshops - Sunday, October 29th, 2017

8:00 am

Full-day Workshop

Big Data: Proven Methods You Need to Extract Big Value

"Big Data" is everywhere. The topic is impacting every industry and institution. Big excitement about big data comes from the intersection of dramatic increases in computing power and data storage with growing streams of data coming from almost every person and process on Earth. The pressing question is, how do we best make value of all this data - what should we do with it?

See more details

Session description

Instructor

Vladimir Barash, Chief Scientist, Graphika

Morning Session Workshop

R Bootcamp: For Newcomers to R

This morning workshop launches your tenure as a user of R, the well-known open-source platform for data science and machine learning. The workshop stands alone as the perfect way to get started with R, or may serve to prepare for the more advanced full-day hands-on workshop, “R for Predictive Modeling”.

Session description

Instructor

Max Kuhn, Software Engineer, RStudio

12:00 pm

Full-day Workshop

R for Machine Learning: A Hands-On Introduction

This one-day session provides a hands-on introduction to R, the well-known open-source platform for data analysis. Real examples are employed in order to methodically expose attendees to best practices driving R and its rich set of predictive modeling (machine learning) packages, providing hands-on experience and know-how. R is compared to other data analysis platforms, and common pitfalls in using R are addressed.

Session description

Instructor

Max Kuhn, Software Engineer, RStudio

4:30 pm

End of Full-day Workshop with Vladimir Barash

7:30 pm

End

Predictive Analytics World for Business - New York - Day 1 - Monday, October 30th, 2017

(PAW Financial & PAW Healthcare run in parallel on this day - dual registration required)

8:00 am

Registration & Networking Breakfast

8:45 am

Conference Chair Welcome

Eric Siegel, Conference Founder, Machine Learning Week

8:50 am

Room: 1E10

Keynote:

Analytics for the Job: Tips and Tricks for Success

Welcome to the Analytics Explosion! Despite speculation that the need for analytics would begin to level-off, evidence suggests it continues to be at an all-time high. Trends show the establishment of more and more in-house analytics teams, allowing the luxury of predictive and prescriptive analytics to be applied across all levels of an organization. However, many factors should be considered when evaluating an analytics undertaking, such as complexity of the problem, precision necessary in the solution, and timeliness required for the response. With so many variables, how do you choose the right analytics tool for the job? What else is required for an analytics effort to be successful? Leveraging a dynamic analytical approach will achieve the greatest value for your business.

Session description

Speaker

Anne G. Robinson, Chief Strategy Officer, Kinaxis

9:40 am

Diamond Sponsor Presentation:

TBA

The Session Description will be available shortly.

Session description

Sponsored by

10:00 am

Exhibits & Morning Coffee Break

10:30 am

Track 1—BUSINESS: Analytics strategy & operationalization

Crisis response; analytics management

Lessons from: NYC Mayor's Office

Quickly Building an Analytics Environment to Address a Public Health Crisis in NYC

Predictive analytics has proven to be a highly useful tool in the public sector, but what happens when an emergency strikes and we have to build an entire analytics infrastructure from scratch? In this case study, the NYC Mayor's Office of Data Analytics (MODA) will walk you through how the City of New York built a system to collect, monitor, and predict the presence of potentially disease-carrying cooling towers (Legionnaires' Disease) among New York's one-million plus buildings in less than a week.

Session description

Speaker

Simon Rimmele, Associate, Analytics, NYC Mayor's Office of Data Analytics

Track 2—TECH: Predictive modeling & machine learning methods

Hand-labeled training data

Case Study: Bloomberg L.P.

Crowd-Sourcing and Quality: How To Get The Best Out of Hand-Tagged Training Data for Machine Learning Models

Machine learning models depend on often large amounts of training data for supervised learning tasks. This data may be expensive to collect, especially if it requires human labeling (as in for document or image classification) and raises some particular quality issues. For example, how do we ensure that human agreement is high and what do we do in the event that it is not? Also, when your data is expensive to tag, how do you ensure that you have the smallest set possible that is representative of all your features? This talk will address these and other issues associated with gathering crowd-sourced, hand-coded data sets for supervised machine learning models.

Session description

Speaker

Leslie Barrett, Senior Software Engineer, Bloomberg LP

Track 3—MARKETING: Marketing & market research analytics

Churn modeling

Case Study: Paychex

Retention Modeling in Uncertain Economic Times

Small businesses are susceptible to even minor changes in economic conditions. When strategically allocating retention efforts across half a million businesses, we need to account for said changes as well as maximize our resource allocations. Traditional modeling techniques can fail over time in the presence of concept drift. We devised an innovative method to account for unknown changes by using seasonal model and trend model to probabilistically assign retention efforts. Additionally we built in functionally that allows new variables to be considered for development as they become relevant. This new methodology removes the necessity for annual retools and stabilizes performance.

Session description

Speaker

Rob Rolleston, Manager, Data Science, Paychex

11:20 am

Track 1—BUSINESS: Analytics strategy & operationalization

Education and team building

Lessons from: LinkedIn

The Sprint for Teaching Data Science: LinkedIn Learning, Analytics, and the New Era of Just-In-Time Skills Training

LinkedIn Learning's mission includes training the enormous number of people who need data science and business analytics skills. But what's the best way to assess market demand and develop tools for validation in stack-ranking skills coverage? How do you go from a handful of data science courses to over 100 in less than a year, and make them as effective as possible? How do you find the best instructors? How and why do we contrast with standard classroom education and with alternative learning like MOOCs? LinkedIn, a data company, uses analytics to answer these questions and to guide our strategy.

Session description

Speaker

Steve Weiss, Content Manager, Data Science and Business Analytics, LinkedIn

Track 2—TECH: Predictive modeling & machine learning methods

Time series modeling

Time Series Prediction with Twitter: A Case Study of Crime in New York City

The problem of predicting, for each type of crime, the crime frequency in a specific area on a specific day can be framed as a regression problem on crime frequencies and Twitter data: given (1) the last 31 days of Twitter activity geo-tagged in the immediate area, (2) the last 31 days of Twitter activity in the general area, and (3) the historical crime frequencies of the general area for the past year, predict the crime frequency for the next day in that location. Inherent in this problem description is the following prediction: the taxonomy used in tweets from around the area (such as a large fraction of restaurants with low ratings, or lots of tweets about how unsafe people feel in that area) contains information that could be used to build predictors of future crime frequencies for different types of crime. Assuming that the time series can be modeled as deviation from a periodic function, and incorporating this assumption into the model, may potentially produce better crime frequency estimates than directly predicting crime frequencies. The proposed research has implications for decision makers concerned with geographic spaces occupied by Twitter users. This session will cover these analytical results, which were produced by an extended group of graduate students and researchers at New York University.

Session description

Speakers

Anasse Bari Ph.D., Professor of Computer Science - Director of the AI and Predictive Analytics Lab, New York University

Chuan-Heng Lin, Machine-Learning Engineer, Pienso

Aaron McKinstry, Computer Scientist, Courant Institute of Mathematical Sciences of New York University

Gen Xiang, Software Engineer, Trinnacle Capital Management

Track 3—MARKETING: Marketing & market research analytics

Churn modeling

Case Study: Atlassian

Predicting Customer Churn from Product Usage at Atlassian

Measuring customer churn is a key aspect of marketing data science regardless of the type of product a company is selling. In fact, identifying the most predictive features is equally important as identifying the users at risk of churning because it can help marketing and sales teams alike in adopting the most appropriate strategy to retain their customers, improve their product and identify new opportunities. This presentation will present a decision tree-based methodology to compute churn likelihood, and will discuss which attributes, behavioral features or character traits, are the most beneficial to our model.

Session description

Speaker

Jennifer Prendki, VP of Machine Learning, Figure Eight

11:40 am

Track 3—MARKETING: Marketing & market research analytics

Market research and analytics

Case Study: Verizon Wireless

Predicting Brand Love With Wireless Behaviors

Using robust data anonymization, safeguarding, and security, Verizon linked thousands of brand health survey participants to their actual customer database records to see how behaviors like increased data usage or phone upgrades could predict changes in surveyed ratings of the brand. This presentation will discuss the business value of analytics around the brand then delve into the analytic methods and selected results.

Session description

Speakers

Michael E. Gooch-Breault, Director, Consumer and Marketplace Insights, Verizon Wireless

Jade Xi, Cslt-Pred/Presc Analytics, Verizon

12:05 pm

Lunch in Exhibit Hall

12:25 pm

Lunch & Learn

Sponsored by

1:15 pm

Lunch in Exhibit Hall

1:30 pm

Keynote

The Predictability Predicament: Your Model Overlooks the Real Target

In the context of building predictive models, predictability is usually considered a blessing. After all - that is the goal... to build the model that has the highest predictive performance. The rise of 'big data' has, in fact, vastly improved our ability to predict human behavior, thanks to the introduction of much more informative features.

However, in practice, the target variable is often more differentiated than accounted for in the data. For example, some customers churn (from a telecom provider) because they are moving, others because they got a better offer in the mail, and the third because their home is in a location with terrible reception. These are all positives for a model that learns to predict churn, but the predicted outcome has occurred for very different reasons. In many applications, such mixed scenarios mean the model will automatically gravitate to the one that is easiest to predict at the expense of the others. This even holds if the predictable scenario is by far less common or relevant. In the worst case, predictive models can introduce biases NOT even present in the training data.

In this talk, we will cover a number of applications where this takes place: clicks on ads being performed 'intentionally' vs. 'accidentally', consumers visiting store locations vs. their phones pretending to be there, and finally customers filling out online forms vs. bots defrauding the advertising industry. In conclusion, the combination of different and highly informative features can have a significantly negative impact on the usefulness and ethics of predictive modeling.

Session description

Speaker

Claudia Perlich, Chief Scientist, Dstillery

2:15 pm

Diamond Sponsor Presentation

TBA

The Session Description will be available shortly.

Session description

Sponsored by

2:40 pm

Track 1—BUSINESS: Analytics strategy & operationalization

Analytics strategy

Lessons from: The Clorox Company

Getting Started with Data Science Driven Insights, Execution and Innovation in the CPG Industry?

The need to be more consumer (data) centric, availability of disparate sources of granular data and rapidly advancing technology, techniques and skills makes it necessary and feasible for consumer packaged goods (CPG) companies to embed data science into their analytics strategy to further drive growth and innovation. In this session, we will discuss factors that make Data Science (in CPG) difficult, the organizational maturity curve of CPG analytics, data and techniques that can be used for driving the consumer journey, some techniques and use cases to get started, some in-house examples at Clorox and some key learnings in the journey so far.

Session description

Speaker

Payel Chowdhury, Associate Director - Data Science, The Clorox Company

Track 2—TECH: Predictive modeling & machine learning methods

Analytical methods

Machine Learning vs. Feature Engineering: What should the Focus be in Attempting to Predict Customer Behaviour

The use of machine learning is a common theme in organizations today, yet most people still struggle with its definition given its many different levels. In this session, we attempt to eliminate this confusion by exploring a number of machine learning algorithms ranging from the simple to the more complex. We observe the use of these algorithms across a variety of industries as well as different behaviours such as customer response and customer risk. Alongside the comparison of machine learning algorithms, we also look at the impact of the data and how feature engineering impacts a given solution.

Session description

Speaker

Richard Boire, President, Boire Analytics

Track 3—MARKETING: Marketing & market research analytics

Marketing applications

Prospecting Strategies with Facebook Big Data

Lookalike Audience is a way to reach new people who are likely to be interested in your business because they're similar to your best existing customers. By implementing Facebook Big Data best practices you will be able to create value-based Lookalike Audiences to help you reach more people who resemble your current high-value customers and to showcase products they are most likely to purchase.

Prospecting strategies based on recent case-studies with Intelligent Blends, Lenny Lemons, Daily Fast Deal, Gearvilla and The Gadget Mole already showed 3x-5x CTR increase, 69% lower cost per click and 4.5x positive return on ad spend, along with other ad campaigns conversion increase.

Session description

Speaker

Kristina Pototska, Growth Product Manager, RetargetApp

3:05 pm

Track 3—MARKETING: Marketing & market research analytics

Case Study: Becker College

Acquisition Funnel for Higher Education

Predictive modeling has gained popularity in studying college enrollment due to fierce competition in higher education. To make informed decisions and allocate limited resources to improve enrollment, predictive modeling has been applied to challenge and change the traditional recruitment process. This session is intended for two learning outcomes: Participants who are not familiar with predictive modeling will learn how to lay out a plan to collect and build acomprehensive data infrastructure and conduct predictive modeling. Participants who have run predictive modeling will learn how to critically examine the quality of their predictive analyses

Session description

Speaker

Feyzi Bagirov, Senior Machine Learning Engineer, Booz Allen Hamilton

3:25 pm

Exhibits & Afternoon Coffee Break

3:55 pm

Track 1—BUSINESS: Analytics strategy & operationalization

Analytics strategy

Lessons from: Prudential Financial

Value Creation Through Analytics Innovation

Data analytics has been one of the hottest areas that companies invested in their future, hoping to keep up and improve competitive standing in the industry. Yet many executives at the C level have doubts on its return and the value that these investments will bring. In this talk, first we will review the challenges that companies faced in planning and working on analytics projects. Then we will examine various types of analytics innovations, and different value outcomes these innovations may bring.

Session description

Speaker

Wayne Huang, Director of Analytics, Prudential Financial Inc.

Track 2—TECH: Predictive modeling & machine learning methods

Analytical methods

Case Study: Citigroup

A Modified Logistic Regression Approach Enhanced by New Interactions and Scaling Detections through Random Forests and GBM

The model variable selection process is a key component of predictive analytics. Whereas logistic regression depends on feature selection/discovery being done beforehand, Random Forests and GBM tree based machine learning approaches intrinsically enact feature selections. However, default options for Random Forest and GBM are biased towards continuous variables, less favorable for categorical and binary variables. Unbiased solutions are very computational intensive. We propose a new approach that combines the strengths of intrinsic feature selection from Random Forests and GBM to detect non-linear relationships, enact scaling, and find interactions, and integrate these non-linear transformations into a traditional approach. A 7-10% lift is observed.

Session description

Speaker

Yulin Ning, Senior Director in Global Decision Management, Citigroup

Track 3—MARKETING: Marketing & market research analytics

Churn modeling; uplift modeling

Case Study: The Co-operators

Which Predictive Model Will Best Help Increase Retention?

A wide variety of models exist for predicting customer retention, each with different data requirements and model outputs. These diverse modeling options make it difficult to identify the approach with the highest potential predictive power for increasing retention. This case study explains how a survival analysis approach to predicting household retention was replaced by a more complicated but more precise "true-lift" model. This model was used for targeted marketing campaigns undertaken to increase household retention for a Canadian insurer, with performance monitored over one year. The "true-lift" model performed better than the retention model.

Session description

Speaker

Emilie Lavoie-Charland, Research & Innovation Analyst, The Co-operators

4:45 pm

Track 1—BUSINESS: Analytics strategy & operationalization

Building Data Science Teams

Lessons from: Comcast

Accelerating Data Science Innovation

As more and more companies strive to recruit and build data science teams, they must also identify how to structure those teams within the organization to drive innovative data science applications that will propel their business. This session will cover key considerations in structuring a data science organization capable of generating the big analytical ideas that will have the biggest impact on pushing the business forward.

Session description

Speaker

Bob Bress, Head of Data Science, Freewheel, A Comcast Company

Track 2—TECH: Predictive modeling & machine learning methods

Forecasting analytical methods

Case Study: Micron Technology

Demand Forecasting with Machine Learning

Accurate forecasting of customer demand for products is critical to success in the semiconductor industry. A diverse product portfolio, detailed customer mappings, dynamic market conditions, and extended production cycle times all sharpen the need for a reliable and responsive automated forecasting solution. I will describe a custom forecasting model recently developed by and now in use at Micron Technology Inc. that combines machine learning algorithms, established time series modeling techniques, and human expertise in leveraging existing forecasting infrastructure to achieve significant improvements in forecast accuracy across multiple levels of the product and customer hierarchy.

Session description

Speaker

Colin Ard, Senior Enterprise Data Scientist, Micron Technology, Inc.

Track 3—MARKETING: Marketing & market research analytics

Optimizing outreach; uplift modeling

Using Rapid Experiments and Uplift Modeling to Optimize Outreach at Scale

In the current environment, media consumption is fragmenting, cord cutters are an increasingly large segment of the population, and "digital" is no longer a ubiquitous, single medium. As such, large companies and other organizations looking to do outreach at scale to change individuals' behavior have an overwhelming number of choices for how to deploy their outreach resources. In this talk, Daniel Porter, co-founder and Chief Analytics Officer of BlueLabs, will discuss how current tools which combine uplift models with state of the art allocation algorithms make it possible for organizations ranging from Fortune 100 companies to Presidential Campaigns to large government agencies to optimize these decisions at the individual level, leading to ensuring delivery of the right message to the right person at the right time, through media channels where an individual is most likely to engage positively with the content.

Session description

Speaker

Daniel Porter, Co-Founder, BlueLabs

5:30 pm

Networking Reception

Sponsored by

Predictive Analytics World for Business - New York - Day 2 - Tuesday, October 31st, 2017

(PAW Financial & PAW Healthcare run in parallel on this day - dual registration required)

8:00 am

Registration & Networking over Coffee

8:35 am

Conference Chair Welcome

Eric Siegel, Conference Founder, Machine Learning Week

8:40 am

Special Plenary Session

What to Optimize? The Heart of Every Analytics Problem

Every analytics challenge reduces, at its technical core, to optimizing a metric. Product recommendation engines push items to maximize a customer's purchases; fraud detection algorithms flag transactions to minimize losses; and so forth. As modeling and classification (optimization) algorithms improve over time, one could imagine obtaining a solution merely by defining the guiding metric. But are our tools that good? More importantly, are we aiming them in the right direction? I think, too often, the answer is no. I'll argue for clear thinking about what exactly it is we ask our computer assistant to do for us, and recount some illustrative war stories. (Analytic heresy guaranteed.)

Session description

Speaker

John Elder Ph.D., Founder & Chair, Elder Research (a MANTECH company)

9:25 am

Plenary Session

Industry Trends: Highlights from the 2017 Data Miner Survey

In the spring of 2017, over a thousand analytic professionals from around the world participated in the 8th Rexer Analytics Data Miner Survey. In this PAW session, Karl Rexer will unveil the highlights of this year's survey results. Highlights will include:

key algorithms
challenges of Big Data Analytics, and steps being taken to overcome them
trends in analytic computing environments & tools
analytic project deployment
job satisfaction

Session description

Speaker

Karl Rexer, President, Rexer Analytics

9:40 am

Diamond Sponsor Presentation

Move Beyond Basic Targeting and Accelerate Sales with Help from Machine Learning

In this session, we will discuss how using data and analytics within a Machine Learning environment has become the latest trend in using analytics for sales & marketing. This approach is set to become one of the most widely accepted means of improving campaign effectiveness through use of propensity and response predictive modeling.

Session description

Sponsored by

Speaker

Kelley Gazdak, Global Vice President Data & Analytic Solutions, Dun & Bradstreet

10:00 am

Track 1—BUSINESS: Analytics strategy & operationalization

Getting it deployed

Lessons from: Honeywell

Operationalizing Analytics: The Critical Last Mile to Value

How often do analytic projects fail to drive value or monetization? How often does great analytic work go to waste? The primary reason for failure to realize value from analytic investment is the lack of ability to deploy to market. Advances in Big Data technology have enabled companies to deploy analytics in a consistent and expedient way to begin realizing the value of their investment.

So what does all this change really mean to you as a data scientist? How do you not only master the shifting environment - but use it to thrive?

In this talk, Bill Groves, Chief Data Scientist & Analytics Officer at Honeywell International, will explore how to operationalize analytics.

Why the "last mile" is critical and so often forgotten
Organizing to optimize return on analytics
Leveraging emerging technologies in data and analytics to increase speed to market and deployment

There's never been a better time to be at the forefront of data science. Ride the wave of change today - and into tomorrow.

Session description

Speaker

William Groves, Chief Data & Analytic Officer, Honeywell International

Track 2—TECH: Predictive modeling & machine learning methods

Data quality

Three Steps for Improving Data Quality for Predictive Analytics

Bad data is the enemy of predictive analytics. From confusion about units or measure, to missing definitions, to data values that are just plain wrong, bad data gets in the way or, worse, leads predictions awry. Unfortunately, there is no "silver bullet" solution for all the problems that can arise.

Fortunately, practitioners can take steps to alleviate the bad data problems:

Understand quality levels and evaluate whether the data are fit-for-use in their analyses.
Use "rinse, wash, and scrub" routines to clean some bad data, and
Address the root causes of poor quality data longer-term.

This presentation briefly summarizes the issues, puts the steps above in context, and shows they contribute to better predictive analytics.

Session description

Speaker

Tom Redman, Data Quality Solutions

Track 3—MORE CASE STUDIES: Varied business applications

Data storytelling

The Limits of Surveys and the Power of Google Search Data

Everybody Lies author Seth Stephens-Davidowitz will discuss how to use Google searches to get new insights into people. He will discuss public tools available for researchers, how to find important insights, and potential pitfalls in using search data. His talk will address a range of topics, including marketing, psychology, and economics.

Session description

Speaker

Seth Stephens-Davidowitz, Author and NYTimes Opinion Writer

10:45 am

Exhibits & Morning Coffee Break

Book Signing with Seth Stephens-Davidowitz, Author, Everybody Lies and former Google data scientist

11:15 am

Track 1—BUSINESS: Analytics strategy & operationalization

Workforce analytics

Lessons from: Intel

How Intel Wins the Right Marketplace Talent with Analytics

As Intel is transforming to become a data-center company and entering new fields such as AI and autonomous cars, the need to integrate HR data into decision making has become more relevant than ever. In this session, we will review new methods and capabilities Intel's Talent intelligence organization developed in order to help its business leverage its internal talent and attract external candidates to fill critical positions. We will show real case studies of how these new analytical capabilities are adopted by Intel leaders to win the right talent in the marketplace.

Session description

Speaker

Hai Harari, Director, Talent Intelligence and Analytics, Intel

Track 2—TECH: Predictive modeling & machine learning methods

Best practices

Q&A: Ask Dean and Karl Anything (about Best Practices)

Preeminent consultant, author and instructor Dean Abbott, along with Rexer Analytics president Karl Rexer, field questions from an audience of predictive analytics practitioners about their work, best practices, and other tips and pointers.

Session description

Speakers

Dean Abbott, Chief Data Scientist, Abbott Analytics

Karl Rexer, President, Rexer Analytics

Track 3—MORE CASE STUDIES: Varied business applications

Legal applications

Legal Ease: Applications of Predictive Analytics in the Law

Is the market for legal services a lemons market? Consumers have no means of determining ex ante whether their lawyer is good or mediocre. Legal services consumption often defaults to guesswork or over-reliance on referrals. Equally, the pricing of legal services bears little relationship with quality - mediocre lawyers often charge as much as good lawyers, and clients end up paying very high fees for lawyers with losing records. On the flip side, good lawyers might be underpaid relative to poor ones, because the market prices all services at an average quality level due to a lack of transparency and trust. In this milieu, analysis of win-loss records, citation practices of particular judges, biases in favour of plaintiffs or defendants, etc have the potential to bring much needed evidence-based decision-making into the legal system. Analytics and machine learning also offer the potential to automate highly important legal decisions such as parole - where idiosyncratic and opaque methods have yielded wildly varying outcomes in similar circumstances imperiling society. For governments, analysis of productivity and efficiency offer the potential to tailor justice spending in more optimal ways to address the justice gap.

Session description

Speaker

Sandeep Gopalan, Pro Vice-Chancellor (Academic Innovation), Deakin University, Melbourne, Australia

11:40 am

Track 3—MORE CASE STUDIES: Varied business applications

Industry-leading case studies

Customer Journey Analytics: Blazing Paths to Customer Success

What if you could accurately assess the value - and risk - of a potential customer, right from the very first interaction? Detailed customer data is now more widely available, across all the touchpoints that prospects and customers have with your organization. By creating an Experience Platform, you can use advanced analytics to tie together the whole journey, and each customer's individual pathway can now be optimized.

In this session, you'll learn:

Analytic techniques to better understand the customer experience
The data, technology, and processes that will help you transform customer experience
How to get started in using advanced analytics to improve your organization's customer experience.

Session description

Speaker

Steven Ramirez, CEO, Beyond the Arc

12:00 pm

Lunch in Exhibit Hall

12:10 pm

Lunch & Learn

4D Today, 5D Tomorrow

All markets are large organizational elements made up of smaller elements called products. The session begins by showing that the only markets that express the law of supply and demand are those for scarce, highly valued products such as gold, silver, and platinum. All other markets demonstrate action in the dual states of value and demand across four mathematical axes, creating a 4D position, modulated by other forces, as the buyers deem appropriate. Adding another variable, time, to a 4D state yields a 5D state. This session continues with an examination of the nature of 4D and 5D states in MEE4DTM software, showing users how to find over and underpriced products already in the market; additionally, it displays what markets want, do not have, and can afford. Working with common data, it provides actionable, statistically significant business intelligence to buyers, sellers and new market entrants alike.

Session description

Speaker

Doug Howarth, CEO, MEE Inc

1:00 pm

Lunch in Exhibit Hall

1:10 pm

Keynote

UPS' Road to Optimization

Using advanced analytics, UPS was able to reduce 185M miles driven annually. The latest tool, ORION (On Road Integrated Optimization and Navigation) completed deployment in 2016 and accounts for $300M to $400M in cost reduction annually.

Session description

Speaker

Jack Levis, Formerly UPS (retired), now Chief Product Strategist, ESP Logistics Technology

1:55 pm

Diamond Sponsor Presentation

The Session Description will be available shortly.

Session description

Sponsored by

Speaker

Mark Davenport, Senior Director of Analytics, The Trade Desk

2:15 pm

Expert Panel

Women in Predictive Analytics: Opportunities and Challenges

Across fields of science and engineering, the track record of contributions made by women continues to grow - a fact that helps pave the way for future female scientists. Predictive analytics and data science are no exception. In this session, our expert panelists will address questions such as:

How to increase the count of women in your analytics team
Differences from other science and engineering fields in terms of being male dominated
How to "survive" as a woman in analytics
The next generation - encouraging girls and newcomers in STEM (science, technology, engineering, and mathematics)
Balancing work and personal life

Session description

Moderator

Anne G. Robinson, Chief Strategy Officer, Kinaxis

Panelists

Tracie Coker Kambies, Principal | Retail Technology and Analytics, Deloitte

Julia Minkowski, Product Lead, Walmart Global Tech

Pallavi Yerramilli, Senior Product Manager, The Trade Desk

3:00 pm

Exhibits & Afternoon Coffee Break

3:30 pm

Track 1—BUSINESS: Analytics strategy & operationalization

Analytics management

Lessons from: Vanguard

Project Management for Data Scientists

As Data Scientists enter the Enterprise work environment at a rapid pace, delivering immediate business impact is often a challenge for new teams. This session will provide an overview of best practices in project management that can be applied to maximize value from your data science team, including Agile/Scrum methodology, team communication and client expectation management, in order to quickly grow both the data-driven practice at your organization and deliver on-going business value.

Session description

Speaker

Wanda Wang, Data Scientist - Investment Management Fintech Strategies, Vanguard

Track 2—TECH: Predictive modeling & machine learning methods

Model interpretation

Case Study: SmarterHQ

When Model Interpretation Matters: Understanding Complex Predictive Models

For some of us, predictive accuracy is paramount when we assess our models: PCC, ROC AUC, Type I and Type II errors, etc. However, in other applications, the interpretation of predictive models is paramount so we understand why the model behaves the way it behaves. For this reason, many practitioners end up building models that are easier to interpret rather than models that are more accurate: regression or decision trees in particular. Neural Networks and Model Ensembles fall out of favor in these applications because they are perceived to be "black boxes".

This talk will describe an approach to determine the relative influence of each input variable in any predictive model using input shuffling, no matter how simple or complex the model. Interpretation of linear regression, logistic regression, neural networks, and Random Forest models will be compared and contrasted.

Session description

Speaker

Dean Abbott, Chief Data Scientist, Abbott Analytics

Track 3—MORE CASE STUDIES: Varied business applications

PA adoption in a new industry

Case Study: RightShip

Overcoming Challenges Implementing a Risk Model in the Maritime Industry

Over 90% of world trade is carried by sea, and RightShip's highly regarded online vetting system provides data and risk evaluations on over 75,000 vessels in the world fleet. In this unique, ground-breaking case study of predictive model deployment in the maritime industry, Bryan Guenther will cover:

Model development - the right team and expertise
Chaid model - pros and cons
How the model highlighted a huge problem in applying a fair rating across different types of vessels
Issues with "flip-flopping" near the cutoff between two ratings
Industry socialization and training
Industry acceptance (or not)
How much do you show the customer without causing confusion?
How do you train up your own internal staff to answer customer questions?
When to retrain the model - How to handle the fallout of changes when updating
The impact of this endeavor on the shipping industry

Session description

Speaker

Bryan Guenther, Qi Program Manager, RightShip

3:55 pm

Track 3—MORE CASE STUDIES: Varied business applications

Agriculture analytics

Case Study: Circle A Farms

Advancing Hydroponics through IoT Analytics

In this session we'll review a case study of smart hydroponics - how we created a connected farm, the data we collected, and the analysis performed to improve yields and make a better product.

Session description

Speaker

Steve Fowler, CEO, Jivoo

4:15 pm

Track 1—BUSINESS: Analytics strategy & operationalization

Model deployment

Lessons from: John Hancock

A Shiny Way to Operationalizing Analytics

A model is only as valuable as its adoption. Speed to value, repeatability and low cost solutions can dramatically reduce software and services budgets and free up valuable dollars for other activities. Open source tools such as Shiny (R) and Flask (Python) have enabled the creation and deployment of data science based web applications convenient and manageable. At John Hancock, in its Advanced Analytics function, we routinely wrap sophisticated modeling code into such web-based point and click solutions. In this session you will see and learn about real-life examples of how one can rapidly operationalize both model build & maintenance.

Session description

Speaker

Shatrunjai Singh, Senior Data Scientist, John Hancock

Track 2—TECH: Predictive modeling & machine learning methods

Data policy

Regulating Opacity: Solving for the Conflict Between Laws and Analytics

Software and analytics have been eating the world for a long time, and law and government are next. Businesses are increasingly transcending physical boundaries into new, unregulated virtual domains, forcing companies, developers and regulators to take a hard look at how data is collected, stored, and used. And new laws around the world are beginning to force the technology industry to rethink how it approaches the law. This talk will explain how and why the worlds of law and technology are colliding, and what this means for data-driven companies, the technology industry, and governments and citizens around the world.

Session description

Speaker

Andrew Burt, Chief Privacy Officer & Legal Engineer, Immuta

Track 3—MORE CASE STUDIES: Varied business applications

Logistics analytics

Case Study: Cargonexx

Leveraging Machine Learning Techniques for Realtime Pricing in B2B Truck Logistics

Automated pricing is delicate when it comes to dynamic changing prices with no real current market price. We present challenges and approaches in this pricing area, from the development of a pricing engine for a logistics platform to deriving the current price for each transport request in realtime (< 15ms). We implement the solution acting as an intermediary between contractors of transports and freight carriers. To solve this stochastic problem, we use "fuzzyfication" and machine learning to build probability distributions for price acceptances. In this session, general steps are presented, applicable to broader cases that have an underlying stochastic problem, where distributions need to be extracted from data to derive optimal control actions under uncertainty.

Session description

Speaker

Alwin Haensel, Founder and Managing Director, Haensel AMS

5:00 pm

End of Day 2

Post-Conference Workshops: Wednesday, November 1, 2017

Full-day Workshop

The Advanced Data Preparation Bootcamp: Whip your Data into Shape

As crucial as it is, data preparation is perhaps the most under-taught part of the predictive analytics (machine learning) process, even though we spend 60%, 70%, even up to 90% of our time doing data preparation steps. This workshop will cover the most important aspects of data preparation. Each of these topics will be described and connected to specific modeling algorithms that benefit from the data preparation step, including:

Data cleaning: outlier detection and "fixing", and which algorithms care about outliers
Missing value imputation: the simple approaches and more complex and complete methods
Feature creation: why we do it, which algorithms are helped most by which kinds of features, and how to automate building different kinds of continuous-valued and categorical features
Feature selection: why it's important to many algorithms
Sampling: what kind of sampling we should do, how large the samples should be, should we (ever) stratify samples, and how to sample small data sets to improve model robustness

Session description

Instructor

Dean Abbott, Chief Data Scientist, Abbott Analytics

Full-day Workshop

The Best and the Worst of Predictive Analytics: Machine Learning Methods and Common Data Science Mistakes

Predictive analytics has proven capable of enormous returns across industries – but, with so many core methods for predictive modeling (machine learning), there are some tough questions that need answering:

How do you pick the right one to deliver the greatest impact for your business, as applied over your data?
What are the best practices along the way?
And how do you avoid the most treacherous pitfalls?

Session description

Instructor

John Elder Ph.D., Founder & Chair, Elder Research (a MANTECH company)

Full-day Workshop

Spark on Hadoop for Machine Learning: Hands-On Lab

Why Machine Learning Needs Spark and Hadoop

Standard machine learning platforms need to catch up. As data grows bigger, faster, more varied-and more widely distributed-storing, transforming, and analyzing it doesn't scale using traditional tools. Instead, today's best practice is to maintain and even process data in its distributed form rather than centralizing it. Apache Hadoop and Apache Spark provide a powerful platform and mature ecosystem with which to both manage and analyze distributed data.

Machine learning projects can and must accommodate these challenges, i.e., the classic "3 V's" of big data-volume, variety, and velocity. In this hands-on workshop, leading big data educator and technology leader James Casaletto will show you how to:

Build and deploy models with Spark. Create predictive models over enterprise-scale big data using the modeling libraries built into the standard, open-source Spark platform.
Model both batch and streaming data. Implement predictive modeling using both batch and streaming data to gain insights in near real-time.
Do it yourself. Gain the power to extract signals from big data on your own, without relying on data engineers, DBA's, and Hadoop specialists for each and every request.

Session description

Instructor

James Casaletto, PhD Candidate, UC Santa Cruz Genomics Institute and former Senior Solutions Architect, MapR

Post-Conference Workshop: Thursday, November 2, 2017

Full-day Workshop

Supercharging Prediction with Ensemble Models

Once you know the basics of predictive analytics and machine learning—including data exploration, data preparation, model building, and model evaluation—what can be done to improve model accuracy? One key technique is the use of model ensembles, combines several or even thousands of models into a single, new model score. It turns out that model ensembles are usually more accurate than any single model, and they are typically more fault tolerant than single models.

Are model ensembles an algorithm or an approach? How can one understand the influence of key variables in the ensembles? Which options affect the ensembles most? This workshop dives into the key ensemble approaches including Bagging, Random Forests, and Stochastic Gradient Boosting. Attendees will learn "best practices" and attention will be paid to learning and experiencing the influence various options have on ensemble models so that attendees will gain a deeper understanding of how the algorithms work qualitatively and how one can interpret resulting models. Attendees will also learn how to automate the building of ensembles by changing key parameters.

Session description

Instructor

Dean Abbott, Chief Data Scientist, Abbott Analytics

CloseSelected Tags: