Predictive Analytics World for Financial 2023
June 18-22, 2023 l Red Rock Casino Resort & Spa, Las Vegas
The 2023 agenda will be announced soon.
Review – Agenda 2022
Blue circle sessions are for All Levels
Red triangle sessions are Expert/Practitioner Level
Workshops - Sunday, June 19th, 2022
Full-day: 8:30am – 4:30pm.
Python leads as a top machine learning solution – thanks largely to its extensive battery of powerful open source machine learning libraries. It’s also one of the most important, powerful programming languages in general.
Workshops - Monday, June 20th, 2022
Full-Day 8:30 am - 4:30pm
This one-day session surveys standard and advanced methods for predictive modeling (aka machine learning).
Full-Day 8:30 am - 4:30pm
Machine learning improves operations only when its predictive models are deployed, integrated and acted upon – that is, only when you operationalize it.
Full-Day 8:30 am - 4:30pm
This one-day introductory workshop dives deep. You will explore deep neural classification, LSTM time series analysis, convolutional image classification, advanced data clustering, bandit algorithms, and reinforcement learning.
Predictive Analytics World for Financial - Las Vegas - Day 1 - Tuesday, June 21st, 2022
Nvidia's Siddha Ganju has gained a unique perspective on machine learning's cross-sector deployment. In her current role, she work's on a range of applications, from self-driving vehicles to healthcare, and she previously led NASA's Long-Period Comets team, applying ML to develop meteor detectors. Deep learning impacts the masses, so it demands mass, interdisciplinary collaboration. In this keynote session, Siddha will describe the very particular interdisciplinary effort -- driven by established joint directives -- required to successfully deploy deep learning across a variety of domains, including climate, planetary defense, healthcare, and self-driving cars.The format of this session will be a "fireside chat," with PAW Founder Eric Siegel interviewing Siddha in order to dig deep into the lessons she's learned.
Machine learning and robotics are dramatically shifting our industrial capabilities and are opening new doors to our functional understanding and ways to support the natural world. Together, these advances can enable something far beyond simply limiting our damage to the planet -- they create the possibility of building a new relationship to nature wherein our industrial footprint can be radically reduced and nature's capability to support itself and all life on Earth (including us!) can be amplified.
As the world of Machine Learning (ML) has advanced, the biggest challenge that still faces data science organizations is the need for insightful, valuable, predictive attributes, aka “features” that can be applied to ML models. The process of building features is so tedious and costly that the “feature store” was invented to make re-building features a thing of the past.
The problem is that traditional means of building features to feed feature stores have been manual, labor-intensive efforts that involve data engineers, subject matter experts, data scientists, and your IT department. But what if there was a faster and more scalable way? Join dotData’s VP of Data Science, Dr. Aaron Cheng as he presents the concept of the automated Feature Factory and see how your organization can take a process that today takes months, and do it in a few days.
In today's hyper-digital world, the data contained in documents often represent significant business value. The application of machine learning to extracting information from these sources is becoming big business. However, each use case represents different challenges in data extraction.This talk examines the application of intelligent document processing in the health insurance space. BCBS Tennessee Director of Data Science & AI Brandon Cosley will discuss how their Data Science Center of Excellence deployed an ensemble of machine learning techniques (e.g. DL, open-source, and NLP libraries) to extract information from documents in different business contexts. He will highlight the successes and challenges of each implementation while focusing on key findings associated with business success.
Business partners are inundated with a non-stop barrage of how they need to capitalize on data and it's uses in order to stay relevant. As such data scientists and business partners spend ample time discussing the value of predictive modeling, use cases and potential ROI. Engaging and aligned as those early conversations can be, it is often a single conversation once the model is built and ready for deployment that can be the most problematic. In short...Just how accepting are business partners of the deployment changes necessary to a process for a predictive model to deliver the promised quantifiable value?Resistance to these changes ultimately turn into a self-fulfilling prophecy causing the appearance of failure to many modeling efforts. Which is why ensuring model deployment acceptance conversations occur throughout a model delivery life cycle is critical to overcoming the distrust, disinformation and general dislike of change a model deployment can create. Let's discuss several examples from a Fortune 500 financial services company's a call center's predictive modeling efforts, where model deployment acceptance impacted not only process integration, model development and benefit realization and the areas of opportunity that could have driven a different result.
When dealing with fraud in real-time payments, the reaction needs to be fast. The cost of mistakes in fraud analytics is very high, yet it is important to preserve a good customer experience and to reduce user friction. This presentation focuses on best practices to establish analytics as a meaningful business resource and how to make it more effective and pervasive. Learn how to communicate with quants, how to obtain buy-in from decision-makers and technology partners, and how to select models based on the impact on business metrics and revenues.
Crystal Quota is a predictive machine learning framework that enables Google Cloud Quota operations to automatically grant or deny Quota Increase Requests for 100 million Cloud users. This framework improves the rate of quota approvals, reduces manual toil, and proactively provides a robust defense pillar against scaled abuse on the platform.
The size of the credit market in US Dollars is in the tens of trillions, providing credit to people, business, and governments. Traditional credit scoring has been the primary tool for assessing the risk of default and is an indispensable tool for lenders. Due to the decentralized, pseudonymous nature of cryptocurrency, the same credit scoring models used by traditional lenders aren't useful. There is, however, a wealth of data available, and an opportunity to leverage that data to better assess the risk of borrowers. This talk will explore that data, the challenges, and the opportunities for both lenders and borrowers.
In this session, an overview of Statistics and Machine Learning Algorithms with Supervised Learning (Logistic Regression, GLM Logistic Regression, Decision Tree, Random Forest, Gradient Boosting, Neural Networks) and Unsupervised Learning (Z-Score, IQR, DBSCAN Clustering, Principal Component Analysis, etc.) will be provided. Then, we will have a holistic comparison of each method at main analytics stages via two use cases. An insurance predictive analytics use case will be employed for the supervised machine learning comparison and a banking outlier detection analytics use case will be employed for the unsupervised machine learning comparison. Finally, the analytics result validation, implementation, and interpretability will be discussed and compared. Sample Python code will be shared.
Certain types of insurance that are legally required for consumers, such as auto and home insurance, typically face a high amount of regulatory scrutiny. Insurers in the United States often must obtain state regulatory approval of their pricing models before being able to sell new insurance products or change pricing of existing products. Beyond that, model users must feel assured that the variables and the relationships between variables used in the models are logical and intuitive, both to themselves, as well as to stakeholders affected by the model results. In this session, we’ll discuss best practices and case studies for analyzing and building regulator and user understanding and trust in machine learning models developed for insurance applications. We will also discuss how the data science industry is leveraging the rating and advisory organization model to obtain streamlined regulatory review and approval of complex models for use by insurance carriers.
Predictive Analytics World for Financial - Las Vegas - Day 2 - Wednesday, June 22nd, 2022
Financial institutions generate a significant volume of data that is complex and varied. Such datasets originate independently in separate business units for various reasons including regulatory requirements and business needs. As a result, data sharing between business units as well as outside the organization (e.g. to the research community) is constrained. Furthermore, data containing personal information must be protected. Accordingly, it is difficult to develop and test new algorithms on original data. One solution is to synthesize financial datasets that follow the same properties of the real data while respecting the need for privacy of the parties involved.In this presentation, J.P. Morgan's Tucker Balch will review the firm's approach to synthetic data. He will highlight three main areas: 1) Generating realistic synthetic datasets. 2) Measuring the similarities between real and generated datasets. 3) Ensuring the generative process satisfies any privacy constraints.
As Financial Services increasingly embrace digitization, AI presents many opportunities for efficiency gains and automation across the entirety of a bank’s operations. However, a lot of these efforts to develop and operate AI applications have been bottlenecked by the data not being AI ready. Join Ajun Prakash, Snorkel AI’s Director of Solutions, to learn how Snorkel helps Financial Services companies solve their data challenges, and discuss a few case studies of operational efficiencies this has unlocked.
Can automl be used to forecast the price of S&P 500 futures? Can various stock technical analysis indicators like moving averages, exponential moving averages, Bollinger bands, relative strength index etc. be used to forecast the S&P futures price for the next day? In this session, Jiwani will share the results of case study that used various autoML tools like H2O automl, H2O Driverless AI, Rapidminer, and other automl software packages. He will examine the results from them all and go over the performance of each one. He will also reveal that some of these are overfitting.
Safety National Casualty Corporation is the leader in Excess Workers' Compensation Insurance. Premium audit of Excess Workers Comp policies requires considerable resources and time at the end of each policy period, especially when the audit is conducted physically. To optimize the premium audit process, in collaboration with the audit and underwriting department, our data analytics team developed a set of predictive models, which leverages historical audit data and account information to predict future premium audit results. The prediction results have been applied to optimize the ordering of audits to collect more premium faster, selectively waive audits based on expected additional premium, and more efficiently allocate premium audit resources.
I introduce a new concept and propose a way to estimate it: How much model search power can a given dataset endure before its confessions are spurious? I’ll explain “complexity capacity” with some simple controlled experiments and explore it during the search for a working investment timing strategy. By measuring the search power of an algorithm and the complementary search capacity of a dataset, we can avoid mismatches -- the disappointment of under-fit or under-search, yes, but mostly the disaster of over-search, where training results look great but out-of-sample predictions are worthless.
The latest poll reconfirms today's dire industry buzz: Very few machine learning models actually get deployed. This pervasive failure of ML projects comes from a lack of prudent leadership as well as various technical challenges. In this panel session, industry experts will weigh on to define which factors and practices contribute the greatest impact to ensure successful machine learning deployment. What are the most important organizational and technological ingredients? Come to this session to find out!
Transaction data has immense potential to go beyond traditional data aggregation by banks, to connecting the dots and providing valuable customer insights across industries. By acquiring financial data and then cleansing and enriching it, organizations can derive insights into customer needs and behavior to provide more meaningful interactions, identify lending opportunities, uncover current risks, provide competitive analysis, improve marketing efforts of a retail giant, and identify growth opportunities for clients. In this session, we will explore the importance of utilizing transaction data and applying machine learning algorithms to datasets to clarify and categorize the transactional data. Institutions can leverage this customer data to provide personal experience and advice.
In today’s social environment, where responsibility to justice and fairness is being reconsidered vigorously, the insurance industry finds itself in the middle of the debate. For those who say insurers should do more to help balance society’s scales, the industry’s reply has been an insistence that actuarial science is colorblind. Underwriting and pricing are built only on socially appropriate factors that are predictive of loss. Factors such as race, ethnicity, religious practice, sexual preference, or national origin are directly excluded from consideration. This industry position has been challenged for many years as the use of territory and credit have been directly criticized as proxies for race and/or income. Factors such as credit based insurance score, gender, occupation and education, all very predictive of loss, have likewise found challenges and restrictions growing. Adding to the intensity of today’s debate is a growing insistence by many that being “colorblind,” on social justice issues is no longer enough. Some critics insist that insurers, along with the rest of society, must be actively involved in promoting justice and fairness.
We have reached a point in this debate where the status quo will no longer suffice without significant support. A new law in Colorado will require insurance companies to demonstrate that their prices and practices are not unfairly discriminatory, and more regulatory action is expected in other states. Given this is a new requirement, the discussions have begun to move from should we do this to how do we do this. This presentation will discuss various definitions of bias and discrimination in rating and insurance practices, methods and techniques that have been developed in data science to uncover unintentional bias, and how they techniques can be applied to insurance industry practices.
Workshops - Thursday, June 23rd, 2022
Full-Day 8:30 am - 4:30pm
This one-day session reveals the subtle mistakes analytics practitioners often make when facing a new challenge (the “deadly dozen”), and clearly explains the advanced methods seasoned experts use to avoid those pitfalls and build accurate and reliable models.
Full-Day 8:30 am - 4:30pm
This one day workshop reviews major big data success stories that have transformed businesses and created new markets.
Full-Day 8:30 am - 4:30pm
This workshop dives into the key ensemble approaches, including Bagging, Random Forests, and Stochastic Gradient Boosting.
3 hour workshop: 5:30-8:30pm
This 3 hour workshop launches your tenure as a user of R, the well-known open-source platform for data analysis.
Workshops - Friday, June 24th, 2022
Full-Day 8:30 am - 4:30pm
Gain experience driving R for predictive modeling across real examples and data sets. Survey the pertinent modeling packages.