Originally published in Uber Engineering, June 9, 2017
At Uber, event forecasting enables us to future-proof our services based on anticipated user demand. The goal is to accurately predict where, when, and how many ride requests Uber will receive at any given time.
Extreme events—peak travel times such as holidays, concerts, inclement weather, and sporting events—only heighten the importance of forecasting for operations planning. Calculating demand time series forecasting during extreme events is a critical component of anomaly detection, optimal resource allocation, and budgeting.
Although extreme event forecasting is a crucial piece of Uber operations, data sparsity makes accurate prediction challenging. Consider New Year’s Eve (NYE), one of the busiest dates for Uber. We only have a handful of NYEs to work with, and each instance might have a different cohort of users. In addition to historical data, extreme event prediction also depends on numerous external factors, including weather, population growth, and marketing changes such as driver incentives.
A combination of classical time series models, such as those found in the standard R forecast package, and machine learning methods are often used to forecast special events. These approaches, however, are neither flexible nor scalable enough for Uber.
In this article, we introduce an Uber forecasting model that combines historical data and external factors to more precisely predict extreme events, highlighting its new architecture and how it compares to our previous model.
Creating Uber’s new extreme event forecasting model
Over time, we realized that in order to grow at scale we needed to upgrade our forecasting model to accurately predict extreme events across Uber markets.