Explore how to unlock business value using time series forecasting approaches with real-life data instead of keeping it on an academic research level.
When it comes to predict the future based on past historical events or available information in the future itself, time series forecasting has established itself as a go-to strategy. Whether with simple statistical approaches or state-of-the-art deep learning models, it warrants a comprehensive understanding of time-dependent factors, both from the past and possibly, from the future. There has been great success in these methods in academical research. But how do we move beyond the comfort of those datasets that i refer to as “Toy Datasets”? Real-life data in business holds a lot more complexity and heterogenity that we need to overcome for successfully creating some business value.
The key to mastering time series forecasting lies in learning the rhythm that drives the data. Benchmark datasets often provide a simplification of this rhythm, showcasing clear seasonal patterns and trends without strong influences from external factors. From air passenger numbers to Boston house prices, typical benchmark datasets in academia are united by simple underlying problem dynamics.
However, when operations move from this controlled environment into the chaos of real-world data, traditional methods can falter, giving rise to a wide array of challenges.
In the business landscape, data spikes are a common occurrence. Promotions and sales days can cause drastic occurrences that can easily throw off a prediction model leading to bad consequences for the business planning. Companies also have to grapple with the so-called “cold start problem” where new product launches obviously cannot provide any historical data so far but still accurate predictions are required for the given product to steer the business effectively.
Intermittent time series, mainly characterised by having lots of 0-values in the data itself, are common in the realm of e-commerce and clothing retailers. They further destabilize common prediction model approaches. However, creative application of statistical methods like Croston Methods or ADIDA (Aggregated-Disaggregated Intermittent Demand Approach) can lend stability during such periods.
Since dealing with datasets that can exceed millions of predicted items including above described data complexities, we need to find matching model approaches. Therefore there is not one model that will cover all these complexities by itself best. For time series in a low volume sector with only less variance and no external drivers, statistical models empirically perform well. For high volume products including high variance by external factors, deep learning or machine learning models are best practice.
Our tech stack at paretos gives us the opportunity to effectively ensemble from those approaches and therefore provide optimal results for various different problem characteristics.
But how we do we know if a prediction model actually performs good, bad, superb or just mediocre?
The first and most crucial step in time series forecasting is the running of baselines. This will provide a lot of context for the performance of the final prediction model and is essentially to steer the machine learning training iterations.
Analysing model performance based on data clusters (for example clustered by volume, variance, forecasting error etc.) further helps in honing the predictions. This becomes essential especially in datasets with more than ~100k predicted items where the decision maker needs to be guided to the most important items straight away to review them fast and effective.
In the world of timeseries forecasting, paretos offers data-informed insights and optimized forecasts, ensuring that your business stays ahead of the curve. Navigate through the complexities of real-world data and turn every challenge into an opportunity. Begin forecasting with paretos today.
Would you like to know if paretos is the right solution for you? Feel free to schedule a non-binding consultation appointment.