Sales forecasting is a key element of a business. Accurate forecasting helps optimize strategic plans and more effectively manage different aspects of the business. One of our customers, whose business is in the animal nutrition and feed industry, faced such problems with accurately forecasting sales volumes due to complexity, volatility and the dynamic nature of the business. By harnessing the power of data science and machine learning, they were able to drastically improve sales forecast accuracy.
The Challenge
Animal feed production is highly dependent on weather, animal feed cycles and availability of supply to manufacture the feed (which is highly dependent on seasonal trends). These factors made sales forecasting quite complicated and spurred the need for an advanced machine learning solution.
Our Solution
Our solution followed the data science process very diligently, from the data collection stage to the model deployment stage. For further details on our solution, check out this link.
Figure 1: The data science process
As part of the process itself, the following steps were undertaken:
1. Data Collection – The data itself had several sources, which had to be collected and collated,
including:
2. Data Pre-processing – This step mainly involved cleansing the data to get it ready for machine learning.
3. Feature Engineering – This is one of the most important steps in this process which involves using domain knowledge in engineering the features from the given data and finding their relative importance with respect to the sales, in order to better understand the data and make the machine learning model more intelligent and accurate. For this solution, we derived various features from the data such as – ‘month over month sales difference’, ‘Quarterly Average Sales’, ‘Average Seasonal Sales’ and so on.
A sample of how the importance is derived and can be used to make decisions is shown below.
Figure 2: Feature Engineering for Sales Forecasting
From Figure 2 above, it can be derived that sales-based features are very important, followed by location-based features and lastly the kind of product. This process ensures that the right data goes into building the model.
The feature engineering itself is a combination of various methods such as correlations and random forests, among others.
4. Machine Learning – The machine learning process was iterative and involved many stages of tuning and optimization to arrive at the best possible model which would take into consideration all the data and business rules. Various regression and time series models were conditioned on the data, but the Advanced Multi-Seasonal-Multi Variate ARIMA (Auto Regression Integrated Moving Average) model was the best suited for this problem. It is an ensemble of the traditional ARIMA + Custom Machine Learning elements.
Why Multi-Seasonal-Multi Variate ARIMA?
Once the candidate model was chosen and optimized it was deployed for use.
The Output
The output derived from the Machine Learning Forecasting Model is an accurate forecast of sales volumes of the products by different levels of granularities – Regions, Factories and Salesperson. It is accurate to within an average of 5% of the actuals, meaning very low margin of error.
The sales manager and the salesperson can now have a more reliable view of the forecast from different dimensions and make the appropriate business decisions.
Conclusion
The solution built was very successful in addressing the problem and benefiting the business. Such accurate and scalable models can be applied to similar sales scenarios which involve complexity and large amounts of data.