Developing a New Data Pipeline to Improve Supply & Demand Forecasting and Increase Productivity by 25%
Our client, an American multinational food manufacturing company, partnered with Wizeline to improve data capabilities for business analysis, leveraging technology, data science, cloud, and automation to enhance supply & demand forecasting and increase productivity by 25%.
Read more to see how the Wizeline team implemented an optimized, automated workflow to streamline data processing and cut the process for end-to-end automated predictions from an unpredictable amount of time to an average of 1 hour.
The Challenge: Maximizing Data for Strategic Business Decisions
Data is only helpful if transformed into timely insights that support related business decisions. Our client is focused on maximizing its data for strategic decision-making to advance its digital evolution. However, the data used during the decision-making process came from diverse, unconnected sources from a wide range of contributors and relied on manual processes, jeopardizing quality and accuracy. In addition, the software engineering processes were not optimized, siloing knowledge and not documenting methods properly. All of this impeded our customer’s ability to quickly provide business forecasts to the company’s decision-makers.
Our Solution: Leveraging an Optimized, Automated Workflow to Create a Data Pipeline
The Wizeline team created a new data pipeline, a crucial element to enable our customer to achieve its digital evolution by leveraging data analytics for decision-making. The pipeline aims to periodically show supply & demand forecasts to the company’s decision-makers. These forecasts help shape commercial strategies and inform essential business decisions.
To develop the pipeline, Wizeline implemented an optimized, automated workflow to streamline data processing by:
- Implementing Infrastructure as Code (IaC) to reduce maintenance and ease replication between AWS environments, making it possible to include the solution in the Disaster Recovery plan
- Using Github Actions to develop CI/CD automated processes
- Improving security through Docker image scanning
- Reducing the length and cost of the machine learning (ML) model training instances by replacing SageMaker notebooks with training jobs, following SageMaker’s best practices
- Reducing the ML training model complexity from n3 to n2
- Boosting code quality by implementing a code linter and refactoring it, increasing its robustness
- Accelerating new feature development through continuous agile testing to the code pipeline
- Executing mail alerts, allowing immediate visibility of issues
- Documenting every step of the process, including the use of containerized environments to speed up onboarding, knowledge transfer, and scalability
- AWS (CloudFormation, Glue, SageMaker, Lambda, Athena, S3, ECR)
- Github and Github Actions
- Python (Boto3, Sklearn, Diagrams (mingrammer Diagrams as Code) Pandas, Profiling, Pdoc, Pytest, Pylint, GreatExpectations 0.13.20)
Results: Reducing the Time to Train Predictive Machine Learning Models From 12 to 9 Hours
The data pipeline evolved from a manual deployment process, previously managed by our customer using spreadsheets, to a fully automated one, helping the food manufacturing company advance in its digital transformation journey.
The Wizeline team accomplished critical engineering and business milestones, resulting in a 25% decrease in training time for predictive machine learning models from 12 to 9 hours. The length of the end-to-end automated predictions process was also cut to 1 hour, transitioning from a previously unknown process in terms of length and timing to a fully visible, scheduled, and automated predictions process.
The data pipeline project increased our customer’s overall data quality, ensured information accuracy, and improved the ability to act on issues in real-time due to the implementation of email alerts. To learn more about how Wizeline could help you achieve similar results, contact firstname.lastname@example.org.