If innovation is an engine, data can best be described as the oil that fuels it. However, like regular oil, data can’t create much value in its raw form; it needs to be collected, cleaned, organized, analyzed, and then converted into insights.
Over the past few months, several companies and organizations have recognized this reality and turned to Wizeline to explore and implement solutions with artificial intelligence (AI) and big data. While these technologies are not always the answer, organizations that evolve their processes and improve the quality of their data can accommodate these applications, ultimately helping them accomplish their business objectives.
As a company, and as engineers, we are always looking for new challenges. Even so, we’ve had to pause and understand what fundamental problems our customers are trying to solve to analyze how these technological advancements could really help them.
Collecting and Managing Data the Right Way
Many of the organizations that have approached us have a non-negligible amount of data, much of which is structured, cleaned, and organized, mainly because it’s used in their day to day operations.
For example, a supermarket chain holds large amounts of data and important information about their customers, transactions, vendors, and stores. Real-time and historical data is stored, usually in one or many OLTP databases, sometimes transformed and loaded into data warehouse solutions and eventually sent to long-term or offline storage.
In parallel, most companies produce new forms of data all the time, absorbing external reports, manipulating data from internal systems, and generating new pieces of information. Excel is the preferred tool for this type of work, producing tons of independent files, shared across all levels of the organization.
It’s great that companies are using and generating data, but without a robust analytics solution, they are sure to encounter problems.
Eventually, the store manager will want to answer more complex questions. What products have increasing demand in the following holiday season? Will demand for these products increase or decrease this year? Which products are purchased together? Who are my most loyal customers, and which customers might leave us for the competition? How can I target a specific segment of buyers, or increase the average ticket per customer?
These are all valid and important questions, but even if the store manager is a SQL expert and could query the databases directly, it’s unlikely they will get the right answers.
The organization in question has the oil but can’t convert it into the fuel that will power innovation and market value.
The fundamental problem is that we have different use cases for the data, with entirely different requirements. The analytics use case requires the capability to traverse the whole dataset quickly in order group information, find specific answers, and extract insights.
We need an analytical database (OLAP) to perform this task, along with a production database (OLTP) with enough security measures in place to prevent a complex query bringing the database to its knees. There is no redundancy in having both of these databases because they serve different purposes.
The Architecture for Success
In order to maximize the value of data, companies have to provide their team and executives with tools that will allow them to move from observations (data) to insights (analytics) and then to decisions (predictive tools).
With the power of cloud platforms and their flexibility, companies can move faster and become much leaner. Even so, while major cloud providers offer analytical databases under the PaaS (platform-as-a-service) model, the traditional approach for IT projects—selecting a few vendors, buying licenses, and waiting for complicated implementation projects—is not the path to success here.
At Wizeline, we have found that the solution is to replicate data from production databases into an analytical database that can power business insight extraction. We have implemented these solutions with many clients successfully and want to share the architecture with our audience.
Wizeline Advanced Analytics Reference Architecture
The core components of it are an analytical database to allow processing with SQL, which can be used to query the data to produce insights or transform the data in preparation for further analytics. It also includes Apache Superset, a business intelligence application, and Apache Airflow, which triggers processing steps to create pipelines and prepare the data for visualization.
This approach has many advantages.
- Low risk – Since the infrastructure is isolated from the production databases, it is very low-risk.
- Low cost – The cost of the infrastructure is also very low while not in use and can throw many resources to quickly generate insights, making it almost serverless. It allows for full control because all the components can be replaced or updated as needed, and the IaC (Infrastructure as code) piece means companies can manage their infrastructure simply and quickly.
- Open source – The architecture is based on open-sourced solutions, so there are licenses required, and it’s multiplatform—we highlighted AWS and GCP, but Azure is also an option.
This type of architecture provides an unprecedented amount of simplicity, as business experts only need to work with a single web application—Superset—with no infrastructure knowledge required. It is also highly secure, with strong encryption, negating the need for additional security contingencies.
By taking advantage of cloud platforms in this way, companies can focus on their core business and generate value out of their data, paving the way for AI and data to become core competencies for the organization, rather than investing too much time and money on complex projects and expensive software vendors.
For more information on Wizeline’s Advanced Analytics solutions, contact email@example.com or firstname.lastname@example.org.