Data as Tech Debt: Simplicity Scales, Complexity Doesn’t

Data as Tech Debt: Simplicity Scales, Complexity Doesn’t

In the current economic landscape, many businesses are missing out on the significant amounts of money “stored” in their legacy data strategy, data architecture, and overall enterprise data inventory.  After conducting extensive research on the issue of ‘Data as Tech Debt’ and consulting with various large companies, we know that many enterprises are still struggling with the accumulation of tech debt in IT systems, organizations, and applications.

We have developed a three-step framework that tackles data tech debt head-on to address this issue. It involves assessing the existing data landscape, developing efficient data pipelines, and consolidating data for intelligent insights and monetization. This blog will explore how this framework helps businesses overcome data tech debt and unlock hidden value in their data assets.

Recognize and Understand the Problem by Assessing Data Tech Debt

A data tech debt assessment is necessary to unpack the hidden costs across the enterprise; once uncovered, this legacy spend can be better utilized for high transformational impact — a reduction in maintenance/operating expenses to fund transformative work to an organization’s data footprint. Here are some possible sources of data debt:

  • Data centers 
    • Contracts with 3P providers (hosted, owned property, outsourced management)
  • Software licensing (Informatica, Hadoop, Cloudera, etc.) 
  • Legacy platforms (database and mainframe transformation) 

The following can help guide the data debt assessment process:

  • How many total databases are there l? Or what is the total  IT spend towards database platforms?
  • What Relational Database Management Systems (RDBMS) are being used? Are any licensed?
  • What NoSQL (Key-Value, Document, Graph, etc.) databases are used? Are any licensed?
  • What database(s) infrastructure supports the products and services?
  • How can you scale your back-end DBMS?  How far down the growth path do you estimate that you have gone?  (Asked another way, if you had to grow your database capability by a factor of 10, how hard would this be to achieve?)

The goal of the assessment is to determine what, from a data perspective, is negatively affecting the business. We align ourselves with the customer’s strategy and work from there to make recommendations on better leveraging the existing data assets.

After Data Tech Debt Identified, Move On to Building Data Pipelines that Aren’t Dead Ends

In the past, you’ve built data pipelines that lead to multiple dead ends in various systems, making your data siloed and difficult to access and extrapolate value from. Now that you’ve identified all your data tech debt, it’s time to build the right data architecture to support you across every stage of the data life cycle. 

The software characteristics embraced by the Reactive Manifesto for mobile applications are desirable and applicable in the Data Engineering field since their adoption enables elastic, responsive, resilient, and message-driven ETL pipelines, as explained below.

  1. Responsive ETL/ELT provides high availability and low latency through a cloud-based infrastructure and well-designed data lifecycle management and data aggregation tactics.
  2. Elastic ETL/ELT provides a variable process capability that can scale up during high-demand picks and can scale down on idle periods.
  3. Resilient ETL/ELT allows keeping the loading and transformation processes running even when one of the components becomes unavailable; this can be achieved through parallel/distributed processing and redundancy.
  4. Event-driven ETL/ELT allows handling asynchronous and parallel processing typical of lambda architectures where a fast lane must be enabled for real-time processing while keeping a slow lane for batch processing and time-consuming analytics.

The Wize Journey to Becoming an Insights-Driven Organization

Modernizing your ETL pipeline is just the first step towards optimizing your organization’s data TCO. Using technology to generate new ways to visualize, analyze and enrich data with ever more sophisticated insights is the best way to increase its value. If you can provide value to your internal users, then you can also make some money out of it. Stay tuned for more insights on monetization.

Tajma Brown

Posted by Tajma Brown on June 12, 2023