20 Expert Tips for Managing Your Data

Roughly 2.5 quintillion bytes of data are created every day. At this rate, it's no wonder businesses are prioritizing their data strategies and looking for the most efficient ways to unlock their data insights. This past month, Wizeline's data engineers, analysts, and scientists banded together to share their best practices for managing data.

1. “Keep your data tidy. Each row should be a unique observation, each column an attribute to describe it and each table should contain a single type of observation.” – Team

“Never modify the raw data. Store transformations that are common to more than one analysis or that help to save up response time in your frequent reports or analysis. Be mindful of the memory vs CPU trade-off while keeping your deliverables in mind at all times.” – Said Montiel, Data Engineer

“Understand the difference between big data and data pipelines is the key to success. choosing the right technology will deliver a balance between cost, time and efficiency job, always have in mind that the most important part of a script is made thinking and not typing.” – Rodrigo Chaparro, Data Scientist

“Automate and document the transformations your data goes through from raw to results. This way your work is easy to track and reproduce. Documentation is vital to ramp up new members and to find improvements overtime.” – Saul Flores, Data Engineer

“When working with Big Data you might reach the limits of a particular technology or implementation. Instead of worrying keep investigating. Maybe your issue lies somewhere between a configuration and a technology switch over.” –  Abraham Alcantara, Data Engineer

. “You might not need Artificial Intelligence straight away. You can start with statistical analysis and automation. Get to know the data you have and how it is currently being used. Once you’re done with that, keep doing it and see where it takes you next.” – Tania Reyes, Data Engineer

“When optimizing your automated Data Pipeline, consider the trade-off between memory and CPU as important as cost vs time.” – Team

“Machine Learning is not a magical plug-in box. It takes work and patience to find and validate a pattern in the data. Maintain focus, be patient and always keep your eyes on the goal.” – Juan Orozco, Data Scientist

“The quality of the model produced, highly depends on the quality of the data provided to it. Keep an eye on the amount, frequency and type of data collected. – Ana Costilla, Data Scientist

“Whenever possible, aim to present only one idea per plot or number when constructing the story with your analysis.” -Team

“When building business intelligence solutions; If the report you are making is for monitoring, ensure you also provide information on how to take action in the known scenarios.” – Ana Costilla

“Build compelling stories with your data! Numbers mean nothing without context, insights, and purpose.” – Team

“Ensure all the information portrayed in a report can trigger an action or helps to build the case for a story. Don’t clutter it with irrelevant numbers that don’t really add value.” – Rosa Maria Muñoz, Data Engineer

“Never add complexity to a model unless you’re sure it will pay off. Start simple, it’s better to build an elementary baseline to serve as a benchmark and iterate from there.” – Edgar Arenas, Data Scientist

“Always track and measure metrics of your data pipelines. This way you can have quality control to ensure your pipeline is operating efficiently.” – Anselmo Rangel, Data Engineer

Governance: Define a source of truth for your data! Good data governance ensures quality control.” – Team

“The tech world is so full of buzzwords and fraudsters that it is really common for traditional industries, to think that a data-driven transformation will be easy, cheap and fast. On the contrary, it will take time, it will be painful and it will be costly but, believe, the end result will be worth it.” – Luis de Alba, Head of Data Practice

“I’d like to share a quote by Sir Arthur Conan Doyle: ‘It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.'” – Luis de Alba

“Section your data lake in zones that make sense for your project. You’ll typically have raw, trusted and production versions of your data.” – Daniel Morales, Data Engineer

“Be conscious of data storage checkpoints. You need to always be able to trace everything back in between transformations.” – Daniel Morales

Thanks for following along this month and stay tuned for more content from our data experts. If you would like to learn more about working with our data team, visit or contact us at

Nellie Luna

Posted by Nellie Luna on September 3, 2019