Helping Etsy Migrate Its Data Warehouse From On-Premises to Google Cloud Platform With No Downtime
Etsy is an American e-commerce company building a global marketplace for unique and creative goods. It is home to a universe of special, extraordinary items, from unique handcrafted pieces to vintage treasures. They help their community of sellers turn their ideas into successful businesses.
Etsy has a multi-year partnership with Wizeline to build core product design and development talent across their business. Through this partnership, Wizeline has helped Etsy enhance its technology and overall business.
In this case study, we explore how Wizeline helped Etsy migrate their data warehouse from on-premises Vertica to Google Cloud Platform (GCP) using BigQuery and leveraging GCloud SDK, libraries, and Google Cloud Storage (GCS) integrations with zero business impact.
The Challenge: Migrating an On-Premise Data Analytics Warehouse to the Cloud
Etsy’s on-premise data analytics warehouse was reaching end-of-life within the year. Therefore, they needed to migrate the data center (Vertica) with 450 Tb of data, 10k+ tables, 1k data sources, 500+ rollups, 1k+ dashboards, and 200 data outputs, all owned by different teams, to a cloud-native solution. This project needed to be accomplished all while keeping Etsy’s data security and culture as a top priority.
The architecture was highly interconnected, and for this migration, Etsy needed to refactor ten internal systems as well as build and migrate 3k pipelines, 2k views, and 1k SQL scripts, making this a complex migration to coordinate and execute across multiple teams. In addition, license cost was restrictive to their growth.
Our Solution: Collaborating With Etsy to Optimize Migration
The Wizeline team collaborated with Etsy on generating the architecture, design, and execution of the migration processes to shut down Vertica. Zero downtime to their daily operations was a must, so during the pre-migration phase, the team assisted with making architectural decisions and building tools to review all current uses of Vertica to identify what needed to be migrated.
We led the migration of all data, including related controls from Vertica to BigQuery, working closely with teams across Etsy. We also contributed to creating an inventory of all data and jobs, implementing the PHP client for BigQuery, exporting data through DataFlow, and modernizing Etsy’s rollup execution process.
Results: Improving System Performance and Increasing the Data Warehouse From 750 Terabytes to 1.5 Petabytes
In summary, 15 Wizeline experts worked with Etsy’s engineering team to migrate their Vertica data warehouse to BigQuery. The code of over ten internal systems was modified and refactored during the migration.
The migration to Google Cloud resulted in maintenance savings and increased the entire system’s performance. The data warehouse increased in size from 750 terabytes to the current 1.5 petabytes. In addition, the Wizeline team reviewed, upgraded, and automated Etsy’s existing data access policies. The Etsy team can now spend more time implementing systems and solutions on the cloud, optimizing the overall business process.