Have you ever thought of what it takes to move 97 petabytes to the cloud? Now think about those 97 petabytes of stored data flanked with 300 terabytes of memory and 90,000 CPU cores, and you have the exact size of our Hadoop environment. Over the next year, our engineering team will be migrating this environment and several others to Google Cloud Platform (GCP), a major change to the way we work today that will set us up for greater success in the future.
Why are we doing this?
In the 10+ years that I have worked here, this is easily one of the largest team-wide initiatives we have ever seen. It’s a huge and daunting project, but also an exciting challenge. Here’s why we’re moving to the cloud:
- Sleep better with better Business Continuity / Disaster Recovery (BCDR). LiveRamp currently maintains much of its own hardware at a San Francisco-based colocation facility. Creating a comprehensive backup and recovery strategy has proven a difficult task, and having our crown jewels resting in an earthquake zone makes us nervous. With our move to GCP, we look forward to multiregional data replication being a breeze compared to what we have today.
- Move faster. It’s dramatically easier to try new architectures or prototype new products in the cloud. What is a couple of clicks in the GCP console could be weeks or months of planning and budgeting at the colo.
- Focus on what we’re good at. The less time we spend doing things we can reasonably pay someone else to do for us, the more time we’ll have to focus on problems important to the business.
As our GCP migration will involve every member of our team pitching in at some point, we made sure to clearly communicate the value of this migration and evaluate cloud providers together. During the evaluation process, we engaged stakeholders across the engineering team. In particular, we had detailed conversations with members of our pixel-serving team, since many of their systems are already cloud-based.
Along the way, we documented what we liked about the providers we met with. We wanted to be sure that everyone on the team was aware of this massive project, and that we understood stakeholders’ perspectives and earned their buy-in before making our final decision.
Why did we choose GCP?
We chose GCP based on three main criteria:
- The awesome GCP team. We worked with amazing people at Google throughout the RFP process. They took every opportunity to go above and beyond to meet our needs, an attitude that will serve us well throughout the migration and beyond. We think of Googlers as LiveRampers, and we’ve heard from several people at Google that they call us Googlers. We feel like one team with similar cultures and values, and look forward to exploring new opportunities together in the future.
- Mature managed Kubernetes offering. We make heavy use of Kubernetes to run our applications. While other cloud providers have recently added managed Kubernetes services, Google’s GKE has been in production since 2014. Considering we expect to see our Kubernetes footprint expand dramatically over the next few years, we wanted a rock-solid platform to manage Kubernetes for us. This alone will save our DevOps team a lot of time, directly addressing some of their current pain points.
- Google’s Global Network. Google’s world-class fiber network was a major selling point. We benefit significantly from the massive infrastructure that Google has built for themselves, ensuring low latency for our publisher integrations from around the world. We no longer have to worry about scaling up bandwidth in order to grow our business.
What’s our GCP migration game plan?
We’re very early in the process, so we expect a lot of surprises. To minimize unpleasant ones while we’re working with teams, we’re doing as much as we can to plan ahead and front-load as much of the highest risk tasks as we can. As we have very little “cold” data, any strategy that involves a very slow, asynchronous copy of data would be extraordinarily difficult to manage. We therefore plan to migrate team by team, having whichever team is on deck modify their system to copy its input data from our colo before commencing its normal processing. When the team coming before it has also migrated, it can remove this step and read its data directly from GCP.
We are preparing each team for migration by providing each with the resources and expertise of our infrastructure team and Google’s professional services engineers. We will also be holding training sessions, both in our offices and on-site at Google, focused on GCP offerings and other topics that will help teams proactively prepare their applications for migration.
This is a significant undertaking, not just for our engineering team but for our company as a whole. It’s a great opportunity to share what we’re up to with all LiveRampers and how it impacts their work. We hope it will open up new avenues of discussion and collaboration.
We’re still close to the starting line, but the GCP team has been great to work with, showing that they’re just as committed to meeting our deadlines as we are. They’re proactive about scheduling technical deep dives and game to work with us on our customization ideas. This is right in line with LiveRamp’s culture, and we couldn’t be happier to learn from and work with Google.
We’ve hit the ground running and look forward to sharing our progress here. If you’re interested in joining our team and working on challenging projects like our GCP migration, check out LiveRamp’s current job openings!