As part of the company's regular engagement with the wider coding community, Etsy engineers Maggie Zhou and Melissa Santos recently told an audience at O'Reilly's OSCON open-source programming conference in Amsterdam exactly how Etsy successfully updates its technology to meet growing data demands.
The talk by Zhou and Santos was one of OSCON's sessions on collaboration in software teams. In their talk, entitled 'Surviving technology transitions: Adding and (more importantly) removing tools from an existing stack', Zhou and Santos highlighted a few case studies from Etsy that showed how their team implements software reliability testing and data migration.
The Etsy team uses open-source software and is committed to keeping its coding practices transparent.
Zhou is a software engineer on Etsy's core platform team and has previously held similar posts at Google and IBM. She maintains the company's infrastructure.
Santos, a data engineering manager at Etsy with a PhD in applied math, teaches the engineers and non-technical team there how to retrieve the data they need for analytics. She has over 10 years' experience with data modeling and data extraction, transformation, and loading (ETL) processes.
The most important tool that a data engineering team should have, Zhou and Santos argue, is a process for both upgrading to new software and removing old packages.
Devising and adhering to a defined tech transformation workflow helps outsiders understand and accept the engineering team's decisions.
They also say working on improving communication helps overcome technical and political problems that come along with removing and adding technologies in the workplace.
Here are the Etsy team's other guidelines for successfully transitioning technology in an existing data stack:
Evaluate the upgrade's significance carefully
First, list the requirements of the new software. Next, list its benefits to the existing data operation, and then assess other options.
Finally, review the stack's architecture and expected operation. Zhou and Santos cited Etsy's new ETL process that its team implemented on top of an open-source messaging system.
The team listed several failure scenarios and wrote down the data stack's expected behavior, which formed the basis for validation testing. Because the new process has a distributed architecture, Etsy can now analyze its data events in near real time.
Upgrade comfortably by extensively testing the new technology
Last year, Etsy wanted to increase the amount of traffic its API could support 20-fold. So when it heard Facebook's engineering team had successfully sped up its back-end performance with its HipHop Virtual Machine (HHVM) software, Etsy decided to give it a try, too. The key was to "gain confidence" in HHVM's ability to run correctly. So Etsy rolled it out slowly, only running HHVM with Etsy's internal APIs at first.
Retiring old tech brings up political issues in the team
Etsy wanted to transform part of its data-stack code from one language to another. It required an extensive data migration and brought up even more data-stack consolidation tasks. In the end, changing coding languages became a part of a much larger two-year infrastructure overhaul.
Perhaps the largest takeaway from Zhou's and Santos's talk was that no transition is immune from outages and postmortems.
All Etsy's upgrades and retirements bring attention to lesser-used parts of its data stack as well as to limitations in its backbone software.
As Etsy navigates its successes and failures, it will continue to share its know-how with the coding community. Etsy's open-source spirit may also help improve its programming craft.