Building software at Etsy

Feb 25, 2016 • Lauren Sperber

Last week I had the pleasure of traveling to Etsy’s Dublin office to speak at CareerZoo, which is a large career fair. I gave two presentations about Etsy’s work culture: One called “Crafting an Empowering Environment,” which I’ll publish next week, and this one, which I titled “Building Software at Etsy.”

Huge thanks to Lara Hogan, Moishe Lettvin, Ian Malpass, Jeremy Pharo, Russ Posluszny, and Stefanie Schirmer for their feedback on early drafts of this talk, and to Jennifer Butler, Tara Hayes, Lara Hogan (again!), and Michael Rembetsy for providing the opportunities and support I needed to give it.

I illustrated my slides with photos from the shops of Irish Etsy sellers, whose work you’ll see in the transcript below with links to their shops.

See the slides

Transcript

The opinions expressed are mine and do not necessarily reflect those of Etsy. All stats referenced are valid as of September 30, 2015.

Hi everyone! My name is Lauren Sperber and I’m a senior software engineer at Etsy. If you’re not familiar with Etsy, we’re an online marketplace where people connect to buy and sell unique handmade and vintage goods. We help 1.5 million active sellers make a living by providing a platform that helps them run their creative small businesses.

I’m going to tell you about the culture of engineering at Etsy and shared values that enable our engineers to work together productively and happily. I’ll start by giving you an overview of my daily work life and then sharing the principles of Etsy engineering that help me work this way.

How I Work

working in the Etsy office

There are two main types of programming work at Etsy. Product engineering directly affects the features that buyers and sellers see on Etsy.com and our mobile apps, while infrastructure engineering creates the underlying architecture that product engineers work on. I mainly do product engineering work, so I’ll give you a sense of what the daily work of a product engineer looks like.

by CelticValleyCeramics

Every programmer at Etsy has a “Virtual Machine,” which is our own personal version of Etsy.com. It has all the same code as the real thing, but uses test data instead of the real data that our buyers and sellers are using to do business. I edit code on my Virtual Machine and make sure it works the way I’m hoping. Once I have finished the smallest part of the feature I’m making, such as an API endpoint to supply the data, or a controller to call the endpoint and send the data to views, I make a small commit.

When I’ve made a few small commits that together create a feature or resolve a bug, I make a pull request on GitHub and email it to the programmers on my team. My coworkers comment on the pull request with suggestions or questions and I continue pushing commits based on their comments to the review branch until we all feel good about the changes.

Next, I run our automated tests to make sure my changes didn’t mistakenly affect other areas of the site. Testing your code isn’t formally required at Etsy, but there is cultural motivation to do so. These tests aren’t comprehensive, but they ensure that the main features of Etsy still work after my change. They also verify I haven’t made any syntax errors that my coworkers might not have spotted in the code review.

from ooakie

A little background: Etsy has more features on it than anyone—even people who work for Etsy—can see at a given time. We have a configuration system in which we keep an array of all the the features whose availability we want to be able to control easily, including features that are still under development.

My new code will be hidden behind a feature flag, so it’s unlikely that any buyers or sellers will be affected by my change. This configuration setup allows us to deploy our code as soon as it’s written and tested—a working style commonly called “continuous deployment.” Since we have about 200 programmers continuously deploying, the website gets updated more than 40 times a day!

To organize the many programmers trying to deploy at any given time, we use IRC, which is a group chat program. We have a channel called “Push” that creates a queue for us to deploy in groups. The first person in each deploy group is in charge of driving their push. If that person is you, you’ll simply click a button to deploy the code to our staging environment, which is called “Princess.” Princess has the same code and data as Etsy.com, but is hosted by a separate server, so we can test our new code there without affecting all of Etsy’s users.

Once we’ve all confirmed our changes on Princess and the automated tests pass, the push driver presses another button to deploy to the production site. We check our code there and watch the error logs and graphs until we’re sure that we haven’t caused any unexpected issues. Once the driver declares the push complete, the next group can deploy.

by TheCuriousNeedle

Once my team has developed a new feature enough to see how it works for our sellers, we have two main options for finding out how it works. One is creating a “prototype group” of sellers who we ask to test out the new feature and let us know how it works for them. This is helpful for getting qualitative feedback from our top sellers, who can leave the group if the revised feature isn’t working for them. Developing a new feature collaboratively with sellers in a prototype group helps us deepen their engagement with Etsy and often they help us evangelize the new feature to other sellers as it gets released to more people. The other option is to turn on new features for a small percentage of users to see how they affect rates of viewing, favoriting, purchasing, and other important metrics. This is helpful for getting quantitative feedback on how the changes affect users’ behavior on the site.

After validating that a new feature has improved the experience for our buyers or sellers, we’ll make it available for all users, but continue to improve on it based on feedback and data.

Etsy’s Engineering Practices

Now that I’ve given you a sense of how product engineers build and release new features for Etsy, I want to talk about some of the major practices for building software at Etsy. The first two are deeper looks at some daily practices I mentioned in my overview of my daily work life.

Code Reviews

from Victorianaprint

I send almost every line of code I write out to my coworkers in a pull request on GitHub. Even if I make a one-line change to a feature flag configuration, I try to get at least one set of eyes on it before deploying so that there’s a chance for someone to catch a non-fatal typo or logical error. This is super easy to do by creating a GitHub gist of the change and pasting it in our team IRC channel.

For more significant change sets, such as creating a new API endpoint or a new set of controllers and views, the code review is even more important.

gif of a woman laughing evilly at a pink laptop

Reduce Ego

There’s no single correct solution to any programming problem, especially within an environment as complex as Etsy’s very large ten-year-old code base. By inviting my teammates to collaborate on my code, I’m working to ensure that the code I push out is the best overall solution for the problem I’m trying to solve, not just the one that I thought of first, or the one that I thought was the most clever.

Reduce Cleverness

Code reviews have an enormous impact in reducing excessive cleverness in code. One of the issues I look for when doing a code review for a coworker is any time I have to stop, scratch my head, and re-read the code to figure out what it’s doing. This is a possible indicator that the code could be simplified, or could use re-use existing tools in our codebase.

gif of Scully from the X-Files reading a book about aliens

Increase Readability

Code gets written once and read many, many times over by other programmers looking to track down the source of a bug or add functionality to an existing feature. Code reviews help to ensure that classes, methods, and variables are named with an eye to long-term clarity. If the reviewer doesn’t understand what I meant with a variable name, there’s no way that the next person to read it will either—even if that next person is myself in a few months. Reviewers also often suggest opportunities to introduce type hints and comments to further clarify what the code is intending to do.

Increase Awareness

Perhaps the most important benefit of code reviews is increasing communal awareness of the code. If I’ve read the code that my teammates wrote carefully before they push it, I have more context to debug it if something goes wrong later, or to work with it to add new features.

Reading my colleagues’ code consistently also helps me learn how my other programmers approach problems and expands the breadth of ideas at my disposal.

Continuous Deployment

by Kellan Elliott-McCrea

As I mentioned earlier, I deploy code as I’ve gotten a code review and run integration tests. The live site is usually deployed more than 40 times a day.

If you’ve never worked this way, updating a site with millions of dollars in sales per day that frequently might sound a bit terrifying, but deploying so often actually helps us keep the website stable.

animated gif of Moss from The IT Crowd working while his desk is on fire

Reduce Fear

I personally update Etsy.com between 1 and 5 times per day, so it’s a normal part of my work day. Since I do it so frequently, I don’t feel nearly as nervous as I might when deploying another website that changes less frequently.

Reduce Risk

If something goes wrong, either due to a deployment or an unexpected situation, we can push a fix out for it almost as soon as we have it ready. Since the changesets we deploy are relatively small and contained, it’s easier to detect if something’s gone wrong without QAing the entire site.

animated gif of Beyonce flipping her hair

Increase Experience

Since all of our programmers deploy frequently, there are a lot of people who know how to help if something goes wrong.

Increase Confidence

By pushing out the smallest possible changes as frequently as possible, we ensure that our changes have a limited range of impact and are easy to test.

Architecture Reviews

by DublinStreet

The next two engineering practices I’ll discuss are higher-level practices that take place less frequently than code reviews and deployment. The first is Architecture Reviews. When a group of engineers feels that they need to implement a new architecture that departs in a significant way from existing technology or processes, they document their proposed design, highlighting the goals, trade-offs, and rejected alternatives, then hold workshops with other engineers from across the company to get feedback on the proposal. After the submitting engineers feel confident in their proposal, they hold an Architecture Review meeting that is open to all engineers at the company. Those proposing the new architecture present for only 10 minutes and allow 50 minutes of questions and answers from other engineers.

animated gif of a cat chasing a laser toy

Reduce Novelty

By going through this process for every new architectural decision, we ensure that we use the most simple and well-known solution unless there is no other option. We avoid switching technologies purely because they are interesting or exciting.