Mike McGarr is an engineering manager at Netflix, overseeing a team responsible for the tools that Netflix engineers use to build and deploy code. It’s a critical function at the streaming media company — ensuring that engineers are as productive as possible, and armed with the automation tools they need, while also helping to manage the dependencies between different teams at the company.
It’s also a unique challenge, given the way teams work at Netflix, in a “highly aligned, loosely coupled” manner, with different cadences for deployment, and different priorities in the network.
McGarr will be sharing insights from his experience at Netflix as one of the speakers at the upcoming ALM Forum, a conference taking place from May 18-22 at the Bell Harbor Conference Center in Seattle. In an interview with GeekWire, he offered a preview. Continue reading for excerpts from his comments.
The Netflix Way of deploying code: The (product) teams take on the responsibility for managing the service and deploying the service and delivering a product to production. With that responsibility they have the freedom to choose the production platform that they want. Our responsibility then is to provide a compelling set of tools and services that allows them to do their job more effectively, and focus on the business logic and the functionality and not have to worry about the build and delivery pipeline of those services.
What’s unique about Netflix is that we’re very much a free market economy, internally, as far as tooling goes. A lot of organizations will say, ‘You’re going to use this tool and everybody is going to conform to this way of doing things.’ We don’t have that challenge. It’s unique.
The importance of ‘distributed innovation’: Putting one team in charge of innovation will bottleneck your innovation. Teams may say that they have the bandwidth to experiment with this new way of deploying. Some teams have failed trying to do that and have come back to the paved road (working with tools from the internal Netflix team). Some teams have been very successful, and we’ll say, “That’s a great lesson learned, let’s incorporate that.”
We’re very much a free market economy, internally, as far as tooling goes. A lot of organizations will say, ‘You’re going to use this tool and everybody is going to conform to this way of doing things.’ We don’t have that challenge. It’s unique.Working with Amazon Web Services: Almost every single open-source tool that we have has been built on top of AWS. Our experience has been that AWS works great. Any tools we build on top of AWS have been to make sure that we can get AWS to work the way we want it to work for Netflix.
An example of this is Asgard. We built Asgard because AWS does not have a definition of an application. We wanted to two sets of ASGs (Auto Scaling Groups) to be defined as two different versions of the same application, and there’s no way to do this within AWS. So we devised a system on top of AWS, to provide this construct for our engineers.
Another example is the process we use for baking. We’ve developed a tool called aminator which allows us to take application code and do what we call baking — which means taking that code and putting it into an AMI (Amazon Machine Image) and storing that in Amazon. You can do that through scripting, but we built this tool to enable our engineers, every time they deploy code, to take that application code into an AMI, store that AMI, and every deployment is not deploying new code, per se, but it’s deploying a whole new OS image into an ASG and scaling that ASG up.
Keys to working with product teams: It really is about breaking down the walls, and making sure that someone on your team is working with your customers closely. We’re exploring the idea of how we embed with our customers. We have a lot of collaboration and a lot of meetings, but we’re looking into taking that further and putting one of our engineers into a service team and say, deliver a feature but work with them. You get a really good insight into how that team operates and the challenges that they have using your own code, that they wouldn’t have communicated. You see it in day-to-day interaction. That’s an example of something we’re moving towards.
How to succeed in this environment: The biggest part is, if you don’t consistently deliver, or if you deliver the wrong thing, your customers won’t trust you, they won’t come to you — it will be a downward spiral of not being able to get the right information from your customers, because they don’t believe you’ll be able to deliver. It’s about communication, it’s about outreach, but it’s also about building trust.
Editor’s Note: ALM Forum is a GeekWire advertiser.