Back To Schedule
Thursday, December 7 • 4:35pm - 5:10pm
101 Ways to Crash Your Cluster [I] - Marius Grigoriu & Emmanuel Gomez, Nordstrom

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Running a kubernetes cluster requires operating many components. One must be good at running and scaling etcd, multiple control plane components, a monitoring system, a logging pipeline, Docker, rkt, and Linux itself. And this list isn't even close to being complete. With such a long list of technologies comes the potential to make a mistake that brings the whole cluster down. Come hear war stories from the Nordstrom's Kubernetes cluster admins. Each is a true story of how the cluster melted down, how they recovered, and what they did to prevent it from happening again. Don't let any of these happen to you...

avatar for Emmanuel Gomez

Emmanuel Gomez

Principal Engineer, Nordstrom
Emmanuel initiated and served as tech lead on the Kubernetes platform efforts at Nordstrom for the last three years. He was working with and advocating for containers before the Kubernetes 1.0 release and has continuously (and tirelessly) developed, operated, educated, and led containerization... Read More →
avatar for Marius Grigoriu

Marius Grigoriu

Sr Manager, Nordstrom
Marius Grigoriu leads the teams responsible for all of the major tools along the software delivery pipeline: issue tracking, version control, continuous integration and deployment, and production through the use of Kubernetes. His focus is to help teams ship high quality systems on... Read More →

Thursday December 7, 2017 4:35pm - 5:10pm CST
Ballroom B, Level 1