Senior Site Reliability Engineer / SRE (m/f)
Wayfair is a leader in the e-commerce space for all things home. We live and breathe modern technologies. We are a “move fast break things, rethink old standards” team with a startup feel but working with platforms at a massive scale.
We’re looking for smart, logical thinkers who produce and advocate for performant and scalable architecture. We care about thought leadership, community involvement, and the ever-changing SRE landscape. We’re particularly interested in engineers who can help us develop our Platform scaling and Config management strategy and help us adopt, implement and support popular mainstream configuration management platforms like HashiCorp Consul, Puppet, HashiCorp Vault into our existing infrastructure for the purposes of automation and ease of use for both internal and external stakeholders
On the Platform Scaling team as a Senior Site Reliability Engineer in the configuration management you’ll have a multitude of opportunities to flex your strengths as well as learn new things while directly assisting our internal customers. We contribute to (and create) bleeding-edge open source projects and continuously push the envelope to explore the future of e-commerce and modern infrastructure systems. Our current scale is in 20,000+ systems comprising 50+ platforms and services (and growing fast!) across multiple global geo locales and GCP regions.
What You’ll Do:
- Manage central platforms as a service for rapid growth and scale that enable a developer community of 2,000 write and deploy code multiple times/day
- Develop monitoring, define SLAs, SLOs and error budgets for mission critical platforms while helping coordinate product launches and reliability exercises
- Write clean, high-performance, and well tested, infrastructure code with a focus on reusability and automation (Shell, Python, GoLang, Puppet)
- Help determine the future roadmap of platforms and services in service discovery, configuration orchestration, and secret management
- Create and maintain detailed documentation for both self-service and onboarding
- Help build our team out by mentoring junior engineers and help develop their skills while assisting them on projects
What You’ll Need:
- Experience in systems and/or software engineering and the SRE and DevOps paradigms
- Experience in one or more programming languages used in modern infrastructure paradigms (Ruby, Python, Go, PHP, etc.), as well as familiarity with version control platforms such as Git
- Experience working with configuration and orchestration management tools (Puppet, Ansible, HashiCorp Consul and HashiCorp Vault)
- Experience deploying and managing infrastructure within a public cloud provider as a part of a hybrid environment with high availability requirements
- Expertise in performance testing tools and SRE best practices
Wayfair believes everyone should live in a home they love. Through technology and innovation, Wayfair makes it possible for shoppers to quickly and easily find exactly what they want from a selection of more than 10 million items across home furnishings, décor, home improvement, housewares and more. Committed to delighting its customers every step of the way, Wayfair is reinventing the way people shop for their homes - from product discovery to final delivery.
Wayfair generated $5.7 billion in net revenue for the twelve months ended June 30, 2018. Headquartered in Boston, Massachusetts with operations throughout North America and Europe, the company employs more than 9,700 people.Tag: ENGR-INFR