The Machine Learning Platform (MLP) provides the foundation for all of this innovation. It offers ML/AI practitioners across Netflix the means to achieve the highest possible impact with their work by making it easy to develop, deploy and improve their machine learning models.
As part of our mission to support the infrastructure for machine learning across the company, we are hiring for a Machine Learning Platform Reliability Engineer to join our team to define and lead practices for reliability, build & release productivity and platform observability. In this role, you will:
Rapidly onboard and take ownership of ML Platform repositories, build processes and delivery integrations with critical partners who ship member-scale Netflix personalization systems daily.- Identify, plan, design and execute important migrations across build & release tools, source control management and platform observability solutions.
- Establish platform SLOs and enable them through instrumentation, logging and dashboarding to detect sources of developer productivity friction and investments for improving efficiency.
- Develop strategic plans, requirements and solution designs for consolidating build & release infrastructure across multiple related machine learning teams.
- Participate in and improve ML Platform incident management and support workflows.
What we offer
Responsibility. Netflix offers true transparency and autonomy. Our culture is unique and is key to how we innovate. From day one, your expertise and opinion will be respected and valued by the team and you’ll be given autonomy in deciding the best direction to set for build & release initiatives in the team.
Learning. You will be creating ML developer tools and user experiences that have never been done before. You will have the opportunity to work with stunning colleagues who value collaboration and have a wealth of experience you can tap into.
A work environment where you can grow your career. ML Platform offers a wide variety of projects that can help find the areas you are passionate about.
Who will be successful in this role?
- You are highly customer-driven / developer-driven and empathic. You strive to always focus on delivering customer / user value with an excellent customer service mentality.
- You have a track record with detailed experience in delivering solutions for large-scale multi-language monorepo build, release and productivity tooling. You create solutions that your stakeholders love and you drive development success from planning to implementation to delivery.
- You are eager to both go deep and wide on ML-facing projects. When a project needs deep technical expertise in a domain area you are able to get up to speed quickly. When projects require breadth of focus you are eager to do what’s needed to deliver value even if it means going outside of your comfort zone.
Skills
- Demonstrated industry-leading experience in large-scale build, release, CI/CD and observability techniques, with particular emphasis on multi-language environments including Scala, Java, and Python.
- Strong understanding of modern monorepo tooling and practices, including familiarity with Bazel, Gradle, Jenkins, and GitHub.
- Strong technical writing ability (RFCs, memos, technical presentations). Ability to motivate shared buy in and understanding. Ability to drive pragmatic decisions and disagree & commit when necessary.
- Advocacy for operational best practices and experience establishing necessary observability, logging, reporting and on-call processes to support engineering excellence.
- Experience (or strong interest in) designing developer tools, CLIs and APIs that interact with complex user workflows including data and auth systems, notebooks & IDEs, compute orchestration, scientific computing or other resource-intensive backend applications.
Netflix provides comprehensive benefits including Health Plans, Mental Health support, a 401(k) Retirement Plan with employer match, Stock Option Program, Disability Programs, Health Savings and Flexible Spending Accounts, Family-forming benefits, and Life and Serious Injury Benefits. We also offer paid leave of absence programs. Full-time hourly employees accrue 35 days annually for paid time off to be used for vacation, holidays, and sick paid time off. Full-time salaried employees are immediately entitled to flexible time off. See more detail about our Benefits here.
Netflix is a unique culture and environment. Learn more here.
We are an equal-opportunity employer and celebrate diversity, recognizing that diversity of thought and background builds stronger teams. We approach diversity and inclusion seriously and thoughtfully. We do not discriminate on the basis of race, religion, color, ancestry, national origin, caste, sex, sexual orientation, gender, gender identity or expression, age, disability, medical condition, pregnancy, genetic makeup, marital status, or military service.