Software Engineer L4/L5, Model Serving Systems, Machine Learning Platform

Netflix

USA, Remote
Base: $466,000.00 - $750,000.00; bonus/equity: no ...
Fully remote
High-traffic distributed services experience
Object-oriented programming proficiency in java
Ml model deployment using triton or tensorrt
Netflix is seeking a Software Engineer L4/L5 for their Model Serving Systems team within the Machine Learning Platform, focusing on developing scalable infrastructure to support AI and ML applications. The ideal candidate should have experience in building high-traffic distributed services, particularly for online ML model inference, and a strong background in object-oriented programming

Job Summary

  • The role involves building highly scalable compute infrastructure to support Netflix's growing AI needs and model serving systems.
  • Engineers will partner with product managers, machine learning engineers, and data scientists to drive innovation in personalization and content delivery.
  • Netflix offers a unique compensation structure where employees choose their salary versus stock option split annually, along with comprehensive benefits including flexible time off.

Matching Summary

Match Score: 85

Netflix is seeking a Software Engineer L4/L5 for their Model Serving Systems team within the Machine Learning Platform, focusing on developing scalable infrastructure to support AI and ML applications. The ideal candidate should have experience in building high-traffic distributed services, particularly for online ML model inference, and a strong background in object-oriented programming.

Salary

Base: $466,000.00 - $750,000.00; Bonus/Equity: No bonuses; Employees choose annual salary vs stock option ratio; Benefits: Comprehensive health plans, 401(k) match, stock options, flexible time off

Skills & Requirements

Must-have

  • High-traffic distributed services experience
  • Object-oriented programming proficiency in Java
  • ML model deployment using Triton or TensorRT
  • Public cloud platform expertise (AWS/Azure/GCP)
  • Latency reduction and cost optimization skills

Nice-to-have

  • Experience with generative models and LLMs
  • Strong cross-functional collaboration abilities
  • Proactive communication on observability practices
  • Interest in pushing AI algorithm boundaries
  • Ability to work across global time zones

Key Requirements

  • BS/MS in Computer Science or related field
  • Experience with online ML model inference at scale
  • Proficiency in performance tuning and capacity planning

Work Rights

Not specified

Tailored Resume

Cover Letter