Principal Ai/ml Engineer, Reliability

Roblox

San Mateo, CA, United States
Base: $295,250 - $345,040 usd; bonus/equity: equit...
Onsite (tuesday, wednesday, thursday; optional presence on monday and friday)
Machine learning engineering expertise
Distributed systems fundamentals
Realtime anomaly detection capabilities
Roblox is seeking a Principal AI/ML Engineer focused on reliability to enhance the platform's performance and reliability through machine learning. The role requires a strategic thinker with expertise in machine learning, particularly in improving production systems' reliability, and involves collaboration across teams

Job Summary

  • This role involves setting the 3-5 year technical strategy to leverage machine learning for improving platform reliability.
  • The engineer will own the roadmap for detecting issues proactively using massive data streams like logs, traces, and metrics.
  • Candidates must be comfortable working in undefined problem spaces while providing structure and decisive direction to teams.

Matching Summary

Match Score: 85

Roblox is seeking a Principal AI/ML Engineer focused on reliability to enhance the platform's performance and reliability through machine learning. The role requires a strategic thinker with expertise in machine learning, particularly in improving production systems' reliability, and involves collaboration across teams.

Salary

Base: $295,250 - $345,040 USD; Bonus/Equity: Equity compensation eligible; Benefits: Full-time benefits as described on careers page

Skills & Requirements

Must-have

  • Machine Learning Engineering expertise
  • Distributed systems fundamentals
  • Realtime anomaly detection capabilities
  • Time-series modeling for capacity
  • Root cause reasoning layer development

Nice-to-have

  • Comfortable with ambiguity
  • Pragmatic builder mindset
  • Inspiring leadership skills
  • Executive communication abilities
  • Curious and creative problem solver

Key Requirements

  • Expert knowledge of various modeling techniques beyond off-the-shelf solutions
  • Ability to architect infrastructure for systems that learn from feedback
  • Understanding of large-scale high-throughput distributed systems

Work Rights

Not specified

Tailored Resume

Cover Letter