Partner with Research Scientists to turn research ideas into working systems, building the data, tooling, and infrastructure that enable rapid iteration, trustworthy evaluation, and a smooth path from prototype to production
Job Summary
Partner with Research Scientists to turn research ideas into working systems, building the data, tooling, and infrastructure that enable rapid iteration, trustworthy evaluation, and a smooth path from prototype to production.
Focus on two research areas: World Models for Observability and Trained Agents for Observability, tackling high-risk, high-reward problems grounded in real-world challenges.
Collaborate with Research Scientists, Product, and Engineering to integrate capabilities into Datadog's products and contribute to research publications at top-tier conferences.
Matching Summary
Partner with Research Scientists to turn research ideas into working systems, building the data, tooling, and infrastructure that enable rapid iteration, trustworthy evaluation, and a smooth path from prototype to production.
Skills & Requirements
Must-have
multimodal data pipelines
training and evaluation infrastructure
distributed training with Ray
large-scale model training
Python and systems language proficiency
Nice-to-have
strong software engineering skills
passion for pushing AI boundaries
customer impact and scalable deployment
bridging research prototypes to products
Key Requirements
depth in distributed computing, RL Infra, and ML systems
practical experience implementing and operating ML systems
experience with large-scale model training and fine-tuning
experience supporting or contributing to research publications