GEICO is seeking an experienced Engineer with a passion for building high-performance, low maintenance, zero-downtime platforms, and applications
Job Summary
GEICO is seeking an experienced Engineer with a passion for building high-performance, low maintenance, zero-downtime platforms, and applications.
The Senior Staff Engineer in Availability and Incident Management will design and deploy machine learning systems that enable intelligent incident detection, automated root cause analysis, and predictive reliability improvements across the platform.
We offer compensation and benefits built to enhance your physical well-being, mental and emotional health and financial future.
Matching Summary
GEICO is seeking an experienced Engineer with a passion for building high-performance, low maintenance, zero-downtime platforms, and applications.
Salary
$110,000.00 - $230,000.00
Skills & Requirements
Must-have
multi-agent AI platform
LLMs and agentic frameworks
vector database solutions
end-to-end ML pipelines
site reliability engineering
large-scale distributed systems
Nice-to-have
psychological safety and continuous improvement
proactive reliability improvements
agent-to-agent collaboration
explainability, governance, and human-in-the-loop controls
Key Requirements
10+ years of professional platform development
8+ years of experience with architecture and design
6+ years of experience building and deploying machine learning systems
6+ years of experience in open-source frameworks
4+ years of experience with AWS, GCP, Azure
2+ years of experience with LLMs, agentic AI frameworks
Bachelor’s degree in Computer Science, Information Systems, or equivalent