Site Reliability Engineer

Nightwing

Sterling, VA, United States
Automate it operations
Log analysis and performance tuning
Linux/unix systems administration
The SRE collaboratively works closely with the contract leadership, Platform teams, and Sponsor to refine the operational and technical strategy to automate key portions of IT operations

Job Summary

  • The SRE collaboratively works closely with the contract leadership, Platform teams, and Sponsor to refine the operational and technical strategy to automate key portions of IT operations.
  • Monitor and track metrics, logs and traces across all services in the system/network and provide context for identifying root causes in the event of an incident, performance degradation, or availability issue.
  • You’ll have the opportunity to work alongside talented individuals who are passionate about what they do.

Matching Summary

The SRE collaboratively works closely with the contract leadership, Platform teams, and Sponsor to refine the operational and technical strategy to automate key portions of IT operations.

Skills & Requirements

Must-have

  • Automate IT operations
  • Log analysis and performance tuning
  • Linux/Unix systems administration
  • Network protocols and infrastructure
  • Automation tools like Ansible
  • Monitoring and logging systems

Nice-to-have

  • Cloud platform experience
  • Containerization technologies
  • DevOps principles and practices
  • Data analysis and visualization

Key Requirements

  • TS/SCI Poly clearance required
  • Proficiency in Python, Go, Java, or JavaScript
  • Experience with configuration management
  • Experience with monitoring tools
  • Experience with cloud platforms
  • Experience with containerization technologies

Work Rights

TS/SCI Poly clearance

Tailored Resume

Cover Letter