Senior Engineer-ai Inference

Bank of America (GHR)

Not specified; not specified; not specified
In-office with flexibility based on role-specific considerations
8+ years software development experience
Python development on linux environment
Vllm or triton inference server expertise
Bank of America is seeking a Senior Engineer for AI Inference to join their innovative team, focusing on the development of a next-generation Gen AI platform. The role emphasizes collaborative design, implementation, and delivery of complex AI features, requiring extensive experience in AI/ML technologies and strong problem-solving skills

Job Summary

  • This role is responsible for defining and leading the engineering approach for complex features to deliver significant business outcomes within a groundbreaking Gen AI platform team.
  • The position requires hands-on experience deploying and performance-tuning models using tools like vLLM and Triton Inference Server to ensure high throughput and scalability.
  • Candidates must possess strong analytical skills to challenge conventions, solve problems, and manage multiple priorities while engaging with global teams and business stakeholders.

Matching Summary

Match Score: 85

Bank of America is seeking a Senior Engineer for AI Inference to join their innovative team, focusing on the development of a next-generation Gen AI platform. The role emphasizes collaborative design, implementation, and delivery of complex AI features, requiring extensive experience in AI/ML technologies and strong problem-solving skills.

Salary

Not specified; Not specified; Not specified

Skills & Requirements

Must-have

  • 8+ years software development experience
  • Python development on Linux environment
  • vLLM or Triton Inference Server expertise
  • Vector Store platforms like Redis FAISS
  • CI/CD pipeline implementation and management
  • Model monitoring for drift and KPIs
  • RAG framework design and implementation

Nice-to-have

  • Experience with open-source Gen AI models
  • Strong stakeholder management skills
  • Ability to mentor and coach team members
  • Knowledge of Policy as Code systems
  • Background in secure multi-tenant architecture
  • Familiarity with MCP modules for enterprise data

Key Requirements

  • Minimum 8 years of relevant experience required
  • Hands-on Python development on Linux
  • Experience with Model Ops and AI/ML delivery
  • Proficiency in modern open-source data science platforms
  • Understanding of fundamental algorithms and code optimization

Work Rights

Not specified

Tailored Resume

Cover Letter