Senior Engineer-ai Inference

Bank of America (GHR)

Multiple Locations
Gen ai inferencing capabilities
Python development on linux
Vllm/triton inference server deployment
Join a groundbreaking team at Bank of America, at the forefront of innovation in AI, building the next generation of Gen AI platform

Job Summary

  • Join a groundbreaking team at Bank of America, at the forefront of innovation in AI, building the next generation of Gen AI platform.
  • This position is focused on design, build, and serve the Gen AI inferencing capabilities, responsible for defining and leading the engineering approach for complex features to deliver significant business outcomes.
  • We value curiosity, collaboration, and a passion for pushing the boundaries of what’s possible with AI, offering opportunities to learn, grow, and make an impact.

Matching Summary

Join a groundbreaking team at Bank of America, at the forefront of innovation in AI, building the next generation of Gen AI platform.

Skills & Requirements

Must-have

  • Gen AI inferencing capabilities
  • Python development on Linux
  • vLLM/Triton Inference Server deployment
  • Model Ops and design
  • RAG for knowledge bases

Nice-to-have

  • Curiosity and collaboration
  • Pushing boundaries of AI
  • Challenging conventions
  • Agile Practices
  • Stakeholder Management

Key Requirements

  • Minimum 8 years of relevant experience
  • Experience in Model Ops and design
  • Hands on experience in Python development on Linux
  • Experience deploying models using vLLM/Triton Inference Server
  • Experience with serving multiple tenants/clients with model endpoints

Work Rights

Not specified

Tailored Resume

Cover Letter