Ai Inference Engineer

726

Not specified
Proficiency in python c++ rust or golang
Experience with vllm tgi nvidia triton
Hardware optimization for nvidia gpus tpus
F5 is seeking an AI Inference Engineer to optimize Large Language Models (LLMs) and enhance AI capabilities in various environments, focusing on performance and scalability. The ideal candidate will have expertise in high-performance AI workflows, infrastructure technologies, and hardware optimization

Job Summary

  • The AI Inference Engineer plays a critical role by bridging the gap between high-performance model development and optimized deployment environments.
  • This position focuses on optimizing Large Language Models for inference across diverse environments ranging from GPU-rich data centers to resource-constrained edge devices.
  • Success is measured by the ability to deliver low-latency, scalable, and high-performing AI prediction systems that align with strategic objectives.

Matching Summary

Match Score: 85

F5 is seeking an AI Inference Engineer to optimize Large Language Models (LLMs) and enhance AI capabilities in various environments, focusing on performance and scalability. The ideal candidate will have expertise in high-performance AI workflows, infrastructure technologies, and hardware optimization.

Skills & Requirements

Must-have

  • Proficiency in Python C++ Rust or Golang
  • Experience with vLLM TGI NVIDIA Triton
  • Hardware optimization for NVIDIA GPUs TPUs
  • Kubernetes Docker cloud infrastructure expertise
  • Designing auto-scaling inference architectures

Nice-to-have

  • Experience with Speculative Decoding PagedAttention
  • Contributions to open-source inference libraries
  • Background in MLOps or SRE roles
  • CUDA or Triton kernel development skills
  • Apple Silicon CoreML optimization experience

Key Requirements

  • Proficiency in Python C++ Rust or Golang
  • Hands-on experience with vLLM TensorRT Llama.cpp Ollama
  • Strong familiarity with Docker Kubernetes AWS GCP Azure
  • Comprehensive understanding of GPU and AI hardware profiling

Work Rights

Not specified

Tailored Resume

Cover Letter