Join the Cerebras Inference Team to participate in development of unique Software and Hardware combination that sports best inference characteristics in the market while running largest models available
Job Summary
Join the Cerebras Inference Team to participate in development of unique Software and Hardware combination that sports best inference characteristics in the market while running largest models available.
Your responsibilities will include working on model representation, optimization and compilation stack to produce the best results on Cerebras current and future platforms.
Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs, offering industry-leading training and inference speeds.
Matching Summary
Join the Cerebras Inference Team to participate in development of unique Software and Hardware combination that sports best inference characteristics in the market while running largest models available.
Skills & Requirements
Must-have
Python and PyTorch internals
Computational graphs and tensor operations
Compiler or ML graph optimization frameworks
PyTorch and HuggingFace Transformers
Large Language Models (LLMs)
C++ programming skills
MLIR based compilation stack
Nice-to-have
PyTorch, TensorFlow XLA, TVM, ONNX RT
Hardware accelerators and quantization
Multi-target inference compilation
Numerical precision trade-offs
Open-source ML compiler contributions
Key Requirements
Degree in Engineering, Computer Science, or equivalent experience
Strong Python programming skills
In-depth experience with PyTorch internals
Solid understanding of computational graphs
Experience building or extending compilers
Experience working with PyTorch and HuggingFace Transformers
Knowledge and experience working with Large Language Models