NVIDIA is seeking an Inference Optimization Architect for its Speech AI team, focusing on enhancing the performance of Speech AI models by optimizing inference latency and resource utilization. The ideal candidate will have extensive experience in deep learning model optimization, particularly in areas such as CUDA development and model serving.
Must-have
Nice-to-have
Not specified