Research Scientist – Speech And Audio Understanding (large Models & Multimodal Systems)

Tencent

Bellevue, Washington, US
Base: $122,500.00 to $229,700.00 py; bonus/equity:...
Large speech models development
Speech representation learning
Deep learning frameworks proficiency
The role involves building large-scale, native multimodal model systems for comprehensive perception and understanding

Job Summary

  • The role involves building large-scale, native multimodal model systems for comprehensive perception and understanding.
  • You will contribute to key research areas including multilingual automatic speech recognition and speech synthesis.
  • Tencent fosters an environment where diverse voices fuel innovation and support individual and common goals.

Matching Summary

The role involves building large-scale, native multimodal model systems for comprehensive perception and understanding.

Salary

Base: $122,500.00 to $229,700.00 per year; Bonus/Equity: Not specified; Benefits: Medical, dental, vision, life and disability benefits

Skills & Requirements

Must-have

  • large speech models development
  • speech representation learning
  • deep learning frameworks proficiency

Nice-to-have

  • multimodal alignment experience
  • state-of-the-art performance
  • experience with large-scale training

Key Requirements

  • Ph.D. or Master's degree with experience
  • solid understanding of speech signal processing
  • experience with ASR, TTS, or speech translation

Work Rights

Not specified

Tailored Resume

Cover Letter