Machine Learning - Speech

Unknown
Treeleaf
ago
fulltime ASR Speaker TTS

JOB RESPONSIBILITIES

  • Working in a high growth company.
  • Build, improve and extend speech models which can include speech-to-text models, text-to-speech models or speech analysis models.
  • Ability to understand and implement state-of-the-art academic research papers and apply novel algorithms to large volumes of real-life data.
  • Work closely on product delivery roadmap, taking it from development to production in collaboration with engineers, researchers, technical leads and architects.
  • Help the team to improve upon current methods and models.
  • Have a practical mindset and are able to bring these models into a production environment.

REQUIREMENT AND QUALIFICATIONS

  • Master’s degree or PhD in computer science, mathematics, engineering, computational linguistics, or work experience of minimum 2-3 years.
  • Provable experience in deep learning, speech processing and NLP (e.g. Kaggle competitions or spare-time projects).
  • Understanding of signal processing with application to speech and audio processing.
  • Experienced with acoustic modeling and language modeling.
  • Good knowledge of and experience with Python and/or C/C++.
  • Strong linguistic background and analytical mindset.
  • Practical mindset and are willing to get your hands dirty and understand the difference between fundamental research and data driven development.
  • Work independently and take matters into your own hands.
  • The ability to quickly learn new technologies and successfully implement them is essential.

PREFERABLE

  • Working knowledge of TensorFlow or Keras.
  • Having built or have been working with an automatic speech recognition (ASR) toolkit such as Kaldi or DeepSpeech is considered a strong plus.
  • Expertise in some of the following speech tasks: speech-to-text, text-to-speech, emotion recognition, personality recognition or speaker diarization.
  • Good understanding or hands on experience of speech preprocessing, noise-robust speech processing normalization techniques, speech related techniques (e.g. HMM, weighted FST, Viterbi,...).
  • Fluency in phonetics and making phonetic transcriptions.
  • Git.
  • JIRA or similar agile tools.
  • Comfortable working in a Linux environment.