Fine-grained multi-faceted control of prosodic features for TTS systems – Internship

France
NAVER LABS Europe
ago
internship TTS

DESCRIPTION

Recent works on neural TTS systems developed various techniques to finely control the generated speech features like prosody, emotions, speaker identity, etc. But, most of these tasks are learned on specific datasets with different methods, which make it not so trivial to have a single TTS system with the ability to control all these features at once.

The objective for this internship is to create a single learning procedure for various speech features to allow the creation of a single TTS model with the ability to control speaker identity, emotions, prosodic focus, etc.

The intern will work in collaboration with a team of NLP and Speech experts to design and build the model. The intern will contribute to the design of the model, the selection of relevant datasets, and the definition of the experimental settings. He or she will implement the unified model and conduct experiments to evaluate its performance.

REQUIRED SKILLS

  • PhD or research master student in speech processing, NLP or machine learning with an interest in language technologies
  • Expertise in deep learning applied to speech processing and/or NLP
  • Strong knowledge of pytorch and strong programming skills
  • Knowledge of the SpeechBrain speech processing toolkit is a plus

APPLICATION INSTRUCTIONS

Please note that applicants must be registered students at a university or other academic institution and that this establishment will need to sign an 'Internship Convention' with NAVER LABS Europe before the student is accepted.

You can apply for this position online. Don't forget to upload your CV and cover letter before you submit. Incomplete applications will not be accepted.

ABOUT NAVER LABS

NAVER LABS is a world class team of self-motivated and highly engaged researchers, engineers and interface designers collaborating together to create next generation ambient intelligence technology and services that are rich with the organic understanding they have of users, their contexts and situations.

Since 2013 LABS has led NAVER’s innovation in technology through products such as the AI-based translation app ‘Papago’, the omni-tasking web browser ‘Whale’, the virtual AI assistant ‘WAVE’, in-vehicle information entertainment system ‘AWAY’ and M1, the 3D indoor mapping robot.

The team in Europe is multidisciplinary and extremely multicultural specializing in artificial intelligence, machine learning, computer vision, natural language processing, UX and ethnography. We collaborate with many partners in the European scientific community on R&D projects.

NAVER LABS Europe is located in the south east of France in Grenoble. The notoriety of Grenoble comes from its exceptional natural environment and scientific ecosystem with 21,000 jobs in public and private research. It is home to 1 of the 4 French national institutes in AI called MIAI (Multidisciplinary Innovation in Ai) It has a large student community (over 62,000 students) and is a lively and cosmopolitan place, offering a host of leisure opportunities. Grenoble is close to both the Swiss and Italian borders and is the ideal place for skiing, hiking, climbing, hang gliding and all types of mountain sports.