AI Content Creation

Our mission is to empower every person and every organization on the planet to create more! We aim to develop artificial intelligence and deep learning technologies to create a variety of content, such as text, speech, sound, music, digital human creation, etc.

Our efforts include developing fundamental methodologies for content creation and designing dedicated models for each creation task:

Fundamental Methologies for Content Creation
- Regeneration Learning, a learning paradigm for data generation, as a counterpart of representation learning for data understanding.
- Transformer Architecture, which enhance Transformer with advanced architecture designs or memory/retrieval to ease data generation.
- Diffusion Models, focusing on inference speedup and applying to discrete data generation.
Speech Generation
- FastSpeech & FastSpeech 2: Widely adopted text-to-speech synthesis models.
- NaturalSpeech: Human-level quality on text-to-speech synthesis.
- NaturalSpeech 2: Zero-shot speech and singing synthesis using continuous codec and latent diffusion model.
- TTS Survey, TTS Tutorials, and NeuralSpeech github repo.
- A book on Neural TTS.
Music Generation
- Muzic: A research project on AI music understanding and generation.
- DeepRapper, TeleMelody: Song writing, such as lyric generation, lyric-to-melody generation.
- MeloForm, Museformer: Melody generation with structure modeling.
- HiFiSinger: High-fidelity singing voice synthesis.
Digital Human Generation
- GAIA: a research project on Generative AI for Avatar.
- HiFace: 3D face reconstruction; StableFace: talking-face video generation without jittering; MemFace: alleviate one-to-many mapping using memories; DAE-Talker: high-fidelity talking face generation using diffusion autoencoder.
Text Generation
- MASS: The first pre-trained language model for sequence-to-sequence generation.
- Human-Parity on Machine Translation: Human-level quality on Chinese-English news translation.

We are hiring both FTE researchers and interns at Microsoft Research Asia! If you are interested in working with us on AI content creation, machine learning, machine translation, NLP, speech, music, digital human creation, please contact Xu Tan, xuta@microsoft.com.