Research
My research focus is on speech synthesis and deep learning, with a particular emphasis on analyzing and manipulating speech. Some of my notable papers are highlighted.
|
|
FS-NCSR: Increasing Diversity of the Super-Resolution Space via Frequency Separation and Noise-Conditioned Normalizing Flow
Ki-Ung Song*, Dongseok Shim*, Kang-wook Kim*, Jae-young Lee, Younggeun Kim
NTIRE CVPRW, 2022
arXiv
2nd place on the NTIRE Learning Super-Resolution Space Challenge 4X track and 1st place on the 8X track.
|
|
Talking Face Generation with Multilingual TTS
Hyoung-Kyu Song*, Sang Hoon Woo*, Junhyeok Lee, Seungmin Yang, Hyunjae Cho, Youseong Lee, Dongho Choi, Kang-wook Kim
CVPR Demo Track (Round 1), 2022
arXiv
/
Demo
|
|
Assem-VC: Realistic Voice Conversion by Assembling Modern Speech Synthesis Techniques
Kang-wook Kim,
Seung-won Park,
Junhyeok Lee,
Myun-chul Joe
To appear in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022
project page
/
arXiv
/
github
|
|
Controllable and Interpretable Singing Voice Decomposition via Assem-VC
Kang-wook Kim,
Junhyeok Lee
NeurIPS Workshop on ML for Creativity and Design, 2021   (Oral Presentation [top 6.2%])
project page
/
arXiv
/
github
/
bibtex
We propose a controllable singing decomposition system that encodes time-aligned linguistic content, pitch, and source speaker identity via Assem-VC.
|
|