Gene-Ping Yang

Gene-Ping Yang

Research Scientist at Meta.

I am currently a Research Scientist at Meta.

I completed my PhD in Informatics at the University of Edinburgh, where I spent a wonderful time at the Centre for Speech Technology Research (CSTR) with Prof. Hao Tang and Prof. Peter Bell. My research focuses on self-supervised pre-training, speech tokenization, and speech-text alignment to uncover the underlying patterns and geometry of speech representations.

I received my master's degree in Computer Science and my undergraduate degree in Electrical Engineering from National Taiwan University, where I built my speech foundation and have the pleasure of working with Prof. Lin-shan Lee and Prof. Hung-yi Lee on speech separation and enhancement.

Self-Supervised Pre-Training Cross-modal alignment of self-supervised speech and text representations.
Speech Tokenization Joint segmentation and discretization of continuous speech into discrete tokens.
Automatic Speech Recognition Optimizing implicit speech-text alignment for robust recognition systems.

Selected Publications

  • Distributed Asynchronous Device Speech Enhancement via Windowed Cross-Attention
    Gene-Ping Yang, Sebastian Braun. WASPAA 2025.
    [bib] [abstract]

  • A Simple HMM with Self-Supervised Representations for Phone Segmentation
    Gene-Ping Yang, Hao Tang. SLT 2024.
    [bib] [abstract]

  • On-Device Constrained Self-Supervised Learning for Keyword Spotting via Quantization Aware Pre-Training and Fine-tuning
    Gene-Ping Yang, Yue Gu, Sashank Macha, Qingming Tang, Yuzong Liu. ICASSP 2024.
    [bib] [abstract]

  • Towards Matching Phones and Speech Representations
    Gene-Ping Yang, and Hao Tang. ASRU 2023.
    [bib] [abstract]

  • On-device Constrained Self-Supervised Speech Representation Learning for Keyword Spotting via Knowledge Distillation
    Gene-Ping Yang, Yue Gu, Qingming Tang, Dongsu Du, Yuzong Liu. Interspeech 2023.
    [bib] [abstract]

  • Autoregressive Predictive Coding: A Comprehensive Study
    Gene-Ping Yang, Sung-Lin Yeh, Yu-An Chung, James Glass and Hao Tang. JSTSP 2022.
    [bib] [abstract]

  • Supervised Attention In Sequence-to-Sequence Models for Speech Recognition
    Gene-Ping Yang and Hao Tang. ICASSP 2022.
    [bib] [abstract]

  • Supervised Attention In Sequence-to-Sequence Models for Speech Recognition
    Gene-Ping Yang and Hao Tang. MLSLP 2021.
    [bib] [abstract]

  • Stabilizing Label Assignment for Speech Separation by Self-supervised Pre-training
    Sung-Feng Huang, Shun-Po Chuang, Da-Rong Liu, Yi-Chen Chen, Gene-Ping Yang, Hung-yi Lee. Interspeech 2021.
    [bib] [abstract]

  • Interrupted and Cascaded Permutation Invariant Training for Speech Separation
    Gene-Ping Yang, Szu-Lin Wu, Yao-Wen Mao, Hung-yi Lee, Lin-shan Lee. ICASSP 2020.
    [bib] [abstract]

  • Improved Speech Separation with Time-and-Frequency Cross-domain Joint Embedding and Clustering
    Gene-Ping Yang, Chao-I Tuan, Hung-Yi Lee, Lin-shan Lee. In Interspeech, 2019.
    [bib] [abstract]