Hao Tang 唐顥

Papers

Identifying the minimal and maximal phonetic subspace of speech representations
Xingwen Han, Hao Tang
ICASSP, 2026
A framework for analyzing concept representations in neural models
Burin Naowarat, Hao Tang, Sharon Goldwater
CoNLL, 2026
Learning speech representations with variational predictive coding
Sung-Lin Yeh, Peter Bell, Hao Tang
Transactions of the ACL, 2026
Speech-FT: Merging pre-trained and fine-tuned speech representation models for cross-task generalization
Tzu-Quan Lin, Wei-Ping Huang, Hao Tang, Hung-yi Lee
IEEE Transactions on Audio, Speech, and Language Processing, 2025
Identifying speaker information in feed-forward layers of self-supervised speech Transformers
Tzu-Quan Lin, Hsi-Chun Cheng, Hung-yi Lee, Hao Tang
APSIPA, 2025
End-to-end long document summarization using gradient caching
Rohit Saxena, Hao Tang, Frank Keller
Transactions of the ACL, 2025
Whisper has an internal word aligner
Sung-Lin Yeh, Yen Meng, Hao Tang
ASRU, 2025
Is smaller always faster? Tradeoffs in compressing self-supervised speech Transformers
Tzu-Quan Lin, Tsung-Huan Yang, Chun-Yao Chang, Kuang-Ming Chen, Tzu-hsun Feng, Hung-yi Lee, Hao Tang
ASRU, 2025
Effective context in neural speech models
Yen Meng, Sharon Goldwater, Hao Tang
Interspeech, 2025
Estimating the completeness of discrete speech units
Sung-Lin Yeh, Hao Tang
SLT, 2024
A simple HMM with self-supervised representations for phone segmentation
Gene-Ping Yang, Hao Tang
SLT, 2024
Property neurons in self-supervised speech Transformers
Tzu-Quan Lin, Guan-Ting Lin, Hung-yi Lee, Hao Tang
SLT, 2024
DAISY: Data adaptive self-supervised early exit for speech representation models
Tzu-Quan Lin, Hung-yi Lee, Hao Tang
Interspeech, 2024
Orthogonality and isotropy of speaker and phonetic information in self-supervised speech representations
Mukhtar Mohamed, Oli Liu, Hao Tang, Sharon Goldwater
Interspeech, 2024
A predictive learning model can simulate temporal dynamics and context effects found in neural representations of continuous speech
Oli Liu, Hao Tang, Naomi Feldman, Sharon Goldwater
CogSci, 2024
(Computational Modeling Prize for Perception & Action)
Hierarchical indexing for retrieval-augmented opinion summarization
Tom Hosking, Hao Tang, Mirella Lapata
Transactions of the ACL, 2024
Spell4TTS: Acoustically-informed spellings for improving text-to-speech pronunciations
Jason Fong, Hao Tang, Simon King
Speech Synthesis Workshop (SSW), 2023
Towards matching phones and speech representations
Gene-Ping Yang, Hao Tang
ASRU, 2023
MelHuBERT: A simplified HuBERT on Mel spectrograms
Tzu-Quan Lin, Hung-yi Lee, Hao Tang
ASRU, 2023
Acoustic word embeddings for untranscribed target languages with continued pretraining and learned pooling
Ramon Sanabria, Hao Tang, Sharon Goldwater
Interspeech, 2023
Self-supervised predictive coding models encode speaker and phonetic information in orthogonal subspaces
Oli Liu, Hao Tang, Sharon Goldwater
Interspeech, 2023
Improving Seq2Seq TTS frontends with transcribed speech audio
Siqi Sun, Korin Richmond, Hao Tang
IEEE/ACM Transctions on Audio, Speech, and Language Processing, 2023
Attributable and scalable opinion summarization
Tom Hosking, Hao Tang, Mirella Lapata
ACL, 2023
Learning dependencies of discrete speech representations with neural hidden Markov models
Sung-Lin Yeh, Hao Tang
ICASSP, 2023
Analyzing acoustic word embeddings from pre-trained self-supervised speech models
Ramon Sanabria, Hao Tang, Sharon Goldwater
ICASSP, 2023
Conditioning and sampling in variational diffusion models for speech super-resolution
Chin-Yun Yu, Sung-Lin Yeh, Gyorgy Fazekas, Hao Tang
ICASSP, 2023
On compressing sequences for self-supervised speech models
Yen Meng, Hsuan-Jui Chen, Jiatong Shi, Shinji Watanabe, Paola Garcia, Hung-yi Lee, Hao Tang
SLT, 2022
Autoregressive predictive coding: A comprehensive study
Gene-Ping Yang, Sung-Lin Yeh, Yu-An Chung, James Glass, Hao Tang
IEEE Journal of Selected Topics in Signal Processing, 2022
Phonetic analysis of self-supervised representations of english speech
Dan Wells, Hao Tang, Korin Richmond
Interspeech, 2022
Speech audio corrector: Using speech from non-target speakers for one-off correction of mispronunciations in grapheme-input text-to-speech
Jason Fong, Daniel Lyth, Gustav Eje Henter, Hao Tang, Simon King
Interspeech, 2022
Autoregressive co-training for learning discrete speech representation
Sung-Ling Yeh, Hao Tang
Interspeech, 2022
Hierarchical sketch induction for paraphrase generation
Tom Hosking, Hao Tang, Mirella Lapata
ACL, 2022
Supervised attention in sequence-to-sequence models for speech recognition
Gene-Ping Yang, Hao Tang
ICASSP, 2022
On the difficulty of segmenting words with attention
Ramon Sanabria, Hao Tang, Sharon Goldwater
Workshop of Insights from Negative Results in NLP, 2021
Vector-quantized autoregressive predictive coding
Yu-An Chung, Hao Tang, James Glass
Interspeech, 2020
(best student paper award)
Audio-visual calibration with polynomial regression for 2-D projection using SVD-PHAT
Francois Grondin, Hao Tang, James Glass
ICASSP, 2020
A deep residual network for large-scale acoustic scene analysis
Logan Ford, Hao Tang, Francois Grondin, James Glass
Interspeech, 2019
An unsupervised autoregressive model for speech representation learning
Yu-An Chung, Wei-Ning Hsu, Hao Tang, James Glass
Interspeech, 2019
VoiceID loss: speech enhancement for speaker verification
Suwon Shon, Hao Tang, James Glass
Interspeech, 2019
Time-contrastive learning based deep bottleneck features for text-dependent speaker verification
Achintya Kr. Sarkar, Zheng-Hua Tan, Hao Tang, Suwon Shon, James Glass
IEEE Transactions on Audio, Speech and Language Processing, 2019
On the inductive bias of words in acoustics-to-word models
Hao Tang, James Glass
arXiv:1810.13407
On training recurrent networks with truncated backpropagation through time in speech recognition
Hao Tang, James Glass
SLT, 2018
Frame-level speaker embeddings for text-independent speaker recognition and analysis of end-to-end model
Suwon Shon, Hao Tang, James Glass
SLT, 2018
A study of enhancement, augmentation, and autoencoder methods for domain adaptation in distant speech recognition
Hao Tang, Wei-Ning Hsu, Francois Grondin, James Glass
Interspeech, 2018
Unsupervised adaptation with interpretable disentangled representations for distant conversational speech recognition
Wei-Ning Hsu, Hao Tang, James Glass
Interspeech, 2018
End-to-end neural segmental models for speech recognition
Hao Tang, Liang Lu, Lingpeng Kong, Kevin Gimpel, Karen Livescu, Chris Dyer, Noah A. Smith, Steve Renals
IEEE Journal of Selected Topics in Signal Processing, 2017
Lexicon-free fingerspelling recognition from video: data, models, and signer adaptation
Taehwan Kim, Jonathan Keane, Weiran Wang, Hao Tang, Jason Riggle, Gregory Shakhnarovich, Diane Brentari, Karen Livescu
Computer Speech and Language, 2017
Multitask learning with low-level auxiliary tasks for encoder-decoder based speech recognition
Shubham Toshniwal, Hao Tang, Liang Lu, Karen Livescu
Interspeech, 2017
ASR for under-resourced languages from probabilistic transcription
Mark Hasegawa-Johnson, Preethi Jyothi, Daniel McCloy, Majid Mirbagheri, Giovanni di Liberto, Amit Das, Bradley Ekin, Chunxi Liu, Vimal Manohar, Hao Tang, Edmund C. Lalor, Nancy Chen, Paul Hager, Tyler Kekona, Rose Sloan, Adrian KC Lee
IEEE Transactions on Audio, Speech and Language Processing, 2017
End-to-end training approaches for discriminative segmental models
Hao Tang, Weiran Wang, Kevin Gimpel, Karen Livescu
SLT, 2016
Efficient segmental cascades for speech recognition
Hao Tang, Weiran Wang, Kevin Gimpel, Karen Livescu
Interspeech, 2016
Triphone state-tying via deep canonical correlation analysis
Weiran Wang, Hao Tang, Karen Livescu
Interspeech, 2016
Adapting ASR for under-resourced languages using mismatched transcriptions
Chunxi Liu, Preethi Jyothi, Hao Tang, Vimal Manohar, Rose Sloan, Tyler Kekona, Mark Hasegawa-Johnson, Sanjeev Khudanpur
ICASSP, 2016
(speech and language processing student paper award)
Signer-independent fingerspelling recognition with deep neural network adaptation
Taehwan Kim, Weiran Wang, Hao Tang, Karen Livescu
ICASSP, 2016
(best student paper of speech and language processing)
Discriminative segmental cascades for feature-rich phone recognition
Hao Tang, Weiran Wang, Kevin Gimpel, Karen Livescu
ASRU, 2015
(best paper nominee)
A comparison of training approaches for discriminative segmental models
Hao Tang, Kevin Gimpel, Karen Livescu
Interspeech, 2014
Log-linear dialog manager
Hao Tang, Shinji Watanabe, Tim K. Marks, John Hershey
ICASSP, 2014
Discriminative pronunciation modeling: a large-margin feature-rich approach
Hao Tang, Joseph Keshet, Karen Livescu
ACL, 2012
An initial attempt for phoneme recognition using structured SVM
Hao Tang, Chao-Hong Meng, Lin-Shan Lee
ICASSP, 2010
Spoken term detection from bilingual spontaneous speech using code-switched lattice-based structure for words and subword units
Hung-yi Lee, Yueh-Lien Tang, Hao Tang, Lin-Shan Lee
ASRU, 2009
Query term selection strategies for web-based Chinese factoid question answering
Hao Tang, Cheng-Wei Lee, Tian-Jian Jiang, Wen-Lian Hsu
TAAI, 2006