This page, whilst looks much far from beautiful at present, does contain several tools that I developed in U. of Edinburgh and EURECOM. All the tools are free for research under GPL unless explicitly stated.
Any feedback is highly welcome and I am always reachable via one of the following emails:
Time domain gamma tone cepstral coefficient (GFCC) provided by Jun Qi, EE dep. Tsinghua Univ., China.
"Auditory feature based on Gammatone filters for robust speech recognition", ISCAS 2013.
n-gram FST indexing v0.1
This FSTi tookit contains a set of indexing tools developed in CSLT, Tsinghua University. The main purpose of FSTi is to provide a quick and easy way to construct an entire STD system when combining with some standard tools, including:
HTK from Cambridge: http://htk.eng.cam.ac.uk/
lattice-tool from SRI: http://www-speech.sri.com/projects/srilm/manpages/lattice-tool.1.html
FSTi provides three integration approaches, any can be used to construct a full practical STD system:
1. HTK + lat2fst: standard FST-based indexing[2,3] (liblse from BUT required).
2. HTK + lattice-tool + ridx: standard ngram indexing.
3. HTK + lattice-tool + ngram2fst: ngram-based FST indexing.
For more details, please refer to the following paper on Interspeech.
 Chao Liu, Dong Wang, "N-gram FST indexing for spoken term detection", Interspeech 2012.
We publish the heterogeneous CNSC code plus an example task on speech separation. For details please refer to
Heterogeneous convolutive non-negative sparse coding. Dong Wang, Javier Tejedor, submiited to Interspeech.
This fold contains the following directory:
1. olcnsc. The core of online convolutive non-negative sparse coding, extended with heterogeneous learning
2. utest. An example task for speech separation. This includes a basic invokation example and two scripts that demonstrate how to search for optimal base distributions.
3. util. Some util scripts that assit utest.
You can use and distribute this code freely for research purpose. The authors do not take any responsibility for any damage caused by running the code.
Any comments, questions, bugs.. are particularly welcome.
This package contains my matlab code for online learning approach for convolutive non-negative sparse coding (OLCNSC). Refer to the following paper on interspeech 2011:
Dong Wang, Nicholas Evans, "Online Pattern Learning for Convolutive Non-negative Sparse Coding"
|Lattice Search tool for STD||
This is the search tool for STD that I used for my PhD thesis. It reads in HTK lattices and output detections in the format of MLF. This tool was based on the lattice search tool provided by BUT, however we have rewritten the framework and extend to support SPM, discriminative confidence (MLP and SVM) and direct posterior confidence estimation.
Refer the following papers:
Dong Wang, Simon King, Joe Frankel, "Stochastic Pronunciation Modelling for Out-of-Vocabular Spoken Term Detection", IEEE transactions on Audio Speech and Language Processing, 2011.
Dong Wang, Javier Tedejor, Simon King, Joe Frankel, "Term-dpendent Confidence Normalization for Out-of-Vocabulary Spoken Term Detection". Journal of Computer Science and Technology, 2012.
Dong Wang, Simon King, Joe Frankel, et al. , "Direct Posterior Confidence Estimation for Spoken Term Detection", ACM transactions on Information System, 2012.