Author:	Ivan A. Uemlianin
Contact:	i.uemlianin@bangor.ac.uk
Copyright:	2006, University of Wales, Bangor

Introduction

What is HTK? What is PyHTK?: TODO

Requirements

The only requirements are python, HTK and SpeechCluster. Note that registration is required for HTK, but the software is free of charge and models built with HTK can be commercialised.

Download

pyhtk can be downloaded from here. The individual components can be downloaded from here.

Further Work

A list of useful feature adds in no particular order:

Integrate iteration into pyhtk (i.e. instead of using the horrible hack utils/pyhtkIter.sh).
Test pyhtk in more challenging contexts (e.g. recognition beyond forced alignment, ATK HTS).

Usage

Getting Help

Typing pyhtk.py -h will get the following usage summary:

pyhtk Usage summary
===================

pyhtk.py [options]

Option        Function
===========   ========
-s            Sets up for building an acoustic model
-b            Sets up and builds an acoustic model
-a <hmmdir>   Forced alignment using the hmm in <hmmdir>
-c1           Clean up log and model files ready for another build
-c2           Clean up everything except model files

Building an Acoustic Model

Make a directory for your HTK Project, and copy pyhtk.py and your wav and lab files into it. For example, using the bash shell:

$ mkdir <myHTKProject>
$ cp pyhtk.py <myHTKProject>
$ cd <myHTKProject>
$ mkdir wav
$ cp /path/to/wav/files wav
$ mkdir lab
$ cp /path/to/lab/files lab
n.b.:

The wav files can be any sample-rate you like, as long as they're all the same.

The lab files can be any format (at least any format supported by SpeechCluster; see the SpeechCluster docs for details, but it includes esps, TextGrid and htk-lab formats); they don't all have to be the same format.

Run pyhtk.py. These are the options:

pyhtk.py -s

This will set things up, but not actually build the AM. This is useful if you want to check your setup.

pyhtk.py -b

This will set things up and build the model.
Logfiles in html format are saved in log/. There are tables of contents, and errors are highlighted in red.

Forced Alignment

TODO: What is forced alignment?

Given a fresh, clean HTK AM, as built by pyhtk.py, copy pyhtk.py and your wav and lab files into it. For example, using the bash shell:

$ cp pyhtk.py <myHTK_AM>
$ cd <myHTK_AM>
$ mkdir wav
$ cp /path/to/wav/files wav
$ mkdir lab
$ cp /path/to/lab/files lab
n.b.:

The wav files can be any sample-rate you like, as long as they're all the same.

The lab files can be any format (at least any format supported by SpeechCluster; see the SpeechCluster docs for details, but it includes esps, TextGrid and htk-lab formats); they don't all have to be the same format.

Run pyhtk.py. These are the options:
pyhtk.py -a <hmmdir>
Where <hmmdir> is the directory of the hmm you want to use. This will set things up and do the alignment.

Logfiles in html format are saved in log/. There are tables of contents, and errors are highlighted in red.
Results are saved in results/.

Recognition: TODO

Advanced: TODO

Cleaning up afterwards

pyhtk.py -c1: Cleans up everything apart from the wav and lab files, ready to have another go. n.b.: will remove a model if one has been built.
pyhtk.py -c2: Cleans up everything apart from the model itself (i.e., including wavs and labs).

pyhtk: A python wrapper for HTK