Tutorial
PyHHMM
implements different versions of the hidden Markov models (HMMs).
HMMs are probabilistic generative models, in which a sequence of observable
variables \(\mathbf{O}\) is generated by a sequence of internal hidden
states \(\mathbf{Z}\). The hidden states are not observed directly.
The transitions between hidden states are assumed to have the form of a
(first-order) Markov chain. They can be specified by the start probability
vector \(\boldsymbol{\pi}\) and a transition probability matrix
\(\mathbf{A}\). The emission probability of an observable can be any
distribution with parameters \(\mathbf{B}\) conditioned on the
current hidden state. The HMM is completely determined by \(\boldsymbol{\lambda} = (\boldsymbol{\pi}, \mathbf{A}, \mathbf{B})\).
In order to have a useful HMM model for a real application, there are three basic problems that must be solved:
Problem 1 (Likelihood): Given an HMM \(\boldsymbol{\lambda} = (\boldsymbol{\pi}, \mathbf{A}, \mathbf{B})\) and an observation sequence \(\mathbf{O}\), determine the likelihood \(P(\mathbf{O}|\boldsymbol{\lambda})\).
Problem 2 (Decoding): Given an observation sequence \(\mathbf{O}\) and an HMM \(\boldsymbol{\lambda} = (\boldsymbol{\pi}, \mathbf{A}, \mathbf{B})\), discover the best hidden state sequence \(\mathbf{Z}\).
Problem 3 (Learning): Given an observation sequence \(\mathbf{O}\) and the set of states in the HMM, learn the HMM parameters \(\boldsymbol{\lambda}\)
The first and the second problem can be solved by the dynamic programming
algorithms known as the Viterbi algorithm and the Forward-Backward algorithm, respectively. The last one can be solved by an iterative Expectation-Maximization (EM) algorithm, known as the Baum-Welch algorithm.
The state-space of the HMMs is discrete. At the same time, the observations can be discrete from a multinomial distribution, continuous from a Gaussian distribution or heterogeneous, hence both discrete and continuous. The models can deal with missing data without requiring imputation before training.
Building HMM and generating samples
You can build a HMM instance by passing the parameters described above to the
constructor. Then, you can generate samples from the HMM by calling
sample()
.
>>> import numpy as np
>>> from pyhhmm.gaussian import GaussianHMM
>>>
>>> model = GaussianHMM(n_states=3, n_emissions=2, covariance_type='diagonal')
>>> model.pi = np.array([0.6, 0.3, 0.1])
>>> model.A = np.array([[0.7, 0.2, 0.1],
... [0.3, 0.5, 0.2],
... [0.3, 0.3, 0.4]])
>>> model.means = np.array([[0.0, 0.0], [3.0, -3.0], [5.0, 10.0]])
>>> model.covars = 0.5 * np.ones((3, 2))
>>>
>>> lengths = np.random.randint(25, 150, size=25)
>>> X = [ model.sample(n_sequences=1, n_samples=10)[0] for n_samples in lengths]
Training HMM parameters and inferring the hidden states
You can train an HMM by calling the train()
method. The input is a list of observation sequences (aka samples).
Note, since the EM algorithm is a gradient-based optimization method, it will
generally get stuck in local optima. You should in general try to run the training
with various initialisations and select the highest scored model. This can be done by setting
the n_init parameter.
The inferred optimal hidden states can be obtained by calling
predict()
method. The predict
method can be
specified with a decoder algorithm. Currently the Viterbi algorithm
(viterbi
), and maximum a posteriori estimation (map
) are supported.
>>> remodel = GaussianHMM(n_states=3, n_emissions=2, covariance_type='diagonal')
>>> remodel, log_likelihood = remodel.train(X, n_init=3, n_iter=50, conv_thresh=0.001, conv_iter=5)
>>> Z2 = remodel.predict(X)
Some utility functions are provided to print the HMM in tabular form:
>>> from pyhhmm.utils import pretty_print_hmm
>>> pretty_print_hmm(remodel, hmm_type='Gaussian')
Fixing parameters
Each HMM parameter has a character code which can be used to customise its
initialisation and estimation. The EM algorithm needs a starting point to
proceed, thus prior to training each parameter is assigned a value
either random or computed from the data. It is possible to hook into this
process and provide a starting point explicitly. To do so
ensure that the character code for the parameter is missing from
init_params
and then
set the parameter to the desired value.
For example, consider a HMM with an explicitly initialized transition
probability matrix:
>>> model = hmm.GaussianHMM(n_components=3, n_iter=100, init_params='smc')
>>> model.A = np.array([[0.7, 0.2, 0.1],
... [0.3, 0.5, 0.2],
... [0.3, 0.3, 0.4]])
A similar trick applies to parameter estimation. If you want to fix some
parameter at a specific value, remove the corresponding character from
tr_params
and set the parameter value before training.
Semi-supervised training
Using the HeterogenousHMM it is possible to fix the emission probabilities of the discrete features. To do so, two parameters of the initialisation must be taken into account:
nr_no_train_de
: indicates the number of discrete features we don´t want to be trainned by the model but the keep fixed to an original value set by the user. For example:
If nr_no_train_de=1 and n_d_emissions=1, the model would just have one discrete feature whose emission probabilities would be fixed (not trained by the EM algorithm).
If nr_no_train_de=1 but n_d_emissions=3, the model would train the emission probabilities for the two first discrete features but would keep the value of the last emission probabilities matrix unchanged. In this case the emission probabilities of the corresponding feature need to be initialised upfront by the user.
state_no_train_de
can be used to fix just some of the states of the nr_no_train_de
features while training the emission probabilities of the others (the last state_no_train_de
states will be fixed). By default it is set to None, which means that the entire emission probability matrix for that discrete emission will be kept unchanged during training.
For example, if nr_no_train_de=1, n_d_emissions=2, n_states=5 and state_no_train_de=2, the model would train the complete emission probabilities matrix for the first discrete feature. For the second discrete feature, the emission probabilities for the 3 first states would be trained, but the emission probabilities for the last 2 states would be fixed to the values provided by the user.
This is extremely helpful if we have to have a Semi-Supervised HMM because we can associate certain states to certain labels/discrete values.
Saving and loading HMM
After training, a HMM can be easily persisted for future use with the save_model()
function, and later loaded using
load_model()
.
>>> from pyhhmm.utils import save_model, load_model
>>> save_model(model,'filename.pkl')
>>> load_model('filename.pkl')
GaussianHMM(...