Tutorial

PyHHMM implements different versions of the hidden Markov models (HMMs).

HMMs are probabilistic generative models, in which a sequence of observable variables \(\mathbf{O}\) is generated by a sequence of internal hidden states \(\mathbf{Z}\). The hidden states are not observed directly. The transitions between hidden states are assumed to have the form of a (first-order) Markov chain. They can be specified by the start probability vector \(\boldsymbol{\pi}\) and a transition probability matrix \(\mathbf{A}\). The emission probability of an observable can be any distribution with parameters \(\mathbf{B}\) conditioned on the current hidden state. The HMM is completely determined by \(\boldsymbol{\lambda} = (\boldsymbol{\pi}, \mathbf{A}, \mathbf{B})\).

In order to have a useful HMM model for a real application, there are three basic problems that must be solved:

  • Problem 1 (Likelihood): Given an HMM \(\boldsymbol{\lambda} = (\boldsymbol{\pi}, \mathbf{A}, \mathbf{B})\) and an observation sequence \(\mathbf{O}\), determine the likelihood \(P(\mathbf{O}|\boldsymbol{\lambda})\).

  • Problem 2 (Decoding): Given an observation sequence \(\mathbf{O}\) and an HMM \(\boldsymbol{\lambda} = (\boldsymbol{\pi}, \mathbf{A}, \mathbf{B})\), discover the best hidden state sequence \(\mathbf{Z}\).

  • Problem 3 (Learning): Given an observation sequence \(\mathbf{O}\) and the set of states in the HMM, learn the HMM parameters \(\boldsymbol{\lambda}\)

The first and the second problem can be solved by the dynamic programming algorithms known as the Viterbi algorithm and the Forward-Backward algorithm, respectively. The last one can be solved by an iterative Expectation-Maximization (EM) algorithm, known as the Baum-Welch algorithm.

The state-space of the HMMs is discrete. At the same time, the observations can be discrete from a multinomial distribution, continuous from a Gaussian distribution or heterogeneous, hence both discrete and continuous. The models can deal with missing data without requiring imputation before training.

Available models

gaussian.GaussianHMM

Hidden Markov Model with Gaussian emissions.

heterogeneous.HeterogeneousHMM

Implementation of HMM with labels.

multinomial.MultinomialHMM

Hidden Markov Model with multiple multinomial (discrete) emissions.

Building HMM and generating samples

You can build a HMM instance by passing the parameters described above to the constructor. Then, you can generate samples from the HMM by calling sample().

>>> import numpy as np
>>> from pyhhmm.gaussian import GaussianHMM
>>>
>>> model = GaussianHMM(n_states=3, n_emissions=2, covariance_type='diagonal')
>>> model.pi = np.array([0.6, 0.3, 0.1])
>>> model.A = np.array([[0.7, 0.2, 0.1],
...                     [0.3, 0.5, 0.2],
...                     [0.3, 0.3, 0.4]])
>>> model.means = np.array([[0.0, 0.0], [3.0, -3.0], [5.0, 10.0]])
>>> model.covars = 0.5 * np.ones((3, 2))
>>>
>>> lengths = np.random.randint(25, 150, size=25)
>>> X = [ model.sample(n_sequences=1, n_samples=10)[0] for n_samples in lengths]

Training HMM parameters and inferring the hidden states

You can train an HMM by calling the train() method. The input is a list of observation sequences (aka samples).

Note, since the EM algorithm is a gradient-based optimization method, it will generally get stuck in local optima. You should in general try to run the training with various initialisations and select the highest scored model. This can be done by setting the n_init parameter.

The inferred optimal hidden states can be obtained by calling predict() method. The predict method can be specified with a decoder algorithm. Currently the Viterbi algorithm (viterbi), and maximum a posteriori estimation (map) are supported.

>>> remodel = GaussianHMM(n_states=3, n_emissions=2, covariance_type='diagonal')
>>> remodel, log_likelihood = remodel.train(X, n_init=3, n_iter=50, conv_thresh=0.001, conv_iter=5)
>>> Z2 = remodel.predict(X)

Some utility functions are provided to print the HMM in tabular form:

>>> from pyhhmm.utils import pretty_print_hmm
>>> pretty_print_hmm(remodel, hmm_type='Gaussian')

Fixing parameters

Each HMM parameter has a character code which can be used to customise its initialisation and estimation. The EM algorithm needs a starting point to proceed, thus prior to training each parameter is assigned a value either random or computed from the data. It is possible to hook into this process and provide a starting point explicitly. To do so

  1. ensure that the character code for the parameter is missing from init_params and then

  2. set the parameter to the desired value.

For example, consider a HMM with an explicitly initialized transition probability matrix:

>>> model = hmm.GaussianHMM(n_components=3, n_iter=100, init_params='smc')
>>> model.A = np.array([[0.7, 0.2, 0.1],
...                     [0.3, 0.5, 0.2],
...                     [0.3, 0.3, 0.4]])

A similar trick applies to parameter estimation. If you want to fix some parameter at a specific value, remove the corresponding character from tr_params and set the parameter value before training.

Semi-supervised training

Using the HeterogenousHMM it is possible to fix the emission probabilities of the discrete features. To do so, two parameters of the initialisation must be taken into account:

  • nr_no_train_de: indicates the number of discrete features we don´t want to be trainned by the model but the keep fixed to an original value set by the user. For example:

    • If nr_no_train_de=1 and n_d_emissions=1, the model would just have one discrete feature whose emission probabilities would be fixed (not trained by the EM algorithm).

    • If nr_no_train_de=1 but n_d_emissions=3, the model would train the emission probabilities for the two first discrete features but would keep the value of the last emission probabilities matrix unchanged. In this case the emission probabilities of the corresponding feature need to be initialised upfront by the user.

  • state_no_train_de can be used to fix just some of the states of the nr_no_train_de features while training the emission probabilities of the others (the last state_no_train_de states will be fixed). By default it is set to None, which means that the entire emission probability matrix for that discrete emission will be kept unchanged during training.

    • For example, if nr_no_train_de=1, n_d_emissions=2, n_states=5 and state_no_train_de=2, the model would train the complete emission probabilities matrix for the first discrete feature. For the second discrete feature, the emission probabilities for the 3 first states would be trained, but the emission probabilities for the last 2 states would be fixed to the values provided by the user.

This is extremely helpful if we have to have a Semi-Supervised HMM because we can associate certain states to certain labels/discrete values.

Saving and loading HMM

After training, a HMM can be easily persisted for future use with the save_model() function, and later loaded using load_model().

>>> from pyhhmm.utils import save_model, load_model
>>> save_model(model,'filename.pkl')
>>> load_model('filename.pkl')
GaussianHMM(...