chronostar.mixture package#

Submodules#

chronostar.mixture.componentmixture module#

class chronostar.mixture.componentmixture.ComponentMixture(init_weights, init_components)#

Bases: BaseMixture

A mixture model of arbitrary components

Parameters:
  • init_weights (NDArray[float64] of shape (n_components) or (n_samples, n_components)) – Initial weights of the components. If array is 2 dimensional, the init_weights are taken to be initial membership probabilities. If this is the case, you must configure init_params=’init_resp’

  • init_components (list[BaseComponent]) – Component objects which will be maximised to the data, optionally with pre-initialised parameters

tol#

Used to determine convergence by sklearn’s EM algorithm. Convergence determined if “change” between EM iterations is less than tol, where change is the difference between the average log probability of each sample, configurable

Type:

float, default 1e-3

reg_covar#

A regularization constant added to the diagonals of covariance matrices, configurable

Type:

float, default 1e-6

max_iter#

The maximum iterations for sklearn’s EM algorithm, configurable

Type:

int, default 100

n_init#

(included only for sklearn API compatbility, ignored)

Type:

int, default 1

init_params#

The initialization approach used by sklearn if component parameters aren’t pre set, configurable. Must be one of

  • ‘init_resp’ : responsibilites are taken from input

  • ‘kmeans’ : responsibilities are initialized using kmeans.

  • ‘k-means++’ : use the k-means++ method to initialize.

  • ‘random’ : responsibilities are initialized randomly.

  • ‘random_from_data’ : initial means are randomly selected data points.

Type:

str, default ‘random’

random_state#

Controls the random seed given to the method chosen to initialize the parameters (see init_params). In addition, it controls the generation of random samples from the fitted distribution. Pass an int for reproducible output across multiple function calls, configurable.

Type:

int, default None

warm_start#

(leave True for correct interactions between self and self.sklmixture)

Type:

bool, default True

verbose#

Whether to print sklearn statements:

  • 0 : no output

  • 1 : prints current initialization and each iteration step

  • 2 : same as 1 but also prints log probability and execution time

Type:

int, default 0

verbose_interval#

If verbose > 0, how many iterations between print statements

Type:

int, default 10

bic(X)#

Calculate the Bayesian Information Criterion

Parameters:

X (NDArray[float64] of shape (n_samples, n_features)) – Input data

Returns:

The calculated BIC

Return type:

float

estimate_weighted_log_prob(X)#

Estimate the weighted log-probabilities, log P(X | Z) + log weights.

Parameters:

X (array-like of shape (n_samples, n_features)) –

Returns:

weighted_log_prob

Return type:

array, shape (n_samples, n_component)

fit(X)#

Fit the mixture model to the input data

Parameters:

X (NDArray[float64] of shape (n_samples, n_features)) – Input data

Return type:

None

get_components()#

Get the tuple of components fitted to the data

Returns:

The list of components

Return type:

list[BaseComponent]

get_parameters()#

Get the parameters that characterise the mixture

Returns:

The weights of the components and the component objects

Return type:

tuple[NDArray[float64], list[BaseComponent]]

set_parameters(params)#

Set the parameters that characterise the mixture

Parameters:

params (tuple[NDArray[float64], list[BaseComponent]]) – The weights of the components and the component objects

Return type:

None

chronostar.mixture.sklmixture module#

class chronostar.mixture.sklmixture.SKLComponentMixture(weights_init, components_init, *, tol=0.001, reg_covar=1e-06, max_iter=100, n_init=1, init_params='random', random_state=None, warm_start=True, verbose=0, verbose_interval=10, **kwargs)#

Bases: BaseMixture

A derived class utilising much from scikit-learn to fit a Gaussian Mixture Model

Parameters:
  • weights_init (NDArray[float64] of shape (n_components)) – Initial weights of each component, ideally normalized to sum to 1. If array is 2 dimensional, the init_weights are taken to be initial membership probabilities. If this is the case, you must configure init_params=’init_resp’

  • components_init (list[BaseComponent]) – Component objects which will be maximised to the data, optionally with pre-initialised parameters

  • tol (float, optional) – The tolerance on convergence detection of an EM fit. If the average log likelihood of all samples differs by less than tol at the end of an EM iteration, convergence has been reached, by default 1e-3

  • reg_covar (float, optional) – Regularisation factor added to diagonal elements of covariance matrices, by default 1e-6

  • max_iter (int, optional) – Maximum iterations of EM algorithm, by default 100

  • n_init (int, optional) – sklearn parameter we don’t use, by default 1

  • init_params (str, optional) –

    How to initialise components if not already set, ‘random’ assigns memberships randomly then maximises, by default ‘random’. Options are:

    • ’random’: each responsibility is randomly assigned

    • ’init_resp’: an initial responsiblity array is provided

  • random_state (Any, optional) – sklearn parameter… the random seed?, by default None

  • warm_start (bool, optional) – sklearn parameter that we don’t use, by default True

  • verbose (int, optional) – sklearn parameter, by default 0

  • verbose_interval (int, optional) – sklearn parameter, by default 10

aic(X)#

Calculate the Akaike Information Criterion

Parameters:

X (NDArray[float64] of shape (n_samples, n_features)) – Input data

Returns:

The calculated AIC

Return type:

float

bic(X)#

Calculate the Bayesian Information Criterion

Parameters:

X (NDArray[float64] of shape (n_samples, n_features)) – Input data

Returns:

The calculated BIC

Return type:

float

Module contents#