tvo.models

class tvo.models.BSC(H, D, W_init=None, sigma2_init=None, pies_init=None, individual_priors=True, precision=torch.float64)[source]

Shallow Binary Sparse Coding (BSC) model.

Parameters:

H (int) – Number of hidden units.
D (int) – Number of observables.
W_init (Tensor) – Tensor with shape (D,H), initializes BSC weights.
pies_init (Tensor) – Tensor with shape (H,), initializes BSC priors.
individual_priors (bool) – Whether to use a Bernoulli prior with H individual prior probabilities. If False, the same prior probability will be used for all latents.
precision (dtype) – Floating point precision required. Must be one of torch.float32 or torch.float64.

data_estimator(idx, batch, states)[source]

Estimator used for data reconstruction. Data reconstruction can only be supported by a model if it implements this method. The estimator to be implemented is defined as follows: \(\\langle \langle y_d \rangle_{p(y_d|\vec{s},\Theta)} \rangle_{q(\vec{s}|\mathcal{K},\Theta)}\) # noqa

Return type:: Tensor

generate_data(N=None, hidden_state=None)[source]

Sample N datapoints from this model. At least one of N or hidden_state must be provided.

Parameters:

N (int) – number of data points to be generated.
hidden_state (Tensor) – Tensor with shape (N,H) where H is the number of units in the first latent layer.

Return type:

Union[Tensor, Tuple[Tensor, Tensor]]

Returns:

if hidden_state was not provided, a tuple (data, hidden_state) where data is a Tensor with shape (N, D) where D is the number of observables for this model and hidden_state is the corresponding tensor of hidden variables with shape (N, H) where H is the number of hidden variables for this model.

log_joint(data, states, lpj=None)[source]

Evaluate log-joints for BSC.

Return type:: Tensor

log_pseudo_joint(data, states)[source]

Evaluate log-pseudo-joints for BSC

Return type:: Tensor

property shape: Tuple[int, ...]

The model shape, i.e. number of observables D and latents H as tuple (D,H)

Returns:: the model shape: observable layer size followed by hidden layer size, e.g. (D, H)

The default implementation returns self._shape if present, otherwise it tries to infer the model’s shape from the parameters self.theta: the number of latents is assumed to be equal to the first dimension of the first tensor in self.theta, and the number of observables is assumed to be equal to the last dimension of the last parameter in self.theta.

update_param_batch(idx, batch, states)[source]

Execute batch-wise M-step or batch-wise section of an M-step computation.

Parameters:

idx (Tensor) – indexes of the datapoints that compose the batch within the dataset
batch (Tensor) – batch of datapoints, Tensor with shape (N,D)
states (TVOVariationalStates) – all variational states for this dataset
mstep_factors – optional dictionary containing the Tensors that were evaluated by the lpj_fn function returned by get_lpj_func during this batch’s E-step.

Return type:

None

If the model allows it, as an optimization this method can return this batch’s free energy evaluated _before_ the model parameter update. If the batch’s free energy is returned here, Trainers will skip a direct per-batch call to the free_energy method.

update_param_epoch()[source]

Execute epoch-wise M-step or epoch-wise section of an M-step computation.

This method is called at the end of each training epoch. Implementing this method is optional: models can leave the body empty (just a pass) or even not implement it at all.

Return type:: None

class tvo.models.BernoulliTVAE(shape=None, precision=torch.float64, min_lr=0.001, max_lr=0.01, cycliclr_step_size_up=400, pi_init=None, W_init=None, b_init=None, analytical_pi_updates=True, activation=None, external_model=None, optimizer=None)[source]

Create a TVAE model with Bernoulli observables.

Parameters:

shape (Sequence[int]) – Network shape, from observable to most hidden: (D,…,H1,H0). Can be None if W_init is not None.
precision (dtype) – One of to.float32 or to.float64, indicates the floating point precision of model parameters.
min_lr (float) – See docs of tvo.utils.CyclicLR
max_lr (float) – See docs of tvo.utils.CyclicLR
cycliclr_step_size_up – See docs of tvo.utils.CyclicLR
pi_init (Tensor) – Optional tensor with initial prior values
W_init (Sequence[Tensor]) – Optional list of tensors with initial weight values. Weight matrices must be ordered from most hidden to observable layer. If this parameter is not None, the shape parameter can be omitted.
b_init (Sequence[Tensor]) – Optional list of tensors with initial.
analytical_pi_updates (bool) – Whether priors should be updated via the analytical max-likelihood solution rather than gradient descent.
activation (Callable) – Decoder activation function used if external_model is not specified. Defaults to ReLU if not specified and external_model not used.
external_model (Optional[Module]) – Optional decoder neural network. One of shape, (W_init, b_init), external_model must be specified exclusively.
optimizer (Optional[Optimizer]) – Gradient optimizer (defaults to Adam if not specified)

forward(x)[source]

Forward application of TVAE’s MLP to the specified input.

Return type:: Tensor

generate_data(N=None, hidden_state=None)[source]

Sample N datapoints from this model. At least one of N or hidden_state must be provided.

Parameters:

N (int) – number of data points to be generated.
hidden_state (Tensor) – Tensor with shape (N,H) where H is the number of units in the first latent layer.

Return type:

Union[Tensor, Tuple[Tensor, Tensor]]

Returns:

if hidden_state was not provided, a tuple (data, hidden_state) where data is a Tensor with shape (N, D) where D is the number of observables for this model and hidden_state is the corresponding tensor of hidden variables with shape (N, H) where H is the number of hidden variables for this model.

log_joint(data, states, lpj=None)[source]

Evaluate log-joint probabilities for this model.

Parameters:

data – shape is (N,D)
states – shape is (N,S,H)

Returns:

log-joints for data and states - shape is (N,S)

update_param_epoch()[source]

Execute epoch-wise M-step or epoch-wise section of an M-step computation.

This method is called at the end of each training epoch. Implementing this method is optional: models can leave the body empty (just a pass) or even not implement it at all.

Return type:: None

class tvo.models.GMM(H, D, W_init=None, sigma2_init=None, pies_init=None, precision=torch.float64)[source]

Gaussian Mixture model (GMM).

Parameters:

H (int) – Number of hidden units.
D (int) – Number of observables.
W_init (Tensor) – Tensor with shape (D,H), initializes GM weights.
pies_init (Tensor) – Tensor with shape (H,), initializes GM priors.
precision (dtype) – Floating point precision required. Must be one of torch.float32 or torch.float64.

data_estimator(idx, batch, states)[source]

Estimator used for data reconstruction. Data reconstruction can only be supported by a model if it implements this method. The estimator to be implemented is defined as follows: \(\\langle \langle y_d \rangle_{p(y_d|\vec{s},\Theta)} \rangle_{q(\vec{s}|\mathcal{K},\Theta)}\) # noqa

Return type:: Tensor

generate_data(N=None, hidden_state=None)[source]

Sample N datapoints from this model. At least one of N or hidden_state must be provided.

Parameters:

N (int) – number of data points to be generated.
hidden_state (Tensor) – Tensor with shape (N,H) where H is the number of units in the first latent layer.

Return type:

Union[Tensor, Tuple[Tensor, Tensor]]

Returns:

if hidden_state was not provided, a tuple (data, hidden_state) where data is a Tensor with shape (N, D) where D is the number of observables for this model and hidden_state is the corresponding tensor of hidden variables with shape (N, H) where H is the number of hidden variables for this model.

log_joint(data, states, lpj=None)[source]

Evaluate log-joints for GMM.

Return type:: Tensor

log_pseudo_joint(data, states)[source]

Evaluate log-pseudo-joints for GMM.

Return type:: Tensor

property shape: Tuple[int, ...]

The model shape, i.e. number of observables D and latents H as tuple (D,H)

Returns:: the model shape: observable layer size followed by hidden layer size, e.g. (D, H)

The default implementation returns self._shape if present, otherwise it tries to infer the model’s shape from the parameters self.theta: the number of latents is assumed to be equal to the first dimension of the first tensor in self.theta, and the number of observables is assumed to be equal to the last dimension of the last parameter in self.theta.

update_param_batch(idx, batch, states)[source]

Execute batch-wise M-step or batch-wise section of an M-step computation.

Parameters:

idx (Tensor) – indexes of the datapoints that compose the batch within the dataset
batch (Tensor) – batch of datapoints, Tensor with shape (N,D)
states (Tensor) – all variational states for this dataset
mstep_factors – optional dictionary containing the Tensors that were evaluated by the lpj_fn function returned by get_lpj_func during this batch’s E-step.

Return type:

None

If the model allows it, as an optimization this method can return this batch’s free energy evaluated _before_ the model parameter update. If the batch’s free energy is returned here, Trainers will skip a direct per-batch call to the free_energy method.

update_param_epoch()[source]

Execute epoch-wise M-step or epoch-wise section of an M-step computation.

This method is called at the end of each training epoch. Implementing this method is optional: models can leave the body empty (just a pass) or even not implement it at all.

Return type:: None

class tvo.models.GaussianTVAE(shape=None, precision=torch.float64, min_lr=0.001, max_lr=0.01, cycliclr_step_size_up=400, pi_init=None, W_init=None, b_init=None, sigma2_init=None, analytical_sigma_updates=True, analytical_pi_updates=True, clamp_sigma_updates=False, activation=None, external_model=None, optimizer=None)[source]

Create a TVAE model with Gaussian observables.

Parameters:

shape (Sequence[int]) – Network shape, from observable to most hidden: (D,…,H1,H0). One of shape, (W_init, b_init), external_model must be specified exclusively.
precision (dtype) – One of to.float32 or to.float64, indicates the floating point precision of model parameters.
min_lr (float) – See docs of tvo.utils.CyclicLR
max_lr (float) – See docs of tvo.utils.CyclicLR
cycliclr_step_size_up – See docs of tvo.utils.CyclicLR
pi_init (Tensor) – Optional tensor with initial prior values
W_init (Sequence[Tensor]) – Optional list of tensors with initial weight values. Weight matrices must be ordered from most hidden to observable layer. One of shape, (W_init, b_init), external_model must be specified exclusively.
b_init (Sequence[Tensor]) – Optional list of tensors with initial bias. One of shape, (W_init, b_init), external_model must be specified exclusively.
sigma2_init (float) – Optional initial value for model variance.
analytical_sigma_updates (bool) – Whether sigmas should be updated via the analytical max-likelihood solution rather than gradient descent.
analytical_pi_updates (bool) – Whether priors should be updated via the analytical max-likelihood solution rather than gradient descent.
clamp_sigma_updates (bool) – Whether to limit the rate at which sigma can be updated.
activation (Callable) – Decoder activation function used if external_model is not specified. Defaults to ReLU if not specified and external_model not used.
external_model (Optional[Module]) – Optional decoder neural network. One of shape, (W_init, b_init), external_model must be specified exclusively.
optimizer (Optional[Optimizer]) – Gradient optimizer (defaults to Adam if not specified)

forward(x)[source]

Forward application of TVAE’s MLP to the specified input.

Return type:: Tensor

generate_data(N=None, hidden_state=None)[source]

Sample N datapoints from this model. At least one of N or hidden_state must be provided.

Parameters:

N (int) – number of data points to be generated.
hidden_state (Tensor) – Tensor with shape (N,H) where H is the number of units in the first latent layer.

Return type:

Union[Tensor, Tuple[Tensor, Tensor]]

Returns:

if hidden_state was not provided, a tuple (data, hidden_state) where data is a Tensor with shape (N, D) where D is the number of observables for this model and hidden_state is the corresponding tensor of hidden variables with shape (N, H) where H is the number of hidden variables for this model.

log_joint(data, states, lpj=None)[source]

Evaluate log-joint probabilities for this model.

Parameters:

data – shape is (N,D)
states – shape is (N,S,H)

Returns:

log-joints for data and states - shape is (N,S)

update_param_epoch()[source]

Execute epoch-wise M-step or epoch-wise section of an M-step computation.

This method is called at the end of each training epoch. Implementing this method is optional: models can leave the body empty (just a pass) or even not implement it at all.

Return type:: None

class tvo.models.NoisyOR(H, D, W_init=None, pi_init=None, precision=torch.float64)[source]

Shallow NoisyOR model.

Parameters:

H (int) – Number of hidden units.
D (int) – Number of observables.
W_init (Tensor) – Tensor with shape (D,H), initializes NoisyOR weights.
pi_init (Tensor) – Tensor with shape (H,), initializes NoisyOR priors.
precision (dtype) – Floating point precision required. Must be one of torch.float32 or torch.float64.

generate_data(N=None, hidden_state=None)[source]

Use hidden states to sample datapoints according to the NoisyOR generative model.

Parameters:: hidden_state (Tensor) – a tensor with shape (N, H) where H is the number of hidden units.
Return type:: Union[Tensor, Tuple[Tensor, Tensor]]
Returns:: the datapoints, as a tensor with shape (N, D) where D is the number of observables.

log_joint(data, states, lpj=None)[source]

Evaluate log-joint probabilities for this model.

Parameters:

data – shape is (N,D)
states – shape is (N,S,H)
lpj – shape is (N,S). When lpj is not None it must contain pre-evaluated log-pseudo joints for the given data and states. The implementation can take advantage of the extra argument to save computation.

Returns:

log-joints for data and states - shape is (N,S)

log_pseudo_joint(data, states)[source]

Evaluate log-pseudo-joints for NoisyOR.

Return type:: Tensor

update_param_batch(idx, batch, states, mstep_factors=None)[source]

Execute batch-wise M-step or batch-wise section of an M-step computation.

Parameters:

idx (Tensor) – indexes of the datapoints that compose the batch within the dataset
batch (Tensor) – batch of datapoints, Tensor with shape (N,D)
states (TVOVariationalStates) – all variational states for this dataset
mstep_factors (Dict[str, Tensor]) – optional dictionary containing the Tensors that were evaluated by the lpj_fn function returned by get_lpj_func during this batch’s E-step.

Return type:

Optional[float]

If the model allows it, as an optimization this method can return this batch’s free energy evaluated _before_ the model parameter update. If the batch’s free energy is returned here, Trainers will skip a direct per-batch call to the free_energy method.

update_param_epoch()[source]

Execute epoch-wise M-step or epoch-wise section of an M-step computation.

This method is called at the end of each training epoch. Implementing this method is optional: models can leave the body empty (just a pass) or even not implement it at all.

Return type:: None

class tvo.models.PMM(H, D, W_init=None, pies_init=None, precision=torch.float64)[source]

Poisson Mixture Model (PMM).

Parameters:

H (int) – Number of hidden units.
D (int) – Number of observables.
W_init (Tensor) – Tensor with shape (D,H), initializes PMM weights.
pies_init (Tensor) – Tensor with shape (H,), initializes PMM priors.
precision (dtype) – Floating point precision required. Must be one of torch.float32 or torch.float64.

data_estimator(idx, batch, states)[source]

Estimator used for data reconstruction. Data reconstruction can only be supported by a model if it implements this method. The estimator to be implemented is defined as follows: \(\\langle \langle y_d \rangle_{p(y_d|\vec{s},\Theta)} \rangle_{q(\vec{s}|\mathcal{K},\Theta)}\) # noqa

Return type:: Tensor

generate_data(N=None, hidden_state=None)[source]

Sample N datapoints from this model. At least one of N or hidden_state must be provided.

Parameters:

N (int) – number of data points to be generated.
hidden_state (Tensor) – Tensor with shape (N,H) where H is the number of units in the first latent layer.

Return type:

Union[Tensor, Tuple[Tensor, Tensor]]

Returns:

if hidden_state was not provided, a tuple (data, hidden_state) where data is a Tensor with shape (N, D) where D is the number of observables for this model and hidden_state is the corresponding tensor of hidden variables with shape (N, H) where H is the number of hidden variables for this model.

log_joint(data, states, lpj=None)[source]

Evaluate log-joints for PMM.

Return type:: Tensor

log_pseudo_joint(data, states)[source]

Evaluate log-pseudo-joints for GMM.

Return type:: Tensor

property shape: Tuple[int, ...]

The model shape, i.e. number of observables D and latents H as tuple (D,H)

Returns:: the model shape: observable layer size followed by hidden layer size, e.g. (D, H)

The default implementation returns self._shape if present, otherwise it tries to infer the model’s shape from the parameters self.theta: the number of latents is assumed to be equal to the first dimension of the first tensor in self.theta, and the number of observables is assumed to be equal to the last dimension of the last parameter in self.theta.

update_param_batch(idx, batch, states)[source]

Execute batch-wise M-step or batch-wise section of an M-step computation.

Parameters:

idx (Tensor) – indexes of the datapoints that compose the batch within the dataset
batch (Tensor) – batch of datapoints, Tensor with shape (N,D)
states (Tensor) – all variational states for this dataset
mstep_factors – optional dictionary containing the Tensors that were evaluated by the lpj_fn function returned by get_lpj_func during this batch’s E-step.

Return type:

None

If the model allows it, as an optimization this method can return this batch’s free energy evaluated _before_ the model parameter update. If the batch’s free energy is returned here, Trainers will skip a direct per-batch call to the free_energy method.

update_param_epoch()[source]

Execute epoch-wise M-step or epoch-wise section of an M-step computation.

This method is called at the end of each training epoch. Implementing this method is optional: models can leave the body empty (just a pass) or even not implement it at all.

Return type:: None

class tvo.models.SSSC(H, D, W_init=None, sigma2_init=None, mus_init=None, Psi_init=None, pies_init=None, reformulated_lpj=True, use_storage=True, reformulated_psi_update=False, precision=torch.float32)[source]

Spike-And-Slab Sparse Coding (SSSC) model.

Parameters:

H (int) – Number of hidden units.
D (int) – Number of observables.
W_init (Tensor) – Tensor with shape (H, D), initializes SSSC weights.
sigma2_init (Tensor) – Tensor initializing SSSC observable variance.
mus_init (Tensor) – Tensor with shape (H,), initializes SSSC latent means.
Psi_init (Tensor) – Tensor with shape (H, H), initializes SSSC latent variance.
pies_init (Tensor) – Tensor with shape (H,), initializes SSSC priors.
reformulated_lpj (bool) – Use looped instead of batchified E-step and mathematically reformulated form of the log-pseudo-joint formula (exploiting matrix determinant lemma and Woodbury matrix identity). Yields more accurate solutions in large dimensions (i.e. large D and H).
use_storage (bool) – Whether to memorize state vector-dependent and datapoint independent- terms computed in the E-step. Terms will be looked-up rather than re- computed if a datapoint evaluates a state that has been evaluated for another datapoint before. The storage will be cleared after each epoch.
reformulated_psi_update (bool) – Whether to update Psi using reformulated form of the update equation.
precision (dtype) – Floating point precision required. Must be one of torch.float32 or torch.float64.

data_estimator(idx, batch, states)[source]

Estimator used for data reconstruction. Data reconstruction can only be supported by a model if it implements this method. The estimator to be implemented is defined as follows: \(\\langle \langle y_d \rangle_{p(y_d|\vec{s},\Theta)} \rangle_{q(\vec{s}|\mathcal{K},\Theta)}\) # noqa

Return type:: Tensor

generate_data(N=None, hidden_state=None)[source]

Sample N datapoints from this model. At least one of N or hidden_state must be provided.

Parameters:

N (int) – number of data points to be generated.
hidden_state (Tensor) – Tensor with shape (N,H) where H is the number of units in the first latent layer.

Return type:

Union[Tensor, Tuple[Tensor, Tensor]]

Returns:

if hidden_state was not provided, a tuple (data, hidden_state) where data is a Tensor with shape (N, D) where D is the number of observables for this model and hidden_state is the corresponding tensor of hidden variables with shape (N, H) where H is the number of hidden variables for this model.

log_joint(data, states, lpj=None)[source]

Evaluate log-joints for SSSC.

Return type:: Tensor

log_pseudo_joint(data, states)[source]

Evaluate log-pseudo-joints for SSSC.

Return type:: Tensor

update_param_batch(idx, batch, states, **kwargs)[source]

Execute batch-wise M-step or batch-wise section of an M-step computation.

Parameters:

idx (Tensor) – indexes of the datapoints that compose the batch within the dataset
batch (Tensor) – batch of datapoints, Tensor with shape (N,D)
states (TVOVariationalStates) – all variational states for this dataset
mstep_factors – optional dictionary containing the Tensors that were evaluated by the lpj_fn function returned by get_lpj_func during this batch’s E-step.

Return type:

None

If the model allows it, as an optimization this method can return this batch’s free energy evaluated _before_ the model parameter update. If the batch’s free energy is returned here, Trainers will skip a direct per-batch call to the free_energy method.

update_param_epoch()[source]

Execute epoch-wise M-step or epoch-wise section of an M-step computation.

This method is called at the end of each training epoch. Implementing this method is optional: models can leave the body empty (just a pass) or even not implement it at all.

Return type:: None