Baum-Welch

class jajapy.BW

Class for the Baum-Welch algorithm.

fit(training_set, initial_model=None, nb_states: Optional[int] = None, random_initial_state: bool = True, min_exit_rate_time: Optional[int] = None, max_exit_rate_time: Optional[int] = None, self_loop: Optional[bool] = None, nb_distributions: Optional[int] = None, output_file: Optional[str] = None, output_file_prism: Optional[str] = None, epsilon: float = 0.01, max_it: int = inf, pp: str = '', verbose: int = 2, return_data: bool = False, stormpy_output: bool = True, fixed_parameters: ndarray = False, update_constant: bool = True, min_val: Optional[float] = None, max_val: Optional[float] = None, processes: Optional[int] = None)

Fits any model according to traces. This method will figure which type of Markov model should be used, according to the training set and the initial model (if given).

Parameters

tracesSet, list or ndarray.

The training set.

initial_modelModel or stormpy.sparse model, optional.

The first hypothesis. If not set it will create a random Model with nb_states states.

nb_states: int, optional.

If initial_model is not set it will create a random Model with nb_states states. Must be set if initial_model is not set.

random_initial_state: bool, optional.

If initial_model is not set it will create a random Model with random initial state according to this sequence of probabilities. Must be set if initial_model is not set. Default is True.

min_exit_rate_time: int, optional

For CTMC learning only. Minimum exit rate for the states in the first hypothesis. Must be set if initial_model is not set, and if the SUL is a CTMC or PCTMC.

max_exit_rate_time: int, optional

For CTMC learning only. Minimum exit rate for the states in the first hypothesis. Must be set if initial_model is not set, and if the SUL is a CTMC or PCTMC.

self_loop: bool, optional

For CTMC/PCTMC learning only. Wether or not there will be self loop in the first hypothesis. Must be set if initial_model is not set, and if the SUL is a CTMC or PCTMC.

nb_distributions: int, optional.

For GoHMM learning only. Number of distributions in each state in the initial hypothsis. Must be set if initial_model is not set, and if the SUL is a GoHMM.

output_filestr, optional

If set, the output model will be saved at this location. Otherwise the output model will not be saved.

output_file_prismstr, optional

If set, the output model will be saved in a prism file at this location. Otherwise the output model will not be saved. This parameter is ignored if the model under learning is a HMM or a GoHMM.

epsilonfloat, optional

The learning process stops when the difference between the loglikelihood of the training set under the two last hypothesis is lower than epsilon. The lower this value the better the output, but the longer the running time. Default is 0.01.

max_it: int, optional

Maximal number of iterations. The algorithm will stop after max_it iterations. Default is infinity.

pp: str, optional

Will be printed at each iteration. Default is an empty string.

verbose: int, optional.

Define the level of information that will print during the learning 0 - nothing (no warnings, no progress bar, no recap at the end) 1 - minimal (warnings only) 2 - default (warnings and progress bar, no recap at the end) 3 - maximal (warnings, progress bar and recap) Default is 2.

return_data: bool, optional.

If set to True, a dictionary containing following values will be returned alongside the output model once the learning is done. ‘learning_rounds’, ‘learning_time’, ‘training_set_loglikelihood’. Default is False.

stormpy_output: bool, optional.

If set to True the output model will be a Stormpy sparse model. Doesn’t work for HMM and GOHMM. Default is True.

fixed_parameters: ndarray of bool, optional

For CTMC/PCTMC learning only. ndarray of bool with the same shape as the transition matrix (i.e nb_states x nb_states). If fixed_parameters[s1,s2] == True, the transition parameter from s1 to s2 will not be changed during the learning (it’s a fixed parameter). By default no parameters will be fixed.

update_constant: bool, optional

For PCTMC learning only. If set to False, the constant transitions (i.e. tha transition that doesn’t depend on any parameter) will no be updated. Default is True.

min_val: float, optional

For PCTMC learning only. Minimal value for the randomly instantiated parameters. If not set and if the model has at least two instantiated parameters, this value is equal to the parameters with the smallest instantiation. If not set and if the model has less than two instantiated parameters, this value is equal to 0.1.

max_valfloat, optional

For PCTMC learning only. Maximal value for the randomly instantiated parameters. If not set and if the model has at least two instantiated parameters, this value is equal to the parameters with the highest instantiation. If not set and if the model has less than two instantiated parameters, this value is equal to 5.0.

processesint, optional

Number of processes used during the learning. Only for linux: for Windows and Mac OS it is 1. Default is cpu_count()-1.

Returns

Model

fitted model.

fit_nonInstantiatedParameters(traces, initial_model, epsilon: float = 0.01, max_it: int = inf, pp: str = '', verbose: bool = True, return_data: bool = False, min_val: Optional[float] = None, max_val: Optional[float] = None) dict

For PCTMC learning only. Fits only the non-instantiated parameters in the initial model according to traces.

Parameters

tracesSet or list or numpy.ndarray

training set.

initial_modelPCTMC

first hypothesis.

epsilonfloat, optional

the learning process stops when the difference between the loglikelihood of the training set under the two last hypothesis is lower than epsilon. The lower this value the better the output, but the longer the running time. By default 0.01.

max_it: int

Maximal number of iterations. The algorithm will stop after max_it iterations. Default is infinity.

ppstr, optional

Will be printed at each iteration. By default ‘’.

verbose: bool, optional

Print or not a small recap at the end of the learning. Default is True.

return_data: bool, optional

If set to True, a dictionary containing following values will be returned alongside the hypothesis once the learning is done. ‘learning_rounds’, ‘learning_time’, ‘training_set_loglikelihood’. Default is False.

min_val: float, optional

Minimal value for the randomly instantiated parameters. If not set and if the model has at least two instantiated parameters, this value is equal to the parameters with the smallest instantiation. If not set and if the model has less than two instantiated parameters, this value is equal to 0.1.

max_valfloat, optional

Maximal value for the randomly instantiated parameters. If not set and if the model has at least two instantiated parameters, this value is equal to the parameters with the highest instantiation. If not set and if the model has less than two instantiated parameters, this value is equal to 5.0.

Returns

dict or list

Dictionary containing the estimated values for the non-indtantiated parameters. If return_data is set to True, returns a list containing: - the dictionary described above, - the returned_data (see parameter description).

fit_parameters(traces, initial_model, to_update: list, epsilon: float = 0.01, max_it: int = inf, pp: str = '', verbose: bool = True, return_data: bool = False, min_val: Optional[float] = None, max_val: Optional[float] = None) dict

For PCTMC learning only. Fits only the non-instantiated parameters in the initial model according to traces.

Parameters

tracesSet or list or numpy.ndarray

training set.

initial_modelPCTMC

first hypothesis.

to_update: list of str

list of parameter names to update

epsilonfloat, optional

the learning process stops when the difference between the loglikelihood of the training set under the two last hypothesis is lower than epsilon. The lower this value the better the output, but the longer the running time. By default 0.01.

max_it: int

Maximal number of iterations. The algorithm will stop after max_it iterations. Default is infinity.

ppstr, optional

Will be printed at each iteration. By default ‘’.

verbose: bool, optional

Print or not a small recap at the end of the learning. Default is True.

return_data: bool, optional

If set to True, a dictionary containing following values will be returned alongside the hypothesis once the learning is done. ‘learning_rounds’, ‘learning_time’, ‘training_set_loglikelihood’. Default is False.

min_val: float, optional

Minimal value for the randomly instantiated parameters. If not set and if the model has at least two instantiated parameters, this value is equal to the parameters with the smallest instantiation. If not set and if the model has less than two instantiated parameters, this value is equal to 0.1.

max_valfloat, optional

Maximal value for the randomly instantiated parameters. If not set and if the model has at least two instantiated parameters, this value is equal to the parameters with the highest instantiation. If not set and if the model has less than two instantiated parameters, this value is equal to 5.0.

Returns

dict or list

Dictionary containing the estimated values for the non-indtantiated parameters. If return_data is set to True, returns a list containing: - the dictionary described above, - the returned_data (see parameter description).