Baum-Welch

class jajapy.BW

Class for the Baum-Welch algorithm.

fit(training_set, initial_model=None, nb_states: Optional[int] = None, random_initial_state: bool = True, min_exit_rate_time: Optional[int] = None, max_exit_rate_time: Optional[int] = None, self_loop: Optional[bool] = None, nb_distributions: Optional[int] = None, output_file: Optional[str] = None, output_file_prism: Optional[str] = None, epsilon: float = 0.01, max_it: int = inf, pp: str = '', verbose: int = 2, return_data: bool = False, stormpy_output: bool = True, fixed_parameters: ndarray = False, update_constant: bool = True, min_val: Optional[float] = None, max_val: Optional[float] = None, processes: Optional[int] = None)

Fits any model according to traces. This method will figure which type of Markov model should be used, according to the training set and the initial model (if given).

Parameters

tracesSet, list or ndarray.: The training set.
initial_modelModel or stormpy.sparse model, optional.: The first hypothesis. If not set it will create a random Model with nb_states states.
nb_states: int, optional.: If initial_model is not set it will create a random Model with nb_states states. Must be set if initial_model is not set.
random_initial_state: bool, optional.: If initial_model is not set it will create a random Model with random initial state according to this sequence of probabilities. Must be set if initial_model is not set. Default is True.
min_exit_rate_time: int, optional: For CTMC learning only. Minimum exit rate for the states in the first hypothesis. Must be set if initial_model is not set, and if the SUL is a CTMC or PCTMC.
max_exit_rate_time: int, optional: For CTMC learning only. Minimum exit rate for the states in the first hypothesis. Must be set if initial_model is not set, and if the SUL is a CTMC or PCTMC.
self_loop: bool, optional: For CTMC/PCTMC learning only. Wether or not there will be self loop in the first hypothesis. Must be set if initial_model is not set, and if the SUL is a CTMC or PCTMC.
nb_distributions: int, optional.: For GoHMM learning only. Number of distributions in each state in the initial hypothsis. Must be set if initial_model is not set, and if the SUL is a GoHMM.
output_filestr, optional: If set, the output model will be saved at this location. Otherwise the output model will not be saved.
output_file_prismstr, optional: If set, the output model will be saved in a prism file at this location. Otherwise the output model will not be saved. This parameter is ignored if the model under learning is a HMM or a GoHMM.
epsilonfloat, optional: The learning process stops when the difference between the loglikelihood of the training set under the two last hypothesis is lower than epsilon. The lower this value the better the output, but the longer the running time. Default is 0.01.
max_it: int, optional: Maximal number of iterations. The algorithm will stop after max_it iterations. Default is infinity.
pp: str, optional: Will be printed at each iteration. Default is an empty string.
verbose: int, optional.: Define the level of information that will print during the learning 0 - nothing (no warnings, no progress bar, no recap at the end) 1 - minimal (warnings only) 2 - default (warnings and progress bar, no recap at the end) 3 - maximal (warnings, progress bar and recap) Default is 2.
return_data: bool, optional.: If set to True, a dictionary containing following values will be returned alongside the output model once the learning is done. ‘learning_rounds’, ‘learning_time’, ‘training_set_loglikelihood’. Default is False.
stormpy_output: bool, optional.: If set to True the output model will be a Stormpy sparse model. Doesn’t work for HMM and GOHMM. Default is True.
fixed_parameters: ndarray of bool, optional: For CTMC/PCTMC learning only. ndarray of bool with the same shape as the transition matrix (i.e nb_states x nb_states). If fixed_parameters[s1,s2] == True, the transition parameter from s1 to s2 will not be changed during the learning (it’s a fixed parameter). By default no parameters will be fixed.
update_constant: bool, optional: For PCTMC learning only. If set to False, the constant transitions (i.e. tha transition that doesn’t depend on any parameter) will no be updated. Default is True.
min_val: float, optional: For PCTMC learning only. Minimal value for the randomly instantiated parameters. If not set and if the model has at least two instantiated parameters, this value is equal to the parameters with the smallest instantiation. If not set and if the model has less than two instantiated parameters, this value is equal to 0.1.
max_valfloat, optional: For PCTMC learning only. Maximal value for the randomly instantiated parameters. If not set and if the model has at least two instantiated parameters, this value is equal to the parameters with the highest instantiation. If not set and if the model has less than two instantiated parameters, this value is equal to 5.0.
processesint, optional: Number of processes used during the learning. Only for linux: for Windows and Mac OS it is 1. Default is cpu_count()-1.

Returns

Model: fitted model.

fit_nonInstantiatedParameters(traces, initial_model, epsilon: float = 0.01, max_it: int = inf, pp: str = '', verbose: bool = True, return_data: bool = False, min_val: Optional[float] = None, max_val: Optional[float] = None) → dict

For PCTMC learning only. Fits only the non-instantiated parameters in the initial model according to traces.

Parameters

tracesSet or list or numpy.ndarray: training set.
initial_modelPCTMC: first hypothesis.
epsilonfloat, optional: the learning process stops when the difference between the loglikelihood of the training set under the two last hypothesis is lower than epsilon. The lower this value the better the output, but the longer the running time. By default 0.01.
max_it: int: Maximal number of iterations. The algorithm will stop after max_it iterations. Default is infinity.
ppstr, optional: Will be printed at each iteration. By default ‘’.
verbose: bool, optional: Print or not a small recap at the end of the learning. Default is True.
return_data: bool, optional: If set to True, a dictionary containing following values will be returned alongside the hypothesis once the learning is done. ‘learning_rounds’, ‘learning_time’, ‘training_set_loglikelihood’. Default is False.
min_val: float, optional: Minimal value for the randomly instantiated parameters. If not set and if the model has at least two instantiated parameters, this value is equal to the parameters with the smallest instantiation. If not set and if the model has less than two instantiated parameters, this value is equal to 0.1.
max_valfloat, optional: Maximal value for the randomly instantiated parameters. If not set and if the model has at least two instantiated parameters, this value is equal to the parameters with the highest instantiation. If not set and if the model has less than two instantiated parameters, this value is equal to 5.0.

Returns

dict or list: Dictionary containing the estimated values for the non-indtantiated parameters. If return_data is set to True, returns a list containing: - the dictionary described above, - the returned_data (see parameter description).

fit_parameters(traces, initial_model, to_update: list, epsilon: float = 0.01, max_it: int = inf, pp: str = '', verbose: bool = True, return_data: bool = False, min_val: Optional[float] = None, max_val: Optional[float] = None) → dict

For PCTMC learning only. Fits only the non-instantiated parameters in the initial model according to traces.

Parameters

tracesSet or list or numpy.ndarray: training set.
initial_modelPCTMC: first hypothesis.
to_update: list of str: list of parameter names to update
epsilonfloat, optional: the learning process stops when the difference between the loglikelihood of the training set under the two last hypothesis is lower than epsilon. The lower this value the better the output, but the longer the running time. By default 0.01.
max_it: int: Maximal number of iterations. The algorithm will stop after max_it iterations. Default is infinity.
ppstr, optional: Will be printed at each iteration. By default ‘’.
verbose: bool, optional: Print or not a small recap at the end of the learning. Default is True.
return_data: bool, optional: If set to True, a dictionary containing following values will be returned alongside the hypothesis once the learning is done. ‘learning_rounds’, ‘learning_time’, ‘training_set_loglikelihood’. Default is False.
min_val: float, optional: Minimal value for the randomly instantiated parameters. If not set and if the model has at least two instantiated parameters, this value is equal to the parameters with the smallest instantiation. If not set and if the model has less than two instantiated parameters, this value is equal to 0.1.
max_valfloat, optional: Maximal value for the randomly instantiated parameters. If not set and if the model has at least two instantiated parameters, this value is equal to the parameters with the highest instantiation. If not set and if the model has less than two instantiated parameters, this value is equal to 5.0.

Returns

dict or list: Dictionary containing the estimated values for the non-indtantiated parameters. If return_data is set to True, returns a list containing: - the dictionary described above, - the returned_data (see parameter description).