gEconpy.model.statespace.DSGEStateSpace.build_statespace_graph#
- DSGEStateSpace.build_statespace_graph(data, register_data=True, missing_fill_value=None, cov_jitter=1e-08, save_kalman_filter_outputs_in_idata=False, add_norm_check=True, add_bk_check=False, add_solver_success_check=False, add_steady_state_penalty=True, resid_penalty=1.0)#
Given a parameter vector theta, constructs the full computational graph describing the state space model and the associated log probability of the data. Hidden states and log probabilities are computed via the Kalman Filter.
- Parameters:
- data
Union[np.ndarray,pd.DataFrame,pt.TensorVariable] The observed data used to fit the state space model. It can be a NumPy array, a Pandas DataFrame, or a Pytensor tensor variable.
- register_databool, optional, default=True
If True, the observed data will be registered with PyMC as a pm.Data variable. In addition, a “time” dim will be created an added to the model’s coords.
- mode
Optional[str], optional, default=None The Pytensor mode used for the computation graph construction. If None, the default mode will be used. Other options include “JAX” and “NUMBA”.
- missing_fill_value: float, optional, default=-9999
A value to mask in missing values. NaN values in the data need to be filled with an arbitrary value to avoid triggering PyMC’s automatic imputation machinery (missing values are instead filled by treating them as hidden states during Kalman filtering).
In general this never needs to be set. But if by a wild coincidence your data includes the value -9999.0, you will need to change the missing_fill_value to something else, to avoid incorrectly mark in data as missing.
- cov_jitter: float, default 1e-8 or 1e-6 if pytensor.config.floatX is float32
The Kalman filter is known to be numerically unstable, especially at half precision. This value is added to the diagonal of every covariance matrix – predicted, filtered, and smoothed – at every step, to ensure all matrices are strictly positive semi-definite.
- Obviously, if this can be zero, that’s best. In general:
Having measurement error makes Kalman Filters more robust. A large source of numerical errors come from the Filtered and Smoothed covariance matrices having a zero in the (0, 0) position, which always occurs when there is no measurement error. You can lower this value in the presence of measurement error.
The Univariate Filter is more robust than other filters, and can tolerate a lower jitter value
- mvn_method: str, default “svd”
Method used to invert the covariance matrix when calculating the pdf of a multivariate normal (or when generating samples). One of “cholesky”, “eigh”, or “svd”. “cholesky” is fastest, but least robust to ill-conditioned matrices, while “svd” is slow but extremely robust.
In general, if your model has measurement error, “cholesky” will be safe to use. Otherwise, “svd” is recommended. “eigh” can also be tried if sampling with “svd” is very slow, but it is not as robust as “svd”.
- save_kalman_filter_outputs_in_idata: bool, optional, default=False
If True, Kalman Filter outputs will be saved in the model as deterministics. Useful for debugging, but should not be necessary for the majority of users.
- mode: str, optional
Pytensor mode to use when compiling the graph. This will be saved as a model attribute and used when compiling sampling functions (e.g.
sample_conditional_prior).Deprecated since version 0.2.5: The mode argument is deprecated and will be removed in a future version. Pass
modeto the model constructor, or manually specifycompile_kwargsin sampling functions instead.
- data