crnGP
- class crnGP[source]
Methods
dloglik(X0, S0, Z, theta, g, rho, stype[, ...])Gradient of log-likelihood under correlated noise
get(key)General get item (retrives key from self.__dict__)
loglik(X0, S0, Z, theta, g, rho, stype[, ...])Log Likelihood under correlated noise
mlecrnGP(X, Z[, T0, rhotype, stype, lower, ...])Gaussian process modeling with correlated noise.
predict(x[, xprime, t0, interval, ...])Prediction under correlated noise
dloglikT
loglikT
pairwise_rho
- dloglik(X0, S0, Z, theta, g, rho, stype, beta0=None, covtype='Gaussian', eps=np.float64(1.4901161193847656e-08), components=('theta', 'g', 'rho'))[source]
Gradient of log-likelihood under correlated noise
- Parameters:
X0 (ndarray_like) – matrix of designs (one point per row)
S0 (ndarray_like) – integer vector containing seed information
Z (ndarray_like) – vector of outputs
theta (ndarray_like) – scalar (isotropic) or vector (anisotropic) lengthscale values
g (float) – noise variance
rho (float) – seed correlation
stype (str) – type of kronecker structure of the data
beta0 (float) – trend
covtype (str) – covariance kernel to use
eps (float) – amount of jitter on diagonal of covariance matrix
components (tuple) – directions for partial derivatives, one or several of ‘theta’ (lengthscales) or ‘g’ (for noise) or ‘rho’ (seed correlation strength)
- Return type:
gradient with respect to hyperparameters (specified via
components)
- loglik(X0, S0, Z, theta, g, rho, stype, beta0=None, covtype='Gaussian', eps=np.float64(1.4901161193847656e-08))[source]
Log Likelihood under correlated noise
- Parameters:
X0 (ndarray_like) – unique design matrix (of size nxd)
S0 (ndarray_like) – vector of seed information (size nx1)
Z (ndarray_like) – observation (output) vector (size N)
theta (ndarray_like) – lengthscale
g (float) – noise variance
rho (float) – seed correlation strength
stype (str) – type of kronecker structure of the data
beta0 (float) – trend
covtype (str) – covariance kernel to use
eps (float) – amount of jitter on diagonal of covariance matrix
- Returns:
loglik – log-likelihood
- Return type:
float
- mlecrnGP(X, Z, T0=None, rhotype='simple', stype='none', lower=None, upper=None, known={}, noiseControl={'g_bounds': array([1.49011612e-07, 1.00000000e+02]), 'rho_bounds': None}, init={}, covtype='Gaussian', maxit=100, eps=np.float64(1.4901161193847656e-08), settings={'factr': 10000000.0, 'return_Ki': True})[source]
Gaussian process modeling with correlated noise.
You may also call this function as crnGP.mle
Gaussian process regression under correlated noise based on maximum likelihood estimation of the hyperparameters.
- Parameters:
X (ndarray_like) – matrix of all designs, one per row, or list with elements. The last column is assumed to contain the integer seed value.
Z (ndarray_like) – Z vector of all observations.
T0 (ndarray_like:) – T0 optional vector of times (same for all
X's). Not currently supported.lower (ndarray_like) – optional bounds for the
thetaparameter (see :func: covariance_functions.cov_gen for the exact parameterization). In the multivariate case, it is possible to give vectors for bounds (resp. scalars) for anisotropy (resp. isotropy)upper (ndarray_like) – optional bounds for the
thetaparameter (see :func: covariance_functions.cov_gen for the exact parameterization). In the multivariate case, it is possible to give vectors for bounds (resp. scalars) for anisotropy (resp. isotropy)noiseControl (dict) –
- dict with element:
g_boundsvector providing minimal and maximal noise to signal ratio (default to(sqrt(MACHINE_DOUBLE_EPS), 100)).
settings (dict) –
- dict for options about the general modeling procedure, with elements:
return_Kiboolean to include the inverse covariance matrix in the object for further use (e.g., prediction).factr(default to 1e7) andpgtolare available to be passed to options for L-BFGS-B in :func:scipy.optimize.minimize.
eps (float) – jitter used in the inversion of the covariance matrix for numerical stability
known (dict) – optional dict of known parameters (e.g.
beta0,theta,g)init (dict) –
- optional lists of starting values for mle optimization:
theta_initinitial value of the theta parameters to be optimized over (default to 10% of the range determined withlowerandupper)g_initvector of nugget parameter to be optimized over
covtype (str) – covariance kernel type, either
'Gaussian','Matern5_2'or'Matern3_2', see :func:~covariance_functions.cov_genmaxit (int) – maximum number of iterations for L-BFGS-B of :func:
scipy.optimize.minimizededicated to maximum likelihood optimization
Notes
The global covariance matrix of the model is parameterized as
nu_hat * (Cx + g Id) * Cs = nu_hat * K, withCxthe spatial correlation matrix between unique designs, depending on the family of kernel used (see~covariance_functions.cov_genfor available choices) and values of lengthscale parameters. Cs is the correlation matrix between seed values, equal to 1 if the seeds are equal,rhootherwise.nu_hatis the plugin estimator of the variance of the process.It is generally recommended to rescale the inputs to the unit cube and to normalize the outputs.
- Returns:
theta: unless given, maximum likelihood estimate (mle) of the lengthscale parameter(s),nu_hat: plugin estimator of the variance,g: unless given, mle of the nugget of the noise/log-noise process,trendtype: either"SK"ifbeta0is provided, else"OK",beta0: constant trend of the mean process, plugin-estimator unless given,ll: log-likelihood value,nit_opt,msg: counts and message returned by :func:scipy.optimize.minimizeused_args: list with arguments provided in the call to the function,Ki,Kgi: inverse of the covariance matrices of the mean and noise processes (not scaled bynu_hatandnu_hat_var),X0,Z0,Z,eps,logN,covtype: values given in input,
time: time to train the model, in seconds.
- Return type:
self, with the following attributes
References
A Fadikar, M Binois, N Collier, A Stevens, KB Toh, J Ozik. Trajectory-oriented optimization of stochastic epidemiological models. 2023 Winter Simulation Conference (WSC), San Antonio, TX, USA, 2023, pp. 1244-1255, doi: 10.1109/WSC60868.2023.10408258
Examples
>>> from hetgpy import crnGP >>> import numpy as np >>> rng = np.random.default_rng(1) >>> pps = 10 # points per seed >>> x = np.linspace(0,2*np.pi,pps).reshape(-1,1) >>> X = np.vstack([x,x]) >>> seeds = ([1] * pps) + ([5] * pps) >>> X = np.hstack([X,np.array(seeds).reshape(-1,1)]) >>> # amplitude and phase shift >>> Z = np.sin(X[:,0] + (np.pi/2)*(X[:,-1]==5)) + X[:,-1] >>> GP = crnGP() >>> GP.mle(X,Z,covtype="Matern5_2")
- predict(x, xprime=None, t0=None, interval=None, interval_lower=None, interval_upper=None, **kw)[source]
Prediction under correlated noise
- Parameters:
x (ndarray_like) – matrix of designs locations to predict at (one point per row)
xprime (ndarray_like) – optional second matrix of predictive locations to obtain the predictive covariance matrix between
xandxprimet0 (ndarray_like) – vector of times (not currently supported)
interval (str) – one of ‘confidence’ or ‘predictive’ which is a convenience method to return confidence/predictive intervals corresponding to interval_lower and interval_upper
interval_lower (float) – lower of confidence/predictive interval
interval_upper (float) – upper of confidence/predictive interval
nugs_only (bool (default False)) – if
True, only return noise variance predictionkwargs (dict) – optional additional elements (only used for nugs_only)
- Returns:
mean: kriging mean;sd2: kriging variance (filtered, e.g. without the nugget values)nugs: noise variance predictioncov: (returned ifxprimeis given) predictive covariance matrix betweenxandxprimeconfidence_interval: prediction with kriging variance onlypredictive_interval: prediction with kriging and noise variance
- Return type:
dict with elements