hetGP

Methods

`LOO_preds_nugs`(i)	Leave-out-out predictions for the nugget
`copy`()	Make a copy of the model, which is useful in tandem with the update function
`dlogLikHet`(X0, Z0, Z, mult, Delta, theta, g)	derivative of log-likelihood for logLikHet_Wood with respect to theta and Lambda with all observations Model: K = nu2 * (C + Lambda) = nu using all observations using the replicates information nu2 is replaced by its plugin estimator in the likelihood
`logLikHet`(X0, Z0, Z, mult, Delta, theta, g)	log-likelihood in the anisotropic case - one lengthscale by variable Model: K = nu2 * (C + Lambda) = nu using all observations using the replicates information nu2 is replaced by its plugin estimator in the likelihood
`mleHetGP`(X, Z[, lower, upper, known, ...])	Gaussian process modeling with heteroskedastic noise
`plot`([type])	Plot model behavior.
`predict`(x[, noise_var, xprime, nugs_only, ...])	Gaussian process predictions using a heterogeneous noise GP object (of `hetGP`)
`rebuild`([robust])	Rebuilds inverse covariance matrix in homGP object (usually after saving).
`strip`()	Removes larger model objects
`summary`()	Print summary of model object and hyperparameter optimization
`update`(Xnew, Znew[, ginit, lower, upper, ...])	Update model object with new observations

get

LOO_preds_nugs(i)[source]

Leave-out-out predictions for the nugget

Parameters:

self (hetGP object)
i (int) – index of the point to remove

Return type:

LOO preds

copy()[source]

Make a copy of the model, which is useful in tandem with the update function

Parameters:: None
Returns:: newmodel
Return type:: (deep) copy of model

dlogLikHet(X0, Z0, Z, mult, Delta, theta, g, k_theta_g=None, theta_g=None, beta0=None, pX=None, logN=True, SiNK=False, components=None, eps=np.float64(1.4901161193847656e-08), covtype='Gaussian', SiNK_eps=0.0001, penalty=True, hom_ll=None)[source]

derivative of log-likelihood for logLikHet_Wood with respect to theta and Lambda with all observations Model: K = nu2 * (C + Lambda) = nu using all observations using the replicates information nu2 is replaced by its plugin estimator in the likelihood

Parameters:

X0 (ndarray_like) – unique designs
Z0 (ndarray_like) – averaged observations
Z (ndarray_like) – replicated observations (sorted with respect to X0)
mult (ndarray_like) – number of replicates at each Xi
Delta (ndarray_like) – vector of nuggets corresponding to each X0i or pXi, that are smoothed to give Lambda
logN (bool) – should exponentiated variance be used
SiNK (bool) – should the smoothing come from the SiNK predictor instead of the kriging one
theta (ndarray_like) – scale parameter for the mean process, either one value (isotropic) or a vector (anistropic)
k_theta_g (ndarray_like) – constant used for linking nuggets lengthscale to mean process lengthscale, i.e., theta_g[k] = k_theta_g * theta[k], alternatively theta_g can be used
theta_g (ndarray_like) – either one value (isotropic) or a vector (anistropic), alternative to using k_theta_g
g (ndarray_like) – nugget of the nugget process
pX (ndarray_like) – matrix of pseudo inputs locations of the noise process for Delta (could be replaced by a vector to avoid double loop)
components (ndarray_like) – components to determine which variable are to be taken in the derivation: None for all, otherwise list with elements from ‘theta’, ‘Delta’, ‘theta_g’, ‘k_theta_g’, ‘pX’ and ‘g’.
beta0 (float) – mean, if not provided, the MLE estimator is used
eps (float) – minimal value of elements of Lambda
covtype (str) – covariance kernel type
penalty (bool) – should a penalty term on Delta be used?
hom_ll (float) – reference homoskedastic likelihood
SiNK_eps (float)

Return type:

ArrayLike

logLikHet(X0, Z0, Z, mult, Delta, theta, g, k_theta_g=None, theta_g=None, logN=None, SiNK=False, beta0=None, pX=None, eps=np.float64(1.4901161193847656e-08), covtype='Gaussian', SiNK_eps=0.0001, penalty=True, hom_ll=None, trace=0)[source]

log-likelihood in the anisotropic case - one lengthscale by variable Model: K = nu2 * (C + Lambda) = nu using all observations using the replicates information nu2 is replaced by its plugin estimator in the likelihood

Parameters:

X0 (ndarray_like) – unique designs
Z0 (ndarray_like) – averaged observations
Z (ndarray_like) – replicated observations (sorted with respect to X0)
mult (ndarray_like) – number of replicates at each Xi
Delta (ndarray_like) – vector of nuggets corresponding to each X0i or pXi, that are smoothed to give Lambda
logN (bool) – should exponentiated variance be used
SiNK (bool) – should the smoothing come from the SiNK predictor instead of the kriging one
theta (ndarray_like) – scale parameter for the mean process, either one value (isotropic) or a vector (anistropic)
k_theta_g (ndarray_like) – constant used for linking nuggets lengthscale to mean process lengthscale, i.e., theta_g[k] = k_theta_g * theta[k], alternatively theta_g can be used
theta_g (ndarray_like) – either one value (isotropic) or a vector (anistropic), alternative to using k_theta_g
g (ndarray_like) – nugget of the nugget process
pX (ndarray_like) – matrix of pseudo inputs locations of the noise process for Delta (could be replaced by a vector to avoid double loop)
beta0 (float) – mean, if not provided, the MLE estimator is used
eps (float) – minimal value of elements of Lambda
covtype (str) – covariance kernel type
penalty (bool) – should a penalty term on Delta be used?
hom_ll (float) – reference homoskedastic likelihood
SiNK_eps (float)
trace (int)

Return type:

float

mleHetGP(X, Z, lower=None, upper=None, known={}, noiseControl={'g_bounds': (1e-06, 1), 'g_max': 100, 'k_theta_g_bounds': (1, 100)}, init={}, covtype='Gaussian', maxit=100, eps=np.float64(1.4901161193847656e-08), settings={'factr': 1000000000.0, 'ignore_MLE_divide_invalid': True, 'returnKi': True}, use_torch=False)[source]

Gaussian process modeling with heteroskedastic noise

You may also call this function as model.mle

Gaussian process regression under input dependent noise based on maximum likelihood estimation of the hyperparameters. A second GP is used to model latent (log-) variances. This function is enhanced to deal with replicated observations.

Parameters:

X (ndarray_like) – matrix of all designs, one per row, or list with elements: - X0 matrix of unique design locations, one point per row - Z0 vector of averaged observations, of length len(X0) - mult number of replicates at designs in X0, of length len(X0)
Z (ndarray_like) – Z vector of all observations. If using a list with X, Z has to be ordered with respect to X0, and of length sum(mult)
lower (ndarray_like) – optional bounds for the theta parameter (see :func: covariance_functions.cov_gen for the exact parameterization). In the multivariate case, it is possible to give vectors for bounds (resp. scalars) for anisotropy (resp. isotropy)
upper (ndarray_like) – optional bounds for the theta parameter (see :func: covariance_functions.cov_gen for the exact parameterization). In the multivariate case, it is possible to give vectors for bounds (resp. scalars) for anisotropy (resp. isotropy)
noiseControl (dict) –
dict with elements related to optimization of the noise process parameters:
- g_min, g_max minimal and maximal noise to signal ratio (of the mean process)
- lowerDelta, upperDelta optional vectors (or scalars) of bounds on Delta, of length len(X0) (default to np.repeat(eps, X0.shape[0]) and np.repeat(noiseControl["g_max"], X0.shape[0]) resp., or their log)
- lowerpX, upperpX optional vectors of bounds of the input domain if pX is used.
- lowerTheta_g, upperTheta_g optional vectors of bounds for the lengthscales of the noise process if linkThetas == 'none'. Same as for theta if not provided.
- k_theta_g_bounds if linkThetas == 'joint', vector with minimal and maximal values for k_theta_g (default to (1, 100)). See Notes.
- g_bounds vector for minimal and maximal noise to signal ratios for the noise of the noise process, i.e., the smoothing parameter for the noise process. (default to (1e-6, 1)).
settings (dict) –
dict for options about the general modeling procedure, with elements:
- linkThetas defines the relation between lengthscales of the mean and noise processes. Either 'none', 'joint'``(default) or ``'constr', see Notes.
- logN, when True (default), the log-noise process is modeled.
- initStrategy one of 'simple', 'residuals' (default) and 'smoothed' to obtain starting values for Delta, see Notes
- penalty when True, the penalized version of the likelihood is used (i.e., the sum of the log-likelihoods of the mean and variance processes, see References).
- hardpenalty is True, the log-likelihood from the noise GP is taken into account only if negative (default if maxit > 1000).
- checkHom when True, if the log-likelihood with a homoskedastic model is better, then return it.
- trace optional scalar (default to 0). If negative, fit silently. If 0, only high level information is given. If 1, information is given about the result of the heterogeneous model optimization. Level 2 gives more details. Level 3 additionaly displays all details about initialization of hyperparameters.
- return_matrices boolean to include the inverse covariance matrix in the object for further use (e.g., prediction).
- return_hom boolean to include homoskedastic GP models used for initialization (i.e., modHom and modNugs).
- factr (default to 1e9) and pgtol are available to be passed to options for L-BFGS-B in :func: scipy.optimize.minimize.
eps (float) – jitter used in the inversion of the covariance matrix for numerical stability
init (dict) –
optional lists of starting values for mle optimization or that should not be optimized over, respectively. Values in known are not modified, while it can happen to these of init, see Notes. One can set one or several of the following:
- theta lengthscale parameter(s) for the mean process either one value (isotropic) or a vector (anistropic)
- Delta vector of nuggets corresponding to each design in X0, that are smoothed to give Lambda (as the global covariance matrix depends on Delta and nu_hat, it is recommended to also pass values for theta)
- beta0 constant trend of the mean process
- k_theta_g constant used for link mean and noise processes lengthscales, when settings['linkThetas'] == 'joint'
- theta_g either one value (isotropic) or a vector (anistropic) for lengthscale parameter(s) of the noise process, when settings['linkThetas'] != 'joint'
- g scalar nugget of the noise process
- g_H scalar homoskedastic nugget for the initialisation with a :func: homGP.mleHomGP. See Notes.
- pX matrix of fixed pseudo inputs locations of the noise process corresponding to Delta
known (dict) –
optional lists of starting values for mle optimization or that should not be optimized over, respectively. Values in known are not modified, while it can happen to these of init, see Notes. One can set one or several of the following:
- theta lengthscale parameter(s) for the mean process either one value (isotropic) or a vector (anistropic)
- Delta vector of nuggets corresponding to each design in X0, that are smoothed to give Lambda (as the global covariance matrix depends on Delta and nu_hat, it is recommended to also pass values for theta)
- beta0 constant trend of the mean process
- k_theta_g constant used for link mean and noise processes lengthscales, when settings['linkThetas'] == 'joint'
- theta_g either one value (isotropic) or a vector (anistropic) for lengthscale parameter(s) of the noise process, when settings['linkThetas'] != 'joint'
- g scalar nugget of the noise process
- g_H scalar homoskedastic nugget for the initialisation with a :func: homGP.mleHomGP. See Notes.
- pX matrix of fixed pseudo inputs locations of the noise process corresponding to Delta
covtype (str) – covariance kernel type, either 'Gaussian', 'Matern5_2' or 'Matern3_2', see :func: ~covariance_functions.cov_gen
maxit (int) – maximum number of iterations for L-BFGS-B of :func: scipy.optimize.minimize dedicated to maximum likelihood optimization
use_torch (bool)

Return type:

None

Notes

The global covariance matrix of the model is parameterized as nu_hat * (C + Lambda * np.diag(1/mult)) = nu_hat * K, with C the correlation matrix between unique designs, depending on the family of kernel used (see :func: ~covariance_functions.cov_gen for available choices) and values of lengthscale parameters. nu_hat is the plugin estimator of the variance of the process. Lambda is the prediction on the noise level given by a second (homoskedastic) GP: .. math:: Lambda = C_g(C_g + mathrm{diag}(g/mathrm{mult}))^{-1} Delta} with C_g the correlation matrix between unique designs for this second GP, with lengthscales hyperparameters theta_g and nugget g and Delta the variance level at X0 that are estimated.

It is generally recommended to use :func: ~find_reps.find_reps to pre-process the data, to rescale the inputs to the unit cube and to normalize the outputs.

The noise process lengthscales can be set in several ways:

using k_theta_g (settings['linkThetas'] == 'joint'), supposed to be greater than one by default. In this case lengthscales of the noise process are multiples of those of the mean process.
if settings['linkThetas'] == 'constr, then the lower bound on theta_g correspond to estimated values of a homoskedastic GP fit.
else lengthscales between the mean and noise process are independent (both either anisotropic or not).

When no starting nor fixed parameter values are provided with init or known, the initialization process consists of fitting first an homoskedastic model of the data, called modHom. init['theta'], initial lengthscales are taken at 10% of the range determined with lower and upper, while init['g_H'] may be use to pass an initial nugget value. The resulting lengthscales provide initial values for theta (or update them if given in init).

If necessary, a second homoskedastic model, modNugs, is fitted to the empirical residual variance between the prediction given by modHom at X0 and Z (up to modHom['nu_hat]'). Note that when specifying settings['linkThetas'] == 'joint', then this second homoskedastic model has fixed lengthscale parameters. Starting values for ``theta_g and g are extracted from modNugs

Three initialization schemes for Delta are available with settings['initStrategy']:

for settings['initStrategy'] == 'simple', Delta is simply initialized to the estimated g value of modHom.
Note that this procedure may fail when settings['penalty'] == True.
for settings['initStrategy'] == 'residuals', Delta is initialized to the estimated residual variance from the homoskedastic mean prediction.
for settings['initStrategy'] == 'smoothed', Delta takes the values predicted by modNugs at X0.

Notice that lower and upper bounds cannot be equal for :func: scipy.optimize.minimize. To use pseudo-input locations for the noise process, one can either provide pX if they are not to be optimized. Otherwise, initial values are given with pXinit, and optimization bounds with lowerpX, upperpX in init. Automatic initialization of the other parameters without restriction is available for now only with method ‘simple’, otherwise it is assumed that pXinit points are a subset of X0.

Returns:

theta: unless given, maximum likelihood estimate (mle) of the lengthscale parameter(s),
Delta: unless given, mle of the nugget vector (non-smoothed),
Lambda: predicted input noise variance at X0,
nu_hat: plugin estimator of the variance,
theta_g: unless given, mle of the lengthscale(s) of the noise/log-noise process,
k_theta_g: if settings['linkThetas'] == 'joint', mle for the constant by which lengthscale parameters of theta are multiplied to get theta_g,
g: unless given, mle of the nugget of the noise/log-noise process,
trendtype: either "SK" if beta0 is provided, else "OK",
beta0: constant trend of the mean process, plugin-estimator unless given,
nmean: plugin estimator for the constant noise/log-noise process mean,
pX: if used, matrix of pseudo-inputs locations for the noise/log-noise process,
ll: log-likelihood value, (ll_non_pen) is the value without the penalty,
nit_opt, msg: counts and message returned by :func:scipy.optimize.minimize
modHom: homoskedastic GP model of class homGP used for initialization of the mean process,
modNugs: homoskedastic GP model of class homGP used for initialization of the noise/log-noise process,
nu_hat_var: variance of the noise process,
used_args: list with arguments provided in the call to the function,
- Ki, Kgi: inverse of the covariance matrices of the mean and noise processes (not scaled by nu_hat and nu_hat_var),
- X0, Z0, Z, eps, logN, covtype: values given in input,
time: time to train the model, in seconds.

See also ~hetgpy.hetGP.hetGP.predict for predictions, ~hetgpy.hetGP.update for updating an existing model. summary and plot functions are available as well. ~hetTP.mleHetTP provide a Student-t equivalent.

Return type:

self, with the following attributes

Parameters:

X (ArrayLike)
Z (ArrayLike)
lower (TypeAliasForwardRef('ArrayLike') | None)
upper (TypeAliasForwardRef('ArrayLike') | None)
known (dict)
noiseControl (dict)
init (dict)
covtype (str)
maxit (int)
eps (float)
settings (dict)
use_torch (bool)

Examples

>>> from hetgpy import hetGP
>>> from hetgpy.example_data import mcycle
>>> m = mcycle()
>>> model = hetGP()
>>> model.mle(m['times'],m['accel'],lower=[1.0],upper=[10.0],covtype="Matern5_2")

References

M. Binois, Robert B. Gramacy, M. Ludkovski (2018), Practical heteroskedastic Gaussian process modeling for large simulation experiments, Journal of Computational and Graphical Statistics, 27(4), 808–821. Preprint available on arXiv:1611.05902.

plot(type='diagnostics', **kwargs)[source]

Plot model behavior. See hetgpy.plot for more details on types of plots

Parameters:: type (str) – type of plot: currently supported are “diagnostics” and “iterates”
Returns:: fig, ax
Return type:: figure and axis object

predict(x, noise_var=False, xprime=None, nugs_only=False, interval=None, interval_lower=None, interval_upper=None, **kwargs)[source]

Gaussian process predictions using a heterogeneous noise GP object (of hetGP)

Parameters:

x (ndarray_like) – matrix of designs locations to predict at (one point per row)
noise_var (bool (default False)) – should the variance of the latent variance process be returned?
xprime (ndarray_like) – optional second matrix of predictive locations to obtain the predictive covariance matrix between x and xprime
nugs_only (bool (default False)) – if True, only return noise variance prediction
kwargs (dict) – optional additional elements (not used)
interval (str | None)
interval_lower (float | None)
interval_upper (float | None)

Returns:

mean: kriging mean;
sd2: kriging variance (filtered, e.g. without the nugget values)
nugs: noise variance prediction
sd2_var: (returned if noise_var = True) kriging variance of the noise process (i.e., on log-variances if logN = TRUE)
cov: (returned if xprime is given) predictive covariance matrix between x and xprime

Return type:

dict with elements

Notes

The full predictive variance corresponds to the sum of sd2 and nugs. See :func: ~hetgpy.hetGP.mleHetGP for examples.

rebuild(robust=False)[source]

Rebuilds inverse covariance matrix in homGP object (usually after saving). Works in tandem with strip

Parameters:: robust (bool) – use np.linalg.pinv for covariance matrix inversion. Otherwise use cholesky
Return type:: self with rebuilt inverse covariance matrix

strip()[source]

Removes larger model objects

Can be rebuilt with rebuild

summary()[source]: Print summary of model object and hyperparameter optimization

update(Xnew, Znew, ginit=0.01, lower=None, upper=None, noiseControl=None, settings=None, known={}, maxit=100, method='quick')[source]

Update model object with new observations

Parameters:

Xnew (ndarray_like) – new inputs (one observation per row)
Znew (ndarray_like) – new observations
lower (ndarray_like) – lower bound for lengthscales. If not provided extracted from self
upper (ndarray_like) – upper bound for lengthscales. If not provided extracted from self
noiseControl (dict) – noise bounds. If not provided extracted from self
settings (dict) – options for optimization. If not provided extracted from self
known (dict) – known hyperparameters to fix at values
maxit (int) – maximum number of iterations for hyperparameter optimization
ginit (float)
method (str)

Return type:

self, potentially with updated hyperparameters

Notes

If hyperparameters do not need to be updated, maxit can be set to 0. In this case it is possible to pass NAs in Znew, then the model can still be used to provide updated variance predictions

Example

>>> import numpy as np, from hetgpy import hetGP
>>> X = np.array([1, 2, 3, 4, 12]).reshape(-1,1)
>>> Y = np.array([0, -1.75, -2, -0.5, 5])
>>> model = hetGP().mle(X,Y,known={'theta':np.array([10.0]),'g':1e-8})
>>> Xnew  = np.array([2.5]).reshape(-1,1)
>>> Znew  = model.predict(Xnew)['mean']
>>> model.update(Xnew,Znew)