hetGPy API reference

docstring for find_reps

find_reps(X, Z, return_Zlist=True, rescale=False, normalize=False, inputBounds=None, use_torch=False)[source]

Prepare data for use with mleHomGP and mleHetGP

In particular to find replicated observations

Parameters:
  • X (array_like) – matrix of design locations, one point per row

  • Z (array_like) – vector of observations at X

  • return_Zlist (bool) – to return Zlist, see below

  • rescale (bool) – if True, the inputs are rescaled to the unit hypercube

  • normalize (bool) – if True, the outputs are centered and normalized

  • inputBounds (array_like) – optional matrix of known boundaries in original input space, of size 2 times X.shape[1]. If not provided, and rescale == True, it is estimated from the data.

Returns:

  • dict – dictionary of outputs

  • type – explain types

  • out

    dictionary of

    • X0 matrix with unique designs locations, one point per row

    • Z0 vector of averaged observations at X0

    • mult number of replicates at X0

    • Z vector with all observations, sorted according to X0

    • Zlist optional list, each element corresponds to observations at a design in X0

    • inputBounds optional matrix, to rescale back to the original input space

    • outputStats optional vector, with mean and variance of the original outputs.

References

Binois et. al (2018)

Examples

>>> from hetgpy.data import mcycle
>>> from hetgpy.hetGP import hetGP
>>> X = mcycle['times']
>>> Z = mcycle['accel']
>>> model = hetGP()
>>> out = model.find_reps(X, Z)
>>> print(out)
class hetGP[source]

Methods

LOO_preds_nugs(i)

Leave-out-out predictions for the nugget

copy()

Make a copy of the model, which is useful in tandem with the update function

dlogLikHet(X0, Z0, Z, mult, Delta, theta, g)

derivative of log-likelihood for logLikHet_Wood with respect to theta and Lambda with all observations Model: K = nu2 * (C + Lambda) = nu using all observations using the replicates information nu2 is replaced by its plugin estimator in the likelihood

logLikHet(X0, Z0, Z, mult, Delta, theta, g)

log-likelihood in the anisotropic case - one lengthscale by variable Model: K = nu2 * (C + Lambda) = nu using all observations using the replicates information nu2 is replaced by its plugin estimator in the likelihood

mleHetGP(X, Z[, lower, upper, known, ...])

Gaussian process modeling with heteroskedastic noise

plot([type])

Plot model behavior.

predict(x[, noise_var, xprime, nugs_only, ...])

Gaussian process predictions using a heterogeneous noise GP object (of hetGP)

rebuild([robust])

Rebuilds inverse covariance matrix in homGP object (usually after saving).

strip()

Removes larger model objects

summary()

Print summary of model object and hyperparameter optimization

update(Xnew, Znew[, ginit, lower, upper, ...])

Update model object with new observations

get

LOO_preds_nugs(i)[source]

Leave-out-out predictions for the nugget

Parameters:
  • self (hetGP object)

  • i (int) – index of the point to remove

Return type:

LOO preds

copy()[source]

Make a copy of the model, which is useful in tandem with the update function

Parameters:

None

Returns:

newmodel

Return type:

(deep) copy of model

dlogLikHet(X0, Z0, Z, mult, Delta, theta, g, k_theta_g=None, theta_g=None, beta0=None, pX=None, logN=True, SiNK=False, components=None, eps=np.float64(1.4901161193847656e-08), covtype='Gaussian', SiNK_eps=0.0001, penalty=True, hom_ll=None)[source]

derivative of log-likelihood for logLikHet_Wood with respect to theta and Lambda with all observations Model: K = nu2 * (C + Lambda) = nu using all observations using the replicates information nu2 is replaced by its plugin estimator in the likelihood

Parameters:
  • X0 (ndarray_like) – unique designs

  • Z0 (ndarray_like) – averaged observations

  • Z (ndarray_like) – replicated observations (sorted with respect to X0)

  • mult (ndarray_like) – number of replicates at each Xi

  • Delta (ndarray_like) – vector of nuggets corresponding to each X0i or pXi, that are smoothed to give Lambda

  • logN (bool) – should exponentiated variance be used

  • SiNK (bool) – should the smoothing come from the SiNK predictor instead of the kriging one

  • theta (ndarray_like) – scale parameter for the mean process, either one value (isotropic) or a vector (anistropic)

  • k_theta_g (ndarray_like) – constant used for linking nuggets lengthscale to mean process lengthscale, i.e., theta_g[k] = k_theta_g * theta[k], alternatively theta_g can be used

  • theta_g (ndarray_like) – either one value (isotropic) or a vector (anistropic), alternative to using k_theta_g

  • g (ndarray_like) – nugget of the nugget process

  • pX (ndarray_like) – matrix of pseudo inputs locations of the noise process for Delta (could be replaced by a vector to avoid double loop)

  • components (ndarray_like) – components to determine which variable are to be taken in the derivation: None for all, otherwise list with elements from ‘theta’, ‘Delta’, ‘theta_g’, ‘k_theta_g’, ‘pX’ and ‘g’.

  • beta0 (float) – mean, if not provided, the MLE estimator is used

  • eps (float) – minimal value of elements of Lambda

  • covtype (str) – covariance kernel type

  • penalty (bool) – should a penalty term on Delta be used?

  • hom_ll (float) – reference homoskedastic likelihood

  • SiNK_eps (float)

Return type:

ArrayLike

logLikHet(X0, Z0, Z, mult, Delta, theta, g, k_theta_g=None, theta_g=None, logN=None, SiNK=False, beta0=None, pX=None, eps=np.float64(1.4901161193847656e-08), covtype='Gaussian', SiNK_eps=0.0001, penalty=True, hom_ll=None, trace=0)[source]

log-likelihood in the anisotropic case - one lengthscale by variable Model: K = nu2 * (C + Lambda) = nu using all observations using the replicates information nu2 is replaced by its plugin estimator in the likelihood

Parameters:
  • X0 (ndarray_like) – unique designs

  • Z0 (ndarray_like) – averaged observations

  • Z (ndarray_like) – replicated observations (sorted with respect to X0)

  • mult (ndarray_like) – number of replicates at each Xi

  • Delta (ndarray_like) – vector of nuggets corresponding to each X0i or pXi, that are smoothed to give Lambda

  • logN (bool) – should exponentiated variance be used

  • SiNK (bool) – should the smoothing come from the SiNK predictor instead of the kriging one

  • theta (ndarray_like) – scale parameter for the mean process, either one value (isotropic) or a vector (anistropic)

  • k_theta_g (ndarray_like) – constant used for linking nuggets lengthscale to mean process lengthscale, i.e., theta_g[k] = k_theta_g * theta[k], alternatively theta_g can be used

  • theta_g (ndarray_like) – either one value (isotropic) or a vector (anistropic), alternative to using k_theta_g

  • g (ndarray_like) – nugget of the nugget process

  • pX (ndarray_like) – matrix of pseudo inputs locations of the noise process for Delta (could be replaced by a vector to avoid double loop)

  • beta0 (float) – mean, if not provided, the MLE estimator is used

  • eps (float) – minimal value of elements of Lambda

  • covtype (str) – covariance kernel type

  • penalty (bool) – should a penalty term on Delta be used?

  • hom_ll (float) – reference homoskedastic likelihood

  • SiNK_eps (float)

  • trace (int)

Return type:

float

mleHetGP(X, Z, lower=None, upper=None, known={}, noiseControl={'g_bounds': (1e-06, 1), 'g_max': 100, 'k_theta_g_bounds': (1, 100)}, init={}, covtype='Gaussian', maxit=100, eps=np.float64(1.4901161193847656e-08), settings={'factr': 1000000000.0, 'ignore_MLE_divide_invalid': True, 'returnKi': True}, use_torch=False)[source]

Gaussian process modeling with heteroskedastic noise

You may also call this function as model.mle

Gaussian process regression under input dependent noise based on maximum likelihood estimation of the hyperparameters. A second GP is used to model latent (log-) variances. This function is enhanced to deal with replicated observations.

Parameters:
  • X (ndarray_like) – matrix of all designs, one per row, or list with elements: - X0 matrix of unique design locations, one point per row - Z0 vector of averaged observations, of length len(X0) - mult number of replicates at designs in X0, of length len(X0)

  • Z (ndarray_like) – Z vector of all observations. If using a list with X, Z has to be ordered with respect to X0, and of length sum(mult)

  • lower (ndarray_like) – optional bounds for the theta parameter (see :func: covariance_functions.cov_gen for the exact parameterization). In the multivariate case, it is possible to give vectors for bounds (resp. scalars) for anisotropy (resp. isotropy)

  • upper (ndarray_like) – optional bounds for the theta parameter (see :func: covariance_functions.cov_gen for the exact parameterization). In the multivariate case, it is possible to give vectors for bounds (resp. scalars) for anisotropy (resp. isotropy)

  • noiseControl (dict) –

    dict with elements related to optimization of the noise process parameters:
    • g_min, g_max minimal and maximal noise to signal ratio (of the mean process)

    • lowerDelta, upperDelta optional vectors (or scalars) of bounds on Delta, of length len(X0) (default to np.repeat(eps, X0.shape[0]) and np.repeat(noiseControl["g_max"], X0.shape[0]) resp., or their log)

    • lowerpX, upperpX optional vectors of bounds of the input domain if pX is used.

    • lowerTheta_g, upperTheta_g optional vectors of bounds for the lengthscales of the noise process if linkThetas == 'none'. Same as for theta if not provided.

    • k_theta_g_bounds if linkThetas == 'joint', vector with minimal and maximal values for k_theta_g (default to (1, 100)). See Notes.

    • g_bounds vector for minimal and maximal noise to signal ratios for the noise of the noise process, i.e., the smoothing parameter for the noise process. (default to (1e-6, 1)).

  • settings (dict) –

    dict for options about the general modeling procedure, with elements:
    • linkThetas defines the relation between lengthscales of the mean and noise processes. Either 'none', 'joint'``(default) or ``'constr', see Notes.

    • logN, when True (default), the log-noise process is modeled.

    • initStrategy one of 'simple', 'residuals' (default) and 'smoothed' to obtain starting values for Delta, see Notes

    • penalty when True, the penalized version of the likelihood is used (i.e., the sum of the log-likelihoods of the mean and variance processes, see References).

    • hardpenalty is True, the log-likelihood from the noise GP is taken into account only if negative (default if maxit > 1000).

    • checkHom when True, if the log-likelihood with a homoskedastic model is better, then return it.

    • trace optional scalar (default to 0). If positive, tracing information on the fitting process. If 1, information is given about the result of the heterogeneous model optimization. Level 2 gives more details. Level 3 additionaly displays all details about initialization of hyperparameters.

    • return_matrices boolean to include the inverse covariance matrix in the object for further use (e.g., prediction).

    • return_hom boolean to include homoskedastic GP models used for initialization (i.e., modHom and modNugs).

    • factr (default to 1e9) and pgtol are available to be passed to options for L-BFGS-B in :func: scipy.optimize.minimize.

  • eps (float) – jitter used in the inversion of the covariance matrix for numerical stability

  • init (dict) –

    optional lists of starting values for mle optimization or that should not be optimized over, respectively. Values in known are not modified, while it can happen to these of init, see Notes. One can set one or several of the following:

    • theta lengthscale parameter(s) for the mean process either one value (isotropic) or a vector (anistropic)

    • Delta vector of nuggets corresponding to each design in X0, that are smoothed to give Lambda (as the global covariance matrix depends on Delta and nu_hat, it is recommended to also pass values for theta)

    • beta0 constant trend of the mean process

    • k_theta_g constant used for link mean and noise processes lengthscales, when settings['linkThetas'] == 'joint'

    • theta_g either one value (isotropic) or a vector (anistropic) for lengthscale parameter(s) of the noise process, when settings['linkThetas'] != 'joint'

    • g scalar nugget of the noise process

    • g_H scalar homoskedastic nugget for the initialisation with a :func: homGP.mleHomGP. See Notes.

    • pX matrix of fixed pseudo inputs locations of the noise process corresponding to Delta

  • known (dict) –

    optional lists of starting values for mle optimization or that should not be optimized over, respectively. Values in known are not modified, while it can happen to these of init, see Notes. One can set one or several of the following:

    • theta lengthscale parameter(s) for the mean process either one value (isotropic) or a vector (anistropic)

    • Delta vector of nuggets corresponding to each design in X0, that are smoothed to give Lambda (as the global covariance matrix depends on Delta and nu_hat, it is recommended to also pass values for theta)

    • beta0 constant trend of the mean process

    • k_theta_g constant used for link mean and noise processes lengthscales, when settings['linkThetas'] == 'joint'

    • theta_g either one value (isotropic) or a vector (anistropic) for lengthscale parameter(s) of the noise process, when settings['linkThetas'] != 'joint'

    • g scalar nugget of the noise process

    • g_H scalar homoskedastic nugget for the initialisation with a :func: homGP.mleHomGP. See Notes.

    • pX matrix of fixed pseudo inputs locations of the noise process corresponding to Delta

  • covtype (str) – covariance kernel type, either 'Gaussian', 'Matern5_2' or 'Matern3_2', see :func: ~covariance_functions.cov_gen

  • maxit (int) – maximum number of iterations for L-BFGS-B of :func: scipy.optimize.minimize dedicated to maximum likelihood optimization

  • use_torch (bool)

Return type:

None

Notes

The global covariance matrix of the model is parameterized as nu_hat * (C + Lambda * np.diag(1/mult)) = nu_hat * K, with C the correlation matrix between unique designs, depending on the family of kernel used (see :func: ~covariance_functions.cov_gen for available choices) and values of lengthscale parameters. nu_hat is the plugin estimator of the variance of the process. Lambda is the prediction on the noise level given by a second (homoskedastic) GP: .. math:: Lambda = C_g(C_g + mathrm{diag}(g/mathrm{mult}))^{-1} Delta} with C_g the correlation matrix between unique designs for this second GP, with lengthscales hyperparameters theta_g and nugget g and Delta the variance level at X0 that are estimated.

It is generally recommended to use :func: ~find_reps.find_reps to pre-process the data, to rescale the inputs to the unit cube and to normalize the outputs.

The noise process lengthscales can be set in several ways:
  • using k_theta_g (settings['linkThetas'] == 'joint'), supposed to be greater than one by default. In this case lengthscales of the noise process are multiples of those of the mean process.

  • if settings['linkThetas'] == 'constr, then the lower bound on theta_g correspond to estimated values of a homoskedastic GP fit.

  • else lengthscales between the mean and noise process are independent (both either anisotropic or not).

When no starting nor fixed parameter values are provided with init or known, the initialization process consists of fitting first an homoskedastic model of the data, called modHom. init['theta'], initial lengthscales are taken at 10% of the range determined with lower and upper, while init['g_H'] may be use to pass an initial nugget value. The resulting lengthscales provide initial values for theta (or update them if given in init).

If necessary, a second homoskedastic model, modNugs, is fitted to the empirical residual variance between the prediction given by modHom at X0 and Z (up to modHom['nu_hat]'). Note that when specifying settings['linkThetas'] == 'joint', then this second homoskedastic model has fixed lengthscale parameters. Starting values for ``theta_g and g are extracted from modNugs

Three initialization schemes for Delta are available with settings['initStrategy']:
  • for settings['initStrategy'] == 'simple', Delta is simply initialized to the estimated g value of modHom.

  • Note that this procedure may fail when settings['penalty'] == True.

  • for settings['initStrategy'] == 'residuals', Delta is initialized to the estimated residual variance from the homoskedastic mean prediction.

  • for settings['initStrategy'] == 'smoothed', Delta takes the values predicted by modNugs at X0.

Notice that lower and upper bounds cannot be equal for :func: scipy.optimize.minimize. To use pseudo-input locations for the noise process, one can either provide pX if they are not to be optimized. Otherwise, initial values are given with pXinit, and optimization bounds with lowerpX, upperpX in init. Automatic initialization of the other parameters without restriction is available for now only with method ‘simple’, otherwise it is assumed that pXinit points are a subset of X0.

Returns:

  • theta: unless given, maximum likelihood estimate (mle) of the lengthscale parameter(s),

  • Delta: unless given, mle of the nugget vector (non-smoothed),

  • Lambda: predicted input noise variance at X0,

  • nu_hat: plugin estimator of the variance,

  • theta_g: unless given, mle of the lengthscale(s) of the noise/log-noise process,

  • k_theta_g: if settings['linkThetas'] == 'joint', mle for the constant by which lengthscale parameters of theta are multiplied to get theta_g,

  • g: unless given, mle of the nugget of the noise/log-noise process,

  • trendtype: either "SK" if beta0 is provided, else "OK",

  • beta0: constant trend of the mean process, plugin-estimator unless given,

  • nmean: plugin estimator for the constant noise/log-noise process mean,

  • pX: if used, matrix of pseudo-inputs locations for the noise/log-noise process,

  • ll: log-likelihood value, (ll_non_pen) is the value without the penalty,

  • nit_opt, msg: counts and message returned by :func:scipy.optimize.minimize

  • modHom: homoskedastic GP model of class homGP used for initialization of the mean process,

  • modNugs: homoskedastic GP model of class homGP used for initialization of the noise/log-noise process,

  • nu_hat_var: variance of the noise process,

  • used_args: list with arguments provided in the call to the function,
    • Ki, Kgi: inverse of the covariance matrices of the mean and noise processes (not scaled by nu_hat and nu_hat_var),

    • X0, Z0, Z, eps, logN, covtype: values given in input,

  • time: time to train the model, in seconds.

See also ~hetgpy.hetGP.hetGP.predict for predictions, ~hetgpy.hetGP.update for updating an existing model. summary and plot functions are available as well. ~hetTP.mleHetTP provide a Student-t equivalent.

Return type:

self, with the following attributes

Parameters:
  • X (ArrayLike)

  • Z (ArrayLike)

  • lower (ArrayLike | None)

  • upper (ArrayLike | None)

  • known (dict)

  • noiseControl (dict)

  • init (dict)

  • covtype (str)

  • maxit (int)

  • eps (float)

  • settings (dict)

  • use_torch (bool)

Examples

>>> from hetgpy import hetGP
>>> from hetgpy.example_data import mcycle
>>> m = mcycle()
>>> model = hetGP()
>>> model.mle(m['times'],m['accel'],lower=[1.0],upper=[10.0],covtype="Matern5_2")

References

M. Binois, Robert B. Gramacy, M. Ludkovski (2018), Practical heteroskedastic Gaussian process modeling for large simulation experiments, Journal of Computational and Graphical Statistics, 27(4), 808–821. Preprint available on arXiv:1611.05902.

plot(type='diagnostics', **kwargs)[source]

Plot model behavior. See hetgpy.plot for more details on types of plots

Parameters:

type (str) – type of plot: currently supported are “diagnostics” and “iterates”

Returns:

fig, ax

Return type:

figure and axis object

predict(x, noise_var=False, xprime=None, nugs_only=False, interval=None, interval_lower=None, interval_upper=None, **kwargs)[source]

Gaussian process predictions using a heterogeneous noise GP object (of hetGP)

Parameters:
  • x (ndarray_like) – matrix of designs locations to predict at (one point per row)

  • noise_var (bool (default False)) – should the variance of the latent variance process be returned?

  • xprime (ndarray_like) – optional second matrix of predictive locations to obtain the predictive covariance matrix between x and xprime

  • nugs_only (bool (default False)) – if True, only return noise variance prediction

  • kwargs (dict) – optional additional elements (not used)

  • interval (str | None)

  • interval_lower (float | None)

  • interval_upper (float | None)

Returns:

  • mean: kriging mean;

  • sd2: kriging variance (filtered, e.g. without the nugget values)

  • nugs: noise variance prediction

  • sd2_var: (returned if noise_var = True) kriging variance of the noise process (i.e., on log-variances if logN = TRUE)

  • cov: (returned if xprime is given) predictive covariance matrix between x and xprime

Return type:

dict with elements

Notes

The full predictive variance corresponds to the sum of sd2 and nugs. See :func: ~hetgpy.hetGP.mleHetGP for examples.

rebuild(robust=False)[source]

Rebuilds inverse covariance matrix in homGP object (usually after saving). Works in tandem with strip

Parameters:

robust (bool) – use np.linalg.pinv for covariance matrix inversion. Otherwise use cholesky

Return type:

self with rebuilt inverse covariance matrix

strip()[source]

Removes larger model objects

Can be rebuilt with rebuild

summary()[source]

Print summary of model object and hyperparameter optimization

update(Xnew, Znew, ginit=0.01, lower=None, upper=None, noiseControl=None, settings=None, known={}, maxit=100, method='quick')[source]

Update model object with new observations

Parameters:
  • Xnew (ndarray_like) – new inputs (one observation per row)

  • Znew (ndarray_like) – new observations

  • lower (ndarray_like) – lower bound for lengthscales. If not provided extracted from self

  • upper (ndarray_like) – upper bound for lengthscales. If not provided extracted from self

  • noiseControl (dict) – noise bounds. If not provided extracted from self

  • settings (dict) – options for optimization. If not provided extracted from self

  • known (dict) – known hyperparameters to fix at values

  • maxit (int) – maximum number of iterations for hyperparameter optimization

  • ginit (float)

  • method (str)

Return type:

self, potentially with updated hyperparameters

Notes

If hyperparameters do not need to be updated, maxit can be set to 0. In this case it is possible to pass NAs in Znew, then the model can still be used to provide updated variance predictions

Example

>>> import numpy as np, from hetgpy import hetGP
>>> X = np.array([1, 2, 3, 4, 12]).reshape(-1,1)
>>> Y = np.array([0, -1.75, -2, -0.5, 5])
>>> model = hetGP().mle(X,Y,known={'theta':np.array([10.0]),'g':1e-8})
>>> Xnew  = np.array([2.5]).reshape(-1,1)
>>> Znew  = model.predict(Xnew)['mean']
>>> model.update(Xnew,Znew)
class homGP[source]

Methods

copy()

Make a copy of the model, which is useful in tandem with the update function

dlogLikHom(X0, Z0, Z, mult, theta, g[, ...])

Log likelihood gradient under homoskedastic noise

get(key)

General get item (retrives key from self.__dict__)

logLikHom(X0, Z0, Z, mult, theta, g[, ...])

Log Likelihood under homoskedastic noise

mleHomGP(X, Z[, lower, upper, known, ...])

Gaussian process modeling with homoskedastic noise.

plot([type])

Plot model behavior.

predict(x[, xprime, interval, ...])

Prediction under homoskedastic noise

rebuild([robust])

Rebuilds inverse covariance matrix in homGP object (usually after saving).

strip()

Removes larger model objects

summary()

Print a summary of the model object and the MLE routine

update(Xnew, Znew[, ginit, lower, upper, ...])

Update model object with new observations

copy()[source]

Make a copy of the model, which is useful in tandem with the update function

Parameters:

None

Returns:

newmodel

Return type:

(deep) copy of model

dlogLikHom(X0, Z0, Z, mult, theta, g, beta0=None, covtype='Gaussian', eps=np.float64(1.4901161193847656e-08), components=('theta', 'g'))[source]

Log likelihood gradient under homoskedastic noise

Parameters:
  • X0 (ndarray_like) – unique design matrix (of size nxd)

  • Z0 (ndarray_like) – averaged responses (size nx1)

  • Z (ndarray_like) – observation (output) vector (size N)

  • mult (ndarray_like) – number of replicates as each design location. Note that mult.sum()==N

  • theta (ndarray_like) – lengthscale

  • g (float) – noise variance

  • beta0 (float) – trend

  • covtype (str) – covariance kernel to use

  • eps (float) – amount of jitter on diagonal of covariance matrix

  • components (list) – directions for partial derivatives, one or both of ‘theta’ (lengthscales) or ‘g’ (for noise)

Returns:

out – gradient with respect to hyperparameters

Return type:

ndarray_like

get(key)[source]

General get item (retrives key from self.__dict__)

logLikHom(X0, Z0, Z, mult, theta, g, beta0=None, covtype='Gaussian', eps=np.float64(1.4901161193847656e-08))[source]

Log Likelihood under homoskedastic noise

Parameters:
  • X0 (ndarray_like) – unique design matrix (of size nxd)

  • Z0 (ndarray_like) – averaged responses (size nx1)

  • Z (ndarray_like) – observation (output) vector (size N)

  • mult (ndarray_like) – number of replicates as each design location. Note that mult.sum()==N

  • theta (ndarray_like) – lengthscale

  • g (float) – noise variance

  • beta0 (float) – trend

  • covtype (str) – covariance kernel to use

  • eps (float) – amount of jitter on diagonal of covariance matrix

Returns:

loglik – log-likelihood

Return type:

float

mleHomGP(X, Z, lower=None, upper=None, known={}, noiseControl={'g_bounds': (np.float64(1.4901161193847656e-08), 100.0)}, init={}, covtype='Gaussian', maxit=100, eps=np.float64(1.4901161193847656e-08), settings={'factr': 10000000.0, 'returnKi': True})[source]

Gaussian process modeling with homoskedastic noise.

You may also call this function as model.mle

Gaussian process regression under homoskedastic noise based on maximum likelihood estimation of the hyperparameters. This function is enhanced to deal with replicated observations.

Parameters:
  • X (ndarray_like) – matrix of all designs, one per row, or list with elements: - X0 matrix of unique design locations, one point per row - Z0 vector of averaged observations, of length len(X0) - mult number of replicates at designs in X0, of length len(X0)

  • Z (ndarray_like) – Z vector of all observations. If using a list with X, Z has to be ordered with respect to X0, and of length sum(mult)

  • lower (ndarray_like) – optional bounds for the theta parameter (see :func: covariance_functions.cov_gen for the exact parameterization). In the multivariate case, it is possible to give vectors for bounds (resp. scalars) for anisotropy (resp. isotropy)

  • upper (ndarray_like) – optional bounds for the theta parameter (see :func: covariance_functions.cov_gen for the exact parameterization). In the multivariate case, it is possible to give vectors for bounds (resp. scalars) for anisotropy (resp. isotropy)

  • noiseControl (dict) –

    dict with element:
    • g_bounds vector providing minimal and maximal noise to signal ratio (default to (sqrt(MACHINE_DOUBLE_EPS), 100)).

  • settings (dict) –

    dict for options about the general modeling procedure, with elements:
    • return_Ki boolean to include the inverse covariance matrix in the object for further use (e.g., prediction).

    • factr (default to 1e7) and pgtol are available to be passed to options for L-BFGS-B in :func: scipy.optimize.minimize.

  • eps (float) – jitter used in the inversion of the covariance matrix for numerical stability

  • known (dict) – optional dict of known parameters (e.g. beta0, theta, g)

  • init (dict) –

    optional lists of starting values for mle optimization:
    • theta_init initial value of the theta parameters to be optimized over (default to 10% of the range determined with lower and upper)

    • g_init vector of nugget parameter to be optimized over

  • covtype (str) – covariance kernel type, either 'Gaussian', 'Matern5_2' or 'Matern3_2', see :func: ~covariance_functions.cov_gen

  • maxit (int) – maximum number of iterations for L-BFGS-B of :func: scipy.optimize.minimize dedicated to maximum likelihood optimization

Return type:

None

Notes

The global covariance matrix of the model is parameterized as nu_hat * (C + g * diag(1/mult)) = nu_hat * K, with C the correlation matrix between unique designs, depending on the family of kernel used (see :func: ~hetgpy.covariance_functions.cov_gen for available choices) and values of lengthscale parameters. nu_hat is the plugin estimator of the variance of the process.

It is generally recommended to use :func: ~find_reps.find_reps to pre-process the data, to rescale the inputs to the unit cube and to normalize the outputs.

Returns:

  • theta: unless given, maximum likelihood estimate (mle) of the lengthscale parameter(s),

  • nu_hat: plugin estimator of the variance,

  • g: unless given, mle of the nugget of the noise/log-noise process,

  • trendtype: either "SK" if beta0 is provided, else "OK",

  • beta0: constant trend of the mean process, plugin-estimator unless given,

  • ll: log-likelihood value, (ll_non_pen) is the value without the penalty,

  • nit_opt, msg: counts and message returned by :func:scipy.optimize.minimize

  • used_args: list with arguments provided in the call to the function,
    • Ki, Kgi: inverse of the covariance matrices of the mean and noise processes (not scaled by nu_hat and nu_hat_var),

    • X0, Z0, Z, eps, logN, covtype: values given in input,

  • time: time to train the model, in seconds.

See also ~hetgpy.homGP.homGP.predict for predictions, ~hetgpy.homGP.update for updating an existing model. summary and plot functions are available as well. ~homTP.mleHomTP provide a Student-t equivalent.

Return type:

self, with the following attributes

Parameters:
  • X (ArrayLike)

  • Z (ArrayLike)

  • lower (ArrayLike | None)

  • upper (ArrayLike | None)

  • known (dict)

  • noiseControl (dict)

  • init (dict)

  • covtype (str)

  • maxit (int)

  • eps (float)

  • settings (dict)

Examples

>>> from hetgpy import homGP
>>> from hetgpy.example_data import mcycle
>>> m = mcycle()
>>> model = homGP()
>>> model.mle(m['times'],m['accel'],lower=[1.0],upper=[10.0],covtype="Matern5_2")

References

M. Binois, Robert B. Gramacy, M. Ludkovski (2018), Practical heteroskedastic Gaussian process modeling for large simulation experiments, Journal of Computational and Graphical Statistics, 27(4), 808–821. Preprint available on arXiv:1611.05902.

plot(type='diagnostics')[source]

Plot model behavior. See hetgpy.plot for more details on types of plots

Parameters:

type (str) – type of plot: currently supported are “diagnostics” and “iterates”

Returns:

fig, ax

Return type:

figure and axis object

predict(x, xprime=None, interval=None, interval_lower=None, interval_upper=None, **kw)[source]

Prediction under homoskedastic noise

Parameters:
  • x (ndarray_like) – matrix of designs locations to predict at (one point per row)

  • xprime (ndarray_like) – optional second matrix of predictive locations to obtain the predictive covariance matrix between x and xprime

  • interval (str) – one of ‘confidence’ or ‘predictive’ which is a convenience method to return confidence/predictive intervals corresponding to interval_lower and interval_upper

  • interval_lower (float) – lower of confidence/predictive interval

  • interval_upper (float) – upper of confidence/predictive interval

  • nugs_only (bool (default False)) – if True, only return noise variance prediction

  • kwargs (dict) – optional additional elements (only used for nugs_only)

Returns:

  • mean: kriging mean;

  • sd2: kriging variance (filtered, e.g. without the nugget values)

  • nugs: noise variance prediction

  • cov: (returned if xprime is given) predictive covariance matrix between x and xprime

  • confidence_interval: prediction with kriging variance only

  • predictive_interval: prediction with kriging and noise variance

Return type:

dict with elements

rebuild(robust=False)[source]

Rebuilds inverse covariance matrix in homGP object (usually after saving). Works in tandem with strip

Parameters:

robust (bool) – use np.linalg.pinv for covariance matrix inversion. Otherwise use cholesky

Return type:

self with rebuilt inverse covariance matrix

strip()[source]

Removes larger model objects

Can be rebuilt with rebuild

summary()[source]

Print a summary of the model object and the MLE routine

update(Xnew, Znew, ginit=0.01, lower=None, upper=None, noiseControl=None, settings=None, known={}, maxit=100)[source]

Update model object with new observations

Parameters:
  • Xnew (ndarray_like) – new inputs (one observation per row)

  • Znew (ndarray_like) – new observations

  • lower (ndarray_like) – lower bound for lengthscales. If not provided extracted from self

  • upper (ndarray_like) – upper bound for lengthscales. If not provided extracted from self

  • noiseControl (dict) – noise bounds. If not provided extracted from self

  • settings (dict) – options for optimization. If not provided extracted from self

  • known (dict) – known hyperparameters to fix at values

  • maxit (int) – maximum number of iterations for hyperparameter optimization

  • ginit (float)

Return type:

self, potentially with updated hyperparameters

Notes

If hyperparameters do not need to be updated, maxit can be set to 0. In this case it is possible to pass NAs in Znew, then the model can still be used to provide updated variance predictions

Example

>>> import numpy as np, from hetgpy import homGP
>>> X = np.array([1, 2, 3, 4, 12]).reshape(-1,1)
>>> Y = np.array([0, -1.75, -2, -0.5, 5])
>>> model = homGP().mle(X,Y,known={'theta':np.array([10.0]),'g':1e-8})
>>> Xnew  = np.array([2.5]).reshape(-1,1)
>>> Znew  = model.predict(Xnew)['mean']
>>> model.update(Xnew,Znew)
crit_EI(x, model, cst=None, preds=None)[source]

Expected Improvement (EI) criteria

Parameters:
  • x (nd_arraylike) – model designs, one point per row

  • model (hetgpy.hetGP) – hetGP or homGP model

  • cst (float) – optional plugin value of the mean

  • preds (Dict) – model predictions (optional)

References

Mockus, J.; Tiesis, V. & Zilinskas, A. (1978). The application of Bayesian methods for seeking the extremum Towards Global Optimization, Amsterdam: Elsevier, 2, 2. Vazquez E, Villemonteix J, Sidorkiewicz M, Walter E (2008). Global Optimization Based on Noisy Evaluations: An Empirical Study of Two Statistical Approaches, Journal of Physics: Conference Series, 135, IOP Publishing.

Examples

>>> from hetgpy.test_functions import f1d
>>> from hetgpy.homGP import homGP
>>> from hetgpy.optim import crit_EI
>>> import numpy as np
>>> ftest = f1d
>>> n_init = 5 # number of unique designs
>>> X = np.linspace(0, 1, n_init).reshape(-1,1)
>>> Z = ftest(X)
>>> xgrid = np.linspace(0,1,51).reshape(-1,1)
>>> model = homGP()
>>> model.mle(X = X, Z = Z, lower = np.array([0.01]), upper = np.array([1]), known = dict(g = 2e-8))
>>> EI = crit_EI(xgrid, model, cst = model.Z0.min())
crit_logEI(x, model, cst=None, preds=None)[source]

Logarithm of Expected Improvement (EI) criteria

Parameters:
  • x (nd_arraylike) – model designs, one point per row

  • model (hetgpy.hetGP) – hetGP or homGP model

  • cst (float) – optional plugin value of the mean

  • preds (Dict) – model predictions (optional)

References

Ament, S., Daulton, S., Eriksson, D., Balandat, M., & Bakshy, E. (2024). Unexpected improvements to expected improvement for Bayesian optimization. Advances in Neural Information Processing Systems, 36.

Examples

>>> from hetgpy.test_functions import f1d
>>> from hetgpy.homGP import homGP
>>> from hetgpy.optim import crit_logEI
>>> import numpy as np
>>> ftest = f1d
>>> n_init = 5 # number of unique designs
>>> X = np.linspace(0, 1, n_init).reshape(-1,1)
>>> Z = ftest(X)
>>> xgrid = np.linspace(0,1,51).reshape(-1,1)
>>> model = homGP()
>>> model.mle(X = X, Z = Z, lower = np.array([0.01]), upper = np.array([1]), known = dict(g = 2e-8))
>>> logEI = crit_logEI(xgrid, model, cst = model.Z0.min())
crit_optim(model, crit, h=2, Xcand=None, control={'maxit': 100, 'multi_start': 10}, seed=None, ncores=1, **kwargs)[source]
crit_qEI(x, model, cst=None, preds=None)[source]

Fast approximated batch-Expected Improvement criterion (for minimization) Parallel Expected improvement

Parameters:
  • x (nd_array) – matrix of new designs representing the batch of q points, one point per row (q x d)

  • model (homGP/hetGP) – model object including inverve matrices

  • cst (nd_array) – optional optional plugin value used in the EI, see details

  • preds (dict) – optional predictions at x to avoid recomputing if already done (must include the predictive covariance, i.e., the cov slot)

Returns:

  • qEI_cpp

  • Details

  • ——-

  • cst is classically the observed minimum in the deterministic case. In the noisy case, the min of the predictive mean works fine. This is a beta version at this point. It may work for for TP models as well.

References

M. Binois (2015), Uncertainty quantification on Pareto fronts and high-dimensional strategies in Bayesian optimization, with applications in multi-objective automotive design. Ecole Nationale Superieure des Mines de Saint-Etienne, PhD thesis.

Examples

>>> from hetgpy.test_functions import f1d
>>> from hetgpy.homGP import homGP
>>> from hetgpy.optim import crit_qEI
>>> import numpy as np
>>> ftest = f1d
>>> n_init = 5 # number of unique designs
>>> X = np.linspace(0, 1, n_init).reshape(-1,1)
>>> Z = ftest(X)
>>> xgrid = np.linspace(0,1,51).reshape(-1,1)
>>> model = homGP()
>>> model.mleHomGP(X = X, Z = Z, lower = np.array([0.01]), upper = np.array([1]), known = dict(g = 2e-8))
>>> xbatch = np.array((0.37, 0.17, 0.7)).reshape(3, 1)
>>> fqEI = crit_qEI(xbatch, model, cst = model.Z0.min())
deriv_crit_EI(x, model, cst=None, preds=None)[source]

Derivative for crit_EI, used in crit_optim

Parameters:
  • x (nd_arraylike) – model designs, one point per row

  • model (hetgpy.hetGP) – hetGP or homGP model

  • cst (float) – optional plugin value of the mean

  • preds (Dict) – model predictions (optional)

References

Ginsbourger, D. Multiples metamodeles pour l’approximation et l’optimisation de fonctions numeriques multivariables Ecole Nationale Superieure des Mines de Saint-Etienne, Ecole Nationale Superieure des Mines de Saint-Etienne, 2009 Roustant, O., Ginsbourger, D., DiceKriging, DiceOptim: Two R packages for the analysis of computer experiments by kriging-based metamodeling and optimization, Journal of Statistical Software, 2012

deriv_crit_logEI(x, model, cst=None, preds=None)[source]

Derivative of the logarithm of Expected Improvement (EI) criteria

Parameters:
  • x (nd_arraylike) – model designs, one point per row

  • model (hetgpy.hetGP) – hetGP or homGP model

  • cst (float) – optional plugin value of the mean

  • preds (Dict) – model predictions (optional)

References

Ament, S., Daulton, S., Eriksson, D., Balandat, M., & Bakshy, E. (2024). Unexpected improvements to expected improvement for Bayesian optimization. Advances in Neural Information Processing Systems, 36.

Examples

dlambda(z, a)[source]

Derivative of the student-t pdf

Parameters:
  • z (float) – input location

  • a (float) – degree of freedom parameter

Return type:

derivative of student-t pdf

Examples

>>> dlambda(0.55, 3.6) # -0.2005827
dlog_h(z, eps=np.float64(2.220446049250313e-16))[source]

References

Ament, S., Daulton, S., Eriksson, D., Balandat, M., & Bakshy, E. (2024). Unexpected improvements to expected improvement for Bayesian optimization. Advances in Neural Information Processing Systems, 36.

log1mexp(x)[source]

References

Maechler, Martin (2012). Accurately Computing log(1-exp(-|a|)). Assessed from the Rmpfr package.

log_h(z, eps=np.float64(2.220446049250313e-16))[source]

References

Ament, S., Daulton, S., Eriksson, D., Balandat, M., & Bakshy, E. (2024). Unexpected improvements to expected improvement for Bayesian optimization. Advances in Neural Information Processing Systems, 36.

predict_gr(model, x)[source]

Gradient of the prediction given a model

IMSPE(model, theta=None, Lambda=None, mult=None, covtype=None, nu=None, eps=np.float64(1.4901161193847656e-08))[source]

IMSPE of a given design.

Integrated Mean Square Prediction Error

Parameters:
  • X (hetgpy.hetGP.hetGP or hetgpy.homGP.homGP model.) – Alternatively, one can provide a matrix of unique designs considered

  • theta (ndarray_like) – lengthscales

  • Lambda (ndarray_like) – diagonal matrix for the noise

  • mult (ndarray_like) – number of replicates at each design

  • covtype (str) – either “Gaussian”, “Matern3_2” or “Matern5_2”

  • nu (float) – variance parameter

  • eps (float) – numerical nugget

  • Details

  • -------

  • homGP (One can provide directly a model of class hetGP or)

  • arguments (or provide design locations X and all other)

Wij(mu1, mu2=None, theta=None, type='Gaussian')[source]

Compute double integral of the covariance kernel over a [0,1]^d domain

Parameters:
  • mu1 (ndarray)like) – input locations considered

  • mu2 (ndarray)like) – input locations considered

  • theta (ndarray_like) – lengthscale hyperparameter of the kernel

  • type (str) – kernel type, one of "Gaussian", "Matern5_2" or "Matern3_2"

References

M. Binois, J. Huang, R. B. Gramacy, M. Ludkovski (2019), Replication or exploration? Sequential design for stochastic simulation experiments, Technometrics, 61(1), 7-23. Preprint available on arXiv:1710.03206.

allocate_mult(model=None, N=None, Wijs=None, use_Ki=False)[source]

Allocation of replicates on existing design locations, based on (29) from (Ankenman et al, 2010)

Parameters:
  • model (hetGP model) – hetGP model

  • N (int) – total budget of replication to allocate

  • Wijs (ndarray) – optional previously computed matrix of Wijs, see hetgpy.IMSE.Wij

  • use_Ki (bool) – should Ki from model be used?

Return type:

vector with approximated best number of replicates per design

References

  1. Ankenman, B. Nelson, J. Staum (2010), Stochastic kriging for simulation metamodeling, Operations research, pp. 371–382, 58

deriv_crit_IMSPE(x, model, id=None, Wijs=None)[source]

Derivative of crit_IMSPE

Parameters:
  • x (ndarray_like) – matrix for the news design (size 1 x d)

  • model (hetGP or homGP) – model

  • id (None) – None (but included for compatibility with crit_IMSPE input structure so it can be used in minimize)

  • Wijs (ndarray_like) – optional previously computed matrix of Wijs, see Wij

Return type:

Derivative of the sequential IMSPE with respect to x

horizon(model, current_horizon=None, previous_ratio=None, target=None, Wijs=None, seed=None)[source]

Adapt the look-ahead horizon depending on the replicate allocation or a target ratio

Parameters:
  • model (hetGP or homGP model) – hetGP or homGP model

  • current_horizon (int) – horizon used for the previous iteration, see details

  • previous_ratio (float) – ratio before adding the previous new design

  • target (float) – scalar in [0,1] for desired n/N

  • Wijs (nd_array) – optional previously computed matrix of Wijs, see hetgpy.IMSE.Wij

Returns:

  • Randomly selected horizon for next iteration (adpative) if no target is provided,

  • otherwise returns the update horizon value.

  • Details

  • ——-

  • If target is provided, along with previous_ratio and current_horizon

  • itemize{ – item the horizon is increased by one if more replicates are needed but a new ppint has been added at the previous iteration, item the horizon is decreased by one if new points are needed but a replicate has been added at the previous iteration, item otherwise it is unchanged.

  • }

  • If no target is provided, allocate_mult is used to obtain the best allocation of the existing replicates,

  • then the new horizon is sampled from the difference between the actual allocation and the best one, bounded below by 0.

  • See (Binois et al. 2017).

References

M. Binois, J. Huang, R. B. Gramacy, M. Ludkovski (2019), Replication or exploration? Sequential design for stochastic simulation experiments, Technometrics, 61(1), 7-23.cr Preprint available on arXiv:1710.03206.

lhs_EP(m, seed=None)[source]

From DiceDesign: FUNCTION PERFORMING ELEMENTARY PERMUTATION (EP) IN LHD USED IN SA ALGORITHMS

Parameters:

m (nd_arraylike) – the design

Returns:

out – list including design after EP, ligns and columns defining EP

Return type:

tuple

maximinSA_LHS(design, T0=10, c=0.95, it=2000, p=50, profile='GEOM', Imax=100, seed=None)[source]

Implementation of maximinSA_LHS from DiceDesign Only profile=”GEOM” is implemented (like in hetGP)

#####maximinSA_LHS##### #####Maximin LHS VIA SIMULATED ANNEALING OPTIMIZATION#####

#—————————————————————————| #args : m : the design | # T0 : the initial temperature | # c : parameter regulating the temperature | # it : the number of iterations | # p : power required in phiP criterion | # profile : temperature down profile | # “GEOM” or “GEOM_MORRIS” or “LINEAR”. By default : “GEOM” | #output : a list containing all the input arguments plus: | # a mindist optimized design | # vector of criterion values along the iterations | # vector of temperature values along the iterations | # vector of acceptation probability values along the iterations | #depends : phiP,lhs_EP | #—————————————————————————|

mi(mu1, theta, type)[source]

Compute integral of the covariance kernel over a [0,1]^d domain

Parameters:
  • mu1 (ndarray_like) – input locations considered

  • theta (ndarray_like) – lengthscale hyperparameter of the kernel

  • type (str) – kernel type, one of "Gaussian", "Matern5_2" or "Matern3_2"

References

Replication or exploration? Sequential design for stochastic simulation experiments, Technometrics, 61(1), 7-23. Preprint available on arXiv:1710.03206.

phiP(design, p=50)[source]

Implementation of phiP.R from DiceDesign (necessary for maximinSA_LHS from DiceDesign which is used by IMSPE_search)

From DiceDesign: Compute the phiP criterion (Lp norm of the sum of the inverses of the design inter-point distances) Reference: Pronzato, L. and Muller, W.,2012, Design of computer experiments: space filling and beyond, Statistics and Computing, 22:681-701. A higher phiP corresponds to a more regular scaterring of design points

Parameters:
  • design (nd_arraylike) – design for a computer experiment

  • p (int) – the “p” in the Lp norm which is taken (default=50)

Returns:

fi_p – the phiP criterion

Return type:

np.float

LOO_preds(model, ids=None)[source]

Provide leave one out predictions, e.g., for model testing and diagnostics. This is used in the method plot available on GP and TP models.

Parameters:
  • model (homGP or hetGP) – hetGPy model to evaluate. TPs not considered at this point.

  • ids (vector of indices of the unique design point considered (defaults to all))

Return type:

dict with mean and variance predictions at x_i assuming this point has not been evaluated

Notes

For TP models, psi is considered fixed

References

O. Dubrule (1983), Cross validation of Kriging in a unique neighborhood, Mathematical Geology 15, 687–699. cr cr F. Bachoc (2013), Cross Validation and Maximum Likelihood estimations of hyper-parameters of Gaussian processes

Examples

>>> from hetgpy import homGP
>>> from hetgpy.LOO import LOO_preds
>>> from hetgpy.find_reps import find_reps
>>> from hetgpy.example_data import mcycle
>>> import numpy as np
>>> m = mcycle()
>>> X, Z = m['times'], m['accel']
>>> model = homGP()
>>> model.mle(X = X, Z = Z, lower = [0.1], upper = [10],
>>>          covtype = "Matern5_2", known = dict(beta0 = 0))
>>> LOO_p = LOO_preds(model)
>>> d = find_reps(X,Z)
>>> LOO_ref = np.zeros(shape=(d['X0'].shape[0],2))
>>> rows = np.arange(d['X0'].shape[0])
>>> for i in rows:
>>>     idxs = np.delete(rows,i)
>>>     model_i = homGP()
>>>     model_i.mle(X = dict(X0 = d['X0'][idxs,:], Z0 = d['Z0'][idxs],
>>>                 mult = d['mult'][idxs]),
>>>                 Z = np.concatenate([d['Zlist'][j] for j in idxs]),
>>>                        lower = [0.1], upper = [50], covtype = "Matern5_2",
>>>                        known = dict(theta = model.theta, g = model.g,
>>>                                    beta0 = 0))
>>>     model_i.nu_hat = model.nu_hat
>>>     p_i = model_i.predict(d['X0'][i:i+1,:])
>>>     LOO_ref[i,:] = np.array([p_i['mean'], p_i['sd2']]).squeeze()
>>> print(np.ptp(LOO_ref[:,0] - LOO_p['mean']))
>>> print(np.ptp(LOO_ref[:,1] - LOO_p['sd2']))
>>> model.plot()