hetGPy API reference
docstring for find_reps
- find_reps(X, Z, return_Zlist=True, rescale=False, normalize=False, inputBounds=None, use_torch=False)[source]
Prepare data for use with
mleHomGPandmleHetGPIn particular to find replicated observations
- Parameters:
X (array_like) – matrix of design locations, one point per row
Z (array_like) – vector of observations at
Xreturn_Zlist (bool) – to return Zlist, see below
rescale (bool) – if
True, the inputs are rescaled to the unit hypercubenormalize (bool) – if
True, the outputs are centered and normalizedinputBounds (array_like) – optional matrix of known boundaries in original input space, of size 2 times X.shape[1]. If not provided, and
rescale == True, it is estimated from the data.
- Returns:
dict – dictionary of outputs
type – explain types
out –
dictionary of
X0matrix with unique designs locations, one point per rowZ0vector of averaged observations atX0multnumber of replicates atX0Zvector with all observations, sorted according toX0Zlistoptional list, each element corresponds to observations at a design inX0inputBoundsoptional matrix, to rescale back to the original input spaceoutputStatsoptional vector, with mean and variance of the original outputs.
References
Binois et. al (2018)
Examples
>>> from hetgpy.data import mcycle >>> from hetgpy.hetGP import hetGP >>> X = mcycle['times'] >>> Z = mcycle['accel']
>>> model = hetGP() >>> out = model.find_reps(X, Z) >>> print(out)
- class hetGP[source]
Methods
Leave-out-out predictions for the nugget
copy()Make a copy of the model, which is useful in tandem with the update function
dlogLikHet(X0, Z0, Z, mult, Delta, theta, g)derivative of log-likelihood for logLikHet_Wood with respect to theta and Lambda with all observations Model: K = nu2 * (C + Lambda) = nu using all observations using the replicates information nu2 is replaced by its plugin estimator in the likelihood
logLikHet(X0, Z0, Z, mult, Delta, theta, g)log-likelihood in the anisotropic case - one lengthscale by variable Model: K = nu2 * (C + Lambda) = nu using all observations using the replicates information nu2 is replaced by its plugin estimator in the likelihood
mleHetGP(X, Z[, lower, upper, known, ...])Gaussian process modeling with heteroskedastic noise
plot([type])Plot model behavior.
predict(x[, noise_var, xprime, nugs_only, ...])Gaussian process predictions using a heterogeneous noise GP object (of
hetGP)rebuild([robust])Rebuilds inverse covariance matrix in homGP object (usually after saving).
strip()Removes larger model objects
summary()Print summary of model object and hyperparameter optimization
update(Xnew, Znew[, ginit, lower, upper, ...])Update model object with new observations
get
- LOO_preds_nugs(i)[source]
Leave-out-out predictions for the nugget
- Parameters:
self (hetGP object)
i (int) – index of the point to remove
- Return type:
LOO preds
- copy()[source]
Make a copy of the model, which is useful in tandem with the update function
- Parameters:
None
- Returns:
newmodel
- Return type:
(deep) copy of model
- dlogLikHet(X0, Z0, Z, mult, Delta, theta, g, k_theta_g=None, theta_g=None, beta0=None, pX=None, logN=True, SiNK=False, components=None, eps=np.float64(1.4901161193847656e-08), covtype='Gaussian', SiNK_eps=0.0001, penalty=True, hom_ll=None)[source]
derivative of log-likelihood for logLikHet_Wood with respect to theta and Lambda with all observations Model: K = nu2 * (C + Lambda) = nu using all observations using the replicates information nu2 is replaced by its plugin estimator in the likelihood
- Parameters:
X0 (ndarray_like) – unique designs
Z0 (ndarray_like) – averaged observations
Z (ndarray_like) – replicated observations (sorted with respect to X0)
mult (ndarray_like) – number of replicates at each Xi
Delta (ndarray_like) – vector of nuggets corresponding to each X0i or pXi, that are smoothed to give Lambda
logN (bool) – should exponentiated variance be used
SiNK (bool) – should the smoothing come from the SiNK predictor instead of the kriging one
theta (ndarray_like) – scale parameter for the mean process, either one value (isotropic) or a vector (anistropic)
k_theta_g (ndarray_like) – constant used for linking nuggets lengthscale to mean process lengthscale, i.e., theta_g[k] = k_theta_g * theta[k], alternatively theta_g can be used
theta_g (ndarray_like) – either one value (isotropic) or a vector (anistropic), alternative to using k_theta_g
g (ndarray_like) – nugget of the nugget process
pX (ndarray_like) – matrix of pseudo inputs locations of the noise process for Delta (could be replaced by a vector to avoid double loop)
components (ndarray_like) – components to determine which variable are to be taken in the derivation: None for all, otherwise list with elements from ‘theta’, ‘Delta’, ‘theta_g’, ‘k_theta_g’, ‘pX’ and ‘g’.
beta0 (float) – mean, if not provided, the MLE estimator is used
eps (float) – minimal value of elements of Lambda
covtype (str) – covariance kernel type
penalty (bool) – should a penalty term on Delta be used?
hom_ll (float) – reference homoskedastic likelihood
SiNK_eps (float)
- Return type:
ArrayLike
- logLikHet(X0, Z0, Z, mult, Delta, theta, g, k_theta_g=None, theta_g=None, logN=None, SiNK=False, beta0=None, pX=None, eps=np.float64(1.4901161193847656e-08), covtype='Gaussian', SiNK_eps=0.0001, penalty=True, hom_ll=None, trace=0)[source]
log-likelihood in the anisotropic case - one lengthscale by variable Model: K = nu2 * (C + Lambda) = nu using all observations using the replicates information nu2 is replaced by its plugin estimator in the likelihood
- Parameters:
X0 (ndarray_like) – unique designs
Z0 (ndarray_like) – averaged observations
Z (ndarray_like) – replicated observations (sorted with respect to X0)
mult (ndarray_like) – number of replicates at each Xi
Delta (ndarray_like) – vector of nuggets corresponding to each X0i or pXi, that are smoothed to give Lambda
logN (bool) – should exponentiated variance be used
SiNK (bool) – should the smoothing come from the SiNK predictor instead of the kriging one
theta (ndarray_like) – scale parameter for the mean process, either one value (isotropic) or a vector (anistropic)
k_theta_g (ndarray_like) – constant used for linking nuggets lengthscale to mean process lengthscale, i.e., theta_g[k] = k_theta_g * theta[k], alternatively theta_g can be used
theta_g (ndarray_like) – either one value (isotropic) or a vector (anistropic), alternative to using k_theta_g
g (ndarray_like) – nugget of the nugget process
pX (ndarray_like) – matrix of pseudo inputs locations of the noise process for Delta (could be replaced by a vector to avoid double loop)
beta0 (float) – mean, if not provided, the MLE estimator is used
eps (float) – minimal value of elements of Lambda
covtype (str) – covariance kernel type
penalty (bool) – should a penalty term on Delta be used?
hom_ll (float) – reference homoskedastic likelihood
SiNK_eps (float)
trace (int)
- Return type:
float
- mleHetGP(X, Z, lower=None, upper=None, known={}, noiseControl={'g_bounds': (1e-06, 1), 'g_max': 100, 'k_theta_g_bounds': (1, 100)}, init={}, covtype='Gaussian', maxit=100, eps=np.float64(1.4901161193847656e-08), settings={'factr': 1000000000.0, 'ignore_MLE_divide_invalid': True, 'returnKi': True}, use_torch=False)[source]
Gaussian process modeling with heteroskedastic noise
You may also call this function as model.mle
Gaussian process regression under input dependent noise based on maximum likelihood estimation of the hyperparameters. A second GP is used to model latent (log-) variances. This function is enhanced to deal with replicated observations.
- Parameters:
X (ndarray_like) – matrix of all designs, one per row, or list with elements: -
X0matrix of unique design locations, one point per row -Z0vector of averaged observations, of lengthlen(X0)-multnumber of replicates at designs inX0, of lengthlen(X0)Z (ndarray_like) – Z vector of all observations. If using a list with
X,Zhas to be ordered with respect toX0, and of lengthsum(mult)lower (ndarray_like) – optional bounds for the
thetaparameter (see :func: covariance_functions.cov_gen for the exact parameterization). In the multivariate case, it is possible to give vectors for bounds (resp. scalars) for anisotropy (resp. isotropy)upper (ndarray_like) – optional bounds for the
thetaparameter (see :func: covariance_functions.cov_gen for the exact parameterization). In the multivariate case, it is possible to give vectors for bounds (resp. scalars) for anisotropy (resp. isotropy)noiseControl (dict) –
- dict with elements related to optimization of the noise process parameters:
g_min,g_maxminimal and maximal noise to signal ratio (of the mean process)lowerDelta,upperDeltaoptional vectors (or scalars) of bounds onDelta, of lengthlen(X0)(default tonp.repeat(eps, X0.shape[0])andnp.repeat(noiseControl["g_max"], X0.shape[0])resp., or theirlog)lowerpX,upperpXoptional vectors of bounds of the input domain if pX is used.lowerTheta_g,upperTheta_goptional vectors of bounds for the lengthscales of the noise process iflinkThetas == 'none'. Same as forthetaif not provided.k_theta_g_boundsiflinkThetas == 'joint', vector with minimal and maximal values fork_theta_g(default to(1, 100)). See Notes.g_boundsvector for minimal and maximal noise to signal ratios for the noise of the noise process, i.e., the smoothing parameter for the noise process. (default to(1e-6, 1)).
settings (dict) –
- dict for options about the general modeling procedure, with elements:
linkThetasdefines the relation between lengthscales of the mean and noise processes. Either'none','joint'``(default) or ``'constr', see Notes.logN, whenTrue(default), the log-noise process is modeled.initStrategyone of'simple','residuals'(default) and'smoothed'to obtain starting values forDelta, see NotespenaltywhenTrue, the penalized version of the likelihood is used (i.e., the sum of the log-likelihoods of the mean and variance processes, see References).hardpenaltyisTrue, the log-likelihood from the noise GP is taken into account only if negative (default ifmaxit > 1000).checkHomwhenTrue, if the log-likelihood with a homoskedastic model is better, then return it.traceoptional scalar (default to0). If negative, fit silently. If0, only high level information is given. If1, information is given about the result of the heterogeneous model optimization. Level2gives more details. Level3additionaly displays all details about initialization of hyperparameters.return_matricesboolean to include the inverse covariance matrix in the object for further use (e.g., prediction).return_homboolean to include homoskedastic GP models used for initialization (i.e.,modHomandmodNugs).factr(default to 1e9) andpgtolare available to be passed to options for L-BFGS-B in :func:scipy.optimize.minimize.
eps (float) – jitter used in the inversion of the covariance matrix for numerical stability
init (dict) –
optional lists of starting values for mle optimization or that should not be optimized over, respectively. Values in
knownare not modified, while it can happen to these ofinit, see Notes. One can set one or several of the following:thetalengthscale parameter(s) for the mean process either one value (isotropic) or a vector (anistropic)Deltavector of nuggets corresponding to each design inX0, that are smoothed to giveLambda(as the global covariance matrix depends onDeltaandnu_hat, it is recommended to also pass values fortheta)beta0constant trend of the mean processk_theta_gconstant used for link mean and noise processes lengthscales, whensettings['linkThetas'] == 'joint'theta_geither one value (isotropic) or a vector (anistropic) for lengthscale parameter(s) of the noise process, whensettings['linkThetas'] != 'joint'gscalar nugget of the noise processg_Hscalar homoskedastic nugget for the initialisation with a :func: homGP.mleHomGP. See Notes.pXmatrix of fixed pseudo inputs locations of the noise process corresponding to Delta
known (dict) –
optional lists of starting values for mle optimization or that should not be optimized over, respectively. Values in
knownare not modified, while it can happen to these ofinit, see Notes. One can set one or several of the following:thetalengthscale parameter(s) for the mean process either one value (isotropic) or a vector (anistropic)Deltavector of nuggets corresponding to each design inX0, that are smoothed to giveLambda(as the global covariance matrix depends onDeltaandnu_hat, it is recommended to also pass values fortheta)beta0constant trend of the mean processk_theta_gconstant used for link mean and noise processes lengthscales, whensettings['linkThetas'] == 'joint'theta_geither one value (isotropic) or a vector (anistropic) for lengthscale parameter(s) of the noise process, whensettings['linkThetas'] != 'joint'gscalar nugget of the noise processg_Hscalar homoskedastic nugget for the initialisation with a :func: homGP.mleHomGP. See Notes.pXmatrix of fixed pseudo inputs locations of the noise process corresponding to Delta
covtype (str) – covariance kernel type, either
'Gaussian','Matern5_2'or'Matern3_2', see :func:~covariance_functions.cov_genmaxit (int) – maximum number of iterations for L-BFGS-B of :func:
scipy.optimize.minimizededicated to maximum likelihood optimizationuse_torch (bool)
- Return type:
None
Notes
The global covariance matrix of the model is parameterized as
nu_hat * (C + Lambda * np.diag(1/mult)) = nu_hat * K, withCthe correlation matrix between unique designs, depending on the family of kernel used (see :func: ~covariance_functions.cov_gen for available choices) and values of lengthscale parameters.nu_hatis the plugin estimator of the variance of the process.Lambdais the prediction on the noise level given by a second (homoskedastic) GP: .. math:: Lambda = C_g(C_g + mathrm{diag}(g/mathrm{mult}))^{-1} Delta} withC_gthe correlation matrix between unique designs for this second GP, with lengthscales hyperparameterstheta_gand nuggetgandDeltathe variance level atX0that are estimated.It is generally recommended to use :func:
~find_reps.find_repsto pre-process the data, to rescale the inputs to the unit cube and to normalize the outputs.- The noise process lengthscales can be set in several ways:
using
k_theta_g(settings['linkThetas'] == 'joint'), supposed to be greater than one by default. In this case lengthscales of the noise process are multiples of those of the mean process.if
settings['linkThetas'] == 'constr, then the lower bound ontheta_gcorrespond to estimated values of a homoskedastic GP fit.else lengthscales between the mean and noise process are independent (both either anisotropic or not).
When no starting nor fixed parameter values are provided with
initorknown, the initialization process consists of fitting first an homoskedastic model of the data, calledmodHom.init['theta'], initial lengthscales are taken at 10% of the range determined withlowerandupper, whileinit['g_H']may be use to pass an initial nugget value. The resulting lengthscales provide initial values fortheta(or update them if given ininit).If necessary, a second homoskedastic model,
modNugs, is fitted to the empirical residual variance between the prediction given bymodHomatX0andZ(up tomodHom['nu_hat]'). Note that when specifyingsettings['linkThetas'] == 'joint', then this second homoskedastic model has fixed lengthscale parameters. Starting values for ``theta_gandgare extracted frommodNugs- Three initialization schemes for
Deltaare available withsettings['initStrategy']: for
settings['initStrategy'] == 'simple',Deltais simply initialized to the estimatedgvalue ofmodHom.Note that this procedure may fail when
settings['penalty'] == True.for
settings['initStrategy'] == 'residuals',Deltais initialized to the estimated residual variance from the homoskedastic mean prediction.for
settings['initStrategy'] == 'smoothed',Deltatakes the values predicted bymodNugsatX0.
Notice that
lowerandupperbounds cannot be equal for:func: scipy.optimize.minimize. To use pseudo-input locations for the noise process, one can either providepXif they are not to be optimized. Otherwise, initial values are given withpXinit, and optimization bounds withlowerpX,upperpXininit. Automatic initialization of the other parameters without restriction is available for now only with method ‘simple’, otherwise it is assumed that pXinit points are a subset of X0.- Returns:
theta: unless given, maximum likelihood estimate (mle) of the lengthscale parameter(s),Delta: unless given, mle of the nugget vector (non-smoothed),Lambda: predicted input noise variance atX0,nu_hat: plugin estimator of the variance,theta_g: unless given, mle of the lengthscale(s) of the noise/log-noise process,k_theta_g: ifsettings['linkThetas'] == 'joint', mle for the constant by which lengthscale parameters ofthetaare multiplied to gettheta_g,g: unless given, mle of the nugget of the noise/log-noise process,trendtype: either"SK"ifbeta0is provided, else"OK",beta0: constant trend of the mean process, plugin-estimator unless given,nmean: plugin estimator for the constant noise/log-noise process mean,pX: if used, matrix of pseudo-inputs locations for the noise/log-noise process,ll: log-likelihood value, (ll_non_pen) is the value without the penalty,nit_opt,msg: counts and message returned by :func:scipy.optimize.minimizemodHom: homoskedastic GP model of classhomGPused for initialization of the mean process,modNugs: homoskedastic GP model of classhomGPused for initialization of the noise/log-noise process,nu_hat_var: variance of the noise process,used_args: list with arguments provided in the call to the function,Ki,Kgi: inverse of the covariance matrices of the mean and noise processes (not scaled bynu_hatandnu_hat_var),X0,Z0,Z,eps,logN,covtype: values given in input,
time: time to train the model, in seconds.
See also ~hetgpy.hetGP.hetGP.predict for predictions, ~hetgpy.hetGP.update for updating an existing model.
summaryandplotfunctions are available as well. ~hetTP.mleHetTP provide a Student-t equivalent.- Return type:
self, with the following attributes
- Parameters:
X (ArrayLike)
Z (ArrayLike)
lower (TypeAliasForwardRef('ArrayLike') | None)
upper (TypeAliasForwardRef('ArrayLike') | None)
known (dict)
noiseControl (dict)
init (dict)
covtype (str)
maxit (int)
eps (float)
settings (dict)
use_torch (bool)
Examples
>>> from hetgpy import hetGP >>> from hetgpy.example_data import mcycle >>> m = mcycle() >>> model = hetGP() >>> model.mle(m['times'],m['accel'],lower=[1.0],upper=[10.0],covtype="Matern5_2")
References
M. Binois, Robert B. Gramacy, M. Ludkovski (2018), Practical heteroskedastic Gaussian process modeling for large simulation experiments, Journal of Computational and Graphical Statistics, 27(4), 808–821. Preprint available on arXiv:1611.05902.
- plot(type='diagnostics', **kwargs)[source]
Plot model behavior. See hetgpy.plot for more details on types of plots
- Parameters:
type (str) – type of plot: currently supported are “diagnostics” and “iterates”
- Returns:
fig, ax
- Return type:
figure and axis object
- predict(x, noise_var=False, xprime=None, nugs_only=False, interval=None, interval_lower=None, interval_upper=None, **kwargs)[source]
Gaussian process predictions using a heterogeneous noise GP object (of
hetGP)- Parameters:
x (ndarray_like) – matrix of designs locations to predict at (one point per row)
noise_var (bool (default False)) – should the variance of the latent variance process be returned?
xprime (ndarray_like) – optional second matrix of predictive locations to obtain the predictive covariance matrix between
xandxprimenugs_only (bool (default False)) – if
True, only return noise variance predictionkwargs (dict) – optional additional elements (not used)
interval (str | None)
interval_lower (float | None)
interval_upper (float | None)
- Returns:
mean: kriging mean;sd2: kriging variance (filtered, e.g. without the nugget values)nugs: noise variance predictionsd2_var: (returned ifnoise_var = True) kriging variance of the noise process (i.e., on log-variances iflogN = TRUE)cov: (returned ifxprimeis given) predictive covariance matrix betweenxandxprime
- Return type:
dict with elements
Notes
The full predictive variance corresponds to the sum of
sd2andnugs. See :func: ~hetgpy.hetGP.mleHetGP for examples.
- rebuild(robust=False)[source]
Rebuilds inverse covariance matrix in homGP object (usually after saving). Works in tandem with strip
- Parameters:
robust (bool) – use np.linalg.pinv for covariance matrix inversion. Otherwise use cholesky
- Return type:
self with rebuilt inverse covariance matrix
- update(Xnew, Znew, ginit=0.01, lower=None, upper=None, noiseControl=None, settings=None, known={}, maxit=100, method='quick')[source]
Update model object with new observations
- Parameters:
Xnew (ndarray_like) – new inputs (one observation per row)
Znew (ndarray_like) – new observations
lower (ndarray_like) – lower bound for lengthscales. If not provided extracted from self
upper (ndarray_like) – upper bound for lengthscales. If not provided extracted from self
noiseControl (dict) – noise bounds. If not provided extracted from self
settings (dict) – options for optimization. If not provided extracted from self
known (dict) – known hyperparameters to fix at values
maxit (int) – maximum number of iterations for hyperparameter optimization
ginit (float)
method (str)
- Return type:
self, potentially with updated hyperparameters
Notes
If hyperparameters do not need to be updated, maxit can be set to 0. In this case it is possible to pass NAs in Znew, then the model can still be used to provide updated variance predictions
Example
>>> import numpy as np, from hetgpy import hetGP >>> X = np.array([1, 2, 3, 4, 12]).reshape(-1,1) >>> Y = np.array([0, -1.75, -2, -0.5, 5]) >>> model = hetGP().mle(X,Y,known={'theta':np.array([10.0]),'g':1e-8}) >>> Xnew = np.array([2.5]).reshape(-1,1) >>> Znew = model.predict(Xnew)['mean'] >>> model.update(Xnew,Znew)
- class homGP[source]
Methods
copy()Make a copy of the model, which is useful in tandem with the update function
dlogLikHom(X0, Z0, Z, mult, theta, g[, ...])Log likelihood gradient under homoskedastic noise
get(key)General get item (retrives key from self.__dict__)
logLikHom(X0, Z0, Z, mult, theta, g[, ...])Log Likelihood under homoskedastic noise
mleHomGP(X, Z[, lower, upper, known, ...])Gaussian process modeling with homoskedastic noise.
plot([type])Plot model behavior.
predict(x[, xprime, interval, ...])Prediction under homoskedastic noise
rebuild([robust])Rebuilds inverse covariance matrix in homGP object (usually after saving).
strip()Removes larger model objects
summary()Print a summary of the model object and the MLE routine
update(Xnew, Znew[, ginit, lower, upper, ...])Update model object with new observations
- copy()[source]
Make a copy of the model, which is useful in tandem with the update function
- Parameters:
None
- Returns:
newmodel
- Return type:
(deep) copy of model
- dlogLikHom(X0, Z0, Z, mult, theta, g, beta0=None, covtype='Gaussian', eps=np.float64(1.4901161193847656e-08), components=('theta', 'g'))[source]
Log likelihood gradient under homoskedastic noise
- Parameters:
X0 (ndarray_like) – unique design matrix (of size nxd)
Z0 (ndarray_like) – averaged responses (size nx1)
Z (ndarray_like) – observation (output) vector (size N)
mult (ndarray_like) – number of replicates as each design location. Note that mult.sum()==N
theta (ndarray_like) – lengthscale
g (float) – noise variance
beta0 (float) – trend
covtype (str) – covariance kernel to use
eps (float) – amount of jitter on diagonal of covariance matrix
components (list) – directions for partial derivatives, one or both of ‘theta’ (lengthscales) or ‘g’ (for noise)
- Returns:
out – gradient with respect to hyperparameters
- Return type:
ndarray_like
- logLikHom(X0, Z0, Z, mult, theta, g, beta0=None, covtype='Gaussian', eps=np.float64(1.4901161193847656e-08))[source]
Log Likelihood under homoskedastic noise
- Parameters:
X0 (ndarray_like) – unique design matrix (of size nxd)
Z0 (ndarray_like) – averaged responses (size nx1)
Z (ndarray_like) – observation (output) vector (size N)
mult (ndarray_like) – number of replicates as each design location. Note that mult.sum()==N
theta (ndarray_like) – lengthscale
g (float) – noise variance
beta0 (float) – trend
covtype (str) – covariance kernel to use
eps (float) – amount of jitter on diagonal of covariance matrix
- Returns:
loglik – log-likelihood
- Return type:
float
- mleHomGP(X, Z, lower=None, upper=None, known={}, noiseControl={'g_bounds': (np.float64(1.4901161193847656e-08), 100.0)}, init={}, covtype='Gaussian', maxit=100, eps=np.float64(1.4901161193847656e-08), settings={'factr': 10000000.0, 'returnKi': True})[source]
Gaussian process modeling with homoskedastic noise.
You may also call this function as model.mle
Gaussian process regression under homoskedastic noise based on maximum likelihood estimation of the hyperparameters. This function is enhanced to deal with replicated observations.
- Parameters:
X (ndarray_like) – matrix of all designs, one per row, or list with elements: -
X0matrix of unique design locations, one point per row -Z0vector of averaged observations, of lengthlen(X0)-multnumber of replicates at designs inX0, of lengthlen(X0)Z (ndarray_like) – Z vector of all observations. If using a list with
X,Zhas to be ordered with respect toX0, and of lengthsum(mult)lower (ndarray_like) – optional bounds for the
thetaparameter (see :func: covariance_functions.cov_gen for the exact parameterization). In the multivariate case, it is possible to give vectors for bounds (resp. scalars) for anisotropy (resp. isotropy)upper (ndarray_like) – optional bounds for the
thetaparameter (see :func: covariance_functions.cov_gen for the exact parameterization). In the multivariate case, it is possible to give vectors for bounds (resp. scalars) for anisotropy (resp. isotropy)noiseControl (dict) –
- dict with element:
g_boundsvector providing minimal and maximal noise to signal ratio (default to(sqrt(MACHINE_DOUBLE_EPS), 100)).
settings (dict) –
- dict for options about the general modeling procedure, with elements:
return_Kiboolean to include the inverse covariance matrix in the object for further use (e.g., prediction).factr(default to 1e7) andpgtolare available to be passed to options for L-BFGS-B in :func:scipy.optimize.minimize.
eps (float) – jitter used in the inversion of the covariance matrix for numerical stability
known (dict) – optional dict of known parameters (e.g.
beta0,theta,g)init (dict) –
- optional lists of starting values for mle optimization:
theta_initinitial value of the theta parameters to be optimized over (default to 10% of the range determined withlowerandupper)g_initvector of nugget parameter to be optimized over
covtype (str) – covariance kernel type, either
'Gaussian','Matern5_2'or'Matern3_2', see :func:~covariance_functions.cov_genmaxit (int) – maximum number of iterations for L-BFGS-B of :func:
scipy.optimize.minimizededicated to maximum likelihood optimization
- Return type:
None
Notes
The global covariance matrix of the model is parameterized as
nu_hat * (C + g * diag(1/mult)) = nu_hat * K, withCthe correlation matrix between unique designs, depending on the family of kernel used (see :func: ~hetgpy.covariance_functions.cov_gen for available choices) and values of lengthscale parameters.nu_hatis the plugin estimator of the variance of the process.It is generally recommended to use :func:
~find_reps.find_repsto pre-process the data, to rescale the inputs to the unit cube and to normalize the outputs.- Returns:
theta: unless given, maximum likelihood estimate (mle) of the lengthscale parameter(s),nu_hat: plugin estimator of the variance,g: unless given, mle of the nugget of the noise/log-noise process,trendtype: either"SK"ifbeta0is provided, else"OK",beta0: constant trend of the mean process, plugin-estimator unless given,ll: log-likelihood value, (ll_non_pen) is the value without the penalty,nit_opt,msg: counts and message returned by :func:scipy.optimize.minimizeused_args: list with arguments provided in the call to the function,Ki,Kgi: inverse of the covariance matrices of the mean and noise processes (not scaled bynu_hatandnu_hat_var),X0,Z0,Z,eps,logN,covtype: values given in input,
time: time to train the model, in seconds.
See also ~hetgpy.homGP.homGP.predict for predictions, ~hetgpy.homGP.update for updating an existing model.
summaryandplotfunctions are available as well. ~homTP.mleHomTP provide a Student-t equivalent.- Return type:
self, with the following attributes
- Parameters:
X (ArrayLike)
Z (ArrayLike)
lower (TypeAliasForwardRef('ArrayLike') | None)
upper (TypeAliasForwardRef('ArrayLike') | None)
known (dict)
noiseControl (dict)
init (dict)
covtype (str)
maxit (int)
eps (float)
settings (dict)
Examples
>>> from hetgpy import homGP >>> from hetgpy.example_data import mcycle >>> m = mcycle() >>> model = homGP() >>> model.mle(m['times'],m['accel'],lower=[1.0],upper=[10.0],covtype="Matern5_2")
References
M. Binois, Robert B. Gramacy, M. Ludkovski (2018), Practical heteroskedastic Gaussian process modeling for large simulation experiments, Journal of Computational and Graphical Statistics, 27(4), 808–821. Preprint available on arXiv:1611.05902.
- plot(type='diagnostics')[source]
Plot model behavior. See hetgpy.plot for more details on types of plots
- Parameters:
type (str) – type of plot: currently supported are “diagnostics” and “iterates”
- Returns:
fig, ax
- Return type:
figure and axis object
- predict(x, xprime=None, interval=None, interval_lower=None, interval_upper=None, **kw)[source]
Prediction under homoskedastic noise
- Parameters:
x (ndarray_like) – matrix of designs locations to predict at (one point per row)
xprime (ndarray_like) – optional second matrix of predictive locations to obtain the predictive covariance matrix between
xandxprimeinterval (str) – one of ‘confidence’ or ‘predictive’ which is a convenience method to return confidence/predictive intervals corresponding to interval_lower and interval_upper
interval_lower (float) – lower of confidence/predictive interval
interval_upper (float) – upper of confidence/predictive interval
nugs_only (bool (default False)) – if
True, only return noise variance predictionkwargs (dict) – optional additional elements (only used for nugs_only)
- Returns:
mean: kriging mean;sd2: kriging variance (filtered, e.g. without the nugget values)nugs: noise variance predictioncov: (returned ifxprimeis given) predictive covariance matrix betweenxandxprimeconfidence_interval: prediction with kriging variance onlypredictive_interval: prediction with kriging and noise variance
- Return type:
dict with elements
- rebuild(robust=False)[source]
Rebuilds inverse covariance matrix in homGP object (usually after saving). Works in tandem with strip
- Parameters:
robust (bool) – use np.linalg.pinv for covariance matrix inversion. Otherwise use cholesky
- Return type:
self with rebuilt inverse covariance matrix
- update(Xnew, Znew, ginit=0.01, lower=None, upper=None, noiseControl=None, settings=None, known={}, maxit=100)[source]
Update model object with new observations
- Parameters:
Xnew (ndarray_like) – new inputs (one observation per row)
Znew (ndarray_like) – new observations
lower (ndarray_like) – lower bound for lengthscales. If not provided extracted from self
upper (ndarray_like) – upper bound for lengthscales. If not provided extracted from self
noiseControl (dict) – noise bounds. If not provided extracted from self
settings (dict) – options for optimization. If not provided extracted from self
known (dict) – known hyperparameters to fix at values
maxit (int) – maximum number of iterations for hyperparameter optimization
ginit (float)
- Return type:
self, potentially with updated hyperparameters
Notes
If hyperparameters do not need to be updated, maxit can be set to 0. In this case it is possible to pass NAs in Znew, then the model can still be used to provide updated variance predictions
Example
>>> import numpy as np, from hetgpy import homGP >>> X = np.array([1, 2, 3, 4, 12]).reshape(-1,1) >>> Y = np.array([0, -1.75, -2, -0.5, 5]) >>> model = homGP().mle(X,Y,known={'theta':np.array([10.0]),'g':1e-8}) >>> Xnew = np.array([2.5]).reshape(-1,1) >>> Znew = model.predict(Xnew)['mean'] >>> model.update(Xnew,Znew)
- crit_EI(x, model, cst=None, preds=None)[source]
Expected Improvement (EI) criteria
- Parameters:
x (nd_arraylike) – model designs, one point per row
model (hetgpy.hetGP) – hetGP or homGP model
cst (float) – optional plugin value of the mean
preds (Dict) – model predictions (optional)
References
Mockus, J.; Tiesis, V. & Zilinskas, A. (1978). The application of Bayesian methods for seeking the extremum Towards Global Optimization, Amsterdam: Elsevier, 2, 2. Vazquez E, Villemonteix J, Sidorkiewicz M, Walter E (2008). Global Optimization Based on Noisy Evaluations: An Empirical Study of Two Statistical Approaches, Journal of Physics: Conference Series, 135, IOP Publishing.
Examples
>>> from hetgpy.test_functions import f1d >>> from hetgpy.homGP import homGP >>> from hetgpy.optim import crit_EI >>> import numpy as np >>> ftest = f1d >>> n_init = 5 # number of unique designs >>> X = np.linspace(0, 1, n_init).reshape(-1,1) >>> Z = ftest(X) >>> xgrid = np.linspace(0,1,51).reshape(-1,1) >>> model = homGP() >>> model.mle(X = X, Z = Z, lower = np.array([0.01]), upper = np.array([1]), known = dict(g = 2e-8)) >>> EI = crit_EI(xgrid, model, cst = model.Z0.min())
- crit_logEI(x, model, cst=None, preds=None)[source]
Logarithm of Expected Improvement (EI) criteria
- Parameters:
x (nd_arraylike) – model designs, one point per row
model (hetgpy.hetGP) – hetGP or homGP model
cst (float) – optional plugin value of the mean
preds (Dict) – model predictions (optional)
References
Ament, S., Daulton, S., Eriksson, D., Balandat, M., & Bakshy, E. (2024). Unexpected improvements to expected improvement for Bayesian optimization. Advances in Neural Information Processing Systems, 36.
Examples
>>> from hetgpy.test_functions import f1d >>> from hetgpy.homGP import homGP >>> from hetgpy.optim import crit_logEI >>> import numpy as np >>> ftest = f1d >>> n_init = 5 # number of unique designs >>> X = np.linspace(0, 1, n_init).reshape(-1,1) >>> Z = ftest(X) >>> xgrid = np.linspace(0,1,51).reshape(-1,1) >>> model = homGP() >>> model.mle(X = X, Z = Z, lower = np.array([0.01]), upper = np.array([1]), known = dict(g = 2e-8)) >>> logEI = crit_logEI(xgrid, model, cst = model.Z0.min())
- crit_qEI(x, model, cst=None, preds=None)[source]
Fast approximated batch-Expected Improvement criterion (for minimization) Parallel Expected improvement
- Parameters:
x (nd_array) – matrix of new designs representing the batch of q points, one point per row (q x d)
model (homGP/hetGP) – model object including inverve matrices
cst (nd_array) – optional optional plugin value used in the EI, see details
preds (dict) – optional predictions at x to avoid recomputing if already done (must include the predictive covariance, i.e., the cov slot)
- Returns:
qEI_cpp
Details
——-
cst is classically the observed minimum in the deterministic case. In the noisy case, the min of the predictive mean works fine. This is a beta version at this point. It may work for for TP models as well.
References
M. Binois (2015), Uncertainty quantification on Pareto fronts and high-dimensional strategies in Bayesian optimization, with applications in multi-objective automotive design. Ecole Nationale Superieure des Mines de Saint-Etienne, PhD thesis.
Examples
>>> from hetgpy.test_functions import f1d >>> from hetgpy.homGP import homGP >>> from hetgpy.optim import crit_qEI >>> import numpy as np >>> ftest = f1d >>> n_init = 5 # number of unique designs >>> X = np.linspace(0, 1, n_init).reshape(-1,1) >>> Z = ftest(X) >>> xgrid = np.linspace(0,1,51).reshape(-1,1) >>> model = homGP() >>> model.mleHomGP(X = X, Z = Z, lower = np.array([0.01]), upper = np.array([1]), known = dict(g = 2e-8)) >>> xbatch = np.array((0.37, 0.17, 0.7)).reshape(3, 1) >>> fqEI = crit_qEI(xbatch, model, cst = model.Z0.min())
- deriv_crit_EI(x, model, cst=None, preds=None)[source]
Derivative for crit_EI, used in crit_optim
- Parameters:
x (nd_arraylike) – model designs, one point per row
model (hetgpy.hetGP) – hetGP or homGP model
cst (float) – optional plugin value of the mean
preds (Dict) – model predictions (optional)
References
Ginsbourger, D. Multiples metamodeles pour l’approximation et l’optimisation de fonctions numeriques multivariables Ecole Nationale Superieure des Mines de Saint-Etienne, Ecole Nationale Superieure des Mines de Saint-Etienne, 2009 Roustant, O., Ginsbourger, D., DiceKriging, DiceOptim: Two R packages for the analysis of computer experiments by kriging-based metamodeling and optimization, Journal of Statistical Software, 2012
- deriv_crit_logEI(x, model, cst=None, preds=None)[source]
Derivative of the logarithm of Expected Improvement (EI) criteria
- Parameters:
x (nd_arraylike) – model designs, one point per row
model (hetgpy.hetGP) – hetGP or homGP model
cst (float) – optional plugin value of the mean
preds (Dict) – model predictions (optional)
References
Ament, S., Daulton, S., Eriksson, D., Balandat, M., & Bakshy, E. (2024). Unexpected improvements to expected improvement for Bayesian optimization. Advances in Neural Information Processing Systems, 36.
Examples
- dlambda(z, a)[source]
Derivative of the student-t pdf
- Parameters:
z (float) – input location
a (float) – degree of freedom parameter
- Return type:
derivative of student-t pdf
Examples
>>> dlambda(0.55, 3.6) # -0.2005827
- dlog_h(z, eps=np.float64(2.220446049250313e-16))[source]
References
Ament, S., Daulton, S., Eriksson, D., Balandat, M., & Bakshy, E. (2024). Unexpected improvements to expected improvement for Bayesian optimization. Advances in Neural Information Processing Systems, 36.
- log1mexp(x)[source]
References
Maechler, Martin (2012). Accurately Computing log(1-exp(-|a|)). Assessed from the Rmpfr package.
- log_h(z, eps=np.float64(2.220446049250313e-16))[source]
References
Ament, S., Daulton, S., Eriksson, D., Balandat, M., & Bakshy, E. (2024). Unexpected improvements to expected improvement for Bayesian optimization. Advances in Neural Information Processing Systems, 36.
- IMSPE(model, theta=None, Lambda=None, mult=None, covtype=None, nu=None, eps=np.float64(1.4901161193847656e-08))[source]
IMSPE of a given design.
Integrated Mean Square Prediction Error
- Parameters:
X (
hetgpy.hetGP.hetGPorhetgpy.homGP.homGPmodel.) – Alternatively, one can provide a matrix of unique designs consideredtheta (ndarray_like) – lengthscales
Lambda (ndarray_like) – diagonal matrix for the noise
mult (ndarray_like) – number of replicates at each design
covtype (str) – either “Gaussian”, “Matern3_2” or “Matern5_2”
nu (float) – variance parameter
eps (float) – numerical nugget
Details
-------
homGP (One can provide directly a model of class hetGP or)
arguments (or provide design locations X and all other)
- Wij(mu1, mu2=None, theta=None, type='Gaussian')[source]
Compute double integral of the covariance kernel over a [0,1]^d domain
- Parameters:
mu1 (ndarray)like) – input locations considered
mu2 (ndarray)like) – input locations considered
theta (ndarray_like) – lengthscale hyperparameter of the kernel
type (str) – kernel type, one of
"Gaussian","Matern5_2"or"Matern3_2"
References
M. Binois, J. Huang, R. B. Gramacy, M. Ludkovski (2019), Replication or exploration? Sequential design for stochastic simulation experiments, Technometrics, 61(1), 7-23. Preprint available on arXiv:1710.03206.
- allocate_mult(model=None, N=None, Wijs=None, use_Ki=False)[source]
Allocation of replicates on existing design locations, based on (29) from (Ankenman et al, 2010)
- Parameters:
model (hetGP model) – hetGP model
N (int) – total budget of replication to allocate
Wijs (ndarray) – optional previously computed matrix of Wijs, see hetgpy.IMSE.Wij
use_Ki (bool) – should Ki from model be used?
- Return type:
vector with approximated best number of replicates per design
References
Ankenman, B. Nelson, J. Staum (2010), Stochastic kriging for simulation metamodeling, Operations research, pp. 371–382, 58
- deriv_crit_IMSPE(x, model, id=None, Wijs=None)[source]
Derivative of crit_IMSPE
- Parameters:
- Return type:
Derivative of the sequential IMSPE with respect to x
- horizon(model, current_horizon=None, previous_ratio=None, target=None, Wijs=None, seed=None)[source]
Adapt the look-ahead horizon depending on the replicate allocation or a target ratio
- Parameters:
model (hetGP or homGP model) – hetGP or homGP model
current_horizon (int) – horizon used for the previous iteration, see details
previous_ratio (float) – ratio before adding the previous new design
target (float) – scalar in [0,1] for desired n/N
Wijs (nd_array) – optional previously computed matrix of Wijs, see hetgpy.IMSE.Wij
- Returns:
Randomly selected horizon for next iteration (adpative) if no target is provided,
otherwise returns the update horizon value.
Details
——-
If target is provided, along with previous_ratio and current_horizon
itemize{ – item the horizon is increased by one if more replicates are needed but a new ppint has been added at the previous iteration, item the horizon is decreased by one if new points are needed but a replicate has been added at the previous iteration, item otherwise it is unchanged.
}
If no target is provided, allocate_mult is used to obtain the best allocation of the existing replicates,
then the new horizon is sampled from the difference between the actual allocation and the best one, bounded below by 0.
See (Binois et al. 2017).
References
M. Binois, J. Huang, R. B. Gramacy, M. Ludkovski (2019), Replication or exploration? Sequential design for stochastic simulation experiments, Technometrics, 61(1), 7-23.cr Preprint available on arXiv:1710.03206.
- lhs_EP(m, seed=None)[source]
From DiceDesign: FUNCTION PERFORMING ELEMENTARY PERMUTATION (EP) IN LHD USED IN SA ALGORITHMS
- Parameters:
m (nd_arraylike) – the design
- Returns:
out – list including design after EP, ligns and columns defining EP
- Return type:
tuple
- maximinSA_LHS(design, T0=10, c=0.95, it=2000, p=50, profile='GEOM', Imax=100, seed=None)[source]
Implementation of maximinSA_LHS from DiceDesign Only profile=”GEOM” is implemented (like in hetGP)
#####maximinSA_LHS##### #####Maximin LHS VIA SIMULATED ANNEALING OPTIMIZATION#####
#—————————————————————————| #args : m : the design | # T0 : the initial temperature | # c : parameter regulating the temperature | # it : the number of iterations | # p : power required in phiP criterion | # profile : temperature down profile | # “GEOM” or “GEOM_MORRIS” or “LINEAR”. By default : “GEOM” | #output : a list containing all the input arguments plus: | # a mindist optimized design | # vector of criterion values along the iterations | # vector of temperature values along the iterations | # vector of acceptation probability values along the iterations | #depends : phiP,lhs_EP | #—————————————————————————|
- mi(mu1, theta, type)[source]
Compute integral of the covariance kernel over a [0,1]^d domain
- Parameters:
mu1 (ndarray_like) – input locations considered
theta (ndarray_like) – lengthscale hyperparameter of the kernel
type (str) – kernel type, one of
"Gaussian","Matern5_2"or"Matern3_2"
References
Replication or exploration? Sequential design for stochastic simulation experiments, Technometrics, 61(1), 7-23. Preprint available on arXiv:1710.03206.
- phiP(design, p=50)[source]
Implementation of phiP.R from DiceDesign (necessary for maximinSA_LHS from DiceDesign which is used by IMSPE_search)
From DiceDesign: Compute the phiP criterion (Lp norm of the sum of the inverses of the design inter-point distances) Reference: Pronzato, L. and Muller, W.,2012, Design of computer experiments: space filling and beyond, Statistics and Computing, 22:681-701. A higher phiP corresponds to a more regular scaterring of design points
- Parameters:
design (nd_arraylike) – design for a computer experiment
p (int) – the “p” in the Lp norm which is taken (default=50)
- Returns:
fi_p – the phiP criterion
- Return type:
np.float
- LOO_preds(model, ids=None)[source]
Provide leave one out predictions, e.g., for model testing and diagnostics. This is used in the method plot available on GP and TP models.
- Parameters:
- Return type:
dict with mean and variance predictions at x_i assuming this point has not been evaluated
Notes
For TP models, psi is considered fixed
References
O. Dubrule (1983), Cross validation of Kriging in a unique neighborhood, Mathematical Geology 15, 687–699. cr cr F. Bachoc (2013), Cross Validation and Maximum Likelihood estimations of hyper-parameters of Gaussian processes
Examples
>>> from hetgpy import homGP >>> from hetgpy.LOO import LOO_preds >>> from hetgpy.find_reps import find_reps >>> from hetgpy.example_data import mcycle >>> import numpy as np >>> m = mcycle() >>> X, Z = m['times'], m['accel'] >>> model = homGP() >>> model.mle(X = X, Z = Z, lower = [0.1], upper = [10], >>> covtype = "Matern5_2", known = dict(beta0 = 0)) >>> LOO_p = LOO_preds(model) >>> d = find_reps(X,Z) >>> LOO_ref = np.zeros(shape=(d['X0'].shape[0],2)) >>> rows = np.arange(d['X0'].shape[0]) >>> for i in rows: >>> idxs = np.delete(rows,i) >>> model_i = homGP() >>> model_i.mle(X = dict(X0 = d['X0'][idxs,:], Z0 = d['Z0'][idxs], >>> mult = d['mult'][idxs]), >>> Z = np.concatenate([d['Zlist'][j] for j in idxs]), >>> lower = [0.1], upper = [50], covtype = "Matern5_2", >>> known = dict(theta = model.theta, g = model.g, >>> beta0 = 0)) >>> model_i.nu_hat = model.nu_hat >>> p_i = model_i.predict(d['X0'][i:i+1,:]) >>> LOO_ref[i,:] = np.array([p_i['mean'], p_i['sd2']]).squeeze() >>> print(np.ptp(LOO_ref[:,0] - LOO_p['mean'])) >>> print(np.ptp(LOO_ref[:,1] - LOO_p['sd2'])) >>> model.plot()