hetGP
- class hetGP[source]
Methods
Leave-out-out predictions for the nugget
copy()Make a copy of the model, which is useful in tandem with the update function
dlogLikHet(X0, Z0, Z, mult, Delta, theta, g)derivative of log-likelihood for logLikHet_Wood with respect to theta and Lambda with all observations Model: K = nu2 * (C + Lambda) = nu using all observations using the replicates information nu2 is replaced by its plugin estimator in the likelihood
logLikHet(X0, Z0, Z, mult, Delta, theta, g)log-likelihood in the anisotropic case - one lengthscale by variable Model: K = nu2 * (C + Lambda) = nu using all observations using the replicates information nu2 is replaced by its plugin estimator in the likelihood
mleHetGP(X, Z[, lower, upper, known, ...])Gaussian process modeling with heteroskedastic noise
plot([type])Plot model behavior.
predict(x[, noise_var, xprime, nugs_only, ...])Gaussian process predictions using a heterogeneous noise GP object (of
hetGP)rebuild([robust])Rebuilds inverse covariance matrix in homGP object (usually after saving).
strip()Removes larger model objects
summary()Print summary of model object and hyperparameter optimization
update(Xnew, Znew[, ginit, lower, upper, ...])Update model object with new observations
get
- LOO_preds_nugs(i)[source]
Leave-out-out predictions for the nugget
- Parameters:
self (hetGP object)
i (int) – index of the point to remove
- Return type:
LOO preds
- copy()[source]
Make a copy of the model, which is useful in tandem with the update function
- Parameters:
None
- Returns:
newmodel
- Return type:
(deep) copy of model
- dlogLikHet(X0, Z0, Z, mult, Delta, theta, g, k_theta_g=None, theta_g=None, beta0=None, pX=None, logN=True, SiNK=False, components=None, eps=np.float64(1.4901161193847656e-08), covtype='Gaussian', SiNK_eps=0.0001, penalty=True, hom_ll=None)[source]
derivative of log-likelihood for logLikHet_Wood with respect to theta and Lambda with all observations Model: K = nu2 * (C + Lambda) = nu using all observations using the replicates information nu2 is replaced by its plugin estimator in the likelihood
- Parameters:
X0 (ndarray_like) – unique designs
Z0 (ndarray_like) – averaged observations
Z (ndarray_like) – replicated observations (sorted with respect to X0)
mult (ndarray_like) – number of replicates at each Xi
Delta (ndarray_like) – vector of nuggets corresponding to each X0i or pXi, that are smoothed to give Lambda
logN (bool) – should exponentiated variance be used
SiNK (bool) – should the smoothing come from the SiNK predictor instead of the kriging one
theta (ndarray_like) – scale parameter for the mean process, either one value (isotropic) or a vector (anistropic)
k_theta_g (ndarray_like) – constant used for linking nuggets lengthscale to mean process lengthscale, i.e., theta_g[k] = k_theta_g * theta[k], alternatively theta_g can be used
theta_g (ndarray_like) – either one value (isotropic) or a vector (anistropic), alternative to using k_theta_g
g (ndarray_like) – nugget of the nugget process
pX (ndarray_like) – matrix of pseudo inputs locations of the noise process for Delta (could be replaced by a vector to avoid double loop)
components (ndarray_like) – components to determine which variable are to be taken in the derivation: None for all, otherwise list with elements from ‘theta’, ‘Delta’, ‘theta_g’, ‘k_theta_g’, ‘pX’ and ‘g’.
beta0 (float) – mean, if not provided, the MLE estimator is used
eps (float) – minimal value of elements of Lambda
covtype (str) – covariance kernel type
penalty (bool) – should a penalty term on Delta be used?
hom_ll (float) – reference homoskedastic likelihood
SiNK_eps (float)
- Return type:
ArrayLike
- logLikHet(X0, Z0, Z, mult, Delta, theta, g, k_theta_g=None, theta_g=None, logN=None, SiNK=False, beta0=None, pX=None, eps=np.float64(1.4901161193847656e-08), covtype='Gaussian', SiNK_eps=0.0001, penalty=True, hom_ll=None, trace=0)[source]
log-likelihood in the anisotropic case - one lengthscale by variable Model: K = nu2 * (C + Lambda) = nu using all observations using the replicates information nu2 is replaced by its plugin estimator in the likelihood
- Parameters:
X0 (ndarray_like) – unique designs
Z0 (ndarray_like) – averaged observations
Z (ndarray_like) – replicated observations (sorted with respect to X0)
mult (ndarray_like) – number of replicates at each Xi
Delta (ndarray_like) – vector of nuggets corresponding to each X0i or pXi, that are smoothed to give Lambda
logN (bool) – should exponentiated variance be used
SiNK (bool) – should the smoothing come from the SiNK predictor instead of the kriging one
theta (ndarray_like) – scale parameter for the mean process, either one value (isotropic) or a vector (anistropic)
k_theta_g (ndarray_like) – constant used for linking nuggets lengthscale to mean process lengthscale, i.e., theta_g[k] = k_theta_g * theta[k], alternatively theta_g can be used
theta_g (ndarray_like) – either one value (isotropic) or a vector (anistropic), alternative to using k_theta_g
g (ndarray_like) – nugget of the nugget process
pX (ndarray_like) – matrix of pseudo inputs locations of the noise process for Delta (could be replaced by a vector to avoid double loop)
beta0 (float) – mean, if not provided, the MLE estimator is used
eps (float) – minimal value of elements of Lambda
covtype (str) – covariance kernel type
penalty (bool) – should a penalty term on Delta be used?
hom_ll (float) – reference homoskedastic likelihood
SiNK_eps (float)
trace (int)
- Return type:
float
- mleHetGP(X, Z, lower=None, upper=None, known={}, noiseControl={'g_bounds': (1e-06, 1), 'g_max': 100, 'k_theta_g_bounds': (1, 100)}, init={}, covtype='Gaussian', maxit=100, eps=np.float64(1.4901161193847656e-08), settings={'factr': 1000000000.0, 'ignore_MLE_divide_invalid': True, 'returnKi': True}, use_torch=False)[source]
Gaussian process modeling with heteroskedastic noise
You may also call this function as model.mle
Gaussian process regression under input dependent noise based on maximum likelihood estimation of the hyperparameters. A second GP is used to model latent (log-) variances. This function is enhanced to deal with replicated observations.
- Parameters:
X (ndarray_like) – matrix of all designs, one per row, or list with elements: -
X0matrix of unique design locations, one point per row -Z0vector of averaged observations, of lengthlen(X0)-multnumber of replicates at designs inX0, of lengthlen(X0)Z (ndarray_like) – Z vector of all observations. If using a list with
X,Zhas to be ordered with respect toX0, and of lengthsum(mult)lower (ndarray_like) – optional bounds for the
thetaparameter (see :func: covariance_functions.cov_gen for the exact parameterization). In the multivariate case, it is possible to give vectors for bounds (resp. scalars) for anisotropy (resp. isotropy)upper (ndarray_like) – optional bounds for the
thetaparameter (see :func: covariance_functions.cov_gen for the exact parameterization). In the multivariate case, it is possible to give vectors for bounds (resp. scalars) for anisotropy (resp. isotropy)noiseControl (dict) –
- dict with elements related to optimization of the noise process parameters:
g_min,g_maxminimal and maximal noise to signal ratio (of the mean process)lowerDelta,upperDeltaoptional vectors (or scalars) of bounds onDelta, of lengthlen(X0)(default tonp.repeat(eps, X0.shape[0])andnp.repeat(noiseControl["g_max"], X0.shape[0])resp., or theirlog)lowerpX,upperpXoptional vectors of bounds of the input domain if pX is used.lowerTheta_g,upperTheta_goptional vectors of bounds for the lengthscales of the noise process iflinkThetas == 'none'. Same as forthetaif not provided.k_theta_g_boundsiflinkThetas == 'joint', vector with minimal and maximal values fork_theta_g(default to(1, 100)). See Notes.g_boundsvector for minimal and maximal noise to signal ratios for the noise of the noise process, i.e., the smoothing parameter for the noise process. (default to(1e-6, 1)).
settings (dict) –
- dict for options about the general modeling procedure, with elements:
linkThetasdefines the relation between lengthscales of the mean and noise processes. Either'none','joint'``(default) or ``'constr', see Notes.logN, whenTrue(default), the log-noise process is modeled.initStrategyone of'simple','residuals'(default) and'smoothed'to obtain starting values forDelta, see NotespenaltywhenTrue, the penalized version of the likelihood is used (i.e., the sum of the log-likelihoods of the mean and variance processes, see References).hardpenaltyisTrue, the log-likelihood from the noise GP is taken into account only if negative (default ifmaxit > 1000).checkHomwhenTrue, if the log-likelihood with a homoskedastic model is better, then return it.traceoptional scalar (default to0). If negative, fit silently. If0, only high level information is given. If1, information is given about the result of the heterogeneous model optimization. Level2gives more details. Level3additionaly displays all details about initialization of hyperparameters.return_matricesboolean to include the inverse covariance matrix in the object for further use (e.g., prediction).return_homboolean to include homoskedastic GP models used for initialization (i.e.,modHomandmodNugs).factr(default to 1e9) andpgtolare available to be passed to options for L-BFGS-B in :func:scipy.optimize.minimize.
eps (float) – jitter used in the inversion of the covariance matrix for numerical stability
init (dict) –
optional lists of starting values for mle optimization or that should not be optimized over, respectively. Values in
knownare not modified, while it can happen to these ofinit, see Notes. One can set one or several of the following:thetalengthscale parameter(s) for the mean process either one value (isotropic) or a vector (anistropic)Deltavector of nuggets corresponding to each design inX0, that are smoothed to giveLambda(as the global covariance matrix depends onDeltaandnu_hat, it is recommended to also pass values fortheta)beta0constant trend of the mean processk_theta_gconstant used for link mean and noise processes lengthscales, whensettings['linkThetas'] == 'joint'theta_geither one value (isotropic) or a vector (anistropic) for lengthscale parameter(s) of the noise process, whensettings['linkThetas'] != 'joint'gscalar nugget of the noise processg_Hscalar homoskedastic nugget for the initialisation with a :func: homGP.mleHomGP. See Notes.pXmatrix of fixed pseudo inputs locations of the noise process corresponding to Delta
known (dict) –
optional lists of starting values for mle optimization or that should not be optimized over, respectively. Values in
knownare not modified, while it can happen to these ofinit, see Notes. One can set one or several of the following:thetalengthscale parameter(s) for the mean process either one value (isotropic) or a vector (anistropic)Deltavector of nuggets corresponding to each design inX0, that are smoothed to giveLambda(as the global covariance matrix depends onDeltaandnu_hat, it is recommended to also pass values fortheta)beta0constant trend of the mean processk_theta_gconstant used for link mean and noise processes lengthscales, whensettings['linkThetas'] == 'joint'theta_geither one value (isotropic) or a vector (anistropic) for lengthscale parameter(s) of the noise process, whensettings['linkThetas'] != 'joint'gscalar nugget of the noise processg_Hscalar homoskedastic nugget for the initialisation with a :func: homGP.mleHomGP. See Notes.pXmatrix of fixed pseudo inputs locations of the noise process corresponding to Delta
covtype (str) – covariance kernel type, either
'Gaussian','Matern5_2'or'Matern3_2', see :func:~covariance_functions.cov_genmaxit (int) – maximum number of iterations for L-BFGS-B of :func:
scipy.optimize.minimizededicated to maximum likelihood optimizationuse_torch (bool)
- Return type:
None
Notes
The global covariance matrix of the model is parameterized as
nu_hat * (C + Lambda * np.diag(1/mult)) = nu_hat * K, withCthe correlation matrix between unique designs, depending on the family of kernel used (see :func: ~covariance_functions.cov_gen for available choices) and values of lengthscale parameters.nu_hatis the plugin estimator of the variance of the process.Lambdais the prediction on the noise level given by a second (homoskedastic) GP: .. math:: Lambda = C_g(C_g + mathrm{diag}(g/mathrm{mult}))^{-1} Delta} withC_gthe correlation matrix between unique designs for this second GP, with lengthscales hyperparameterstheta_gand nuggetgandDeltathe variance level atX0that are estimated.It is generally recommended to use :func:
~find_reps.find_repsto pre-process the data, to rescale the inputs to the unit cube and to normalize the outputs.- The noise process lengthscales can be set in several ways:
using
k_theta_g(settings['linkThetas'] == 'joint'), supposed to be greater than one by default. In this case lengthscales of the noise process are multiples of those of the mean process.if
settings['linkThetas'] == 'constr, then the lower bound ontheta_gcorrespond to estimated values of a homoskedastic GP fit.else lengthscales between the mean and noise process are independent (both either anisotropic or not).
When no starting nor fixed parameter values are provided with
initorknown, the initialization process consists of fitting first an homoskedastic model of the data, calledmodHom.init['theta'], initial lengthscales are taken at 10% of the range determined withlowerandupper, whileinit['g_H']may be use to pass an initial nugget value. The resulting lengthscales provide initial values fortheta(or update them if given ininit).If necessary, a second homoskedastic model,
modNugs, is fitted to the empirical residual variance between the prediction given bymodHomatX0andZ(up tomodHom['nu_hat]'). Note that when specifyingsettings['linkThetas'] == 'joint', then this second homoskedastic model has fixed lengthscale parameters. Starting values for ``theta_gandgare extracted frommodNugs- Three initialization schemes for
Deltaare available withsettings['initStrategy']: for
settings['initStrategy'] == 'simple',Deltais simply initialized to the estimatedgvalue ofmodHom.Note that this procedure may fail when
settings['penalty'] == True.for
settings['initStrategy'] == 'residuals',Deltais initialized to the estimated residual variance from the homoskedastic mean prediction.for
settings['initStrategy'] == 'smoothed',Deltatakes the values predicted bymodNugsatX0.
Notice that
lowerandupperbounds cannot be equal for:func: scipy.optimize.minimize. To use pseudo-input locations for the noise process, one can either providepXif they are not to be optimized. Otherwise, initial values are given withpXinit, and optimization bounds withlowerpX,upperpXininit. Automatic initialization of the other parameters without restriction is available for now only with method ‘simple’, otherwise it is assumed that pXinit points are a subset of X0.- Returns:
theta: unless given, maximum likelihood estimate (mle) of the lengthscale parameter(s),Delta: unless given, mle of the nugget vector (non-smoothed),Lambda: predicted input noise variance atX0,nu_hat: plugin estimator of the variance,theta_g: unless given, mle of the lengthscale(s) of the noise/log-noise process,k_theta_g: ifsettings['linkThetas'] == 'joint', mle for the constant by which lengthscale parameters ofthetaare multiplied to gettheta_g,g: unless given, mle of the nugget of the noise/log-noise process,trendtype: either"SK"ifbeta0is provided, else"OK",beta0: constant trend of the mean process, plugin-estimator unless given,nmean: plugin estimator for the constant noise/log-noise process mean,pX: if used, matrix of pseudo-inputs locations for the noise/log-noise process,ll: log-likelihood value, (ll_non_pen) is the value without the penalty,nit_opt,msg: counts and message returned by :func:scipy.optimize.minimizemodHom: homoskedastic GP model of classhomGPused for initialization of the mean process,modNugs: homoskedastic GP model of classhomGPused for initialization of the noise/log-noise process,nu_hat_var: variance of the noise process,used_args: list with arguments provided in the call to the function,Ki,Kgi: inverse of the covariance matrices of the mean and noise processes (not scaled bynu_hatandnu_hat_var),X0,Z0,Z,eps,logN,covtype: values given in input,
time: time to train the model, in seconds.
See also ~hetgpy.hetGP.hetGP.predict for predictions, ~hetgpy.hetGP.update for updating an existing model.
summaryandplotfunctions are available as well. ~hetTP.mleHetTP provide a Student-t equivalent.- Return type:
self, with the following attributes
- Parameters:
X (ArrayLike)
Z (ArrayLike)
lower (TypeAliasForwardRef('ArrayLike') | None)
upper (TypeAliasForwardRef('ArrayLike') | None)
known (dict)
noiseControl (dict)
init (dict)
covtype (str)
maxit (int)
eps (float)
settings (dict)
use_torch (bool)
Examples
>>> from hetgpy import hetGP >>> from hetgpy.example_data import mcycle >>> m = mcycle() >>> model = hetGP() >>> model.mle(m['times'],m['accel'],lower=[1.0],upper=[10.0],covtype="Matern5_2")
References
M. Binois, Robert B. Gramacy, M. Ludkovski (2018), Practical heteroskedastic Gaussian process modeling for large simulation experiments, Journal of Computational and Graphical Statistics, 27(4), 808–821. Preprint available on arXiv:1611.05902.
- plot(type='diagnostics', **kwargs)[source]
Plot model behavior. See hetgpy.plot for more details on types of plots
- Parameters:
type (str) – type of plot: currently supported are “diagnostics” and “iterates”
- Returns:
fig, ax
- Return type:
figure and axis object
- predict(x, noise_var=False, xprime=None, nugs_only=False, interval=None, interval_lower=None, interval_upper=None, **kwargs)[source]
Gaussian process predictions using a heterogeneous noise GP object (of
hetGP)- Parameters:
x (ndarray_like) – matrix of designs locations to predict at (one point per row)
noise_var (bool (default False)) – should the variance of the latent variance process be returned?
xprime (ndarray_like) – optional second matrix of predictive locations to obtain the predictive covariance matrix between
xandxprimenugs_only (bool (default False)) – if
True, only return noise variance predictionkwargs (dict) – optional additional elements (not used)
interval (str | None)
interval_lower (float | None)
interval_upper (float | None)
- Returns:
mean: kriging mean;sd2: kriging variance (filtered, e.g. without the nugget values)nugs: noise variance predictionsd2_var: (returned ifnoise_var = True) kriging variance of the noise process (i.e., on log-variances iflogN = TRUE)cov: (returned ifxprimeis given) predictive covariance matrix betweenxandxprime
- Return type:
dict with elements
Notes
The full predictive variance corresponds to the sum of
sd2andnugs. See :func: ~hetgpy.hetGP.mleHetGP for examples.
- rebuild(robust=False)[source]
Rebuilds inverse covariance matrix in homGP object (usually after saving). Works in tandem with strip
- Parameters:
robust (bool) – use np.linalg.pinv for covariance matrix inversion. Otherwise use cholesky
- Return type:
self with rebuilt inverse covariance matrix
- update(Xnew, Znew, ginit=0.01, lower=None, upper=None, noiseControl=None, settings=None, known={}, maxit=100, method='quick')[source]
Update model object with new observations
- Parameters:
Xnew (ndarray_like) – new inputs (one observation per row)
Znew (ndarray_like) – new observations
lower (ndarray_like) – lower bound for lengthscales. If not provided extracted from self
upper (ndarray_like) – upper bound for lengthscales. If not provided extracted from self
noiseControl (dict) – noise bounds. If not provided extracted from self
settings (dict) – options for optimization. If not provided extracted from self
known (dict) – known hyperparameters to fix at values
maxit (int) – maximum number of iterations for hyperparameter optimization
ginit (float)
method (str)
- Return type:
self, potentially with updated hyperparameters
Notes
If hyperparameters do not need to be updated, maxit can be set to 0. In this case it is possible to pass NAs in Znew, then the model can still be used to provide updated variance predictions
Example
>>> import numpy as np, from hetgpy import hetGP >>> X = np.array([1, 2, 3, 4, 12]).reshape(-1,1) >>> Y = np.array([0, -1.75, -2, -0.5, 5]) >>> model = hetGP().mle(X,Y,known={'theta':np.array([10.0]),'g':1e-8}) >>> Xnew = np.array([2.5]).reshape(-1,1) >>> Znew = model.predict(Xnew)['mean'] >>> model.update(Xnew,Znew)