Skip to contents

This function computes the kernel bandwidth of the Gaussian kernel for the normality, two-sample and k-sample kernel-based quadratic distance (KBQD) tests.

Usage

select_h(
  x,
  y = NULL,
  alternative = NULL,
  method = "subsampling",
  b = 0.8,
  B = 100,
  delta_dim = 1,
  delta = NULL,
  h_values = NULL,
  Nrep = 50,
  n_cores = 2,
  Quantile = 0.95,
  power.plot = TRUE
)

Arguments

x

Data set of observations from X.

y

Numeric matrix or vector of data values. Depending on the input y, the selection of h is performed for the corresponding test.

  • if y = NULL, the function performs the tests for normality on x.

  • if y is a data matrix, with same dimensions of x, the function performs the two-sample test between x and y.

  • if y is a numeric or factor vector, indicating the group memberships for each observation, the function performs the k-sample test.

alternative

Family of alternative chosen for selecting h, between "location", "scale" and "skewness".

method

The method used for critical value estimation ("subsampling", "bootstrap", or "permutation").

b

The size of the subsamples used in the subsampling algorithm .

B

The number of iterations to use for critical value estimation, B = 150 as default.

delta_dim

Vector of coefficient of alternative with respect to each dimension

delta

Vector of parameter values indicating chosen alternatives

h_values

Values of the tuning parameter used for the selection

Nrep

Number of bootstrap/permutation/subsampling replications.

n_cores

Number of cores used to parallel the h selection algorithm (default:2).

Quantile

The quantile to use for critical value estimation, 0.95 is the default value.

power.plot

Logical. If TRUE, it is displayed the plot of power for values in h_values and delta.

Value

A list with the following attributes:

  • h_sel the selected value of tuning parameter h;

  • power matrix of power values computed for the considered values of delta and h_values;

  • power.plot power plots (if power.plot is TRUE).

Details

The function performs the selection of the optimal value for the tuning parameter \(h\) of the normal kernel function, for normality test, the two-sample and k-sample KBQD tests. It performs a small simulation study, generating samples according to the family of alternative specified, for the chosen values of h_values and delta.

References

Markatou, M., Saraceno, G., Chen, Y. (2023). “Two- and k-Sample Tests Based on Quadratic Distances.” Manuscript, (Department of Biostatistics, University at Buffalo)

Examples

# Select the value of h using the mid-power algorithm
# \donttest{
x <- matrix(rnorm(100),ncol=2)
y <- matrix(rnorm(100),ncol=2)
h_sel <- select_h(x,y,"skewness")

h_sel
#> $h_sel
#> [1] 2.4
#> 
#> $power
#>      h delta power
#> 1  0.4   0.2  0.04
#> 2  0.8   0.2  0.08
#> 3  1.2   0.2  0.04
#> 4  1.6   0.2  0.12
#> 5  2.0   0.2  0.14
#> 6  2.4   0.2  0.04
#> 7  2.8   0.2  0.10
#> 8  3.2   0.2  0.22
#> 9  0.4   0.3  0.10
#> 10 0.8   0.3  0.02
#> 11 1.2   0.3  0.12
#> 12 1.6   0.3  0.08
#> 13 2.0   0.3  0.14
#> 14 2.4   0.3  0.24
#> 15 2.8   0.3  0.18
#> 16 3.2   0.3  0.24
#> 17 0.4   0.6  0.20
#> 18 0.8   0.6  0.28
#> 19 1.2   0.6  0.42
#> 20 1.6   0.6  0.42
#> 21 2.0   0.6  0.42
#> 22 2.4   0.6  0.44
#> 23 2.8   0.6  0.38
#> 24 3.2   0.6  0.44
#> 
#> $power.plot

#> 
# }