Calculates the Kernel Density Estimation (KDE) of the sample data.
Syntax
NxKDE(X, Lo, Hi, Transform, $\lambda$, Kernel, H, Optimization, Return, Target)
The KDE function syntax has the following arguments:
 X
 is the input data series (one or twodimensional array of cells (e.g., rows or columns)).
 Lo
 is the xdomain lower bound. If missing, no lower bound is assumed ($\infty$).
 Hi
 is the xdomain upper bound. If missing, no lower bound is assumed ($\infty$).
 Transform
 is a switch to select the data priortransform method (0 = none (default), 1 = logit, 2 = probit, 3 = complementary loglog, 4 = log, 5 = power).
Value Method 0 None / reflective (Silverman) 1 Logit transform 2 Probit (aka., normit) transform 3 Complementary loglog transform 4 Log transform 5 Power (i.e., BoxCox) transform  $\lambda$
 is the power transform smoothing parameter.
 Kernel
 is a switch to select the kernel function (0 = Gaussian (default), 1 = uniform, 2 = triangular, 3 = biweight (quatric), 4 = triweight, 5 = Epanechnikov, 6 = cosine).
Value Description 0 Gaussian kernel function (default) 1 Uniform kernel 2 Triangular kernel 3 Biweight or quatric kernel 4 Triweight kernel 5 Epanechnikov kernel 6 Cosine kernel  H
 is the smoothing parameter (bandwidth) of the kernel density estimator. If missing, and optimization is not “None,” the KDE function calculates an optimal value.
 Optimization
 is a switch to select the kernel bandwidth optimization method (0 = none (default), 1 = Silverman, 2 = direct plugin, 3 = unbiased crossvalidation).
Value Method 0 None (default) 1 Silverman's rule of thumb 2 Direct Plugin (Sheather & Jones) 3 Unbiased crossvalidation  Return
 is a number that determines the type of return value: 0 (or missing) = PDF, 1 = CDF, 2 = inverse CDF, 3 = bandwidth.
Value Description 0 or omitted Probability Density Function (PDF) 1 Cumulative Density Function (CDF) 2 Inverse Cumulative Density Function (inv. CDF) 3 Bandwidth  Target
 is the desired xvalue(s) to calculate for (a single value or a onedimensional array of cells (e.g., rows or columns)).
Remarks
 In statistics, kernel density estimation (KDE) is a nonparametric way to estimate the probability density function of a random variable.
 Let $\{x_i\}$ be an independent identical distributed (i.i.d.) sample drawn from some distribution with an unknown density $f()$. The kernel density estimator is defined as follows:
$$\hat f(x)=\frac{1}{nh}\sum_{i=1}^N {K(\frac{xx_i}{h}})$$ Where:
 $K()$ is the kernel function – a symmetric (but not necessarily positive) function that integrates to one.
 $h$ is the smoothing parameter called the bandwidth.
 The bandwidth of the kernel is a free parameter that exhibits a strong influence on the resulting estimate.
 The domain lower and upper bounds arguments are optional, but if they are given, the input data is checked against those bounds. An error #NUM! Is returned if any data point violates the bounds.
 The power and log transform can work on one bound, while the rest can work on two bounds.
 In lower and upper bounds are specified, but the transform function is either log or power, the NxKDE(.) returns #VALUE!
 The none/reflection method does not transform the input data, but rather treats the xvalues near the domain endpoints.
 The NxKDE(.) returns zero PDF for any xvalue outside the specified xdomain.
 The NxKDE(.) returns zero (0) CDF for any xvalue smaller than the xdomain lower bound and one (1) for those values is greater than the xdomain upper bound.
 For the inverse CDF return type, the NxKDE returns #VALUE! If the target value is not in $(0, 1)$ interval.
 NxKDE supports a fixed bandwidth throughout the sample.
 The input data series may include missing values (e.g., #N/A, #VALUE!, #NUM!, or empty cell). The KDE(.) will exclude all those values in the calculations.
 The NxKDE(.) supports three bandwidth optimization methods. Except for the direct plugin (DPI) method, the user can use any supported kernel function.
 The direct plugin (DPI) method requires a kernel function with at least six (6) nonzero derivatives, continuous and squareintegrable. This excludes uniform, triangular, biweight, and Epanechnikov kernels.
 The NxKDE(.) returns #VALUE! if the DPI optimization is turned on and one of the following kernels is selected: uniform, triangular, quartic, or Epanechnikov.
 For performance reasons, we recommend calculating the optimal bandwidth (optimization on) in a separate step. After that, use the computed optimal bandwidth in all subsequent NxKDE(.) calls, but with the optimization off.
Status
The NxKDE(.) function is available starting with version 1.68 CAMEL.
Examples
Example 1:


Formula  Description (Result) 

=NxKDE(\$B\$2:\$B\$29,0.5,,1)  NxKDE (0.165) 
Files Examples
Related Links
References
 Park, B.U.; Marron, J.S. (1990). "Comparison of datadriven bandwidth selectors". Journal of the American Statistical Association. 85 (409): 66–72
 Silverman, B.W. (1986). Density Estimation for Statistics and Data Analysis. London: Chapman ∓ Hall/CRC. p. 45. ISBN 9780412246203.
 Jones, M.C.; Marron, J.S.; Sheather, S. J. (1996). "A brief survey of bandwidth selection for density estimation." Journal of the American Statistical Association. 91 (433): 401–407.
 Sheather, S. J. , and Jones, M.C. 1991. A reliable databased bandwidth selection method for kernel density estimation. Journal of Royal Statistical Society, Series B 53: 683–690.
 W. Zucchini, Applied smoothing techniques, Part 1 Kernel Density Estimation., 2003.
Comments
Please sign in to leave a comment.