HISTBINS - Number of Histogram Bins

Returns the number of histogram bins using a given method.

 

Syntax

HISTBINS(X, Method)

X is the input data series (one/two dimensional array of cells (e.g. rows or columns)).

Method is a switch to select the calculation method (1=Sturges's formula, 2=Square-root, 3=Scott's Choice, 4=Freedman-Diaconis choice, 5=Optimal (default)).

Method Description
1 Sturges's Formula
2 Square-root Choice
3 Scott's Choice
4 Freedman-Diaconis choice
5 Optimal (Loss-Function min) (default)
 

Remarks

  1. The input data series may include missing values (e.g. #N/A, #VALUE!, #NUM!, empty cell), but they will not be included in the calculations.
  2. The number of bins, $h$, can be assigned directly or calculated from a suggested bin width. $h$
  3. $h$ is defined in terms of $h$ as follows:

    $$k=\left \lceil \frac{\mathrm{max}(X)-\mathrm{min}(x)}{h} \right \rceil$$

    Where:
    • $h$ is the input data series
  4. Sturges's formula for the number of bins, $h$, is:

    $$k = \lceil \log_2 n + 1 \rceil$$

    Where:
    • $n$ is the number of non-missing values in the input time series data


    Which:
    • implicitly bases the number of bins on the range
    • and can perform poorly for $n \lt 30$
  5. The square-root choice for the number of bins, $h$, is:

    $$k = \sqrt{n}$$

    Where:
    • $n$ is the number of non-missing values in the input time series data


    (This is the bin calculating method that Excel uses for its native histogram)
  6. Scott's choice for the bin width, $h$, is:

    $$h = \frac{3.5 \sigma}{n^{\frac{1}{3}}}$$

    Where:
    • $\sigma$ is the standard deviation of the input data series
    • $n$ is the number of non-missing values in the input time series data
  7. The Freedman–Diaconis choice for the bin width, $h$, is:

    $$h = 2 \dfrac{\operatorname{IQR}(X)}{n^{\frac{1}{3}}}$$

    Where:
    • IQR is the interquartile range of the input data series
    • $h$ is the input data series
    • $n$ is the number of non-missing values in the input time series data

Examples

Example 1:

 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
A B
Date Data
1/1/2008 #N/A
1/2/2008 -1.28
1/3/2008 0.24
1/4/2008 1.28
1/5/2008 1.20
1/6/2008 1.73
1/7/2008 -2.18
1/8/2008 -0.23
1/9/2008 1.10
1/10/2008 -1.09
1/11/2008 -0.69
1/12/2008 -1.69
1/13/2008 -1.85
1/14/2008 -0.98
1/15/2008 -0.77
1/16/2008 -0.30
1/17/2008 -1.28
1/18/2008 0.24
1/19/2008 1.28
1/20/2008 1.20
1/21/2008 1.73
1/22/2008 -2.18
1/23/2008 -0.23
1/24/2008 1.10
1/25/2008 -1.09
1/26/2008 -0.69
1/27/2008 -1.69
1/28/2008 -1.85
1/29/2008 -0.98


  Formula Description (Result)
  =HISTBINS($B$2:$B$30,0) Sturges's Formula (5)
  =HISTBINS($B$2:$B$30,1) Square-root Choice (5)
  =HISTBINS($B$2:$B$30,2) Scott's Choice (3)
  =HISTBINS($B$2:$B$30,3) Freedman-Diaconis Choice (3)

Files Examples

References

  • Balakrishnan, N., Exponential Distribution: Theory, Methods and Applications, CRC, P 18 1996.
Have more questions? Submit a request

0 Comments