statistics-0.16.1.2: A library of statistical types, data, and functions
Copyright (c) 2009 Bryan O'Sullivan
License BSD3
Maintainer bos@serpentine.com
Stability experimental
Portability portable
Safe Haskell None
Language Haskell2010

Statistics.Quantile

Description

Functions for approximating quantiles, i.e. points taken at regular intervals from the cumulative distribution function of a random variable.

The number of quantiles is described below by the variable q , so with q =4, a 4-quantile (also known as a quartile ) has 4 intervals, and contains 5 points. The parameter k describes the desired point, where 0 ≤ k q .

Synopsis

Quantile estimation functions

Below is family of functions which use same algorithm for estimation of sample quantiles. It approximates empirical CDF as continuous piecewise function which interpolates linearly between points \((X_k,p_k)\) where \(X_k\) is k-th order statistics (k-th smallest element) and \(p_k\) is probability corresponding to it. ContParam determines how \(p_k\) is chosen. For more detailed explanation see [Hyndman1996].

This is the method used by most statistical software, such as R, Mathematica, SPSS, and S.

data ContParam Source #

Parameters α and β to the continuousBy function. Exact meaning of parameters is described in [Hyndman1996] in section "Piecewise linear functions"

Instances

Instances details
Eq ContParam Source #
Instance details

Defined in Statistics.Quantile

Data ContParam Source #
Instance details

Defined in Statistics.Quantile

Methods

gfoldl :: ( forall d b. Data d => c (d -> b) -> d -> c b) -> ( forall g. g -> c g) -> ContParam -> c ContParam Source #

gunfold :: ( forall b r. Data b => c (b -> r) -> c r) -> ( forall r. r -> c r) -> Constr -> c ContParam Source #

toConstr :: ContParam -> Constr Source #

dataTypeOf :: ContParam -> DataType Source #

dataCast1 :: Typeable t => ( forall d. Data d => c (t d)) -> Maybe (c ContParam ) Source #

dataCast2 :: Typeable t => ( forall d e. ( Data d, Data e) => c (t d e)) -> Maybe (c ContParam ) Source #

gmapT :: ( forall b. Data b => b -> b) -> ContParam -> ContParam Source #

gmapQl :: (r -> r' -> r) -> r -> ( forall d. Data d => d -> r') -> ContParam -> r Source #

gmapQr :: forall r r'. (r' -> r -> r) -> r -> ( forall d. Data d => d -> r') -> ContParam -> r Source #

gmapQ :: ( forall d. Data d => d -> u) -> ContParam -> [u] Source #

gmapQi :: Int -> ( forall d. Data d => d -> u) -> ContParam -> u Source #

gmapM :: Monad m => ( forall d. Data d => d -> m d) -> ContParam -> m ContParam Source #

gmapMp :: MonadPlus m => ( forall d. Data d => d -> m d) -> ContParam -> m ContParam Source #

gmapMo :: MonadPlus m => ( forall d. Data d => d -> m d) -> ContParam -> m ContParam Source #

Ord ContParam Source #
Instance details

Defined in Statistics.Quantile

Show ContParam Source #
Instance details

Defined in Statistics.Quantile

Generic ContParam Source #
Instance details

Defined in Statistics.Quantile

ToJSON ContParam Source #
Instance details

Defined in Statistics.Quantile

FromJSON ContParam Source #
Instance details

Defined in Statistics.Quantile

Binary ContParam Source #
Instance details

Defined in Statistics.Quantile

Default ContParam Source #

We use s as default value which is same as R's default.

Instance details

Defined in Statistics.Quantile

type Rep ContParam Source #
Instance details

Defined in Statistics.Quantile

class Default a where Source #

A class for types with a default value.

Minimal complete definition

Nothing

Methods

def :: a Source #

The default value for this type.

Instances

Instances details
Default Double
Instance details

Defined in Data.Default.Class

Default Float
Instance details

Defined in Data.Default.Class

Default Int
Instance details

Defined in Data.Default.Class

Default Int8
Instance details

Defined in Data.Default.Class

Default Int16
Instance details

Defined in Data.Default.Class

Default Int32
Instance details

Defined in Data.Default.Class

Default Int64
Instance details

Defined in Data.Default.Class

Default Integer
Instance details

Defined in Data.Default.Class

Default Ordering
Instance details

Defined in Data.Default.Class

Default Word
Instance details

Defined in Data.Default.Class

Default Word8
Instance details

Defined in Data.Default.Class

Default Word16
Instance details

Defined in Data.Default.Class

Default Word32
Instance details

Defined in Data.Default.Class

Default Word64
Instance details

Defined in Data.Default.Class

Default ()
Instance details

Defined in Data.Default.Class

Methods

def :: () Source #

Default All
Instance details

Defined in Data.Default.Class

Default Any
Instance details

Defined in Data.Default.Class

Default CShort
Instance details

Defined in Data.Default.Class

Default CUShort
Instance details

Defined in Data.Default.Class

Default CInt
Instance details

Defined in Data.Default.Class

Default CUInt
Instance details

Defined in Data.Default.Class

Default CLong
Instance details

Defined in Data.Default.Class

Default CULong
Instance details

Defined in Data.Default.Class

Default CLLong
Instance details

Defined in Data.Default.Class

Default CULLong
Instance details

Defined in Data.Default.Class

Default CFloat
Instance details

Defined in Data.Default.Class

Default CDouble
Instance details

Defined in Data.Default.Class

Default CPtrdiff
Instance details

Defined in Data.Default.Class

Default CSize
Instance details

Defined in Data.Default.Class

Default CSigAtomic
Instance details

Defined in Data.Default.Class

Default CClock
Instance details

Defined in Data.Default.Class

Default CTime
Instance details

Defined in Data.Default.Class

Default CUSeconds
Instance details

Defined in Data.Default.Class

Default CSUSeconds
Instance details

Defined in Data.Default.Class

Default CIntPtr
Instance details

Defined in Data.Default.Class

Default CUIntPtr
Instance details

Defined in Data.Default.Class

Default CIntMax
Instance details

Defined in Data.Default.Class

Default CUIntMax
Instance details

Defined in Data.Default.Class

Default RiddersParam
Instance details

Defined in Numeric.RootFinding

Default NewtonParam
Instance details

Defined in Numeric.RootFinding

Default ContParam Source #

We use s as default value which is same as R's default.

Instance details

Defined in Statistics.Quantile

Default [a]
Instance details

Defined in Data.Default.Class

Methods

def :: [a] Source #

Default ( Maybe a)
Instance details

Defined in Data.Default.Class

Integral a => Default ( Ratio a)
Instance details

Defined in Data.Default.Class

Default a => Default ( IO a)
Instance details

Defined in Data.Default.Class

( Default a, RealFloat a) => Default ( Complex a)
Instance details

Defined in Data.Default.Class

Default ( First a)
Instance details

Defined in Data.Default.Class

Default ( Last a)
Instance details

Defined in Data.Default.Class

Default a => Default ( Dual a)
Instance details

Defined in Data.Default.Class

Default ( Endo a)
Instance details

Defined in Data.Default.Class

Num a => Default ( Sum a)
Instance details

Defined in Data.Default.Class

Num a => Default ( Product a)
Instance details

Defined in Data.Default.Class

Default r => Default (e -> r)
Instance details

Defined in Data.Default.Class

Methods

def :: e -> r Source #

( Default a, Default b) => Default (a, b)
Instance details

Defined in Data.Default.Class

Methods

def :: (a, b) Source #

( Default a, Default b, Default c) => Default (a, b, c)
Instance details

Defined in Data.Default.Class

Methods

def :: (a, b, c) Source #

( Default a, Default b, Default c, Default d) => Default (a, b, c, d)
Instance details

Defined in Data.Default.Class

Methods

def :: (a, b, c, d) Source #

( Default a, Default b, Default c, Default d, Default e) => Default (a, b, c, d, e)
Instance details

Defined in Data.Default.Class

Methods

def :: (a, b, c, d, e) Source #

( Default a, Default b, Default c, Default d, Default e, Default f) => Default (a, b, c, d, e, f)
Instance details

Defined in Data.Default.Class

Methods

def :: (a, b, c, d, e, f) Source #

( Default a, Default b, Default c, Default d, Default e, Default f, Default g) => Default (a, b, c, d, e, f, g)
Instance details

Defined in Data.Default.Class

Methods

def :: (a, b, c, d, e, f, g) Source #

quantile Source #

Arguments

:: Vector v Double
=> ContParam

Parameters α and β .

-> Int

k , the desired quantile.

-> Int

q , the number of quantiles.

-> v Double

x , the sample data.

-> Double

O( n ·log n ). Estimate the k th q -quantile of a sample x , using the continuous sample method with the given parameters.

The following properties should hold, otherwise an error will be thrown.

  • input sample must be nonempty
  • the input does not contain NaN
  • 0 ≤ k ≤ q

quantiles :: ( Vector v Double , Foldable f, Functor f) => ContParam -> f Int -> Int -> v Double -> f Double Source #

O( k·n ·log n ). Estimate set of the k th q -quantile of a sample x , using the continuous sample method with the given parameters. This is faster than calling quantile repeatedly since sample should be sorted only once

The following properties should hold, otherwise an error will be thrown.

  • input sample must be nonempty
  • the input does not contain NaN
  • for every k in set of quantiles 0 ≤ k ≤ q

quantilesVec :: ( Vector v Double , Vector v Int ) => ContParam -> v Int -> Int -> v Double -> v Double Source #

O( k·n ·log n ). Same as quantiles but uses Vector container instead of Foldable one.

Parameters for the continuous sample method

cadpw :: ContParam Source #

California Department of Public Works definition, α =0, β =1. Gives a linear interpolation of the empirical CDF. This corresponds to method 4 in R and Mathematica.

hazen :: ContParam Source #

Hazen's definition, α =0.5, β =0.5. This is claimed to be popular among hydrologists. This corresponds to method 5 in R and Mathematica.

spss :: ContParam Source #

Definition used by the SPSS statistics application, with α =0, β =0 (also known as Weibull's definition). This corresponds to method 6 in R and Mathematica.

s :: ContParam Source #

Definition used by the S statistics application, with α =1, β =1. The interpolation points divide the sample range into n-1 intervals. This corresponds to method 7 in R and Mathematica and is default in R.

medianUnbiased :: ContParam Source #

Median unbiased definition, α =1/3, β =1/3. The resulting quantile estimates are approximately median unbiased regardless of the distribution of x . This corresponds to method 8 in R and Mathematica.

normalUnbiased :: ContParam Source #

Normal unbiased definition, α =3/8, β =3/8. An approximately unbiased estimate if the empirical distribution approximates the normal distribution. This corresponds to method 9 in R and Mathematica.

Other algorithms

weightedAvg Source #

Arguments

:: Vector v Double
=> Int

k , the desired quantile.

-> Int

q , the number of quantiles.

-> v Double

x , the sample data.

-> Double

O( n ·log n ). Estimate the k th q -quantile of a sample, using the weighted average method. Up to rounding errors it's same as quantile s .

The following properties should hold otherwise an error will be thrown.

  • the length of the input is greater than 0
  • the input does not contain NaN
  • k ≥ 0 and k ≤ q

Median & other specializations

median Source #

Arguments

:: Vector v Double
=> ContParam

Parameters α and β .

-> v Double

x , the sample data.

-> Double

O( n ·log n ) Estimate median of sample

mad Source #

Arguments

:: Vector v Double
=> ContParam

Parameters α and β .

-> v Double

x , the sample data.

-> Double

O( n ·log n ). Estimate the median absolute deviation (MAD) of a sample x using continuousBy . It's robust estimate of variability in sample and defined as:

\[ MAD = \operatorname{median}(| X_i - \operatorname{median}(X) |) \]

midspread Source #

Arguments

:: Vector v Double
=> ContParam

Parameters α and β .

-> Int

q , the number of quantiles.

-> v Double

x , the sample data.

-> Double

O( n ·log n ). Estimate the range between q -quantiles 1 and q -1 of a sample x , using the continuous sample method with the given parameters.

For instance, the interquartile range (IQR) can be estimated as follows:

midspread medianUnbiased 4 (U.fromList [1,1,2,2,3])
==> 1.333333

Deprecated

continuousBy Source #

Arguments

:: Vector v Double
=> ContParam

Parameters α and β .

-> Int

k , the desired quantile.

-> Int

q , the number of quantiles.

-> v Double

x , the sample data.

-> Double

Deprecated: Use quantile instead

References