Package 'GLDEX'

Title:	Fitting Single and Mixture of Generalised Lambda Distributions
Description:	The fitting algorithms considered in this package have two major objectives. One is to provide a smoothing device to fit distributions to data using the weight and unweighted discretised approach based on the bin width of the histogram. The other is to provide a definitive fit to the data set using the maximum likelihood and quantile matching estimation. Other methods such as moment matching, starship method, L moment matching are also provided. Diagnostics on goodness of fit can be done via qqplots, KS-resample tests and comparing mean, variance, skewness and kurtosis of the data with the fitted distribution. References include the following: Karvanen and Nuutinen (2008) "Characterizing the generalized lambda distribution by L-moments" <doi:10.1016/j.csda.2007.06.021>, King and MacGillivray (1999) "A starship method for fitting the generalised lambda distributions" <doi:10.1111/1467-842X.00089>, Su (2005) "A Discretized Approach to Flexibly Fit Generalized Lambda Distributions to Data" <doi:10.22237/jmasm/1130803560>, Su (2007) "Nmerical Maximum Log Likelihood Estimation for Generalized Lambda Distributions" <doi:10.1016/j.csda.2006.06.008>, Su (2007) "Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R" <doi:10.18637/jss.v021.i09>, Su (2009) "Confidence Intervals for Quantiles Using Generalized Lambda Distributions" <doi:10.1016/j.csda.2009.02.014>, Su (2010) "Chapter 14: Fitting GLDs and Mixture of GLDs to Data using Quantile Matching Method" <doi:10.1201/b10159>, Su (2010) "Chapter 15: Fitting GLD to data using GLDEX 1.0.4 in R" <doi:10.1201/b10159>, Su (2015) "Flexible Parametric Quantile Regression Model" <doi:10.1007/s11222-014-9457-1>, Su (2021) "Flexible parametric accelerated failure time model"<doi:10.1080/10543406.2021.1934854>.
Authors:	Steve Su [aut, cre, cph] , Martin Maechler [aut], Juha Karvanen [aut], Robert King [aut], Benjamin Dean [ctb], R Core Team [aut]
Maintainer:	Steve Su <[email protected]>
License:	GPL (>= 3)
Version:	2.0.0.9.3
Built:	2025-03-10 02:47:16 UTC
Source:	https://github.com/cran/GLDEX

Help Index

This package fits RS and FMKL generalised lambda distributions using various methods. It also provides functions for fitting bimodal distributions using mixtures of generalised lambda distributions.
Digit/Bit Representation of Integers in any Base
Fitting mixture of generalied lambda distribtions to data using maximum likelihood estimation via the EM algorithm
Fitting mixture of generalied lambda distribtions to data using parition maximum likelihood estimation
Fitting mixtures of generalied lambda distribtions to data using quantile matching method
Finds the final fits using the maximum likelihood estimation for the bimodal dataset.
Finds the final fits using partition maximum likelihood estimation for the bimodal dataset.
Finds the initial values for optimisation in fitting the bimodal generalised lambda distribution.
Check whether the RS or FMKL/FKML GLD is a valid GLD for single values of L1, L2, L3 and L4
Check whether the RS or FMKL/FKML GLD is a valid GLD for vectors of L1, L2, L3 and L4
Classifies data into two groups using a clustering regime.
Compare the moments of the data and the fitted univariate generalised lambda distribution.
Compare the moments of the data and the fitted univariate generalised lambda distribution. Specialised funtion designed for RMFMKL.ML and STAR methods.
Fit RS and FMKL generalised distributions to data using discretised approach with weights.
Fit RS and FMKL generalised distributions to data using discretised approach without weights.
Fit data using L moment matching estimation for RS and FMKL GLD
Fit data using RS, FMKL maximum likelihood estimation and the FMKL starship method.
Fit data using moment matching estimation for RS and FMKL GLD
Fit data using quantile matching estimation for RS and FMKL GLD
Compute the simulated Kolmogorov-Smirnov tests for the unimodal dataset
Compute the simulated Kolmogorov-Smirnov tests for the bimodal dataset
Diagnostic function for theoretical distribution fits through the resample Kolmogorov-Smirnoff tests
Diagnostic function for empirical data distribution fits through the resample Kolmogorov-Smirnoff tests
Estimates the mean and variance after cutting up a vector of variable into evenly spaced categories.
Finds the low discrepancy quasi random numbers
Find the theoretical first four L moments of the generalised lambda distribution.
Applying functions based on an index for a matrix.
Check whether the specified GLDs cover the minimum and the maximum values in a dataset
Finds the moments of fitted mixture of generalised lambda distribution by simulation.
Calculate mean, variance, skewness and kurtosis of a numerical vector
Estimates the number of classes or bins to smooth over in the discretised method of fitting generalised lambda distribution to data.
Plotting the univariate generalised lambda distribution fits on the data set.
Plotting mixture of two generalised lambda distributions on the data set.
Plotting many univariate generalised lambda distributions on one page.
Computes the raw moments of the generalised lambda distribution up to 4th order.
Fit FMKL generalised distribution to data using discretised approach with weights.
Fit FMKL generalised distribution to data using discretised approach without weights.
Fit FMKL generalised lambda distribution to data set using L moment matching
Fit FMKL generalised lambda distribution to data set using maximum likelihood estimation
Fit RS generalised lambda distribution to data set using maximum likelihood estimation
Fit FMKL generalised lambda distribution to data set using moment matching
Fit FMKL generalised lambda distribution to data set using quantile matching
Fit RS generalised distribution to data using discretised approach with weights.
Fit RS generalised distribution to data using discretised approach without weights.
Fit RS generalised lambda distribution to data set using L moment matching
Fit RS generalised lambda distribution to data set using maximum likelihood estimation
Fit RS generalised lambda distribution to data set using maximum likelihood estimation
Fit RS generalised lambda distribution to data set using moment matching
Fit RS generalised lambda distribution to data set using quantile matching
Simulate a mixture of two generalised lambda distributions.
Calculates the theoretical mean, variance, skewness and kurtosis for mixture of two generalised lambda distributions.
Find the theoretical first four moments of the generalised lambda distribution.
Determine which values are zero.
Returns a vector after removing all the zeros.
Checks whether the parameters provided constitute a valid generalised lambda distribution.
Checks whether the parameters provided constitute a valid generalised lambda distribution.
The Generalised Lambda Distribution Family
Histogram with exact number of bins specified by the user
Returns a logical vecto, TRUE if the value is Inf or -Inf.
Returns a logical vector TRUE, if the value is not Inf or -Inf.
Kolmogorov-Smirnov test
L-moments
An alternative to the normal pretty function in R.
Do a quantile plot on the univariate distribution fits.
Do a quantile plot on the bimodal distribution fits.
Quasi Randum Numbers via Halton Sequences
Compute skewness and kurtosis statistics
Carry out the “starship” estimation method for the generalised lambda distribution
Carry out the “starship” estimation method for the generalised lambda distribution using a grid-based search
Objective function that is minimised in starship estimation method
Trimmed L-moments
Determine Missing Values

This package fits RS and FMKL generalised lambda distributions using various methods. It also provides functions for fitting bimodal distributions using mixtures of generalised lambda distributions.

Description

The fitting algorithms considered in this package have two major objectives. One is to provide a smoothing device to fit distributions to data using the weight and unweighted discretised approach based on the bin width of the histogram. The other is to provide a definitive fit to the data set using the maximum likelihood estimation.

Copyright Information: To ensure the stability of this package, this package ports other functions from other open sourced packages directly so that any changes in other packages will not cause this package to malfunction.

All functions obtained from other sources have been acknowledged by the author in the authorship or the description sections of the help files and they are freely available online for all to use. Please contact the author for any copyright issues.

Specifically the following functions have been modified from R:

hist.su, ks.gof, pretty.su

The following functions are taken from other open source packages in R:

runif.pseudo, rnorm.pseudo, runif.halton, rnorm.halton, runif.sobol, rnorm.sobol by Diethelm Wuertz distributed under GPL.

digitsBase, QUnif and sHalton written by Martin Maechler distributed under GPL.

dgl, pgl, rgl, qgl, starship.adpativegrid, starship.obj and starship written by Robert King and some functions modified by Steve Su distributed under GPL.

Lmoments and t1lmoments written by Juha Karvanen distributed under GPL.

Details

This package allows a direct fitting method onto the data set using fun.RMFMKL.ml, fun.RMFMKL.ml.m, fun.RMFMKL.hs, fun.RMFMKL.hs.nw, fun.RPRS.ml, fun.RPRS.ml.m, fun.RPRS.hs, fun.RPRS.hs.nw, fun.RMFMKL.qs, fun.RPRS.qs, fun.RMFMKL.mm, fun.RPRS.mm, fun.RMFMKL.lm, fun.RPRS.lm and in the case of bimodal data set: fun.auto.bimodal.qs, fun.auto.bimodal.ml, fun.auto.bimodal.pml functions. The resulting fits can be graphically gauged by fun.plot.fit or fun.plot.fit.bm (for bimodal data), or examined by numbers using the Kolmogorov-Smirnoff resample tests (fun.diag.ks.g) and fun.diag.ks.g.bimodal). For unimodal data fits, it is also possible to compare the mean, variance, skewness and kurtosis of the fitted distribution with the data set using fun.comp.moments.ml and fun.comp.moments.ml.2 functions. Similarly, for bimodal data fits, this is done via fun.theo.bi.mv.gld and fun.moments.r. Additionally, L moments for single generalised lambda distribution can be obtained using fun.lm.theo.gld. For graphical display of goodness of fit, quantile plots can be used, these can be done using qqplot.gld and qqplot.gld.bi for univariate and bimodal generalised lambda distribution fits respectively.

Author(s)

Steve Su <[email protected]>

References

Asquith, W. (2007), L-moments and TL-moments of the generalized lambda distribution, Computational Statistics and Data Analysis 51(9): 4484-4496.

Freimer, M., Mudholkar, G. S., Kollia, G. & Lin, C. T. (1988), A study of the generalized tukey lambda family, Communications in Statistics - Theory and Methods *17*, 3547-3567.

Gilchrist, Warren G. (2000), Statistical Modelling with Quantile Functions, Chapman & Hall

Karian, Z.A., Dudewicz, E.J., and McDonald, P. (1996), The extended generalized lambda distribution system for fitting distributions to data: history, completion of theory, tables, applications, the “Final Word” on Moment fits, Communications in Statistics - Simulation and Computation *25*, 611-642.

Karian, Zaven A. and Dudewicz, Edward J. (2000), Fitting statistical distributions: the Generalized Lambda Distribution and Generalized Bootstrap methods, Chapman & Hall

Karvanen, J. and A. Nuutinen (2008), Characterizing the generalized lambda distribution by L-moments, Computational Statistics and Data Analysis 52(4): 1971-1983. doi:10.1016/j.csda.2007.06.021

King, R.A.R. & MacGillivray, H. L. (1999), A starship method for fitting the generalised lambda distributions, Australian and New Zealand Journal of Statistics, 41, 353-374 doi:10.1111/1467-842X.00089

Ramberg, J. S. & Schmeiser, B. W. (1974), An approximate method for generating asymmetric random variables, Communications of the ACM *17*, 78-82.

Su, S. (2005). A Discretized Approach to Flexibly Fit Generalized Lambda Distributions to Data. Journal of Modern Applied Statistical Methods (November): 408-424. doi:10.22237/jmasm/1130803560

Su, S. (2007). Nmerical Maximum Log Likelihood Estimation for Generalized Lambda Distributions. Computational Statistics and Data Analysis: *51*, 8, 3983-3998. doi:10.1016/j.csda.2006.06.008

Su, S. (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9. doi:10.18637/jss.v021.i09

Su, S. (2009). Confidence Intervals for Quantiles Using Generalized Lambda Distributions. Computational Statistics and Data Analysis *53*, 9, 3324-3333. doi:10.1016/j.csda.2009.02.014

Su, S. (2010). Chapter 14: Fitting GLDs and Mixture of GLDs to Data using Quantile Matching Method. Handbook of Distribution Fitting Methods with R. (Karian and Dudewicz) 557-583. doi:10.1201/b10159-3

Su, S. (2010). Chapter 15: Fitting GLD to data using GLDEX 1.0.4 in R. Handbook of Distribution Fitting Methods with R. (Karian and Dudewicz) 585-608. doi:10.1201/b10159-3

Su, S. (2015) "Flexible Parametric Quantile Regression Model" Statistics & Computing 25 (3). 635-650. doi:10.1007/s11222-014-9457-1

Su S. (2021) "Flexible parametric accelerated failure time model" J Biopharm Stat. 2021 Sep 31(5):650-667. doi:10.1080/10543406.2021.1934854

Examples


###Univariate distributions example:

set.seed(1000)

junk<-rweibull(300,3,2)

##A faster ML estimtion 

junk.fit1<-fun.RMFMKL.ml.m(junk)
junk.fit2<-fun.RPRS.ml.m(junk)

qqplot.gld(junk.fit1,data=junk,param="fmkl")
qqplot.gld(junk.fit2,data=junk,param="rs")

##Using discretised approach, with 50 classes

#Using discretised method with weights
obj.fit1.hs<-fun.data.fit.hs(junk)

#Plot the resulting fit. The fun.plot.fit function also works for singular fits 
#such as those by fun.plot.fit(obj.fit1.ml,junk,nclass=50,
#param=c("rs","fmkl","fmkl"),xlab="x")

fun.plot.fit(obj.fit1.hs,junk,nclass=50,param=c("rs","fmkl"),xlab="x")

#Compare the mean, variance, skewness and kurtosis of the fitted distribution 
#with actual data

fun.theo.mv.gld(obj.fit1.hs[1,1],obj.fit1.hs[2,1],obj.fit1.hs[3,1],
obj.fit1.hs[4,1],param="rs")
fun.theo.mv.gld(obj.fit1.hs[1,2],obj.fit1.hs[2,2],obj.fit1.hs[3,2],
obj.fit1.hs[4,2],param="fmkl")
fun.moments.r(junk)

#Conduct resample KS tests
fun.diag.ks.g(obj.fit1.hs[,1],junk,param="rs")
fun.diag.ks.g(obj.fit1.hs[,2],junk,param="fmkl")

##Try another fit, say 15 classes

obj.fit2.hs<-fun.data.fit.hs(junk,rs.default="N",fmkl.default="N",no.c.rs = 15, 
no.c.fmkl = 15)

fun.plot.fit(obj.fit2.hs,junk,nclass=50,param=c("rs","fmkl"),xlab="x")

fun.theo.mv.gld(obj.fit2.hs[1,1],obj.fit2.hs[2,1],obj.fit2.hs[3,1],
obj.fit2.hs[4,1],param="rs")
fun.theo.mv.gld(obj.fit2.hs[1,2],obj.fit2.hs[2,2],obj.fit2.hs[3,2],
obj.fit2.hs[4,2],param="fmkl")
fun.moments.r(junk)

fun.diag.ks.g(obj.fit2.hs[,1],junk,param="rs")
fun.diag.ks.g(obj.fit2.hs[,2],junk,param="fmkl")

##Uses the maximum likelihood estimation method

#Fit the function using fun.data.fit.ml
obj.fit1.ml<-fun.data.fit.ml(junk)

#Plot the resulting fit
fun.plot.fit(obj.fit1.ml,junk,nclass=50,param=c("rs","fmkl","fmkl"),xlab="x",
name=".ML")

#Compare the mean, variance, skewness and kurtosis of the fitted distribution 
#with actual data
fun.comp.moments.ml(obj.fit1.ml,junk)

#Do a quantile plot

qqplot.gld(junk,obj.fit1.ml[,1],param="rs",name="RS ML fit")

#Run a KS resample test on the resulting fit

fun.diag2(obj.fit1.ml,junk,1000)

#It is possible to use say fun.data.fit.ml(junk,rs.leap=409,fmkl.leap=409,
#FUN="QUnif") to find solution under a different set of low discrepancy number 
#generators.

###Bimodal distributions example:

#Fitting mixture of generalised lambda distributions on the data set using both 
#the maximum likelihood and partition maximum likelihood and plot the resulting 
#fits

opar <- par() 
par(mfrow=c(2,1))

junk<-fun.auto.bimodal.ml(faithful[,1],per.of.mix=0.01,clustering.m=clara,
init1.sel="rprs",init2.sel="rmfmkl",init1=c(-1.5,1.5),init2=c(-0.25,1.5),
leap1=3,leap2=3)
fun.plot.fit.bm(nclass=50,fit.obj=junk,data=faithful[,1],
name="Maximum likelihood using",xlab="faithful1",param.vec=c("rs","fmkl"))

#Do a quantile plot

qqplot.gld.bi(faithful[,1],junk$par,param1="rs",param2="fmkl",
name="RS FMKL ML fit",range=c(0.001,0.999))

par(opar)

junk<-fun.auto.bimodal.pml(faithful[,1],clustering.m=clara,init1.sel="rprs",
init2.sel="rmfmkl",init1=c(-1.5,1.5),init2=c(-0.25,1.5),leap1=3,leap2=3)
fun.plot.fit.bm(nclass=50,fit.obj=junk,data=faithful[,1],
name="Partition maximum likelihood using",xlab="faithful1",
param.vec=c("rs","fmkl"))

#Fit the faithful[,1] data from the dataset library using sobol sequence 
#generator for the first distribution fit and halton sequence for the second 
#distribution fit.

fit1<-fun.auto.bimodal.ml(faithful[,1],init1.sel="rmfmkl",init2.sel="rmfmkl",
init1=c(-0.25,1.5),init2=c(-0.25,1.5),leap1=3,leap2=3,fun1="runif.sobol",
fun2="runif.halton")

#Run diagnostic KS tests

fun.diag.ks.g.bimodal(fit1$par[1:4],fit1$par[5:8],prop1=fit1$par[9],
data=faithful[,1],param1="fmkl",param2="fmkl")

#Find the theoretical moments of the fit

fun.theo.bi.mv.gld(fit1$par[1],fit1$par[2],fit1$par[3],fit1$par[4],"fmkl",
fit1$par[5],fit1$par[6],fit1$par[7],fit1$par[8],"fmkl",fit1$par[9])

#Compare this with the empirical moments from the data set.

fun.moments.r(faithful[,1])

#Do a quantile plot

qqplot.gld.bi(faithful[,1],fit1$par,param1="fmkl",param2="fmkl",
name="FMKL FMKL ML fit")

#Quantile matching method

#Fitting faithful data from the dataset library, with the clara clustering 
#regime. The first distribution is RS and the second distribution is fmkl. 
#The percentage of data mix is 1%.

#Fitting normal(3,2) distriution using the default setting
junk<-rnorm(50,3,2)
fun.data.fit.qs(junk)

fun.auto.bimodal.qs(faithful[,1],per.of.mix=0.01,clustering.m=clara,
init1.sel="rprs",init2.sel="rmfmkl",init1=c(-1.5,1,5),init2=c(-0.25,1.5),
leap1=3,leap2=3)

#L Moment matching

#Fitting normal(3,2) distriution using the default setting
junk<-rnorm(50,3,2)
fun.data.fit.lm(junk)

# Moment matching method

#Fitting normal(3,2) distriution using the default setting
junk<-rnorm(50,3,2)
fun.data.fit.mm(junk)

# Example on fitting mixture of normal distributions

data1<-c(rnorm(1500,-1,2/3),rnorm(1500,1,2/3))

junk<-fun.auto.bimodal.ml(data1,per.of.mix=0.01,clustering.m=data1>0,
init1.sel="rprs",init2.sel="rmfmkl",init1=c(-1.5,1.5),init2=c(-0.25,1.5),
leap1=3,leap2=3)

fun.plot.fit.bm(nclass=50,fit.obj=junk,data=data1,
name="Maximum likelihood using",xlab="faithful1",param.vec=c("rs","fmkl"))

qqplot.gld.bi(data1,junk$par,param1="rs",param2="fmkl",
name="RS FMKL ML fit",range=c(0.001,0.999))

# Generate random observations from FMKL generalised lambda distributions with 
# parameters (1,2,3,4) and (4,3,2,1) with 50% of data from each distribution.
fun.simu.bimodal(c(1,2,3,4),c(4,3,2,1),prop1=0.5,param1="fmkl",param2="fmkl")
###Univariate distributions example:

set.seed(1000)

junk<-rweibull(300,3,2)

##A faster ML estimtion 

junk.fit1<-fun.RMFMKL.ml.m(junk)
junk.fit2<-fun.RPRS.ml.m(junk)

qqplot.gld(junk.fit1,data=junk,param="fmkl")
qqplot.gld(junk.fit2,data=junk,param="rs")

##Using discretised approach, with 50 classes

#Using discretised method with weights
obj.fit1.hs<-fun.data.fit.hs(junk)

#Plot the resulting fit. The fun.plot.fit function also works for singular fits 
#such as those by fun.plot.fit(obj.fit1.ml,junk,nclass=50,
#param=c("rs","fmkl","fmkl"),xlab="x")

fun.plot.fit(obj.fit1.hs,junk,nclass=50,param=c("rs","fmkl"),xlab="x")

#Compare the mean, variance, skewness and kurtosis of the fitted distribution 
#with actual data

fun.theo.mv.gld(obj.fit1.hs[1,1],obj.fit1.hs[2,1],obj.fit1.hs[3,1],
obj.fit1.hs[4,1],param="rs")
fun.theo.mv.gld(obj.fit1.hs[1,2],obj.fit1.hs[2,2],obj.fit1.hs[3,2],
obj.fit1.hs[4,2],param="fmkl")
fun.moments.r(junk)

#Conduct resample KS tests
fun.diag.ks.g(obj.fit1.hs[,1],junk,param="rs")
fun.diag.ks.g(obj.fit1.hs[,2],junk,param="fmkl")

##Try another fit, say 15 classes

obj.fit2.hs<-fun.data.fit.hs(junk,rs.default="N",fmkl.default="N",no.c.rs = 15, 
no.c.fmkl = 15)

fun.plot.fit(obj.fit2.hs,junk,nclass=50,param=c("rs","fmkl"),xlab="x")

fun.theo.mv.gld(obj.fit2.hs[1,1],obj.fit2.hs[2,1],obj.fit2.hs[3,1],
obj.fit2.hs[4,1],param="rs")
fun.theo.mv.gld(obj.fit2.hs[1,2],obj.fit2.hs[2,2],obj.fit2.hs[3,2],
obj.fit2.hs[4,2],param="fmkl")
fun.moments.r(junk)

fun.diag.ks.g(obj.fit2.hs[,1],junk,param="rs")
fun.diag.ks.g(obj.fit2.hs[,2],junk,param="fmkl")

##Uses the maximum likelihood estimation method

#Fit the function using fun.data.fit.ml
obj.fit1.ml<-fun.data.fit.ml(junk)

#Plot the resulting fit
fun.plot.fit(obj.fit1.ml,junk,nclass=50,param=c("rs","fmkl","fmkl"),xlab="x",
name=".ML")

#Compare the mean, variance, skewness and kurtosis of the fitted distribution 
#with actual data
fun.comp.moments.ml(obj.fit1.ml,junk)

#Do a quantile plot

qqplot.gld(junk,obj.fit1.ml[,1],param="rs",name="RS ML fit")

#Run a KS resample test on the resulting fit

fun.diag2(obj.fit1.ml,junk,1000)

#It is possible to use say fun.data.fit.ml(junk,rs.leap=409,fmkl.leap=409,
#FUN="QUnif") to find solution under a different set of low discrepancy number 
#generators.

###Bimodal distributions example:

#Fitting mixture of generalised lambda distributions on the data set using both 
#the maximum likelihood and partition maximum likelihood and plot the resulting 
#fits

opar <- par() 
par(mfrow=c(2,1))

junk<-fun.auto.bimodal.ml(faithful[,1],per.of.mix=0.01,clustering.m=clara,
init1.sel="rprs",init2.sel="rmfmkl",init1=c(-1.5,1.5),init2=c(-0.25,1.5),
leap1=3,leap2=3)
fun.plot.fit.bm(nclass=50,fit.obj=junk,data=faithful[,1],
name="Maximum likelihood using",xlab="faithful1",param.vec=c("rs","fmkl"))

#Do a quantile plot

qqplot.gld.bi(faithful[,1],junk$par,param1="rs",param2="fmkl",
name="RS FMKL ML fit",range=c(0.001,0.999))

par(opar)

junk<-fun.auto.bimodal.pml(faithful[,1],clustering.m=clara,init1.sel="rprs",
init2.sel="rmfmkl",init1=c(-1.5,1.5),init2=c(-0.25,1.5),leap1=3,leap2=3)
fun.plot.fit.bm(nclass=50,fit.obj=junk,data=faithful[,1],
name="Partition maximum likelihood using",xlab="faithful1",
param.vec=c("rs","fmkl"))

#Fit the faithful[,1] data from the dataset library using sobol sequence 
#generator for the first distribution fit and halton sequence for the second 
#distribution fit.

fit1<-fun.auto.bimodal.ml(faithful[,1],init1.sel="rmfmkl",init2.sel="rmfmkl",
init1=c(-0.25,1.5),init2=c(-0.25,1.5),leap1=3,leap2=3,fun1="runif.sobol",
fun2="runif.halton")

#Run diagnostic KS tests

fun.diag.ks.g.bimodal(fit1$par[1:4],fit1$par[5:8],prop1=fit1$par[9],
data=faithful[,1],param1="fmkl",param2="fmkl")

#Find the theoretical moments of the fit

fun.theo.bi.mv.gld(fit1$par[1],fit1$par[2],fit1$par[3],fit1$par[4],"fmkl",
fit1$par[5],fit1$par[6],fit1$par[7],fit1$par[8],"fmkl",fit1$par[9])

#Compare this with the empirical moments from the data set.

fun.moments.r(faithful[,1])

#Do a quantile plot

qqplot.gld.bi(faithful[,1],fit1$par,param1="fmkl",param2="fmkl",
name="FMKL FMKL ML fit")

#Quantile matching method

#Fitting faithful data from the dataset library, with the clara clustering 
#regime. The first distribution is RS and the second distribution is fmkl. 
#The percentage of data mix is 1%.

#Fitting normal(3,2) distriution using the default setting
junk<-rnorm(50,3,2)
fun.data.fit.qs(junk)

fun.auto.bimodal.qs(faithful[,1],per.of.mix=0.01,clustering.m=clara,
init1.sel="rprs",init2.sel="rmfmkl",init1=c(-1.5,1,5),init2=c(-0.25,1.5),
leap1=3,leap2=3)

#L Moment matching

#Fitting normal(3,2) distriution using the default setting
junk<-rnorm(50,3,2)
fun.data.fit.lm(junk)

# Moment matching method

#Fitting normal(3,2) distriution using the default setting
junk<-rnorm(50,3,2)
fun.data.fit.mm(junk)

# Example on fitting mixture of normal distributions

data1<-c(rnorm(1500,-1,2/3),rnorm(1500,1,2/3))

junk<-fun.auto.bimodal.ml(data1,per.of.mix=0.01,clustering.m=data1>0,
init1.sel="rprs",init2.sel="rmfmkl",init1=c(-1.5,1.5),init2=c(-0.25,1.5),
leap1=3,leap2=3)

fun.plot.fit.bm(nclass=50,fit.obj=junk,data=data1,
name="Maximum likelihood using",xlab="faithful1",param.vec=c("rs","fmkl"))

qqplot.gld.bi(data1,junk$par,param1="rs",param2="fmkl",
name="RS FMKL ML fit",range=c(0.001,0.999))

# Generate random observations from FMKL generalised lambda distributions with 
# parameters (1,2,3,4) and (4,3,2,1) with 50% of data from each distribution.
fun.simu.bimodal(c(1,2,3,4),c(4,3,2,1),prop1=0.5,param1="fmkl",param2="fmkl")

Digit/Bit Representation of Integers in any Base

Description

Integer number representations in other Bases.

Formally, for every element $N =$ x[i], compute the (vector of) “digits” $A$ of the base $b$ representation of the number $N$ , $N = \sum_{k=0}^M A_{M-k} b ^ k$ .
Revert such a representation to integers.

Usage

digitsBase(x, base = 2, ndigits = 1 + floor(log(max(x), base)))
digitsBase(x, base = 2, ndigits = 1 + floor(log(max(x), base)))

Arguments

x

For digitsBase(): non-negative integer (vector) whose base base digits are wanted.

For as.intBase():
a list of numeric vectors, a character vector, or an integer matrix as returned by digitsBase(), representing digits in base base.

base

integer, at least 2 specifying the base for representation.

ndigits

number of bits/digits to use.

Value

For digitsBase(), an object, say m, of class "basedInt" which is basically a (ndigits x n) matrix where m[,i] corresponds to x[i], n <- length(x) and attr(m,"base") is the input base.

as.intBase() and the as.integer method for basedInt objects return an integer vector.

Note

digits and digits.v are now deprecated and will be removed from the sfsmisc package.

Author(s)

Martin Maechler, Dec 4, 1991 (for S-plus; then called digits.v).

Examples

digitsBase(0:12, 8) #-- octal representation

## This may be handy for just one number (and default decimal):
digits <- function(n, base = 10) as.vector(digitsBase(n, base = base))
digits(128, base = 8) # 2 0 0

## one way of pretty printing (base <= 10!)
b2ch <- function(db)
        noquote(gsub("^0+(.{1,})$"," \1", apply(db, 2, paste, collapse = "")))
b2ch(digitsBase(0:33, 2))  #->  0 1 10 11 100 101 ... 100001
b2ch(digitsBase(0:33, 4))  #->  0 1 2 3 10 11 12 13 20 ... 200 201

## Hexadecimal:
i <- c(1:20, 100:106)
M <- digitsBase(i, 16)
hexdig <- c(0:9, LETTERS[1:6])
cM <- hexdig[1 + M]; dim(cM) <- dim(M)
b2ch(cM) #->  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F 10 11 ... 6A

digitsBase(0:12, 8) #-- octal representation

## This may be handy for just one number (and default decimal):
digits <- function(n, base = 10) as.vector(digitsBase(n, base = base))
digits(128, base = 8) # 2 0 0

## one way of pretty printing (base <= 10!)
b2ch <- function(db)
        noquote(gsub("^0+(.{1,})$"," \1", apply(db, 2, paste, collapse = "")))
b2ch(digitsBase(0:33, 2))  #->  0 1 10 11 100 101 ... 100001
b2ch(digitsBase(0:33, 4))  #->  0 1 2 3 10 11 12 13 20 ... 200 201

## Hexadecimal:
i <- c(1:20, 100:106)
M <- digitsBase(i, 16)
hexdig <- c(0:9, LETTERS[1:6])
cM <- hexdig[1 + M]; dim(cM) <- dim(M)
b2ch(cM) #->  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F 10 11 ... 6A

Fitting mixture of generalied lambda distribtions to data using maximum likelihood estimation via the EM algorithm

Description

This function will fit mixture of generalised lambda distributions to dataset. It is restricted to two generalised lambda distributions. The method of fitting is maximum likelihood via EM algorithm. It is a two step optimization procedure, each unimodal part of the bimodal distribution is modelled using the maximum likelihood method or the starship method (FMKL GLD only), these initial values are the used to maximise the likelihood for the entire bimodal distribution using the EM algorithm. It fits mixture of the form p*(f1)+(1-p)*(f2) where f1 and f2 are pdfs of the generalised lambda distributions.

Usage

fun.auto.bimodal.ml(data, per.of.mix = 0.01, clustering.m = clara, 
init1.sel = "rprs", init2.sel = "rprs", init1=c(-1.5, 1.5), init2=c(-1.5, 1.5), 
leap1=3, leap2=3,fun1="runif.sobol",fun2="runif.sobol",no=10000, max.it=5000,
optim.further="Y") 
fun.auto.bimodal.ml(data, per.of.mix = 0.01, clustering.m = clara, 
init1.sel = "rprs", init2.sel = "rprs", init1=c(-1.5, 1.5), init2=c(-1.5, 1.5), 
leap1=3, leap2=3,fun1="runif.sobol",fun2="runif.sobol",no=10000, max.it=5000,
optim.further="Y")

Arguments

`data`	A numerical vector representing the dataset.
`per.of.mix`	Level of mix between two parts of the distribution, usually 1-2% of cross mix is sufficient.
`clustering.m`	Clustering method used in classifying the dataset into two parts. Valid arguments include clara, fanny and pam from the cluster library. Default is clara. Or a logical vector specifying how data should be split.
`init1.sel`	This can be `"rprs"`, `"rmfmkl"` or `"star"`, the initial method used to fit the first distribution.
`init2.sel`	This can be `"rprs"`, `"rmfmkl"` or `"star"`, the initial method used to fit the second distribution.
`init1`	Inititial values lambda3 and lambda4 for the first generalised lambda distribution.
`init2`	Inititial values lambda3 and lambda4 for the second generalised lambda distribution.
`leap1`	See scrambling argument in `fun.gen.qrn`.
`fun1`	A character string of either `"runif.sobol"` (default), `"runif.sobol.owen"`, `"runif.halton"` or `"QUnif"`.
`leap2`	See scrambling argument in `fun.gen.qrn`.
`fun2`	A character string of either `"runif.sobol"` (default), `"runif.sobol.owen"`, `"runif.halton"` or `"QUnif"`.
`no`	Number of initial random values to find the best initial values for optimisation.
`max.it`	Maximum number of iterations for numerical optimisation.
`optim.further`	Whether to optimise the function further using full maximum likelihood method, recommended setting is "Y"

Details

The initial values that work well for RPRS are c(-1.5,1.5) and for RMFMKL are c(-0.25,1.5). For scrambling, if 1, 2 or 3 the sequence is scrambled otherwise not. If 1, Owen type type of scrambling is applied, if 2, Faure-Tezuka type of scrambling, is applied, and if 3, both Owen+Faure-Tezuka type of scrambling is applied. The star method uses the same initial values as rmfmkl since it uses the FMKL generalised lambda distribution. Nelder-Simplex algorithm is used in the numerical optimization. rprs stands for revised percentile method for RS generalised lambda distribution and "rmfmkl" stands for revised method of moment for FMKL generalised lambda distribution. These acronyms represents the initial optimization algorithm used to get a reasonable set of initial values for the subsequent optimization procedues. This function is an improvement from Su (2007) in Journal of Statistical Software.

Value

`par`	The best set of parameters found, the first four corresponds to the first distribution fit, the second four corresponds to the second distribution fit, the last value correspond to p for the first distribution fit.
`value`	The value of -ML for the paramters obtained.
`counts`	A two-element integer vector giving the number of calls to `fn` and `gr` respectively. This excludes those calls needed to compute the Hessian, if requested, and any calls to `fn` to compute a finite-difference approximation to the gradient.
`convergence`	`0` indicates successful convergence, `1` indicates the iteration limit `maxit` had been reached, `10` indicates degeneracy of the Nelder-Mead simplex.
`message`	A character string giving any additional information returned by the optimizer, or `NULL`.

Note

If the number of observations is small, rprs can sometimes fail as the percentiles may not exist for this data. Also, if the initial values do not span a valid generalised lambda distribution, try another set of initial values.

Author(s)

Steve Su

References

Bratley P. and Fox B.L. (1988) Algorithm 659: Implementing Sobol's quasi random sequence generator, ACM Transactions on Mathematical Software 14, 88-100.

Joe S. and Kuo F.Y. (1998) Remark on Algorithm 659: Implementing Sobol's quasi random Sequence Generator.

Nelder, J. A. and Mead, R. (1965) A simplex algorithm for function minimization. Computer Journal *7*, 308-313.

Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.

Examples



# Fitting faithful data from the dataset library, with the clara clustering 
# regime. The first distribution is RS and the second distribution is fmkl. 
# The percentage of data mix is 1%.

fun.auto.bimodal.ml(faithful[,1],per.of.mix=0.01,clustering.m=clara,
init1.sel="rprs",init2.sel="rmfmkl",init1=c(-1.5,1,5),init2=c(-0.25,1.5),
leap1=3,leap2=3)


# Fitting faithful data from the dataset library, with the clara clustering 
# regime. The first distribution is RS and the second distribution is fmkl. 
# The percentage of data mix is 1%.

fun.auto.bimodal.ml(faithful[,1],per.of.mix=0.01,clustering.m=clara,
init1.sel="rprs",init2.sel="rmfmkl",init1=c(-1.5,1,5),init2=c(-0.25,1.5),
leap1=3,leap2=3)

Fitting mixture of generalied lambda distribtions to data using parition maximum likelihood estimation

Description

This function will fit mixture of generalised lambda distributions to dataset. It is restricted to two generalised lambda distributions. The method of fitting is parition maximum likelihood. It is a two step optimization procedure, each unimodal part of the bimodal distribution is modelled using the maximum likelihood method or the starship method (FMKL GLD only). These initial values the used to "maximise" the complete log likelihood for the entire bimodal distribution. It fits mixture of the form p*(f1)+(1-p)*(f2) where f1 and f2 are pdfs of the generalised lambda distributions.

Usage

fun.auto.bimodal.pml(data, clustering.m = clara, init1.sel = "rprs", 
init2.sel = "rprs", init1=c(-1.5, 1.5), init2=c(-1.5, 1.5), leap1=3, leap2=3,
fun1="runif.sobol", fun2="runif.sobol",no=10000,max.it=5000, optim.further="Y")
fun.auto.bimodal.pml(data, clustering.m = clara, init1.sel = "rprs", 
init2.sel = "rprs", init1=c(-1.5, 1.5), init2=c(-1.5, 1.5), leap1=3, leap2=3,
fun1="runif.sobol", fun2="runif.sobol",no=10000,max.it=5000, optim.further="Y")

Arguments

`data`	A numerical vector representing the dataset.
`clustering.m`	Clustering method used in classifying the dataset into two parts. Valid arguments include clara, fanny and pam from the cluster library. Default is clara. Or a logical vector specifying how data should be split.
`init1.sel`	This can be `"rprs"`, `"rmfmkl"` or `"star"`, the initial method used to fit the first distribution.
`init2.sel`	This can be `"rprs"`, `"rmfmkl"` or `"star"`, the initial method used to fit the second distribution.
`init1`	Inititial values lambda3 and lambda4 for the first generalised lambda distribution.
`init2`	Inititial values lambda3 and lambda4 for the second generalised lambda distribution.
`leap1`	See scrambling argument in `fun.gen.qrn`.
`fun1`	A character string of either `"runif.sobol"` (default), `"runif.sobol.owen"`, `"runif.halton"` or `"QUnif"`.
`leap2`	See scrambling argument in `fun.gen.qrn`.
`fun2`	A character string of either `"runif.sobol"` (default), `"runif.sobol.owen"`, `"runif.halton"` or `"QUnif"`.
`no`	Number of initial random values to find the best initial values for optimisation.
`max.it`	Maximum number of iterations for numerical optimisation.
`optim.further`	Whether to optimise the function further using full maximum likelihood method, recommended setting is "Y"

Details

Value

`par`	The best set of parameters found, the first four corresponds to the first distribution fit, the second four corresponds to the second distribution fit, the last value correspond to p for the first distribution fit.
`value`	The value of -PML for the paramters obtained.
`counts`	A two-element integer vector giving the number of calls to "fn" and "gr" respectively. This excludes those calls needed to compute the Hessian, if requested, and any calls to 'fn' to compute a finite-difference approximation to the gradient.
`convergence`	`0` indicates successful convergence, `1` indicates the iteration limit `maxit` had been reached, `10` indicates degeneracy of the Nelder-Mead simplex.
`message`	A character string giving any additional information returned by the optimizer, or `NULL`.

Note

If the number of observations is small, rprs can sometimes fail as the percentiles may not exist for this data. Also, if the initial values do not result in a valid generalised lambda distribution, try another set of initial values.

References

Bratley P. and Fox B.L. (1988) Algorithm 659: Implementing Sobol's quasi random sequence generator, ACM Transactions on Mathematical Software 14, 88-100.

Joe S. and Kuo F.Y. (1998) Remark on Algorithm 659: Implementing Sobol's quasi random Sequence Generator.

Nelder, J. A. and Mead, R. (1965) A simplex algorithm for function minimization. Computer Journal *7*, 308-313.

Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.

Examples


# Fitting faithful data from the dataset library, with the clara clustering 
# regime. The first distribution is RS and the second distribution is fmkl. 

fun.auto.bimodal.pml(faithful[,1],clustering.m=clara,init1.sel="rprs",
init2.sel="rmfmkl",init1=c(-1.5,1,5),init2=c(-0.25,1.5),leap1=3,leap2=3)

# Fitting faithful data from the dataset library, with the clara clustering 
# regime. The first distribution is RS and the second distribution is fmkl. 

fun.auto.bimodal.pml(faithful[,1],clustering.m=clara,init1.sel="rprs",
init2.sel="rmfmkl",init1=c(-1.5,1,5),init2=c(-0.25,1.5),leap1=3,leap2=3)

Fitting mixtures of generalied lambda distribtions to data using quantile matching method

Description

This function will fit mixture of generalised lambda distributions to dataset. It is restricted to two generalised lambda distributions. The method of fitting is quantile matching method. It is a two step optimization procedure, each unimodal part of the bimodal distribution is modelled using quantile matching method. The initial values obtained are then used to maximise the theoretical and empirical quantile match for the entire bimodal distribution. It fits mixture of the form p*(f1)+(1-p)*(f2) where f1 and f2 are pdfs of the generalised lambda distributions.

Usage

fun.auto.bimodal.qs(data, per.of.mix = 0.01, clustering.m = clara, 
init1.sel = "rprs", init2.sel = "rprs", init1=c(-1.5, 1.5), init2=c(-1.5, 1.5), 
leap1=3, leap2=3, fun1 = "runif.sobol", fun2 = "runif.sobol", trial.n = 100, 
len = 1000, type = 7, no = 10000, maxit = 5000)
fun.auto.bimodal.qs(data, per.of.mix = 0.01, clustering.m = clara, 
init1.sel = "rprs", init2.sel = "rprs", init1=c(-1.5, 1.5), init2=c(-1.5, 1.5), 
leap1=3, leap2=3, fun1 = "runif.sobol", fun2 = "runif.sobol", trial.n = 100, 
len = 1000, type = 7, no = 10000, maxit = 5000)

Arguments

`data`	A numerical vector representing the dataset.
`per.of.mix`	Level of mix between two parts of the distribution, usually 1-2% of cross mix is sufficient.
`clustering.m`	Clustering method used in classifying the dataset into two parts. Valid arguments include clara, fanny and pam from the cluster library. Default is clara. Or a logical vector specifying how data should be split.
`init1.sel`	This can be `"rprs"` or `"rmfmkl"`, representing the choice (RS or FMKL) of the first distribution
`init2.sel`	This can be `"rprs"` or `"rmfmkl"`, representing the choice (RS or FMKL) of the second distribution
`init1`	Inititial values lambda3 and lambda4 for the first generalised lambda distribution.
`init2`	Inititial values lambda3 and lambda4 for the second generalised lambda distribution.
`leap1`	See scrambling argument in `fun.gen.qrn`.
`fun1`	A character string of either `"runif.sobol"` (default), `"runif.sobol.owen"`, `"runif.halton"` or `"QUnif"`.
`leap2`	See scrambling argument in `fun.gen.qrn`.
`fun2`	A character string of either `"runif.sobol"` (default), `"runif.sobol.owen"`, `"runif.halton"` or `"QUnif"`.
`trial.n`	Number of evenly spaced quantile ranging from 0 to 1 to be used in the checking phase, to find the best set of initial values for optimisation, this is intended to be lower than `len` to speed up the fitting algorithm. Default is 100.
`len`	Number of evenly spaced quantile ranging from 0 to 1 to be used, default is 1000
`type`	Type of quantile to be used, default is 7, see `quantile`
`no`	Number of initial random values to find the best initial values for optimisation.
`maxit`	Maximum number of iterations for numerical optimisation. Default is 5000.

Details

Value

`par`	The best set of parameters found, the first four corresponds to the first distribution fit, the second four corresponds to the second distribution fit, the last value correspond to p for the first distribution fit.
`value`	The value of -ML for the paramters obtained.
`counts`	A two-element integer vector giving the number of calls to `fn` and `gr` respectively. This excludes those calls needed to compute the Hessian, if requested, and any calls to `fn` to compute a finite-difference approximation to the gradient.
`convergence`	`0` indicates successful convergence, `1` indicates the iteration limit `maxit` had been reached, `10` indicates degeneracy of the Nelder-Mead simplex.
`message`	A character string giving any additional information returned by the optimizer, or `NULL`.

Note

Author(s)

Steve Su

References

Bratley P. and Fox B.L. (1988) Algorithm 659: Implementing Sobol's quasi random sequence generator, ACM Transactions on Mathematical Software 14, 88-100.

Joe S. and Kuo F.Y. (1998) Remark on Algorithm 659: Implementing Sobol's quasi random Sequence Generator.

Nelder, J. A. and Mead, R. (1965) A simplex algorithm for function minimization. Computer Journal *7*, 308-313.

Su (2008). Fitting GLD to data via quantile matching method. (Book chapter to appear)

Examples



# Fitting faithful data from the dataset library, with the clara clustering 
# regime. The first distribution is RS and the second distribution is fmkl. 
# The percentage of data mix is 1%.

fun.auto.bimodal.qs(faithful[,1],per.of.mix=0.01,clustering.m=clara,
init1.sel="rprs",init2.sel="rmfmkl",init1=c(-1.5,1,5),init2=c(-0.25,1.5),
leap1=3,leap2=3)

# Fitting faithful data from the dataset library, with the clara clustering 
# regime. The first distribution is RS and the second distribution is fmkl. 
# The percentage of data mix is 1%.

fun.auto.bimodal.qs(faithful[,1],per.of.mix=0.01,clustering.m=clara,
init1.sel="rprs",init2.sel="rmfmkl",init1=c(-1.5,1,5),init2=c(-0.25,1.5),
leap1=3,leap2=3)

Finds the final fits using the maximum likelihood estimation for the bimodal dataset.

Description

This is the secondary optimization procedure to evaluate the final bimodal distribution fits using the maximum likelihood. It usually relies on initial values found by fun.bimodal.init function.

Usage

fun.bimodal.fit.ml(data, first.fit, second.fit, prop, param1, param2, selc1, 
selc2)
fun.bimodal.fit.ml(data, first.fit, second.fit, prop, param1, param2, selc1, 
selc2)

Arguments

`data`	Dataset to be fitted.
`first.fit`	The distribution parameters or the initial values of the first distribution fit.
`second.fit`	The distribution parameters or the initial values of the second distribution fit.
`prop`	The proportion of the data set, usually obtained from `fun.bimodal.init`.
`param1`	Can be either `"rs"` or `"fmkl"`, depending on the type of first distribution used.
`param2`	Can be either `"rs"` or `"fmkl"`, depending on the type of second distribution used.
`selc1`	Selection of initial values for the first distribution, can be either `"rs"`, `"fmkl"` or `"star"`. Choose initial values from RPRS (ML), RMFMKL (ML) or STAR method.
`selc2`	Selection of initial values for the second distribution, can be either `"rs"`, `"fmkl"` or `"star"`. Choose initial values from RPRS (ML), RMFMKL (ML) or STAR method.

Details

This function should be used in tandem with fun.bimodal.init.

Value

`par`	The first four numbers are the parameters of the first generalised lambda distribution, the second four numbers are the parameters of the second generalised lambda distribution and the last value is the proportion of the first generalised lambda distribution.
`value`	The objective value of negative likelihood obtained using the par above.
`counts`	A two-element integer vector giving the number of calls to functions. Gradient is not used in this case.
`convergence`	An integer code. `0` indicates successful convergence. Error codes are: `1` indicates that the iteration limit 'maxit' had been reached. `10` indicates degeneracy of the Nelder-Mead simplex.
`message`	A character string giving any additional information returned by the optimizer, or `NULL`.

Note

There is currently no guarantee of a global convergence.

Author(s)

Steve Su

References

Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.

Examples


 # Extract faithful[,2] into faithful2
 faithful2<-faithful[,2]

 # Uses clara clustering method
 clara.faithful2<-fun.class.regime.bi(faithful2, 0.01, clara)

 # Save into two different objects
 qqqq1.faithful2.cc<-clara.faithful2$data.a
 qqqq2.faithful2.cc<-clara.faithful2$data.b

 # Find the initial values
 result.faithful2.init<-fun.bimodal.init(data1=qqqq1.faithful2.cc,
 data2=qqqq2.faithful2.cc, rs.leap1=3,fmkl.leap1=3,rs.init1 = c(-1.5, 1.5), 
 fmkl.init1 = c(-0.25, 1.5), rs.leap2=3,fmkl.leap2=3,rs.init2 = c(-1.5, 1.5), 
 fmkl.init2 = c(-0.25, 1.5))

 # Find the final fits
 result.faithful2.rsrs<-fun.bimodal.fit.ml(data=faithful2,
 result.faithful2.init[[2]],result.faithful2.init[[3]],
 result.faithful2.init[[1]], param1="rs",param2="rs",selc1="rs",selc2="rs")

 # Output
 result.faithful2.rsrs

# Extract faithful[,2] into faithful2
 faithful2<-faithful[,2]

 # Uses clara clustering method
 clara.faithful2<-fun.class.regime.bi(faithful2, 0.01, clara)

 # Save into two different objects
 qqqq1.faithful2.cc<-clara.faithful2$data.a
 qqqq2.faithful2.cc<-clara.faithful2$data.b

 # Find the initial values
 result.faithful2.init<-fun.bimodal.init(data1=qqqq1.faithful2.cc,
 data2=qqqq2.faithful2.cc, rs.leap1=3,fmkl.leap1=3,rs.init1 = c(-1.5, 1.5), 
 fmkl.init1 = c(-0.25, 1.5), rs.leap2=3,fmkl.leap2=3,rs.init2 = c(-1.5, 1.5), 
 fmkl.init2 = c(-0.25, 1.5))

 # Find the final fits
 result.faithful2.rsrs<-fun.bimodal.fit.ml(data=faithful2,
 result.faithful2.init[[2]],result.faithful2.init[[3]],
 result.faithful2.init[[1]], param1="rs",param2="rs",selc1="rs",selc2="rs")

 # Output
 result.faithful2.rsrs

Finds the final fits using partition maximum likelihood estimation for the bimodal dataset.

Description

This is the secondary optimization procedure to evaluate the final bimodal distribution fits using the partition maximum likelihood. It usually relies on initial values found by fun.bimodal.init function.

Usage

fun.bimodal.fit.pml(data1, data2, first.fit, second.fit, prop, param1, param2, 
selc1, selc2)
fun.bimodal.fit.pml(data1, data2, first.fit, second.fit, prop, param1, param2, 
selc1, selc2)

Arguments

`data1`	First data set, usually obtained by `fun.class.regime.bi`.
`data2`	Second data set, usually obtained by `fun.class.regime.bi`.
`first.fit`	The distribution parameters or the initial values of the first distribution fit.
`second.fit`	The distribution parameters or the initial values of the second distribution fit.
`prop`	The proportion of the data set, usually obtained from `fun.bimodal.init`.
`param1`	Can be either `rs` or `fmkl`, depending on the type of first distribution used.
`param2`	Can be either `rs` or `fmkl`, depending on the type of second distribution used.
`selc1`	Selection of initial values for the first distribution, can be either `"rs"`, `"fmkl"` or `"star"`. Choose initial values from RPRS (ML), RMFMKL (ML) or STAR method.
`selc2`	Selection of initial values for the second distribution, can be either `"rs"`, `"fmkl"` or `"star"`. Choose initial values from RPRS (ML), RMFMKL (ML) or STAR method.

Details

This function should be used in tandem with fun.bimodal.init function.

Value

`par`	The first four numbers are the parameters of the first generalised lambda distribution, the second four numbers are the parameters of the second generalised lambda distribution and the last value is the proportion of the first generalised lambda distribution.
`value`	The objective value of negative likelihood obtained.
`counts`	A two-element integer vector giving the number of calls to functions. Gradient is not used in this case.
`convergence`	An integer code. `0` indicates successful convergence. Error codes are: `1` indicates that the iteration limit 'maxit' had been reached. `10` indicates degeneracy of the Nelder-Mead simplex.
`message`	A character string giving any additional information returned by the optimizer, or `NULL`.

Note

There is currently no guarantee of a global convergence.

Author(s)

Steve Su

References

Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.

Examples


 # Extract faithful[,2] into faithful2
 faithful2<-faithful[,2]

 # Uses clara clustering method
 clara.faithful2<-clara(faithful2,2)$clustering

 # Save into two different objects
 qqqq1.faithful2.cc<-faithful2[clara.faithful2==1]
 qqqq2.faithful2.cc<-faithful2[clara.faithful2==2]

 # Find the initial values
 result.faithful2.init<-fun.bimodal.init(data1=qqqq1.faithful2.cc,
 data2=qqqq2.faithful2.cc, rs.leap1=3,fmkl.leap1=3,rs.init1 = c(-1.5, 1.5), 
 fmkl.init1 = c(-0.25, 1.5), rs.leap2=3,fmkl.leap2=3,rs.init2 = c(-1.5, 1.5), 
 fmkl.init2 = c(-0.25, 1.5))

 # Find the final fits
 result.faithful2.rsrs<-fun.bimodal.fit.pml(data1=qqqq1.faithful2.cc,
 data2=qqqq2.faithful2.cc, result.faithful2.init[[2]],
 result.faithful2.init[[3]], result.faithful2.init[[1]],param1="rs",
 param2="rs",selc1="rs",selc2="rs")

 # Output
 result.faithful2.rsrs
# Extract faithful[,2] into faithful2
 faithful2<-faithful[,2]

 # Uses clara clustering method
 clara.faithful2<-clara(faithful2,2)$clustering

 # Save into two different objects
 qqqq1.faithful2.cc<-faithful2[clara.faithful2==1]
 qqqq2.faithful2.cc<-faithful2[clara.faithful2==2]

 # Find the initial values
 result.faithful2.init<-fun.bimodal.init(data1=qqqq1.faithful2.cc,
 data2=qqqq2.faithful2.cc, rs.leap1=3,fmkl.leap1=3,rs.init1 = c(-1.5, 1.5), 
 fmkl.init1 = c(-0.25, 1.5), rs.leap2=3,fmkl.leap2=3,rs.init2 = c(-1.5, 1.5), 
 fmkl.init2 = c(-0.25, 1.5))

 # Find the final fits
 result.faithful2.rsrs<-fun.bimodal.fit.pml(data1=qqqq1.faithful2.cc,
 data2=qqqq2.faithful2.cc, result.faithful2.init[[2]],
 result.faithful2.init[[3]], result.faithful2.init[[1]],param1="rs",
 param2="rs",selc1="rs",selc2="rs")

 # Output
 result.faithful2.rsrs

Finds the initial values for optimisation in fitting the bimodal generalised lambda distribution.

Description

After classifying the data using fun.class.regime.bi, this function evaluates the temporary or initial solutions by estimating each part of the bimodal distribution using the maximum likelihood estimation and starship method. These initial solutions are then passed onto fun.bimodal.fit.ml or fun.bimodal.fit.pml to obtain the final fits.

Usage

fun.bimodal.init(data1, data2, rs.leap1, fmkl.leap1, rs.init1, fmkl.init1, 
rs.leap2, fmkl.leap2, rs.init2, fmkl.init2,fun1="runif.sobol",
fun2="runif.sobol",no=10000)
fun.bimodal.init(data1, data2, rs.leap1, fmkl.leap1, rs.init1, fmkl.init1, 
rs.leap2, fmkl.leap2, rs.init2, fmkl.init2,fun1="runif.sobol",
fun2="runif.sobol",no=10000)

Arguments

`data1`	The first data obtained by the clustering algorithm.
`data2`	The second data obtained by the clustering algorithm.
`rs.leap1`	See scrambling argument in `fun.gen.qrn`.
`fmkl.leap1`	See scrambling argument in `fun.gen.qrn`.
`rs.init1`	Inititial values (lambda3 and lambda4) for the first RS generalised lambda distribution. `c(-1.5,1.5)` tends to work well.
`fmkl.init1`	Inititial values (lambda3 and lambda4) for the first FMKL generalised lambda distribution. `c(-0.25,1.5)` tends to work well
`rs.leap2`	See scrambling argument in `fun.gen.qrn`.
`fmkl.leap2`	See scrambling argument in `fun.gen.qrn`.
`rs.init2`	Inititial values (lambda3 and lambda4) for the second RS generalised lambda distribution. `c(-1.5,1.5)` tends to work well.
`fmkl.init2`	Inititial values (lambda3 and lambda4) for the second FMKL generalised lambda distribution. `c(-0.25,1.5)` tends to work well
`fun1`	A character string of either `"runif.sobol"` (default), `"runif.sobol.owen"`, `"runif.halton"` or `"QUnif"`.
`fun2`	A character string of either `"runif.sobol"` (default), `"runif.sobol.owen"`, `"runif.halton"` or `"QUnif"`.
`no`	Number of initial random values to find the best initial values for optimisation.

Details

All three methods of fitting (RPRS, RMFMKL and STAR) will be given for each part of the bimodal distribution.

Value

`prop`	Proportion of the number of observations in the first data in relation to the entire data.
`first.fit`	A matrix comprising the parameters of GLD obtained from RPRS, RMFMKL and STAR for the first dataset.
`second.fit`	A matrix comprising the parameters of GLD obtained from RPRS, RMFMKL and STAR for the second dataset.

Note

This is not designed to be called by the end user explicitly, the difficulties with RPRS parameterisation should be noted by the users.

Author(s)

Steve Su

References

Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.

Examples


 # Split the first column of the faithful data into two using 

 fun.class.regime.bi
 faithful1.mod<-fun.class.regime.bi(faithful[,1], 0.1, clara)

 # Save the datasets
 qqqq1.faithful1.cc1<-faithful1.mod$data.a
 qqqq2.faithful1.cc1<-faithful1.mod$data.b

 # Find the initial values for secondary optimisation.
 
 result.faithful1.init1<-fun.bimodal.init(data1=qqqq1.faithful1.cc1,
 data2=qqqq2.faithful1.cc1, rs.leap1=3,fmkl.leap1=3,rs.init1 = c(-1.5, 1.5), 
 fmkl.init1 = c(-0.25, 1.5), rs.leap2=3,fmkl.leap2=3,rs.init2 = c(-1.5, 1.5), 
 fmkl.init2 = c(-0.25, 1.5))

 # These initial values are then passed onto fun,bimodal.fit.ml to obtain the 
 # final fits.
# Split the first column of the faithful data into two using 

 fun.class.regime.bi
 faithful1.mod<-fun.class.regime.bi(faithful[,1], 0.1, clara)

 # Save the datasets
 qqqq1.faithful1.cc1<-faithful1.mod$data.a
 qqqq2.faithful1.cc1<-faithful1.mod$data.b

 # Find the initial values for secondary optimisation.
 
 result.faithful1.init1<-fun.bimodal.init(data1=qqqq1.faithful1.cc1,
 data2=qqqq2.faithful1.cc1, rs.leap1=3,fmkl.leap1=3,rs.init1 = c(-1.5, 1.5), 
 fmkl.init1 = c(-0.25, 1.5), rs.leap2=3,fmkl.leap2=3,rs.init2 = c(-1.5, 1.5), 
 fmkl.init2 = c(-0.25, 1.5))

 # These initial values are then passed onto fun,bimodal.fit.ml to obtain the 
 # final fits.

Check whether the RS or FMKL/FKML GLD is a valid GLD for single values of L1, L2, L3 and L4

Description

This function will return a single logical value showing whether a combination of L1, L2, L3 and L4 is a valid GLD.

Usage

fun.check.gld(lambda1, lambda2, lambda3, lambda4, param)
fun.check.gld(lambda1, lambda2, lambda3, lambda4, param)

Arguments

`lambda1`	A numerical vector for L1 of GLD
`lambda2`	A numerical vector for L2 of GLD
`lambda3`	A numerical vector for L3 of GLD
`lambda4`	A numerical vector for L4 of GLD
`param`	Can be "rs", "fmkl", or "fkml"

Value

A single logical value indicating whether the specified GLD is a valid probability density function

Author(s)

Steve Su

Examples


fun.check.gld(1,4,3,2,"rs")

fun.check.gld(1,4,3,2,"fkml")

fun.check.gld(1,4,3,-2,"rs")
fun.check.gld(1,4,3,2,"rs")

fun.check.gld(1,4,3,2,"fkml")

fun.check.gld(1,4,3,-2,"rs")

Check whether the RS or FMKL/FKML GLD is a valid GLD for vectors of L1, L2, L3 and L4

Description

This function will return a logical vector showing whether vector combinations of L1, L2, L3 and L4 are valid GLDs.

Usage

fun.check.gld.multi(l1, l2, l3, l4, param)
fun.check.gld.multi(l1, l2, l3, l4, param)

Arguments

`l1`	A numerical vector for L1 of GLD
`l2`	A numerical vector for L2 of GLD
`l3`	A numerical vector for L3 of GLD
`l4`	A numerical vector for L4 of GLD
`param`	Can be "rs", "fmkl", or "fkml"

Details

This is an extension to fun.check.gld

Value

A logical vector indicating whether the specified parameters are valid GLDs

Author(s)

Steve Su

Examples


fun.check.gld.multi(c(1,2,3),c(4,5,6),c(7,8,9),c(10,11,12),param="rs")

fun.check.gld.multi(c(1,2,3),c(4,5,6),c(7,8,9),c(10,11,-12),param="rs")

fun.check.gld.multi(c(1,2,3),c(4,5,6),c(7,8,9),c(10,11,12),param="rs")

fun.check.gld.multi(c(1,2,3),c(4,5,6),c(7,8,9),c(10,11,-12),param="rs")

Classifies data into two groups using a clustering regime.

Description

This function is primarily designed to split a bimodal data vector into two groups to allow the fitting of mixture generalised lambda distributions.

Usage

fun.class.regime.bi(data, perc.cross, fun.cross)
fun.class.regime.bi(data, perc.cross, fun.cross)

Arguments

`data`	Data to be classified into two groups.
`perc.cross`	Percentage of cross over from one data to the other, usually set at 1%
`fun.cross`	Any clustering function such as `link{clara}`, `pam`, `fanny` can be used here. Or a logical vector indicating how data should be split.

Details

This function is part of the routine mixture fitting procedure provided in this package. The perc.cross argument or percentage of cross over is designed to allow the use of maximum likelihood estimation via EM algorithm for fitting bimodal data. When this is invoked, it will ensure both part of the data will contain both the minmum and maximum of the data set as well as a proportion ( specified in perc.cross argument) of observations from each other. If 1% is required, then data.a will contains 1% of the data.b and vice versa after the full data set has been classified into data.a and data.b by the fun.cross classification regime.

Value

`data.a`	First group of data obtained by the classification algorithm.
`data.b`	Second group of data obtained by the classification algorithm.

Author(s)

Steve Su

References

Kaufman, L. and Rousseeuw, P. J. (1990). Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York.

Su (2006) Maximum Log Likelihood Estimation using EM Algorithm and Partition Maximum Log Likelihood Estimation for Mixtures of Generalized Lambda Distributions. Working Paper.

Examples


# Classify the faithful[,1] data into two categories with 10% cross over mix.
 fun.class.regime.bi(faithful[,1],0.1,clara)

# Classify the faithful[,1] data into two categories with no mixing:
 fun.class.regime.bi(faithful[,1],0,clara)
# Classify the faithful[,1] data into two categories with 10% cross over mix.
 fun.class.regime.bi(faithful[,1],0.1,clara)

# Classify the faithful[,1] data into two categories with no mixing:
 fun.class.regime.bi(faithful[,1],0,clara)

Compare the moments of the data and the fitted univariate generalised lambda distribution.

Description

After fitting the distribution, it is often desirable to see whether the moments of the data matches with the fitted distribution. This function computes the theoretical and actual moments especially for fun.data.fit.ml function output.

Usage

fun.comp.moments.ml(theo.obj, data, name = "ML")
fun.comp.moments.ml(theo.obj, data, name = "ML")

Arguments

`theo.obj`	Fitted distribution parameters, usually output from `fun.data.fit.ml`
`data`	Data set used
`name`	Naming the method used in fitting the distribution, by default this is "ML".

Value

`r.mat`	A matrix showing the mean, variance, skewness and kurtosis of the fitted distribution in comparison to the data set.
`eval.mat`	Absolute difference in each of the four moments from the data under each of the distibutional fits.

Note

Sometimes it is difficult to find RPRS type of fits to data set, so instead fun.comp.moments.ml.2 is used to compare the theoretical moments of RMFMKL.ML and STAR methods with respect to the dataset fitted.

Author(s)

Steve Su

Examples


# Generate random normally distributed observations.
 junk<-rnorm(1000,3,2)

# Fit the dataset using fun.data.ml
 fit<-fun.data.fit.ml(junk)

# Compare the resulting fits. It is usually the case the maximum likelihood 
# provides better estimation of the moments than the starship method.
 fun.comp.moments.ml(fit,junk)

# Generate random normally distributed observations.
 junk<-rnorm(1000,3,2)

# Fit the dataset using fun.data.ml
 fit<-fun.data.fit.ml(junk)

# Compare the resulting fits. It is usually the case the maximum likelihood 
# provides better estimation of the moments than the starship method.
 fun.comp.moments.ml(fit,junk)

Compare the moments of the data and the fitted univariate generalised lambda distribution. Specialised funtion designed for RMFMKL.ML and STAR methods.

Description

Usage

fun.comp.moments.ml.2(theo.obj, data, name = "ML")
fun.comp.moments.ml.2(theo.obj, data, name = "ML")

Arguments

`theo.obj`	Fitted distribution parameters, there should be two sets, both FMKL GLD.
`data`	Data set used
`name`	Naming the method used in fitting the distribution, by default this is "ML".

Value

`r.mat`	A matrix showing the mean, variance, skewness and kurtosis of the fitted distribution in comparison to the data set.
`eval.mat`	Absolute difference in each of the four moments from the data under each of the distibutional fits.

Note

To compare all three fits under fun.data.fit.ml see fun.comp.moments.ml function.

Author(s)

Steve Su

Examples


## Generate random normally distributed observations.
 junk<-rnorm(1000,3,2)

## Fit the dataset using fun.data.ml
 fit<-cbind(fun.RMFMKL.ml(junk),starship(junk)$lambda)

## Compare the resulting fits. It is usually the case the maximum likelihood 
## provides better estimation of the moments than the starship method.
 fun.comp.moments.ml.2(fit,junk)
## Generate random normally distributed observations.
 junk<-rnorm(1000,3,2)

## Fit the dataset using fun.data.ml
 fit<-cbind(fun.RMFMKL.ml(junk),starship(junk)$lambda)

## Compare the resulting fits. It is usually the case the maximum likelihood 
## provides better estimation of the moments than the starship method.
 fun.comp.moments.ml.2(fit,junk)

Fit RS and FMKL generalised distributions to data using discretised approach with weights.

Description

This function fits RS and FMKL generalised distribution to data using discretised approach with weights. It is designed to act as a smoother device rather than a definitive fit.

Usage

fun.data.fit.hs(data, rs.default = "Y", fmkl.default = "Y", rs.leap = 3, 
fmkl.leap = 3, rs.init = c(-1.5, 1.5), fmkl.init = c(-0.25, 1.5), no.c.rs = 50, 
no.c.fmkl = 50,FUN="runif.sobol", no=10000)
fun.data.fit.hs(data, rs.default = "Y", fmkl.default = "Y", rs.leap = 3, 
fmkl.leap = 3, rs.init = c(-1.5, 1.5), fmkl.init = c(-0.25, 1.5), no.c.rs = 50, 
no.c.fmkl = 50,FUN="runif.sobol", no=10000)

Arguments

`data`	Dataset to be fitted
`rs.default`	If yes, this function uses the default method `fun.nclass.e` to calculate number of classes required for the RS distribution fits.
`fmkl.default`	If yes, this function uses the default method `fun.nclass.e` to calculate number of classes required for the FMKL distribution fits.
`rs.leap`	See scrambling argument in `fun.gen.qrn`.
`fmkl.leap`	See scrambling argument in `fun.gen.qrn`.
`rs.init`	Initial values for RS distribution optimization, `c(-1.5,1.5)` tends to work well.
`fmkl.init`	Initial values for FMKL distribution optimization, `c(-0.25,1.5)` tends to work well.
`no.c.rs`	Number of classes or bins of histogram to be optimized over for the RS GLD. This argument is ineffective if `default="Y"`.
`no.c.fmkl`	Number of classes or bins of histogram to be optimized over for the FMKL GLD. This argument is ineffective if `default="Y"`.
`FUN`	A character string of either `"runif.sobol"` (default), `"runif.sobol.owen"`, `"runif.halton"` or `"QUnif"`.
`no`	Number of initial random values to find the best initial values for optimisation.

Details

This function optimises the deviations of frequency of the bins to that of the theoretical so it has the effect of "fitting clothes" onto the data set. The user can decide the frequency of the bins they want the distribution to smooth over. The resulting fit may or may not be an adequate fit from a formal statistical point of view such as satisfying the goodness of fit for example, but it can be useful to suggest the range of different distributions exhibited by the data set. The default number of classes calculates the mean and variance after categorising the data into different bins and uses the number of classes that best matches the mean and variance of the original, ungrouped data.The weighting is designed to accentuate the peak or the dense part of the distribution and suppress the tails.

Value

A matrix showing the four parameters of the RS and FMKL distribution fit.

Note

In some cases, the resulting fit may not converge, there are currently no checking mechanism in place to ensure global convergence. The RPRS method can sometimes fail if there are no valid percentiles in the data set or if initial values do not give a valid distribution.

Author(s)

Steve Su

References

Su, S. (2005). A Discretized Approach to Flexibly Fit Generalized Lambda Distributions to Data. Journal of Modern Applied Statistical Methods (November): 408-424.

Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.

Examples


# Fitting normal(3,2) distriution using the default setting
 junk<-rnorm(1000,3,2)
 fun.data.fit.hs(junk)
# Fitting normal(3,2) distriution using the default setting
 junk<-rnorm(1000,3,2)
 fun.data.fit.hs(junk)

Fit RS and FMKL generalised distributions to data using discretised approach without weights.

Description

This function fits RS and FMKL generalised distribution to data using discretised approach without weights. It is designed to act as a smoother device rather than a definitive fit.

Usage

fun.data.fit.hs.nw(data, rs.default = "Y", fmkl.default = "Y", rs.leap = 3, 
fmkl.leap = 3, rs.init = c(-1.5, 1.5), fmkl.init = c(-0.25, 1.5), no.c.rs = 50, 
no.c.fmkl = 50,FUN="runif.sobol",no=10000)
fun.data.fit.hs.nw(data, rs.default = "Y", fmkl.default = "Y", rs.leap = 3, 
fmkl.leap = 3, rs.init = c(-1.5, 1.5), fmkl.init = c(-0.25, 1.5), no.c.rs = 50, 
no.c.fmkl = 50,FUN="runif.sobol",no=10000)

Arguments

`data`	Dataset to be fitted
`rs.default`	If yes, this function uses the default method `fun.nclass.e` to calculate number of classes required for the RS distribution fits.
`fmkl.default`	If yes, this function uses the default method `fun.nclass.e` to calculate number of classes required for the FMKL distribution fits.
`rs.leap`	See scrambling argument in `fun.gen.qrn`.
`fmkl.leap`	See scrambling argument in `fun.gen.qrn`.
`rs.init`	Initial values for RS distribution optimization, `c(-1.5,1.5)` tends to work well.
`fmkl.init`	Initial values for FMKL distribution optimization, `c(-0.25,1.5)` tends to work well.
`no.c.rs`	Number of classes or bins of histogram to be optimized over for the RS GLD. This argument is ineffective if `default="Y"`.
`no.c.fmkl`	Number of classes or bins of histogram to be optimized over for the FMKL GLD. This argument is ineffective if `default="Y"`.
`FUN`	A character string of either `"runif.sobol"` (default), `"runif.sobol.owen"`, `"runif.halton"` or `"QUnif"`.
`no`	Number of initial random values to find the best initial values for optimisation.

Details

Value

A matrix showing the four parameters of the RS and FMKL distribution fit.

Note

Author(s)

Steve Su

References

Su, S. (2005). A Discretized Approach to Flexibly Fit Generalized Lambda Distributions to Data. Journal of Modern Applied Statistical Methods (November): 408-424.

Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.

Examples


# Fitting normal(3,2) distriution using the default setting
 junk<-rnorm(1000,3,2)
 fun.data.fit.hs.nw(junk)
# Fitting normal(3,2) distriution using the default setting
 junk<-rnorm(1000,3,2)
 fun.data.fit.hs.nw(junk)

Fit data using L moment matching estimation for RS and FMKL GLD

Description

This function fits generalised lambda distributions to data using L moment matching method

Usage

fun.data.fit.lm(data, rs.leap = 3, fmkl.leap = 3, rs.init = c(-1.5, 1.5), 
fmkl.init = c(-0.25, 1.5), FUN = "runif.sobol", no = 10000)
fun.data.fit.lm(data, rs.leap = 3, fmkl.leap = 3, rs.init = c(-1.5, 1.5), 
fmkl.init = c(-0.25, 1.5), FUN = "runif.sobol", no = 10000)

Arguments

`data`	Dataset to be fitted.
`rs.leap`	See scrambling argument in `fun.gen.qrn`.
`fmkl.leap`	See scrambling argument in `fun.gen.qrn`.
`rs.init`	Inititial values (lambda3 and lambda4) for the RS generalised lambda distribution.
`fmkl.init`	Inititial values (lambda3 and lambda4) for the FMKL generalised lambda distribution.
`FUN`	A character string of either `"runif.sobol"` (default), `"runif.sobol.owen"`, `"runif.halton"` or `"QUnif"`.
`no`	Number of initial random values to find the best initial values for optimisation.

Details

This function consolidates fun.RPRS.lm and fun.RMFMKL.lm and gives all the fits in one output.

Value

A matrix showing the parameters of RS and FMKL generalised lambda distributions.

Author(s)

Steve Su

References

Asquith, W. (2007). "L-moments and TL-moments of the generalized lambda distribution." Computational Statistics and Data Analysis 51(9): 4484-4496.

Karvanen, J. and A. Nuutinen (2008). "Characterizing the generalized lambda distribution by L-moments." Computational Statistics and Data Analysis 52(4): 1971-1983.

Examples


# Fitting normal(3,2) distriution using the default setting
 junk<-rnorm(50,3,2)
 fun.data.fit.lm(junk)
# Fitting normal(3,2) distriution using the default setting
 junk<-rnorm(50,3,2)
 fun.data.fit.lm(junk)

Fit data using RS, FMKL maximum likelihood estimation and the FMKL starship method.

Description

This function fits generalised lambda distributions to data using RPRS, RMFMKL and starship methods.

Usage

fun.data.fit.ml(data, rs.leap = 3, fmkl.leap = 3, rs.init = c(-1.5, 1.5), 
fmkl.init = c(-0.25, 1.5),FUN="runif.sobol",no=10000)
fun.data.fit.ml(data, rs.leap = 3, fmkl.leap = 3, rs.init = c(-1.5, 1.5), 
fmkl.init = c(-0.25, 1.5),FUN="runif.sobol",no=10000)

Arguments

`data`	Dataset to be fitted.
`rs.leap`	See scrambling argument in `fun.gen.qrn`.
`fmkl.leap`	See scrambling argument in `fun.gen.qrn`.
`rs.init`	Inititial values (lambda3 and lambda4) for the RS generalised lambda distribution.
`fmkl.init`	Inititial values (lambda3 and lambda4) for the FMKL generalised lambda distribution.
`FUN`	A character string of either `"runif.sobol"` (default), `"runif.sobol.owen"`, `"runif.halton"` or `"QUnif"`.
`no`	Number of initial random values to find the best initial values for optimisation.

Details

This function consolidates fun.RPRS.ml, fun.RMFMKL.ml and starship and gives all the fits in one output.

Value

A matrix showing the parameters of generalised lambda distribution for RPRS, FMFKL and STAR methods.

Note

RPRS can sometimes fail if it is not possible to calculate the percentiles of the data set. This usually happens when the number of data point is small.

Author(s)

Steve Su

References

King, R.A.R. & MacGillivray, H. L. (1999), A starship method for fitting the generalised lambda distributions, Australian and New Zealand Journal of Statistics, 41, 353-374

Su, S. (2007). Numerical Maximum Log Likelihood Estimation for Generalized Lambda Distributions. Computational statistics and data analysis 51(8) 3983-3998.

Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.

Examples


# Fitting normal(3,2) distriution using the default setting
 junk<-rnorm(50,3,2)
 fun.data.fit.ml(junk)
# Fitting normal(3,2) distriution using the default setting
 junk<-rnorm(50,3,2)
 fun.data.fit.ml(junk)

Fit data using moment matching estimation for RS and FMKL GLD

Description

This function fits generalised lambda distributions to data using moment matching method

Usage

fun.data.fit.mm(data, rs.leap = 3, fmkl.leap = 3, rs.init = c(-1.5, 1.5), 
fmkl.init = c(-0.25, 1.5), FUN = "runif.sobol", no = 10000)
fun.data.fit.mm(data, rs.leap = 3, fmkl.leap = 3, rs.init = c(-1.5, 1.5), 
fmkl.init = c(-0.25, 1.5), FUN = "runif.sobol", no = 10000)

Arguments

`data`	Dataset to be fitted.
`rs.leap`	See scrambling argument in `fun.gen.qrn`.
`fmkl.leap`	See scrambling argument in `fun.gen.qrn`.
`rs.init`	Inititial values (lambda3 and lambda4) for the RS generalised lambda distribution.
`fmkl.init`	Inititial values (lambda3 and lambda4) for the FMKL generalised lambda distribution.
`FUN`	A character string of either `"runif.sobol"` (default), `"runif.sobol.owen"`, `"runif.halton"` or `"QUnif"`.
`no`	Number of initial random values to find the best initial values for optimisation.

Details

This function consolidates fun.RPRS.mm and fun.RMFMKL.mm and gives all the fits in one output.

Value

A matrix showing the parameters of RS and FMKL generalised lambda distributions.

Note

RPRS can sometimes fail if it is not possible to calculate the percentiles of the data set. This usually happens when the number of data point is small.

References

Karian, Z. and E. Dudewicz (2000). Fitting Statistical Distributions: The Generalized Lambda Distribution and Generalised Bootstrap Methods. New York, Chapman and Hall.

Examples


# Fitting normal(3,2) distriution using the default setting
 junk<-rnorm(50,3,2)
 fun.data.fit.mm(junk)
  # Fitting normal(3,2) distriution using the default setting
 junk<-rnorm(50,3,2)
 fun.data.fit.mm(junk)

Fit data using quantile matching estimation for RS and FMKL GLD

Description

This function fits generalised lambda distributions to data using quantile matching method

Usage

fun.data.fit.qs(data, rs.leap = 3, fmkl.leap = 3, rs.init = c(-1.5, 1.5), 
fmkl.init = c(-0.25, 1.5), FUN = "runif.sobol", trial.n = 100, len = 1000, 
type = 7, no = 10000)
fun.data.fit.qs(data, rs.leap = 3, fmkl.leap = 3, rs.init = c(-1.5, 1.5), 
fmkl.init = c(-0.25, 1.5), FUN = "runif.sobol", trial.n = 100, len = 1000, 
type = 7, no = 10000)

Arguments

`data`	Dataset to be fitted.
`rs.leap`	See scrambling argument in `fun.gen.qrn`.
`fmkl.leap`	See scrambling argument in `fun.gen.qrn`.
`rs.init`	Inititial values (lambda3 and lambda4) for the RS generalised lambda distribution.
`fmkl.init`	Inititial values (lambda3 and lambda4) for the FMKL generalised lambda distribution.
`FUN`	A character string of either `"runif.sobol"` (default), `"runif.sobol.owen"`, `"runif.halton"` or `"QUnif"`.
`trial.n`	Number of evenly spaced quantile ranging from 0 to 1 to be used in the checking phase, to find the best set of initial values for optimisation, this is intended to be lower than `len` to speed up the fitting algorithm. Default is 100.
`len`	Number of evenly spaced quantile ranging from 0 to 1 to be used, default is 1000
`type`	Type of quantile to be used, default is 7, see `quantile`
`no`	Number of initial random values to find the best initial values for optimisation.

Details

This function consolidates fun.RPRS.qs and fun.RMFMKL.qs and gives all the fits in one output.

Value

A matrix showing the parameters of RS and FMKL generalised lambda distributions.

Note

RPRS can sometimes fail if it is not possible to calculate the percentiles of the data set. This usually happens when the number of data point is small.

Author(s)

Steve Su

References

Su (2008). Fitting GLD to data via quantile matching method. (Book chapter to appear)

Examples


# Fitting normal(3,2) distriution using the default setting
 junk<-rnorm(50,3,2)
 fun.data.fit.qs(junk)
# Fitting normal(3,2) distriution using the default setting
 junk<-rnorm(50,3,2)
 fun.data.fit.qs(junk)

Compute the simulated Kolmogorov-Smirnov tests for the unimodal dataset

Description

This function counts the number of times the p-value exceed 0.05 for the null hypothesis that the observations simulated from the fitted distribution is the same as the observations simulated from the unimodal data set.

Usage

fun.diag.ks.g(result, data, no.test = 1000, len = floor(0.9 * length(data)), 
param, alpha = 0.05)
fun.diag.ks.g(result, data, no.test = 1000, len = floor(0.9 * length(data)), 
param, alpha = 0.05)

Arguments

`result`	A vector representing the four parameters of the generalised lambda distribution.
`data`	The unimodal dataset.
`no.test`	Total number of tests required.
`len`	Number of data to sample.
`param`	Type of the generalised lambda distribution, `"rs"` or `"fmkl"`.
`alpha`	Significance level of KS test.

Value

A numerical value representing number of times the p-value exceeds alpha.

Note

If there are ties, jittering is used in ks.gof.

Author(s)

Steve Su

References

Stephens, M. A. (1986). Tests based on EDF statistics. In Goodness-of-Fit Techniques. D'Agostino, R. B. and Stevens, M. A., eds. New York: Marcel Dekker.

Su, S. (2005). A Discretized Approach to Flexibly Fit Generalized Lambda Distributions to Data. Journal of Modern Applied Statistical Methods (November): 408-424.

Su (2007). Nmerical Maximum Log Likelihood Estimation for Generalized Lambda Distributions. Computational Statistics and Data Analysis: *51*, 8, 3983-3998.

Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.

Examples


# Generate 1000 random observations from Normal distribution with mean=100, 
# standard deviation=10. Save this as junk
 junk<-rnorm(1000,100,10)

# Fit junk using RPRS method via the maxmum likelihood.
 fit1<-fun.RPRS.ml(junk, c(-1.5, 1.5), leap = 3)

# Calculate the simulated KS test result:
 fun.diag.ks.g(fit1,junk,param="rs")
# Generate 1000 random observations from Normal distribution with mean=100, 
# standard deviation=10. Save this as junk
 junk<-rnorm(1000,100,10)

# Fit junk using RPRS method via the maxmum likelihood.
 fit1<-fun.RPRS.ml(junk, c(-1.5, 1.5), leap = 3)

# Calculate the simulated KS test result:
 fun.diag.ks.g(fit1,junk,param="rs")

Compute the simulated Kolmogorov-Smirnov tests for the bimodal dataset

Description

Usage

fun.diag.ks.g.bimodal(result1, result2, prop1, prop2, data, no.test = 1000, 
len = floor(0.9 * length(data)), param1, param2, alpha = 0.05)
fun.diag.ks.g.bimodal(result1, result2, prop1, prop2, data, no.test = 1000, 
len = floor(0.9 * length(data)), param1, param2, alpha = 0.05)

Arguments

`result1`	A vector representing the four parameters of the first generalised lambda distribution.
`result2`	A vector representing the four parameters of the second generalised lambda distribution.
`prop1`	Proportion of the first distribution fitted to the bimodal dataset.
`prop2`	Proportion of the second distribution fitted to the bimodal dataset.
`data`	The bimodal dataset.
`no.test`	Total number of tests required.
`len`	Number of data to sample.
`param1`	Type of first generalised lambda distribution, can be `"rs"` or `"fmkl"`.
`param2`	Type of second generalised lambda distribution, can be `"rs"` or `"fmkl"`.
`alpha`	Significance level of KS test.

Value

A numerical value representing number of times the p-value exceeds alpha.

Note

If there are ties, jittering is used in ks.gof.

Author(s)

Steve Su

References

Stephens, M. A. (1986). Tests based on EDF statistics. In Goodness- of-Fit Techniques. D'Agostino, R. B. and Stevens, M. A., eds. New York: Marcel Dekker.

Su, S. (2005). A Discretized Approach to Flexibly Fit Generalized Lambda Distributions to Data. Journal of Modern Applied Statistical Methods (November): 408-424.

Su (2007). Nmerical Maximum Log Likelihood Estimation for Generalized Lambda Distributions. Computational Statistics and Data Analysis: *51*, 8, 3983-3998.

Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.

Examples


# Fit the faithful[,1] data from the MASS library
 fit1<-fun.auto.bimodal.ml(faithful[,1],init1.sel="rprs",init2.sel="rmfmkl",
 init1=c(-1.5,1,5),init2=c(-0.25,1.5),leap1=3,leap2=3)
# Run diagnostic KS tests
 fun.diag.ks.g.bimodal(fit1$par[1:4],fit1$par[5:8],prop1=fit1$par[9],
 data=faithful[,1],param1="rs",param2="fmkl")
# Fit the faithful[,1] data from the MASS library
 fit1<-fun.auto.bimodal.ml(faithful[,1],init1.sel="rprs",init2.sel="rmfmkl",
 init1=c(-1.5,1,5),init2=c(-0.25,1.5),leap1=3,leap2=3)
# Run diagnostic KS tests
 fun.diag.ks.g.bimodal(fit1$par[1:4],fit1$par[5:8],prop1=fit1$par[9],
 data=faithful[,1],param1="rs",param2="fmkl")

Diagnostic function for theoretical distribution fits through the resample Kolmogorov-Smirnoff tests

Description

This function is primarily designed to be used for testing the fitted distribution with reference to a theoretical distribution. It is also tailored for output obtained from the fun.data.fit.ml function.

Usage

fun.diag1(result, test, no.test = 1000, alpha = 0.05)
fun.diag1(result, test, no.test = 1000, alpha = 0.05)

Arguments

`result`	Output from `fun.data.fit.ml` function.
`test`	Simulated observations from theoretical distribution, the length should be no.test $\mbox{\textasciitilde}$ 2.
`no.test`	Number of times to do the KS tests.
`alpha`	Significance level of KS test.

Value

A vector showing the number of times the KS p-value is greater than alpha for each of the distribution fit strategy.

Note

If there are ties, jittering is used in ks.gof.

Author(s)

Steve Su

References

Su, S. (2005). A Discretized Approach to Flexibly Fit Generalized Lambda Distributions to Data. Journal of Modern Applied Statistical Methods (November): 408-424.

Su, S. (2007). Numerical Maximum Log Likelihood Estimation for Generalized Lambda Distributions. Journal of Computational statistics and data analysis 51(8) 3983-3998.

Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.

Examples


# Fits a Weibull 5,2 distribution:
 weibull.approx.ml<-fun.data.fit.ml(rweibull(1000,5,2))

# Compute the resample K-S test results.
 fun.diag1(weibull.approx.ml, rweibull(100000, 5, 2))

# Fits a Weibull 5,2 distribution:
 weibull.approx.ml<-fun.data.fit.ml(rweibull(1000,5,2))

# Compute the resample K-S test results.
 fun.diag1(weibull.approx.ml, rweibull(100000, 5, 2))

Diagnostic function for empirical data distribution fits through the resample Kolmogorov-Smirnoff tests

Description

This function is primarily designed to be used for testing the fitted distribution with reference to an empirical data. It is also tailored for output obtained from the fun.data.fit.ml function.

Usage

fun.diag2(result, data, no.test = 1000, len=100, alpha = 0.05)
fun.diag2(result, data, no.test = 1000, len=100, alpha = 0.05)

Arguments

`result`	Output from `fun.data.fit.ml` function.
`data`	Observations in which the distribution was fitted upon.
`no.test`	Number of times to do the KS tests.
`len`	Number of observations to sample from the data. This is also the number of observations sampled from the fitted distribution in each KS test.
`alpha`	Significance level of KS test.

Value

A vector showing the number of times the KS p-value is greater than alpha for each of the distribution fit strategy.

Note

If there are ties, jittering is used in ks.gof.

Author(s)

Steve Su

References

Su, S. (2005). A Discretized Approach to Flexibly Fit Generalized Lambda Distributions to Data. Journal of Modern Applied Statistical Methods (November): 408-424.

Su, S. (2007). Numerical Maximum Log Likelihood Estimation for Generalized Lambda Distributions. Journal of Computational statistics and data analysis 51(8) 3983-3998.

Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.

Examples


# Fits a Normal 3,2 distribution:
 junk<-rnorm(1000,3,2)
 fit<-fun.data.fit.ml(junk)

# Compute the resample K-S test results.
 fun.diag2(fit,junk)
# Fits a Normal 3,2 distribution:
 junk<-rnorm(1000,3,2)
 fit<-fun.data.fit.ml(junk)

# Compute the resample K-S test results.
 fun.diag2(fit,junk)

Estimates the mean and variance after cutting up a vector of variable into evenly spaced categories.

Description

This function supplements fun.nclass.e and it is not intended to be used by the users directly.

Usage

fun.disc.estimation(x, nint)
fun.disc.estimation(x, nint)

Arguments

`x`	A vector of observations.
`nint`	Number of intervals to cut the vectors into.

Details

The function cuts the vector into evenly spaced categories and estimate the mean and variance of the actual data based on the categorisation.

Value

Two numerical values, the first being the mean and the second being the variance.

Author(s)

Steve Su

Examples


## Cut up a randomly normally distributed observations into 5 evenly spaced 
## categories and estimate the mean and variance based on this cateogorisation.
junk<-rnorm(1000,3,2)
fun.disc.estimation(junk,5)
## Cut up a randomly normally distributed observations into 5 evenly spaced 
## categories and estimate the mean and variance based on this cateogorisation.
junk<-rnorm(1000,3,2)
fun.disc.estimation(junk,5)

Finds the low discrepancy quasi random numbers

Description

This function calls the runif.sobol, runif.sobol.owen and runif.halton essentially from the spacefillr package.

Usage

fun.gen.qrn(n, dimension, scrambling, FUN = "runif.sobol")
fun.gen.qrn(n, dimension, scrambling, FUN = "runif.sobol")

Arguments

`n`	Number to generate.
`dimension`	Number of dimensions.
`scrambling`	seed used, or leap as in the case of `QUnif`.
`FUN`	This can be `"runif.sobol"` (default), `"runif.sobol.owen"`, `"runif.halton"` or `"QUnif"`.

Value

A vector of values if dimension=1, otherwise a matrix of values between 0 and 1.

Author(s)

Steve Su

Examples

fun.gen.qrn(1000,5,3,"runif.sobol")

fun.gen.qrn(1000,5,409,"QUnif")
fun.gen.qrn(1000,5,3,"runif.sobol")

fun.gen.qrn(1000,5,409,"QUnif")

Find the theoretical first four L moments of the generalised lambda distribution.

Description

This function computes the first four L moments for both RS and FMKL generalised lambda distributions.

Usage

fun.lm.theo.gld(L1, L2, L3, L4, param)
fun.lm.theo.gld(L1, L2, L3, L4, param)

Arguments

`L1`	Lambda 1. Or c(Lambda 1,Lambda 2,Lambda 3,Lambda 4).
`L2`	Lambda 2.
`L3`	Lambda 3.
`L4`	Lambda 4.
`param`	"rs" or "fmkl" or "fkml"

Value

A vector listing the first four L moments

Note

Sometimes the theoretical moments may not exist, in those cases, NA is returned.

Author(s)

Steve Su

References

Asquith, W. (2007). "L-moments and TL-moments of the generalized lambda distribution." Computational Statistics and Data Analysis 51(9): 4484-4496.

Karvanen, J. and A. Nuutinen (2008). "Characterizing the generalized lambda distribution by L-moments." Computational Statistics and Data Analysis 52(4): 1971-1983.

Examples

fun.lm.theo.gld(1, 2, 3, 4, "rs")
fun.lm.theo.gld(1, 2, 3, 4, "fmkl")
fun.lm.theo.gld(1, 2, 3, 4, "rs")
fun.lm.theo.gld(1, 2, 3, 4, "fmkl")

Applying functions based on an index for a matrix.

Description

This is a generic function that can be used to find mean, variance, sum or other operations according to some index imposed on the matrix or vector.

Usage

fun.mApply(X, INDEX, FUN = NULL, ..., simplify = TRUE)
fun.mApply(X, INDEX, FUN = NULL, ..., simplify = TRUE)

Arguments

`X`	Matrix with n rows.
`INDEX`	Vector or list of vectors of length n.
`FUN`	Function to operate on submatrices of `X` by `INDEX`
`...`	Arguments to function.
`simplify`	Set as `TRUE` by default, see `sapply` fo details.

Value

If FUN returns more than one number, fun.mApply returns a matrix with rows corresponding to unique values of INDEX.

Author(s)

Tony Plate

Examples

 
# Finding the row medians of a matrix (matrix(1:20,nrow=5))
fun.mApply(matrix(1:20,nrow=5),list(1:5),median)
# Finding the row medians of a matrix (matrix(1:20,nrow=5))
fun.mApply(matrix(1:20,nrow=5),list(1:5),median)

Check whether the specified GLDs cover the minimum and the maximum values in a dataset

Description

This function checks the lowest and highest quantiles of the specified GLDs against the specified dataset

Usage

fun.minmax.check.gld(data, lambdas, param, lessequalmin = 1, 
greaterequalmax = 1)
fun.minmax.check.gld(data, lambdas, param, lessequalmin = 1, 
greaterequalmax = 1)

Arguments

`data`	A vector of numerical dataset
`lambdas`	A matrix of four columns representing lambda 1 to lambda 4 of the GLD
`param`	Can be "rs", "fkml" or "fmkl"
`lessequalmin`	Can be 0 or 1
`greaterequalmax`	Can be 0 or 1

Details

lessequalmin==1 means the lowest value of GLD <= minimum value of data lessequalmin==0 means the lowest value of GLD < minimum value of data greaterequalmin==1 means the highest value of GLD >= maximum value of data greaterequalmin==0 means the highest value of GLD > maximum value of data

Value

A vector of logical values indicating whether the specified data the specified GLDs cover the minimum and the maximum values in a dataset

Author(s)

Steve Su

Examples

fun.minmax.check.gld(runif(100,.9,1),matrix(1:12,ncol=4),param="rs",0,0)
fun.minmax.check.gld(runif(100,.98,1),matrix(1:12,ncol=4),param="fkml",1,1)
fun.minmax.check.gld(runif(100,.9,1),matrix(1:12,ncol=4),param="rs",0,0)
fun.minmax.check.gld(runif(100,.98,1),matrix(1:12,ncol=4),param="fkml",1,1)

Finds the moments of fitted mixture of generalised lambda distribution by simulation.

Description

This functions compute the mean, variance, skewness and kurtosis of the fitted generalised lambda distribution mixtures using Monte Carlo simulation.

Usage

fun.moments.bimodal(result1, result2, prop1, prop2, len = 1000, 
no.test = 1000, param1, param2)
fun.moments.bimodal(result1, result2, prop1, prop2, len = 1000, 
no.test = 1000, param1, param2)

Arguments

`result1`	A vector comprising four values for the first generalised lambda distribution.
`result2`	A vector comprising four values for the second generalised lambda distribution.
`prop1`	Proportion of the first generalised lambda distribution
`prop2`	1-prop1, this can be left unspecified.
`len`	Length of object for each simulation run.
`no.test`	Number of simulation run.
`param1`	This can be `"rs"` or `"fmkl"`, specifying the type of the first generalised lambda distribution.
`param2`	This can be `"rs"` or `"fmkl"`, specifying the type of the second generalised lambda distribution.

Details

There is also a theoretical computation of the moments in fun.theo.bi.mv.gld, it should be noted that the theoretical moments may not exist. The length of object in len means how many observations should be generated in each simulation run, with the number of simulation runs governed by no.test.

Value

A matrix with four columns showing the mean, variance, skewness and kurtosis of the fitted generalised lambda distribution mixtures using Monte Carlo simulation. Each row represents a simulation run.

Author(s)

Steve Su

Examples


# Fitting the first column of the Old Faithful Geyser data
 fit1<-fun.auto.bimodal.ml(faithful[,1],init1.sel="rmfmkl",init2.sel="rmfmkl",
 init1=c(-0.25,1.5),init2=c(-0.25,1.5),leap1=3,leap2=3)

# After fitting compute the monte carlo moments using fun.moments.bimodal
 fun.moments.bimodal(fit1$par[1:4],fit1$par[5:8],prop1=fit1$par[9],
 param1="fmkl",param2="fmkl")

# It is also possible to compare this with the moments of the original dataset:
 fun.moments(faithful[,1])

# Fitting the first column of the Old Faithful Geyser data
 fit1<-fun.auto.bimodal.ml(faithful[,1],init1.sel="rmfmkl",init2.sel="rmfmkl",
 init1=c(-0.25,1.5),init2=c(-0.25,1.5),leap1=3,leap2=3)

# After fitting compute the monte carlo moments using fun.moments.bimodal
 fun.moments.bimodal(fit1$par[1:4],fit1$par[5:8],prop1=fit1$par[9],
 param1="fmkl",param2="fmkl")

# It is also possible to compare this with the moments of the original dataset:
 fun.moments(faithful[,1])

Calculate mean, variance, skewness and kurtosis of a numerical vector

Description

This function evaluates the mean, variance, skewness and kurtosis of a numerical vector. Missing values are automatically removed.

Usage

fun.moments.r(x, normalise = "N")
fun.moments.r(x, normalise = "N")

Arguments

`x`	A numeric vector
`normalise`	"Y" if you want kurtosis to be calculated with reference to kurtosis = 0 under Normal distribution. Default is "N".

Value

A vector of mean, variance, skewness and kurtosis.

Note

Please contact the author directly if you find a bug!

Author(s)

Steve Su

Examples


fun.moments.r(rnorm(1000))
fun.moments.r(rnorm(1000),normalise="Y")

fun.moments.r(rnorm(1000))
fun.moments.r(rnorm(1000),normalise="Y")

Estimates the number of classes or bins to smooth over in the discretised method of fitting generalised lambda distribution to data.

Description

Support function for discretised method of fitting distribution to data.

Usage

fun.nclass.e(x)
fun.nclass.e(x)

Arguments

`x`	Vector of data.

Details

This function calculates the mean and variance of the discretised data from 1 to the very last observation and chooses the best number of categories that represent the mean and variance of the actual data set through the criterion of squared deviations.

Value

A numerical value suggesting the best number of class that can be used to represent the mean and variane of the original data set.

Note

This is not designed to be called directly by end user.

Author(s)

Steve Su

Examples

fun.nclass.e(rnorm(100,3,2))
fun.nclass.e(rnorm(100,3,2))

Plotting the univariate generalised lambda distribution fits on the data set.

Description

This function is designed for univariate generalised lambda distribution fits only.

Usage

fun.plot.fit(fit.obj, data, nclass = 50, xlab = "", name = "", param.vec,
ylab="Density", main="")
fun.plot.fit(fit.obj, data, nclass = 50, xlab = "", name = "", param.vec,
ylab="Density", main="")

Arguments

`fit.obj`	Fitted object from `fun.data.fit.ml`, `fun.data.fit.hs`, `fun.data.fit.hs.nw`, `fun.RPRS.ml`, `fun.RMFMKL.ml`, `fun.RPRS.hs`, `fun.RMFMKL.hs`, `fun.RPRS.hs.nw`, `fun.RMFMKL.hs.nw`
`data`	Dataset to be plotted.
`nclass`	Number of class of histogram, the default is 50.
`xlab`	Label on the x axis.
`name`	Naming the type of distribution fits.
`param.vec`	A vector describing the type of generalised lambda distribution used in the `fit.obj`.
`ylab`	Label on the y axis.
`main`	Title of the graph.

Value

A graphical output showing the data and the resulting distributional fits.

Note

If the distribution fits over fits the peak of the distribution, it can be difficult to see the actual data set.

Author(s)

Steve Su

Examples


# Generate Normally distribute random numbers as dataset
 junk<-rnorm(1000,3,2)

# Fit the data set using fun.data.fit.ml. 
# Also, fun.data.fit.hs or fun.data.fit.hs.nw can be used.
 obj.fit<-fun.data.fit.ml(junk)

# Plot the resulting fits
 fun.plot.fit(obj.fit,junk,xlab="x",name=".ML",param.vec=c("rs","fmkl","fmkl"))

# This function also works for singular fits such as those by fun.RPRS.ml,
# fun.RMFMKL.ml, fun.RPRS.hs, fun.RMFMKL.hs, fun.RPRS.hs.nw, fun.RMFMKL.hs.nw
 junk<-rnorm(1000,3,2)
 obj.fit<-fun.RPRS.ml(junk)
 fun.plot.fit(obj.fit,junk,xlab="x",name="RPRS.ML",param.vec=c("rs"))
# Generate Normally distribute random numbers as dataset
 junk<-rnorm(1000,3,2)

# Fit the data set using fun.data.fit.ml. 
# Also, fun.data.fit.hs or fun.data.fit.hs.nw can be used.
 obj.fit<-fun.data.fit.ml(junk)

# Plot the resulting fits
 fun.plot.fit(obj.fit,junk,xlab="x",name=".ML",param.vec=c("rs","fmkl","fmkl"))

# This function also works for singular fits such as those by fun.RPRS.ml,
# fun.RMFMKL.ml, fun.RPRS.hs, fun.RMFMKL.hs, fun.RPRS.hs.nw, fun.RMFMKL.hs.nw
 junk<-rnorm(1000,3,2)
 obj.fit<-fun.RPRS.ml(junk)
 fun.plot.fit(obj.fit,junk,xlab="x",name="RPRS.ML",param.vec=c("rs"))

Plotting mixture of two generalised lambda distributions on the data set.

Description

This function is designed for mixture of two generalised lambda distributions only.

Usage

fun.plot.fit.bm(fit.obj, data, nclass = 50, xlab = "", name = "", main="", 
param.vec, ylab="Density")
fun.plot.fit.bm(fit.obj, data, nclass = 50, xlab = "", name = "", main="", 
param.vec, ylab="Density")

Arguments

`fit.obj`	Fitted object from `fun.auto.bimodal.ml`, `fun.auto.bimodal.pml`
`data`	Dataset to be plotted.
`nclass`	Number of class of histogram, the default is 50.
`xlab`	Label on the x axis.
`name`	Legend, usually used to identify type of GLD used if `main` is provided. If `main` is not provided, then this is used in the title.
`main`	Title of the graph.
`param.vec`	A vector describing the type of generalised lambda distribution used in the `fit.obj`.
`ylab`	Label on the y axis.

Value

A graphical output showing the data and the resulting distributional fits.

Note

If the distribution fits over fits the peak of the distribution, it can be difficult to see the actual data set.

Author(s)

Steve Su

Examples



 opar <- par() 
 par(mfrow=c(2,1))

# Fitting mixture of generalised lambda distributions on the data set using 
# both the maximum likelihood and partition maximum likelihood and plot 
# the resulting fits

 junk<-fun.auto.bimodal.ml(faithful[,1],per.of.mix=0.1,clustering.m=clara,
 init1.sel="rprs",init2.sel="rmfmkl",init1=c(-1.5,1,5),init2=c(-0.25,1.5),
 leap1=3,leap2=3)
 fun.plot.fit.bm(nclass=50,fit.obj=junk,data=faithful[,1],
 name="Maximum likelihood using",xlab="faithful1",param.vec=c("rs","fmkl"))

 junk<-fun.auto.bimodal.pml(faithful[,1],clustering.m=clara,init1.sel="rprs",
 init2.sel="rmfmkl",init1=c(-1.5,1,5),init2=c(-0.25,1.5),leap1=3,leap2=3)
 fun.plot.fit.bm(nclass=50,fit.obj=junk,data=faithful[,1],
 name="Partition maximum likelihood using",xlab="faithful1",
 param.vec=c("rs","fmkl"))

 junk<-fun.auto.bimodal.ml(faithful[,1],per.of.mix=0.1,clustering.m=clara,
 init1.sel="rprs",init2.sel="rmfmkl",init1=c(-1.5,1,5),init2=c(-0.25,1.5),
 leap1=3,leap2=3)
 fun.plot.fit.bm(nclass=50,fit.obj=junk,data=faithful[,1],
 main="Mixture distribution fit",
 name="RS and FMKL GLD",xlab="faithful1",param.vec=c("rs","fmkl"))

 par(opar)

opar <- par() 
 par(mfrow=c(2,1))

# Fitting mixture of generalised lambda distributions on the data set using 
# both the maximum likelihood and partition maximum likelihood and plot 
# the resulting fits

 junk<-fun.auto.bimodal.ml(faithful[,1],per.of.mix=0.1,clustering.m=clara,
 init1.sel="rprs",init2.sel="rmfmkl",init1=c(-1.5,1,5),init2=c(-0.25,1.5),
 leap1=3,leap2=3)
 fun.plot.fit.bm(nclass=50,fit.obj=junk,data=faithful[,1],
 name="Maximum likelihood using",xlab="faithful1",param.vec=c("rs","fmkl"))

 junk<-fun.auto.bimodal.pml(faithful[,1],clustering.m=clara,init1.sel="rprs",
 init2.sel="rmfmkl",init1=c(-1.5,1,5),init2=c(-0.25,1.5),leap1=3,leap2=3)
 fun.plot.fit.bm(nclass=50,fit.obj=junk,data=faithful[,1],
 name="Partition maximum likelihood using",xlab="faithful1",
 param.vec=c("rs","fmkl"))

 junk<-fun.auto.bimodal.ml(faithful[,1],per.of.mix=0.1,clustering.m=clara,
 init1.sel="rprs",init2.sel="rmfmkl",init1=c(-1.5,1,5),init2=c(-0.25,1.5),
 leap1=3,leap2=3)
 fun.plot.fit.bm(nclass=50,fit.obj=junk,data=faithful[,1],
 main="Mixture distribution fit",
 name="RS and FMKL GLD",xlab="faithful1",param.vec=c("rs","fmkl"))

 par(opar)

Plotting many univariate generalised lambda distributions on one page.

Description

This is a variant of the fun.plot.fit function.

Usage

fun.plot.many.gld(fit.obj, data, xlab="", ylab="Density", main="", legd="",
param.vec)
fun.plot.many.gld(fit.obj, data, xlab="", ylab="Density", main="", legd="",
param.vec)

Arguments

`fit.obj`	A matrix of generalised lambda distibutions parameters from `fun.data.fit.ml`, `fun.data.fit.hs`, `fun.data.fit.hs.nw`, `fun.RPRS.ml`, `fun.RMFMKL.ml`, `fun.RPRS.hs`, `fun.RMFMKL.hs`, `fun.RPRS.hs.nw`, `fun.RMFMKL.hs.nw` functions. Or a matrix of generalised lambda distribution parameters.
`data`	Dataset to be plotted or two values showing the ranges of value to be compared.
`xlab`	X-axis labels.
`ylab`	Y-axis labels.
`main`	Title for the plot.
`legd`	Legend for the plot.
`param.vec`	A vector showing the types of generalised lambda distributions. This can be "rs" or "fmkl", only needed if you want to put your own parameters for generalised lambda distributions which are not generated from a fitting algorithm in this package.

Value

A graph showing the different distributions on the same page.

Note

The data part of the function is not plotted, to see the dataset use the fun.plot.fit function.

Author(s)

Steve Su

Examples



# Fit the dataset
 junk<-rnorm(1000,3,2)
 result.hs<-fun.data.fit.hs(junk,rs.default = "Y", fmkl.default = "Y", 
 rs.leap=3, fmkl.leap=3,rs.init = c(-1.5, 1.5), fmkl.init = c(-0.25, 1.5),
 no.c.rs=50,no.c.fmkl=50)

 opar <- par() 
 par(mfrow=c(2,2))

# Plot the entire data range
 fun.plot.many.gld(result.hs,junk,"x","density","",
 legd=c("RPRS.hs", "RMFMKL.hs"))

# Plot and compare parts of the distributions
 fun.plot.many.gld(result.hs,c(1,2),"x","density","",legd=c("RPRS.hs", 
"RMFMKL.hs"))
 fun.plot.many.gld(result.hs,c(0.1,0,2),"x","density","",legd=c("RPRS.hs", 
"RMFMKL.hs"))
 fun.plot.many.gld(result.hs,c(3,4),"x","density","",legd=c("RPRS.hs", 
"RMFMKL.hs"))

 par(opar)


# Fit the dataset
 junk<-rnorm(1000,3,2)
 result.hs<-fun.data.fit.hs(junk,rs.default = "Y", fmkl.default = "Y", 
 rs.leap=3, fmkl.leap=3,rs.init = c(-1.5, 1.5), fmkl.init = c(-0.25, 1.5),
 no.c.rs=50,no.c.fmkl=50)

 opar <- par() 
 par(mfrow=c(2,2))

# Plot the entire data range
 fun.plot.many.gld(result.hs,junk,"x","density","",
 legd=c("RPRS.hs", "RMFMKL.hs"))

# Plot and compare parts of the distributions
 fun.plot.many.gld(result.hs,c(1,2),"x","density","",legd=c("RPRS.hs", 
"RMFMKL.hs"))
 fun.plot.many.gld(result.hs,c(0.1,0,2),"x","density","",legd=c("RPRS.hs", 
"RMFMKL.hs"))
 fun.plot.many.gld(result.hs,c(3,4),"x","density","",legd=c("RPRS.hs", 
"RMFMKL.hs"))

 par(opar)

Computes the raw moments of the generalised lambda distribution up to 4th order.

Description

This function is of theoretical interest only.

Usage

fun.rawmoments(L1, L2, L3, L4, param = "fmkl")
fun.rawmoments(L1, L2, L3, L4, param = "fmkl")

Arguments

`L1`	Location parameter of the generalised lambda distribution.
`L2`	Scale parameter of the generalised lambda distribution.
`L3`	First shape parameter of the generalised lambda distribution.
`L4`	Second shape parameter of the generalised lambda distribution.
`param`	`"rs"` or `"fmkl"` specifying the type of the generalised lambda distribution.

Details

This function is the building block for fun.theo.bi.mv.gld.

Value

A vector showing the raw moments of the specified generalised lambda distribution up to the fourth order.

Author(s)

Steve Su

References

Freimer, M., Mudholkar, G. S., Kollia, G. & Lin, C. T. (1988), A study of the generalized tukey lambda family, Communications in Statistics - Theory and Methods *17*, 3547-3567.

Karian, Zaven A. and Dudewicz, Edward J. (2000), Fitting statistical distributions: the Generalized Lambda Distribution and Generalized Bootstrap methods, Chapman & Hall

Ramberg, J. S. & Schmeiser, B. W. (1974), An approximate method for generating asymmetric random variables, Communications of the ACM *17*, 78-82.

Examples


## Generate some random numbers using FMKL and RS generalised lambda 
## distributions and then compute the empirical and theoretical 
## E(X), E(X^2), E(X^3), E(X^4) 

junk<-rgl(100000,1,2,3,4)
mean(junk)
mean(junk^2)
mean(junk^3)
mean(junk^4)

junk<-rgl(100000,1,2,3,4,"rs")
mean(junk)
mean(junk^2)
mean(junk^3)
mean(junk^4)

fun.rawmoments(1,2,3,4)
fun.rawmoments(1,2,3,4,"rs")
## Generate some random numbers using FMKL and RS generalised lambda 
## distributions and then compute the empirical and theoretical 
## E(X), E(X^2), E(X^3), E(X^4) 

junk<-rgl(100000,1,2,3,4)
mean(junk)
mean(junk^2)
mean(junk^3)
mean(junk^4)

junk<-rgl(100000,1,2,3,4,"rs")
mean(junk)
mean(junk^2)
mean(junk^3)
mean(junk^4)

fun.rawmoments(1,2,3,4)
fun.rawmoments(1,2,3,4,"rs")

Fit FMKL generalised distribution to data using discretised approach with weights.

Description

This function fits FMKL generalised distribution to data using discretised approach with weights. It is designed to act as a smoother device rather than as a definitive fit.

Usage

fun.RMFMKL.hs(data, default = "Y", fmkl.init = c(-0.25, 1.5), no.c.fmkl = 50, 
leap = 3,FUN="runif.sobol",no=10000)
fun.RMFMKL.hs(data, default = "Y", fmkl.init = c(-0.25, 1.5), no.c.fmkl = 50, 
leap = 3,FUN="runif.sobol",no=10000)

Arguments

`data`	Dataset to be fitted
`default`	If yes, this function uses the default method `fun.nclass.e` to calculate number of classes required.
`fmkl.init`	Initial values for FMKL distribution optimization, `c(-0.25,1.5)` tends to work well.
`no.c.fmkl`	Number of classes or bins of histogram to be optimized over. This argument is ineffective if `default="Y"`.
`leap`	See scrambling argument in `fun.gen.qrn`.
`FUN`	A character string of either `"runif.sobol"` (default), `"runif.sobol.owen"`, `"runif.halton"` or `"QUnif"`.
`no`	Number of initial random values to find the best initial values for optimisation.

Details

This function optimises the deviations of frequency of the bins to that of the theoretical so it has the effect of "fitting clothes" onto the data set. The user can decide the frequency of the bins they want the distribution to smooth over. The resulting fit may or may not be an adequate fit from a formal statistical point of view such as satisfying the goodness of fit for example, but it can be useful to suggest the range of different distributions exhibited by the data set. The default number of classes calculates the mean and variance after categorising the data into different bins and uses the number of classes that best matches the mean and variance of the original, ungrouped data. The weighting is designed to accentuate the peak or the dense part of the distribution and suppress the tails.

Value

A vector representing four parametefmkl of the FMKL generalised lambda distribution.

Note

In some cases, the resulting fit may not converge, there are currently no checking mechanism in place to ensure global convergence.

Author(s)

Steve Su

References

Su, S. (2005). A Discretized Approach to Flexibly Fit Generalized Lambda Distributions to Data. Journal of Modern Applied Statistical Methods (November): 408-424.

Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.

Examples


# Using the default number of classes
 fun.RMFMKL.hs(data=rnorm(1000,3,2),default="Y",fmkl.init=c(-0.25,1.5),leap=3)
# Using 20 classes
 fun.RMFMKL.hs(data=rnorm(1000,3,2),default="N",fmkl.init=c(-0.25,1.5),
 no.c.fmkl=20,leap=3)
# Using the default number of classes
 fun.RMFMKL.hs(data=rnorm(1000,3,2),default="Y",fmkl.init=c(-0.25,1.5),leap=3)
# Using 20 classes
 fun.RMFMKL.hs(data=rnorm(1000,3,2),default="N",fmkl.init=c(-0.25,1.5),
 no.c.fmkl=20,leap=3)

Fit FMKL generalised distribution to data using discretised approach without weights.

Description

This function fits FMKL generalised distribution to data using discretised approach without weights. It is designed to act as a smoother device rather than as a definitive fit.

Usage

fun.RMFMKL.hs.nw(data, default = "Y", fmkl.init = c(-0.25, 1.5), 
no.c.fmkl = 50, leap = 3,FUN="runif.sobol",no=10000)
fun.RMFMKL.hs.nw(data, default = "Y", fmkl.init = c(-0.25, 1.5), 
no.c.fmkl = 50, leap = 3,FUN="runif.sobol",no=10000)

Arguments

`data`	Dataset to be fitted
`default`	If yes, this function uses the default method `fun.nclass.e` to calculate number of classes required.
`fmkl.init`	Initial values for FMKL distribution optimization, `c(-0.25,1.5)` tends to work well.
`no.c.fmkl`	Number of classes or bins of histogram to be optimized over. This argument is ineffective if `default="Y"`.
`leap`	See scrambling argument in `fun.gen.qrn`.
`FUN`	A character string of either `"runif.sobol"` (default), `"runif.sobol.owen"`, `"runif.halton"` or `"QUnif"`.
`no`	Number of initial random values to find the best initial values for optimisation.

Details

Value

A vector representing four parameters of the FMKL generalised lambda distribution.

Note

In some cases, the resulting fit may not converge, there are currently no checking mechanism in place to ensure global convergence.

Author(s)

Steve Su

References

Su, S. (2005). A Discretized Approach to Flexibly Fit Generalized Lambda Distributions to Data. Journal of Modern Applied Statistical Methods (November): 408-424.

Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.

Examples


# Using the default number of classes
 fun.RMFMKL.hs.nw(data=rnorm(1000,3,2),default="Y",
 fmkl.init=c(-0.25,1.5),leap=3)
# Using 20 classes
 fun.RMFMKL.hs.nw(data=rnorm(1000,6,5),default="N",fmkl.init=c(-0.25,1.5),
 no.c.fmkl=20,leap=3)

# Using the default number of classes
 fun.RMFMKL.hs.nw(data=rnorm(1000,3,2),default="Y",
 fmkl.init=c(-0.25,1.5),leap=3)
# Using 20 classes
 fun.RMFMKL.hs.nw(data=rnorm(1000,6,5),default="N",fmkl.init=c(-0.25,1.5),
 no.c.fmkl=20,leap=3)

Fit FMKL generalised lambda distribution to data set using L moment matching

Description

This function fits FMKL generalised lambda distribution to data set using L moment matching

Usage

fun.RMFMKL.lm(data, fmkl.init = c(-0.25, 1.5), leap = 3, FUN = "runif.sobol", 
no = 10000)
fun.RMFMKL.lm(data, fmkl.init = c(-0.25, 1.5), leap = 3, FUN = "runif.sobol", 
no = 10000)

Arguments

`data`	Dataset to be fitted
`fmkl.init`	Initial values for FMKL distribution optimization, `c(-0.25,1.5)` tends to work well.
`leap`	See scrambling argument in `fun.gen.qrn`.
`FUN`	A character string of either `"runif.sobol"` (default), `"runif.sobol.owen"`, `"runif.halton"` or `"QUnif"`.
`no`	Number of initial random values to find the best initial values for optimisation.

Details

This function provides method of L moment fitting scheme for FMKL GLD. Note this function can fail if there are no defined percentiles from the data set or if the initial values do not lead to a valid FMKL generalised lambda distribution.

This function is based on scheme detailed in the literature below but it has been modified by the author (Steve Su).

Value

A vector representing four parametefmkl of the FMKL generalised lambda distribution.

Author(s)

Steve Su

References

Asquith, W. (2007). "L-moments and TL-moments of the generalized lambda distribution." Computational Statistics and Data Analysis 51(9): 4484-4496.

Karvanen, J. and A. Nuutinen (2008). "Characterizing the generalized lambda distribution by L-moments." Computational Statistics and Data Analysis 52(4): 1971-1983.

Examples


# Fitting the normal distribution
 fun.RMFMKL.lm(data=rnorm(1000,2,3),fmkl.init=c(-0.25,1.5),leap=3)
# Fitting the normal distribution
 fun.RMFMKL.lm(data=rnorm(1000,2,3),fmkl.init=c(-0.25,1.5),leap=3)

Fit FMKL generalised lambda distribution to data set using maximum likelihood estimation

Description

This function fits FMKL generalised lambda distribution to data set using maximum likelihood estimation.

Usage

fun.RMFMKL.ml(data, fmkl.init = c(-0.25, 1.5), leap = 3,FUN="runif.sobol",
no=10000)
fun.RMFMKL.ml(data, fmkl.init = c(-0.25, 1.5), leap = 3,FUN="runif.sobol",
no=10000)

Arguments

`data`	Dataset to be fitted
`fmkl.init`	Initial values for FMKL distribution optimization, `c(-0.25,1.5)` tends to work well.
`leap`	See scrambling argument in `fun.gen.qrn`.
`FUN`	A character string of either `"runif.sobol"` (default), `"runif.sobol.owen"`, `"runif.halton"` or `"QUnif"`.
`no`	Number of initial random values to find the best initial values for optimisation.

Details

This function provides one of the definitive fit to data set using generalised lambda distributions.

Value

A vector representing four parameters of the FMKL generalised lambda distribution.

Author(s)

Steve Su

References

Su, S. (2007). Numerical Maximum Log Likelihood Estimation for Generalized Lambda Distributions. Journal of Computational statistics and data analysis 51(8) 3983-3998.

Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.

Examples


# Fitting the normal distribution
 fun.RMFMKL.ml(data=rnorm(1000,2,3),fmkl.init=c(-0.25,1.5),leap=3)
# Fitting the normal distribution
 fun.RMFMKL.ml(data=rnorm(1000,2,3),fmkl.init=c(-0.25,1.5),leap=3)

Fit RS generalised lambda distribution to data set using maximum likelihood estimation

Description

This function fits FMKL generalised lambda distribution to data set using maximum likelihood estimation using faster implementation through C programming

Usage

fun.RMFMKL.ml.m(data, fmkl.init = c(-0.25, 1.5), leap = 3, FUN = "runif.sobol", 
no = 10000)
fun.RMFMKL.ml.m(data, fmkl.init = c(-0.25, 1.5), leap = 3, FUN = "runif.sobol", 
no = 10000)

Arguments

`data`	Dataset to be fitted
`fmkl.init`	Initial values for FMKL distribution optimization, `c(-0.25,1.5)` tends to work well.
`leap`	See scrambling argument in `fun.gen.qrn`.
`FUN`	A character string of either `"runif.sobol"` (default), `"runif.sobol.owen"`, `"runif.halton"` or `"QUnif"`.
`no`	Number of initial random values to find the best initial values for optimisation.

Details

This function provides one of the definitive fit to data set using generalised lambda distributions.

Value

A vector representing four parameters of the FMKL generalised lambda distribution.

Author(s)

Steve Su

References

Su, S. (2007). Numerical Maximum Log Likelihood Estimation for Generalized Lambda Distributions. Journal of Computational statistics and data analysis 51(8) 3983-3998.

Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.

Examples


# Fitting the normal distribution
 fun.RMFMKL.ml.m(data=rnorm(1000,2,3),fmkl.init=c(-0.25,1.5),leap=3)
# Fitting the normal distribution
 fun.RMFMKL.ml.m(data=rnorm(1000,2,3),fmkl.init=c(-0.25,1.5),leap=3)

Fit FMKL generalised lambda distribution to data set using moment matching

Description

This function fits FMKL generalised lambda distribution to data set using moment matching

Usage

fun.RMFMKL.mm(data, fmkl.init = c(-0.25, 1.5), leap = 3, FUN = "runif.sobol", 
no = 10000)
fun.RMFMKL.mm(data, fmkl.init = c(-0.25, 1.5), leap = 3, FUN = "runif.sobol", 
no = 10000)

Arguments

`data`	Dataset to be fitted
`fmkl.init`	Initial values for FMKL distribution optimization, `c(-0.25,1.5)` tends to work well.
`leap`	See scrambling argument in `fun.gen.qrn`.
`FUN`	A character string of either `"runif.sobol"` (default), `"runif.sobol.owen"`, `"runif.halton"` or `"QUnif"`.
`no`	Number of initial random values to find the best initial values for optimisation.

Details

This function provides method of moment fitting scheme for FMKL GLD. Note this function can fail if there are no defined percentiles from the data set or if the initial values do not lead to a valid FMKL generalised lambda distribution.

This function is based on scheme detailed in the literature below but it has been modified by the author (Steve Su).

Value

A vector representing four parametefmkl of the FMKL generalised lambda distribution.

Author(s)

Steve Su

References

Karian, Z. and E. Dudewicz (2000). Fitting Statistical Distributions: The Generalized Lambda Distribution and Generalised Bootstrap Methods. New York, Chapman and Hall.

Examples


# Fitting the normal distribution
 fun.RMFMKL.mm(data=rnorm(1000,2,3),fmkl.init=c(-0.25,1.5),leap=3)
# Fitting the normal distribution
 fun.RMFMKL.mm(data=rnorm(1000,2,3),fmkl.init=c(-0.25,1.5),leap=3)

Fit FMKL generalised lambda distribution to data set using quantile matching

Description

This function fits FMKL generalised lambda distribution to data set using quantile matching

Usage

fun.RMFMKL.qs(data, fmkl.init = c(-0.25, 1.5), leap = 3, FUN = "runif.sobol", 
trial.n = 100, len = 1000, type = 7, no = 10000)
fun.RMFMKL.qs(data, fmkl.init = c(-0.25, 1.5), leap = 3, FUN = "runif.sobol", 
trial.n = 100, len = 1000, type = 7, no = 10000)

Arguments

`data`	Dataset to be fitted
`fmkl.init`	Initial values for FMKL distribution optimization, `c(-0.25,1.5)` tends to work well.
`leap`	See scrambling argument in `fun.gen.qrn`.
`FUN`	A character string of either `"runif.sobol"` (default), `"runif.sobol.owen"`, `"runif.halton"` or `"QUnif"`.
`trial.n`	Number of evenly spaced quantile ranging from 0 to 1 to be used in the checking phase, to find the best set of initial values for optimisation, this is intended to be lower than `len` to speed up the fitting algorithm. Default is 100.
`len`	Number of evenly spaced quantile ranging from 0 to 1 to be used, default is 1000
`type`	Type of quantile to be used, default is 7, see `quantile`
`no`	Number of initial random values to find the best initial values for optimisation.

Details

This function provides quantile matching fitting scheme for FMKL GLD. Note this function can fail if there are no defined percentiles from the data set or if the initial values do not lead to a valid FMKL generalised lambda distribution.

Value

A vector representing four parametefmkl of the FMKL generalised lambda distribution.

Author(s)

Steve Su

References

Su (2008). Fitting GLD to data via quantile matching method. (Book chapter to appear)

Examples


# Fitting the normal distribution
 fun.RMFMKL.qs(data=rnorm(1000,2,3),fmkl.init=c(-0.25,1.5),leap=3)
# Fitting the normal distribution
 fun.RMFMKL.qs(data=rnorm(1000,2,3),fmkl.init=c(-0.25,1.5),leap=3)

Fit RS generalised distribution to data using discretised approach with weights.

Description

This function fits RS generalised distribution to data using discretised approach with weights. It is designed to act as a smoother device rather than as a definitive fit.

Usage

fun.RPRS.hs(data, default = "Y", rs.init = c(-1.5, 1.5), no.c.rs = 50, 
leap = 3,FUN="runif.sobol",no=10000)
fun.RPRS.hs(data, default = "Y", rs.init = c(-1.5, 1.5), no.c.rs = 50, 
leap = 3,FUN="runif.sobol",no=10000)

Arguments

`data`	Dataset to be fitted
`default`	If yes, this function uses the default method `fun.nclass.e` to calculate number of classes required.
`rs.init`	Initial values for RS distribution optimization, `c(-1.5,1.5)` tends to work well.
`no.c.rs`	Number of classes or bins of histogram to be optimized over. This argument is ineffective if `default="Y"`.
`leap`	See scrambling argument in `fun.gen.qrn`.
`FUN`	A character string of either `"runif.sobol"` (default), `"runif.sobol.owen"`, `"runif.halton"` or `"QUnif"`.
`no`	Number of initial random values to find the best initial values for optimisation.

Details

This function optimises the deviations of frequency of the bins to that of the theoretical so it has the effect of "fitting clothes" onto the data set. The user can decide the frequency of the bins they want the distribution to smooth over. The resulting fit may or may not be an adequate fit from a formal statistical point of view such as satisfying the goodness of fit for example, but it can be useful to suggest the range of different distributions exhibited by the data set. The default number of classes calculates the mean and variance after categorising the data into different bins and uses the number of classes that best matches the mean and variance of the original, ungrouped data. The weighting is designed to accentuate the peak or the dense part of the distribution and suppress the tails.

Value

A vector representing four parameters of the RS generalised lambda distribution.

Note

Author(s)

Steve Su

References

Su, S. (2005). A Discretized Approach to Flexibly Fit Generalized Lambda Distributions to Data. Journal of Modern Applied Statistical Methods (November): 408-424.

Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.

Examples


# Using the default number of classes
 fun.RPRS.hs(data=rnorm(1000,2,3),default="Y",rs.init=c(-1.5,1.5),leap=3)
# Using 20 classes
 fun.RPRS.hs(data=rnorm(1000,2,3),default="N",rs.init=c(-1.5,1.5),
 no.c.rs=20,leap=3)
# Using the default number of classes
 fun.RPRS.hs(data=rnorm(1000,2,3),default="Y",rs.init=c(-1.5,1.5),leap=3)
# Using 20 classes
 fun.RPRS.hs(data=rnorm(1000,2,3),default="N",rs.init=c(-1.5,1.5),
 no.c.rs=20,leap=3)

Fit RS generalised distribution to data using discretised approach without weights.

Description

This function fits RS generalised distribution to data using discretised approach without weights. It is designed to act as a smoother device rather than as a definitive fit.

Usage

fun.RPRS.hs.nw(data, default = "Y", rs.init = c(-1.5, 1.5), no.c.rs = 50, 
leap = 3,FUN="runif.sobol",no=10000)
fun.RPRS.hs.nw(data, default = "Y", rs.init = c(-1.5, 1.5), no.c.rs = 50, 
leap = 3,FUN="runif.sobol",no=10000)

Arguments

`data`	Dataset to be fitted
`default`	If yes, this function uses the default method `fun.nclass.e` to calculate number of classes required.
`rs.init`	Initial values for RS distribution optimization, `c(-1.5,1.5)` tends to work well.
`no.c.rs`	Number of classes or bins of histogram to be optimized over. This argument is ineffective if `default="Y"`.
`leap`	See scrambling argument in `fun.gen.qrn`.
`FUN`	A character string of either `"runif.sobol"` (default), `"runif.sobol.owen"`, `"runif.halton"` or `"QUnif"`.
`no`	Number of initial random values to find the best initial values for optimisation.

Details

Value

A vector representing four parameters of the RS generalised lambda distribution.

Note

Author(s)

Steve Su

References

Su, S. (2005). A Discretized Approach to Flexibly Fit Generalized Lambda Distributions to Data. Journal of Modern Applied Statistical Methods (November): 408-424.

Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.

Examples


# Using the default number of classes
 fun.RPRS.hs.nw(data=rnorm(1000,3,2),default="Y",rs.init=c(-1.5,1.5),leap=3)
# Using 20 classes
 fun.RPRS.hs.nw(data=rnorm(1000,3,2),default="N",rs.init=c(-1.5,1.5),
 no.c.rs=20,leap=3)
# Using the default number of classes
 fun.RPRS.hs.nw(data=rnorm(1000,3,2),default="Y",rs.init=c(-1.5,1.5),leap=3)
# Using 20 classes
 fun.RPRS.hs.nw(data=rnorm(1000,3,2),default="N",rs.init=c(-1.5,1.5),
 no.c.rs=20,leap=3)

Fit RS generalised lambda distribution to data set using L moment matching

Description

This function fits RS generalised lambda distribution to data set using L moment matching

Usage

fun.RPRS.lm(data, rs.init = c(-1.5, 1.5), leap = 3, FUN = "runif.sobol", 
no = 10000)
fun.RPRS.lm(data, rs.init = c(-1.5, 1.5), leap = 3, FUN = "runif.sobol", 
no = 10000)

Arguments

`data`	Dataset to be fitted
`rs.init`	Initial values for RS distribution optimization, `c(-1.5,1.5)` tends to work well.
`leap`	See scrambling argument in `fun.gen.qrn`.
`FUN`	A character string of either `"runif.sobol"` (default), `"runif.sobol.owen"`, `"runif.halton"` or `"QUnif"`.
`no`	Number of initial random values to find the best initial values for optimisation.

Details

This function provides method of L moment fitting scheme for RS GLD. Note this function can fail if there are no defined percentiles from the data set or if the initial values do not lead to a valid RS generalised lambda distribution.

This function is based on scheme detailed in the literature below but it has been modified by the author (Steve Su).

Value

A vector representing four parameters of the RS generalised lambda distribution.

Author(s)

Steve Su

References

Asquith, W. (2007). "L-moments and TL-moments of the generalized lambda distribution." Computational Statistics and Data Analysis 51(9): 4484-4496.

Karvanen, J. and A. Nuutinen (2008). "Characterizing the generalized lambda distribution by L-moments." Computational Statistics and Data Analysis 52(4): 1971-1983.

Examples


# Fitting the normal distribution
 fun.RPRS.lm(data=rnorm(1000,2,3),rs.init=c(-1.5,1.5),leap=3)
# Fitting the normal distribution
 fun.RPRS.lm(data=rnorm(1000,2,3),rs.init=c(-1.5,1.5),leap=3)

Fit RS generalised lambda distribution to data set using maximum likelihood estimation

Description

This function fits RS generalised lambda distribution to data set using maximum likelihood estimation.

Usage

fun.RPRS.ml(data, rs.init = c(-1.5, 1.5), leap = 3,FUN="runif.sobol",no=10000)
fun.RPRS.ml(data, rs.init = c(-1.5, 1.5), leap = 3,FUN="runif.sobol",no=10000)

Arguments

`data`	Dataset to be fitted
`rs.init`	Initial values for RS distribution optimization, `c(-1.5,1.5)` tends to work well.
`leap`	See scrambling argument in `fun.gen.qrn`.
`FUN`	A character string of either `"runif.sobol"` (default), `"runif.sobol.owen"`, `"runif.halton"` or `"QUnif"`.
`no`	Number of initial random values to find the best initial values for optimisation.

Details

This function provides one of the definitive fit to data set using generalised lambda distributions. Note this function can fail if there are no defined percentiles from the data set or if the initial values do not lead to a valid RS generalised lambda distribution.

Value

A vector representing four parameters of the RS generalised lambda distribution.

Author(s)

Steve Su

References

Su, S. (2007). Numerical Maximum Log Likelihood Estimation for Generalized Lambda Distributions. Computational statistics and data analysis 51(8) 3983-3998.

Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.

Examples


# Fitting the normal distribution
 fun.RPRS.ml(data=rnorm(1000,2,3),rs.init=c(-1.5,1.5),leap=3)
# Fitting the normal distribution
 fun.RPRS.ml(data=rnorm(1000,2,3),rs.init=c(-1.5,1.5),leap=3)

Fit RS generalised lambda distribution to data set using maximum likelihood estimation

Description

This function fits RS generalised lambda distribution to data set using maximum likelihood estimation using faster implementation through C programming

Usage

fun.RPRS.ml.m(data, rs.init = c(-1.5, 1.5), leap = 3, FUN = "runif.sobol", 
no = 10000)
fun.RPRS.ml.m(data, rs.init = c(-1.5, 1.5), leap = 3, FUN = "runif.sobol", 
no = 10000)

Arguments

`data`	Dataset to be fitted
`rs.init`	Initial values for RS distribution optimization, `c(-1.5,1.5)` tends to work well.
`leap`	See scrambling argument in `fun.gen.qrn`.
`FUN`	A character string of either `"runif.sobol"` (default), `"runif.sobol.owen"`, `"runif.halton"` or `"QUnif"`.
`no`	Number of initial random values to find the best initial values for optimisation.

Details

Value

A vector representing four parameters of the RS generalised lambda distribution.

Author(s)

Steve Su

References

Su, S. (2007). Numerical Maximum Log Likelihood Estimation for Generalized Lambda Distributions. Computational statistics and data analysis 51(8) 3983-3998.

Su (2007). Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R. Journal of Statistical Software: *21* 9.

Examples


# Fitting the normal distribution
 fun.RPRS.ml.m(data=rnorm(1000,2,3),rs.init=c(-1.5,1.5),leap=3)
# Fitting the normal distribution
 fun.RPRS.ml.m(data=rnorm(1000,2,3),rs.init=c(-1.5,1.5),leap=3)

Fit RS generalised lambda distribution to data set using moment matching

Description

This function fits RS generalised lambda distribution to data set using moment matching

Usage

fun.RPRS.mm(data, rs.init = c(-1.5, 1.5), leap = 3, FUN = "runif.sobol", 
no = 10000)
fun.RPRS.mm(data, rs.init = c(-1.5, 1.5), leap = 3, FUN = "runif.sobol", 
no = 10000)

Arguments

`data`	Dataset to be fitted
`rs.init`	Initial values for RS distribution optimization, `c(-1.5,1.5)` tends to work well.
`leap`	See scrambling argument in `fun.gen.qrn`.
`FUN`	A character string of either `"runif.sobol"` (default), `"runif.sobol.owen"`, `"runif.halton"` or `"QUnif"`.
`no`	Number of initial random values to find the best initial values for optimisation.

Details

This function provides method of moment fitting scheme for RS GLD. Note this function can fail if there are no defined percentiles from the data set or if the initial values do not lead to a valid RS generalised lambda distribution.

This function is based on scheme detailed in the literature below but it has been modified by the author (Steve Su).

Value

A vector representing four parameters of the RS generalised lambda distribution.

Author(s)

Steve Su

References

Karian, Z. and E. Dudewicz (2000). Fitting Statistical Distributions: The Generalized Lambda Distribution and Generalised Bootstrap Methods. New York, Chapman and Hall.

Examples


# Fitting the normal distribution
 fun.RPRS.mm(data=rnorm(1000,2,3),rs.init=c(-1.5,1.5),leap=3)
# Fitting the normal distribution
 fun.RPRS.mm(data=rnorm(1000,2,3),rs.init=c(-1.5,1.5),leap=3)

Fit RS generalised lambda distribution to data set using quantile matching

Description

This function fits RS generalised lambda distribution to data set using quantile matching

Usage

fun.RPRS.qs(data, rs.init = c(-1.5, 1.5), leap = 3, FUN = "runif.sobol", 
trial.n = 100, len = 1000, type = 7, no = 10000)
fun.RPRS.qs(data, rs.init = c(-1.5, 1.5), leap = 3, FUN = "runif.sobol", 
trial.n = 100, len = 1000, type = 7, no = 10000)

Arguments

`data`	Dataset to be fitted
`rs.init`	Initial values for RS distribution optimization, `c(-1.5,1.5)` tends to work well.
`leap`	See scrambling argument in `fun.gen.qrn`.
`FUN`	A character string of either `"runif.sobol"` (default), `"runif.sobol.owen"`, `"runif.halton"` or `"QUnif"`.
`trial.n`	Number of evenly spaced quantile ranging from 0 to 1 to be used in the checking phase, to find the best set of initial values for optimisation, this is intended to be lower than `len` to speed up the fitting algorithm. Default is 100.
`len`	Number of evenly spaced quantile ranging from 0 to 1 to be used, default is 1000
`type`	Type of quantile to be used, default is 7, see `quantile`
`no`	Number of initial random values to find the best initial values for optimisation.

Details

This function provides quantile matching fitting scheme for RS GLD. Note this function can fail if there are no defined percentiles from the data set or if the initial values do not lead to a valid RS generalised lambda distribution.

Value

A vector representing four parameters of the RS generalised lambda distribution.

Author(s)

Steve Su

References

Su (2008). Fitting GLD to data via quantile matching method. (Book chapter to appear)

Examples


# Fitting the normal distribution
 fun.RPRS.qs(data=rnorm(1000,2,3),rs.init=c(-1.5,1.5),leap=3)
# Fitting the normal distribution
 fun.RPRS.qs(data=rnorm(1000,2,3),rs.init=c(-1.5,1.5),leap=3)

Simulate a mixture of two generalised lambda distributions.

Description

This function allows the user to simulate observations from a mixture of two generalised lambda distributions. It can be very useful for sensitivity analysis.

Usage

fun.simu.bimodal(result1, result2, prop1, prop2, len = 1000, 
no.test = 1000, param1, param2)
fun.simu.bimodal(result1, result2, prop1, prop2, len = 1000, 
no.test = 1000, param1, param2)

Arguments

`result1`	A vector comprising four values for the first generalised lambda distribution.
`result2`	A vector comprising four values for the second generalised lambda distribution.
`prop1`	Proportion of the first generalised lambda distribution
`prop2`	1-prop1, this can be left unspecified.
`len`	Length of object for each simulation run.
`no.test`	Number of simulation run.
`param1`	This can be `"rs"` or `"fmkl"`, specifying the type of the first generalised lambda distribution.
`param2`	This can be `"rs"` or `"fmkl"`, specifying the type of the second generalised lambda distribution.

Details

The length of object in len means how many observations should be generated in each simulation run, with the number of simulation runs governed by no.test.

Value

A list with length equal to the number of simulation runs. Each subset of the list has random observations equal to the the number specified in len.

Author(s)

Steve Su

Examples

# Generate random observations from FMKL generalised lambda distributions with 
# parameters (1,2,3,4) and (4,3,2,1) with 50% of data from each distribution.
junk<-fun.simu.bimodal(c(1,2,3,4),c(4,3,2,1),prop1=0.5,param1="fmkl",
param2="fmkl")

# Calculate the maximum number from each simulation run
sapply(junk,max)

# Calculate the median from each simulation run
sapply(junk,median)
# Generate random observations from FMKL generalised lambda distributions with 
# parameters (1,2,3,4) and (4,3,2,1) with 50% of data from each distribution.
junk<-fun.simu.bimodal(c(1,2,3,4),c(4,3,2,1),prop1=0.5,param1="fmkl",
param2="fmkl")

# Calculate the maximum number from each simulation run
sapply(junk,max)

# Calculate the median from each simulation run
sapply(junk,median)

Calculates the theoretical mean, variance, skewness and kurtosis for mixture of two generalised lambda distributions.

Description

This is the bimodal counterpart for fun.comp.moments.ml.2 and fun.comp.moments.ml.

Usage

fun.theo.bi.mv.gld(L1, L2, L3, L4, param1, M1, M2, M3, M4, param2, p1,
normalise="N")
fun.theo.bi.mv.gld(L1, L2, L3, L4, param1, M1, M2, M3, M4, param2, p1,
normalise="N")

Arguments

`L1`	Location parameter of the first generalised lambda distribution. Or all the parameters of mixture distribution in the form of c(L1,L2,L3,L4,M1,M2,M3,M4,p), you still must specify param1 and param2.
`L2`	Scale parameter of the first generalised lambda distribution.
`L3`	First shape parameter of the first generalised lambda distribution.
`L4`	Second shape parameter of the first generalised lambda distribution.
`param1`	`"rs"` or `"fmkl"` specifying the type of the first generalised lambda distribution.
`M1`	Location parameter of the second generalised lambda distribution
`M2`	Scale parameter of the second generalised lambda distribution.
`M3`	First shape parameter of the second generalised lambda distribution.
`M4`	Second shape parameter of the second generalised lambda distribution.
`param2`	`"rs"` or `"fmkl"` specifying the type of the second generalised lambda distribution.
`p1`	Proportion of the first generalisd lambda distribution.
`normalise`	"Y" if you want kurtosis to be calculated with reference to kurtosis = 0 under Normal distribution.

Value

A vector showing the theoretical mean, variance, skewness and kurtosis for mixture of two generalised lambda distributions.

Note

The theoretical moments may not always exist for generalised lambda distributions.

Author(s)

Steve Su

References

Freimer, M., Mudholkar, G. S., Kollia, G. & Lin, C. T. (1988), A study of the generalized tukey lambda family, Communications in Statistics - Theory and Methods *17*, 3547-3567.

Karian, Zaven A. and Dudewicz, Edward J. (2000), Fitting statistical distributions: the Generalized Lambda Distribution and Generalized Bootstrap methods, Chapman & Hall

Ramberg, J. S. & Schmeiser, B. W. (1974), An approximate method for generating asymmetric random variables, Communications of the ACM *17*, 78-82.

Examples


# Fits the Old Faithful geyser data (first column) using the maximum 
# likelihood.
 fit1<-fun.auto.bimodal.ml(faithful[,1],init1.sel="rmfmkl",init2.sel="rmfmkl",
 init1=c(-0.25,1.5),init2=c(-0.25,1.5),leap1=3,leap2=3)

# Find the theoretical moments of the fit
 fun.theo.bi.mv.gld(fit1$par[1],fit1$par[2],fit1$par[3],fit1$par[4],"fmkl",
 fit1$par[5],fit1$par[6],fit1$par[7],fit1$par[8],"fmkl",fit1$par[9])

# Compare this with the empirical moments from the data set.
 fun.moments(faithful[,1])
# Fits the Old Faithful geyser data (first column) using the maximum 
# likelihood.
 fit1<-fun.auto.bimodal.ml(faithful[,1],init1.sel="rmfmkl",init2.sel="rmfmkl",
 init1=c(-0.25,1.5),init2=c(-0.25,1.5),leap1=3,leap2=3)

# Find the theoretical moments of the fit
 fun.theo.bi.mv.gld(fit1$par[1],fit1$par[2],fit1$par[3],fit1$par[4],"fmkl",
 fit1$par[5],fit1$par[6],fit1$par[7],fit1$par[8],"fmkl",fit1$par[9])

# Compare this with the empirical moments from the data set.
 fun.moments(faithful[,1])

Find the theoretical first four moments of the generalised lambda distribution.

Description

Computes the "mean","variance","skewness","kurtosis" statistics from a given generalised lambda distribution.

Usage

fun.theo.mv.gld(L1, L2, L3, L4, param, normalise="N")
fun.theo.mv.gld(L1, L2, L3, L4, param, normalise="N")

Arguments

`L1`	Lambda 1. Or c(Lambda 1,Lambda 2,Lambda 3,Lambda 4).
`L2`	Lambda 2.
`L3`	Lambda 3.
`L4`	Lambda 4.
`param`	"rs" or "fmkl" or "fkml"
`normalise`	"Y" if you want kurtosis to be calculated with reference to kurtosis = 0 under Normal distribution.

Value

A vector listing the values of mean, variance, skewness and kurtosis.

Note

Sometimes the theoretical moments may not exist, in those cases, NA is returned.

Author(s)

Steve Su

References

Freimer, M., Mudholkar, G. S., Kollia, G. & Lin, C. T. (1988), A study of the generalized tukey lambda family, Communications in Statistics - Theory and Methods *17*, 3547-3567.

Gilchrist, Warren G. (2000), Statistical Modelling with Quantile Functions, Chapman & Hall

Karian, Zaven A. and Dudewicz, Edward J. (2000), Fitting statistical distributions: the Generalized Lambda Distribution and Generalized Bootstrap methods, Chapman & Hall

Examples

fun.theo.mv.gld(1, 2, 3, 4, "rs")
fun.theo.mv.gld(1, 2, 3, 4, "fmkl")
fun.theo.mv.gld(1, 2, 3, 4, "rs")
fun.theo.mv.gld(1, 2, 3, 4, "fmkl")

Determine which values are zero.

Description

Returns an integer vector showing the position of zero values in the data.

Usage

fun.which.zero(data)
fun.which.zero(data)

Arguments

data

A vector of data.

Value

An integer vector showing the position of zero values in the data.

Note

Any missing values will be returned as missing.

Author(s)

Steve Su

Examples

# Finding where the zeros are in this vector: c(0,1,2,3,4,0,2)
fun.which.zero(c(0,1,2,3,4,0,2))
# Finding where the zeros are in this vector: c(0,1,2,3,NA,0,2)
fun.which.zero(c(0,1,2,3,NA,0,2))
# Finding where the zeros are in this vector: c(0,1,2,3,4,0,2)
fun.which.zero(c(0,1,2,3,4,0,2))
# Finding where the zeros are in this vector: c(0,1,2,3,NA,0,2)
fun.which.zero(c(0,1,2,3,NA,0,2))

Returns a vector after removing all the zeros.

Description

This function returns a vector after removing all the zeros.

Usage

fun.zero.omit(object)
fun.zero.omit(object)

Arguments

object

A vector of data.

Value

Returns a vector after removing zeros and also give information on the number of zeros in the data removed.

Note

Missing value and Inf values are not removed in this zero removing process.

Author(s)

Steve Su

Examples

# Removing zero entries from the vector c(0,1,2,3,4,0,2)
fun.zero.omit(c(0,1,2,3,4,0,2))
# Removing zero entries from the vector c(0,1,2,3,4,0,2)
fun.zero.omit(c(0,1,2,3,4,0,2))

Checks whether the parameters provided constitute a valid generalised lambda distribution.

Description

An alternative to the gl.check.lambda function in gld package.

Usage

gl.check.lambda.alt(l1, l2, l3, l4, param = "fmkl")
gl.check.lambda.alt(l1, l2, l3, l4, param = "fmkl")

Arguments

`l1`	Lambda 1.
`l2`	Lambda 2.
`l3`	Lambda 3.
`l4`	Lambda 4.
`param`	`"rs"` or `"fmkl"` generalised lambda distribution.

Details

This version differs from gl.check.lambda in gld library.

Value

A logical value, TRUE or FALSE. TRUE indicates the parameters given is a valid probability distribution.

Author(s)

Steve Su

References

Freimer, M., Mudholkar, G. S., Kollia, G. & Lin, C. T. (1988), A study of the generalized tukey lambda family, Communications in Statistics - Theory and Methods *17*, 3547-3567.

Karian, Z.E., Dudewicz, E.J., and McDonald, P. (1996), The extended generalized lambda distribution system for fitting distributions to data: history, completion of theory, tables, applications, the “Final Word” on Moment fits, Communications in Statistics - Simulation and Computation *25*, 611-642.

Ramberg, J. S. & Schmeiser, B. W. (1974), An approximate method for generating asymmetric random variables, Communications of the ACM *17*, 78-82.

Examples

gl.check.lambda.alt(0,1,.23,4.5,param="fmkl") ## TRUE
gl.check.lambda.alt(0,-1,.23,4.5,param="fmkl") ## FALSE 
gl.check.lambda.alt(0,1,0.5,-0.5,param="rs") ## FALSE
gl.check.lambda.alt(0,1,.23,4.5,param="fmkl") ## TRUE
gl.check.lambda.alt(0,-1,.23,4.5,param="fmkl") ## FALSE 
gl.check.lambda.alt(0,1,0.5,-0.5,param="rs") ## FALSE

Checks whether the parameters provided constitute a valid generalised lambda distribution.

Description

A replacement to the gl.check.lambda function in gld package.

Usage

gl.check.lambda.alt1(l1, l2 = NULL, l3 = NULL, l4 = NULL, 
param = "fmkl", vect = FALSE)
gl.check.lambda.alt1(l1, l2 = NULL, l3 = NULL, l4 = NULL, 
param = "fmkl", vect = FALSE)

Arguments

`l1`	Lambda 1.
`l2`	Lambda 2.
`l3`	Lambda 3.
`l4`	Lambda 4.
`param`	`"rs"` or `"fmkl"` generalised lambda distribution.
`vect`	A logical, set this to `TRUE` if the parameters are given in the vector form (it turns off checking of the format of 'lambdas' and the other lambda arguments

Details

This is a modified gl.check.lambda function in replace of gld library's gl.check.lambda function to allow for 5 parameters FMKL distributions and vector input of parameter values into this function.

Value

A logical value, TRUE or FALSE. TRUE indicates the parameters given is a valid probability distribution.

Author(s)

Steve Su

References

Freimer, M., Mudholkar, G. S., Kollia, G. & Lin, C. T. (1988), A study of the generalized tukey lambda family, Communications in Statistics - Theory and Methods *17*, 3547-3567.

Ramberg, J. S. & Schmeiser, B. W. (1974), An approximate method for generating asymmetric random variables, Communications of the ACM *17*, 78-82.

Examples

gl.check.lambda.alt1(c(0,1,.23,4.5),param="fmkl",vect=TRUE) 
## TRUE, Using vector input of parameter values.
gl.check.lambda.alt1(0,-1,.23,4.5,param="fmkl") ## FALSE 
gl.check.lambda.alt1(0,1,0.5,-0.5,param="rs") ## FALSE
gl.check.lambda.alt1(c(0,1,.23,4.5),param="fmkl",vect=TRUE) 
## TRUE, Using vector input of parameter values.
gl.check.lambda.alt1(0,-1,.23,4.5,param="fmkl") ## FALSE 
gl.check.lambda.alt1(0,1,0.5,-0.5,param="rs") ## FALSE

The Generalised Lambda Distribution Family

Description

Density, quantile density, distribution function, quantile function and random generation for the generalised lambda distribution (also known as the asymmetric lambda, or Tukey lambda). Works for both the "fmkl" and "rs" parameterisations. These functions originate from the gld library by Robert King and they are modified in this package to allow greater functionality and adaptability to new fitting methods. It does not give an error message for invalid distributions but will return NAs instead. To allow comparability with the pkg(gld) package, this package uses the same notation and description as those written by Robert King.

Usage

     dgl(x, lambda1 = 0, lambda2 = NULL, lambda3 = NULL, lambda4 = NULL, 
       param = "fmkl", inverse.eps = 1e-08, 
       max.iterations = 500)
     pgl(q, lambda1 = 0, lambda2 = NULL, lambda3 = NULL, lambda4 = NULL, 
       param = "fmkl",  inverse.eps = 1e-08, 
       max.iterations = 500)
     qgl(p, lambda1, lambda2 = NULL, lambda3 = NULL, lambda4 = NULL,
       param = "fmkl")
     rgl(n, lambda1=0, lambda2 = NULL, lambda3 = NULL, lambda4 = NULL,
       param = "fmkl")
dgl(x, lambda1 = 0, lambda2 = NULL, lambda3 = NULL, lambda4 = NULL, 
       param = "fmkl", inverse.eps = 1e-08, 
       max.iterations = 500)
     pgl(q, lambda1 = 0, lambda2 = NULL, lambda3 = NULL, lambda4 = NULL, 
       param = "fmkl",  inverse.eps = 1e-08, 
       max.iterations = 500)
     qgl(p, lambda1, lambda2 = NULL, lambda3 = NULL, lambda4 = NULL,
       param = "fmkl")
     rgl(n, lambda1=0, lambda2 = NULL, lambda3 = NULL, lambda4 = NULL,
       param = "fmkl")

Arguments

`x`	Vector of actual values for `dgl`
`p`	Vector of probabilities for `qgl` or `qdgl`
`q`	Vector of quantiles for `pgl`
`n`	Number of observations to be generated for `rgl`
`lambda1`	This can be either a single numeric value or a vector. If it is a vector, it must be of length 4 for parameterisations `"fmkl"` or `"rs"` and of length 5 for parameterisation `"fm5"` and the other 'lambda' arguments must be left as `NULL`. The numbering of the lambda parameters for the `"fmkl"` parameterisation is different to that used by Freimer, Mudholkar, Kollia and Lin (1988).
`lambda2`	Scale parameter
`lambda3`	First shape parameter
`lambda4`	Second shape parameter
`param`	`"fmkl"` or `"fkml"` uses Freimer, Kollia, Mudholkar, and Lin (1988) and it is the default setting. `"rs"` uses Ramberg and Schmeiser (1974)
`inverse.eps`	Accuracy of calculation for the numerical determination of F(x), defaults to 1e-8
`max.iterations`	Maximum number of iterations in the numerical determination of F(x), defaults to 500

Details

The generalised lambda distribution, also known as the asymmetric lambda, or Tukey lambda distribution, is a distribution with a wide range of shapes. The distribution is defined by its quantile function, the inverse of the distribution function. The 'gld' package implements three parameterisations of the distribution. The default parameterisation (the FMKL) is that due to Freimer Mudholkar, Kollia and Lin (1988) (see references below), with a quantile function:

$F^{-1}(u)= \lambda_1 + { { \frac{u^{\lambda_3}-1}{\lambda_3} - \frac{(1-u)^{\lambda_4}-1}{\lambda_4} } \over \lambda_2 }$

for $\lambda_2 > 0$ .

A second parameterisation, the RS, chosen by setting param="rs" is that due to Ramberg and Schmeiser (1974), with the quantile function:

$F^{-1}(u)= \lambda_1 + \frac{u^{\lambda_3} - (1-u)^{\lambda_4}} {\lambda_2}$

This parameterisation has a complex series of rules determining which values of the parameters produce valid statistical distributions. See gl.check.lambda for details.

Value

`dgl`	gives the density (based on the quantile density and a numerical solution to $F^{-1}(u)=x$ ,
`qdgl`	gives the quantile density,
`pgl`	gives the distribution function (based on a numerical solution to $F^{-1}(u)=x$ ,
`qgl`	gives the quantile function, and
`rgl`	generates random observations.

References

Freimer, M., Kollia, G., Mudholkar, G. S. & Lin, C. T. (1988), A study of the generalized tukey lambda family, Communications in Statistics - Theory and Methods *17*, 3547-3567.

Gilchrist, Warren G. (2000), Statistical Modelling with Quantile Functions, Chapman & Hall

Karian, Zaven A. and Dudewicz, Edward J. (2000), Fitting statistical distributions: the Generalized Lambda Distribution and Generalized Bootstrap methods, Chapman & Hall

Ramberg, J. S. & Schmeiser, B. W. (1974), An approximate method for generating asymmetric random variables, Communications of the ACM *17*, 78-82.

Examples

 qgl(seq(0,1,0.02),0,1,0.123,-4.3)
 pgl(seq(-2,2,0.2),0,1,-.1,-.2,param="fmkl",inverse.eps=.Machine$double.eps)
qgl(seq(0,1,0.02),0,1,0.123,-4.3)
 pgl(seq(-2,2,0.2),0,1,-.1,-.2,param="fmkl",inverse.eps=.Machine$double.eps)

Histogram with exact number of bins specified by the user

Description

The generic function histsu computes a histogram of the given data values.

Usage

histsu(x, breaks = "Sturges", freq = NULL, probability = !freq, 
include.lowest = TRUE, right = TRUE, density = NULL, angle = 45, col = NULL, 
border = NULL, main = paste("Histogram of", xname), xlim = range(breaks), 
ylim = NULL, xlab = xname, ylab, axes = TRUE, plot = TRUE, labels = FALSE, 
nclass = NULL, ...)
histsu(x, breaks = "Sturges", freq = NULL, probability = !freq, 
include.lowest = TRUE, right = TRUE, density = NULL, angle = 45, col = NULL, 
border = NULL, main = paste("Histogram of", xname), xlim = range(breaks), 
ylim = NULL, xlab = xname, ylab, axes = TRUE, plot = TRUE, labels = FALSE, 
nclass = NULL, ...)

Arguments

`x`	A vector of values for which the histogram is desired.
`breaks`	Either: 1) A vector giving the breakpoints between histogram cells, OR 2) A single number giving the number of cells for the histogram, OR 3) A character string naming an algorithm to compute the number of cells (see Details), OR 4) A function to compute the number of cells.
`freq`	logical; if `TRUE`, the histogram graphic is a representation of frequencies, the `counts` component of the result; if `FALSE`, probability densities, component 'density', are plotted (so that the histogram has a total area of one). Defaults to `TRUE` iff 'breaks' are equidistant (and 'probability' is not specified).
`probability`	A logical value, `TRUE` means it is not a frequency graph.
`include.lowest`	If `TRUE`, an `x[i]` equal to the 'breaks' value will be included in the first (or last, for `right=FALSE`) bar. This will be ignored (with a warning) unless 'breaks' is a vector.
`right`	If `TRUE`, the histograms cells are right-closed (left open) intervals.
`density`	The density of shading lines, in lines per inch. The default value of `NULL` means that no shading lines are drawn. Non-positive values of 'density' also inhibit the drawing of shading lines.
`angle`	The slope of shading lines, given as an angle in degrees (counter-clockwise).
`col`	A colour to be used to fill the bars. The default of `NULL` yields unfilled bars.
`border`	The color of the border around the bars. The default is to use the standard foreground color.
`main`	Title of the graph.
`xlim`	A two valued vector specifying the lower and upper limits of the x axis.
`ylim`	A two valued vector specifying the lower and upper limits of the y axis.
`xlab`	X axis labels.
`ylab`	Y axis labels.
`axes`	Logical value, if `TRUE`, axis will be drawn.
`plot`	Logical value, if `TRUE`, plot will be drawn.
`labels`	Logical or character. Additionally draw labels on top of bars, if not `FALSE`; see `plot.histogram`.
`nclass`	Number of bins of the histogram.
`...`	Other graphical parameters, see par for details.

Details

See hist help file. This function forces the number of class of histogram to that as specified by the user.

Value

An object of class "histogram" which is a list with components:

`breaks`	The n+1 cell boundaries (=`breaks` if that was a vector).
`counts`	N integers; for each cell, the number of `x[]` inside.
`density`	Values as estimated density values. If `all(diff(breaks) == 1)`, they are the relative frequencies counts/n.
`intensities`	Same as density, deprecated.
`mids`	The n cell midpoints.
`xname`	A character string with the actual `x` argument name.
`equidist`	Logical, indicating if the distances between `breaks` are all the same.

Note

Please see hist help file.

Author(s)

R development team with modifications by Steve Su

References

Venables, W. N. and Ripley. B. D. (2002) Modern Applied Statistics with S. Springer.

Examples

# See hist for extended example:
junk<-rgamma(1000,5)
# Forcing the number of bins to be 10:
histsu(junk,nclass=10)
# See hist for extended example:
junk<-rgamma(1000,5)
# Forcing the number of bins to be 10:
histsu(junk,nclass=10)

Returns a logical vecto, TRUE if the value is Inf or -Inf.

Description

This function works in similar fashion as in is.na and is.notinf.

Usage

is.inf(x)
is.inf(x)

Arguments

`x`	A numerical value or a vector of data.

Value

A logical vector, T if the value is Inf or -Inf.

Note

In the presence of missing value, the function will return a missing value.

Author(s)

Steve Su

Examples

is.inf(c(Inf,2,2,1,-Inf))
is.inf(c(Inf,2,2,1,-Inf))

Returns a logical vector TRUE, if the value is not Inf or -Inf.

Description

This function works in similar fashion as in is.na and is.inf.

Usage

is.notinf(x)
is.notinf(x)

Arguments

`x`	A numerical value or a vector of data.

Value

A logical vector, T if the value is not Inf or -Inf.

Note

In the presence of missing value, the function will return a missing value.

Author(s)

Steve Su

Examples

is.notinf(c(Inf,2,2,1,-Inf))
is.notinf(c(Inf,2,2,1,-Inf))

Kolmogorov-Smirnov test

Description

Performs one or two sample Kolmogorov-Smirnov tests.

Usage

ks.gof(x, y, ..., alternative = c("two.sided", "less", "greater"), 
exact = NULL)
ks.gof(x, y, ..., alternative = c("two.sided", "less", "greater"), 
exact = NULL)

Arguments

`x`	A numeric vector of data values.
`y`	Either a numeric vector of data values, or a character string naming a distribution function.
`...`	Parameters of the distribution specified (as a character string) by 'y'.
`alternative`	Indicates the alternative hypothesis and must be one of `"two.sided"` (default),`"less"`, or `"greater"`. You can specify just the initial letter.
`exact`	`NULL` or a logical indicating whether an exact p-value should be computed. See Details for the meaning of `NULL`. Not used for the one-sided two-sample case.

Details

If y is numeric, a two-sample test of the null hypothesis that x and y were drawn from the same continuous distribution is performed.

Alternatively, y can be a character string naming a continuous distribution function. In this case, a one-sample test is carried out of the null that the distribution function which generated x is distribution y with parameters specified by "...".

The possible values "two.sided" (default),"less" and "greater" of alternative specify the null hypothesis that the true distribution function of x is equal to, not less than or not greater than the hypothesized distribution function (one-sample case) or the distribution function of y (two-sample case), respectively.

Exact p-values are not available for the one-sided two-sample case, or in the case of ties. exact = NULL (the default), an exact p-value is computed if the sample size if less than 100 in the one-sample case, and if the product of the sample sizes is less than 10000 in the two-sample case. Otherwise, asymptotic distributions are used whose approximations may be inaccurate in small samples. In the one-sample two-sided case, exact p-values are obtained as described in Marsaglia, Tsang & Wang (2003). The formula of Birnbaum & Tingey (1951) is used for the one-sample one-sided case.

If a single-sample test is used, the parameters specified in "..." must be pre-specified and not estimated from the data. There is some more refined distribution theory for the KS test with estimated parameters (see Durbin, 1973), but that is not implemented in ks.gof.

Value

A list with class "htest" containing the following components:

`statistic`	Value of test statistics.
`p.value`	P-value.
`alternative`	Character string describing the alternative hypothesis.
`method`	Character string indicating what type of test was performed.
`data.name`	Character string giving the name(s) of the data.

Note

This function handle ties by jittering, adding a very small uniform random number generated from the minimal value of the data set divided by 1e+08 to minimal value divided by 1e+07.

Author(s)

References

Z. W. Birnbaum & Fred H. Tingey (1951), One-sided confidence contours for probability distribution functions. The Annals of Mathematical Statistics, *22*/4, 592-596.

William J. Conover (1971), Practical nonparametric statistics. New York: John Wiley & Sons. Pages 295-301 (one-sample "Kolmogorov" test), 309-314 (two-sample "Smirnov" test).

Durbin, J. (1973) Distribution theory for tests based on the sample distribution function. SIAM.

George Marsaglia, Wai Wan Tsang & Jingbo Wang (2003), Evaluating Kolmogorov's distribution. Journal of Statistical Software, *8*/18. <URL: http://www.jstatsoft.org/v08/i18/>.

Examples

x <- rnorm(50)
y <- runif(30)
# Do x and y come from the same distribution?
ks.gof(x, y)
# Does x come from a shifted gamma distribution with shape 3 and rate 2?
ks.gof(x+2, "pgamma", 3, 2) # two-sided, exact
ks.gof(x+2, "pgamma", 3, 2, exact = FALSE)
ks.gof(x+2, "pgamma", 3, 2, alternative = "gr")
x <- rnorm(50)
y <- runif(30)
# Do x and y come from the same distribution?
ks.gof(x, y)
# Does x come from a shifted gamma distribution with shape 3 and rate 2?
ks.gof(x+2, "pgamma", 3, 2) # two-sided, exact
ks.gof(x+2, "pgamma", 3, 2, exact = FALSE)
ks.gof(x+2, "pgamma", 3, 2, alternative = "gr")

L-moments

Description

Calculates sample L-moments, L-coefficients and covariance matrix of L-moments.

Usage

Lmoments(data,rmax=4,na.rm=FALSE,returnobject=FALSE,trim=c(0,0))
Lcoefs(data,rmax=4,na.rm=FALSE,trim=c(0,0))
Lmomcov(data,rmax=4,na.rm=FALSE)
Lmoments_calc(data,rmax=4)
Lmomcov_calc(data,rmax=4)
Lmoments(data,rmax=4,na.rm=FALSE,returnobject=FALSE,trim=c(0,0))
Lcoefs(data,rmax=4,na.rm=FALSE,trim=c(0,0))
Lmomcov(data,rmax=4,na.rm=FALSE)
Lmoments_calc(data,rmax=4)
Lmomcov_calc(data,rmax=4)

Arguments

`data`	matrix or data frame.
`rmax`	maximum order of L-moments.
`na.rm`	a logical value indicating whether 'NA' values should be removed before the computation proceeds.
`returnobject`	a logical value indicating whether a list object should be returned instead of an array of L-moments.
`trim`	c(0,0) for ordinary L-moments and c(1,1) for trimmed (t=1) L-moments

Value

`Lmoments`	returns an array of L-moments containing a row for each variable in data, or if returnobject=TRUE, a list containing the following:
`lambdas`	an array of L-moments
`ratios`	an array of mean, L-scale and L-moment ratios
`trim`	the value of the parameter 'trim'
`source`	a string with value "Lmoments" or "t1lmoments"
`Lcoefs`	returns an array of L-coefficients (mean, L-scale, L-skewness, L-kurtosis, ...)
`Lmomcov`	returns the covariance matrix of L-moments or a list of covariance matrices if the input has multiple columns.
`Lmoments_calc`	is internal function.
`Lmomcov_calc`	is internal function.

Note

Functions Lmoments and Lcoefs calculate trimmed L-moments if you specify trim=c(1,1).

Author(s)

Juha Karvanen <[email protected]>

References

Karvanen, J. and A. Nuutinen (2008). "Characterizing the generalized lambda distribution by L-moments." Computational Statistics and Data Analysis 52(4): 1971-1983.

Asquith, W. (2007). "L-moments and TL-moments of the generalized lambda distribution." Computational Statistics and Data Analysis 51(9): 4484-4496.

Elamir, E. A., Seheult, A. H. 2004. "Exact variance structure of sample L-moments" Journal of Statistical Planning and Inference 124 (2) 337-359.

Hosking, J. 1990. "L-moments: Analysis and estimation distributions using linear combinations of order statistics", Journal of Royal Statistical Society B 52, 105-124.

Examples


x<-rnorm(500)
Lmoments(x)

x<-rnorm(500)
Lmoments(x)

An alternative to the normal pretty function in R.

Description

Divide a range of values into equally spaced divisions. End points are given as output.

Usage

pretty.su(x, nint = 5)
pretty.su(x, nint = 5)

Arguments

`x`	A vector of values.
`nint`	Number of intervals required.

Details

This is also used for the plotting of histogram in the histsu function.

Value

A vector of endpoints dividing the data into equally spaced regions.

Author(s)

Steve Su

Examples

# Generate random numbers from normal distribution:
junk<-rnorm(1000,2,3)

# Cut them into 7 regions, 8 endpoints.
pretty.su(junk,7)

# Generate random numbers from normal distribution:
junk<-rnorm(1000,2,3)

# Cut them into 7 regions, 8 endpoints.
pretty.su(junk,7)

Do a quantile plot on the univariate distribution fits.

Description

This plots the theoretical and actual data quantiles to allow the user to examine the adequacy of a single gld distribution fit.

Usage

qqplot.gld(data, fit, param, len = 10000, name = "", 
corner = "topleft",type="",range=c(0,1),xlab="",main="")
qqplot.gld(data, fit, param, len = 10000, name = "", 
corner = "topleft",type="",range=c(0,1),xlab="",main="")

Arguments

`data`	Data fitted.
`fit`	Parameters of distribution fit.
`param`	Can be either `"rs"` or `"fmkl"`.
`len`	Precision of the quantile calculatons. Default is 10000. This means 10000 points are taken from 0 to 1.
`name`	Name of the data set, added to the title of plot if `main` is missing.
`corner`	Can be `"bottomright"`, `"bottom"`, `"bottomleft"`, `"left"`, `"topleft"`, `"top"`, `"topright"`, `"right"`, `"center"` as in `legend`.
`type`	This can be "" or "str.qqplot", the first produces the raw quantiles and the second plot them on a straight line. Default is "".
`range`	This is the range for which the quantiles are to be plotted. Default is `c(0,1)`.
`xlab`	x axis label, if left blank, then default is "Data".
`main`	Title of the plot, if left blank, a default title will be added.

Value

A plot is given.

Author(s)

Steve Su

Examples


 set.seed(1000)

 junk<-rweibull(300,3,2)

# Fit the function using fun.data.fit.ml
 obj.fit1.ml<-fun.data.fit.ml(junk)

# Do a quantile plot on the raw quantiles
 qqplot.gld(junk,obj.fit1.ml[,1],param="rs",name="RS ML fit")

# Or a qq plot to examine deviation from straight line
 qqplot.gld(junk,obj.fit1.ml[,1],param="rs",name="RS ML fit",type="str.qqplot")
set.seed(1000)

 junk<-rweibull(300,3,2)

# Fit the function using fun.data.fit.ml
 obj.fit1.ml<-fun.data.fit.ml(junk)

# Do a quantile plot on the raw quantiles
 qqplot.gld(junk,obj.fit1.ml[,1],param="rs",name="RS ML fit")

# Or a qq plot to examine deviation from straight line
 qqplot.gld(junk,obj.fit1.ml[,1],param="rs",name="RS ML fit",type="str.qqplot")

Do a quantile plot on the bimodal distribution fits.

Description

This plots the theoretical and actual data quantiles to allow the user to examine the adequacy of two gld distributions mixture fit.

Usage

qqplot.gld.bi(data, fit, param1, param2, len = 10000, name = "", 
corner = "topleft",type="",range=c(0,1),xlab="",main="")
qqplot.gld.bi(data, fit, param1, param2, len = 10000, name = "", 
corner = "topleft",type="",range=c(0,1),xlab="",main="")

Arguments

`data`	Data fitted.
`fit`	Parameters of distribution fit.
`param1`	Can be either `"rs"` or `"fmkl"`.
`param2`	Can be either `"rs"` or `"fmkl"`.
`len`	Precision of the quantile calculatons. Default is 10000. This means 10000 points are taken from 0 to 1.
`name`	Name of the data set, added to the title of plot if `main` is missing.
`corner`	Can be `"bottomright"`, `"bottom"`, `"bottomleft"`, `"left"`, `"topleft"`, `"top"`, `"topright"`, `"right"`, `"center"` as in `legend`.
`type`	This can be "" or "str.qqplot", the first produces the raw quantiles and the second plot them on a straight line. Default is "".
`range`	This is the range for which the quantiles are to be plotted. Default is `c(0,1)`.
`xlab`	x axis label, if left blank, then default is "Data"
`main`	Title of the plot, if left blank, a default title will be added.

Value

A plot is given.

Author(s)

Steve Su

Examples


 set.seed(1000)

 junk<-rweibull(300,3,2)

## Fitting mixture of generalised lambda distributions on the data set using 
## both the maximum likelihood and partition maximum likelihood and plot the 
## resulting fits
 junk<-fun.auto.bimodal.ml(faithful[,1],per.of.mix=0.1,clustering.m=clara,
 init1.sel="rprs",init2.sel="rmfmkl",init1=c(-1.5,1.5),init2=c(-0.25,1.5),
 leap1=3,leap2=3)
 fun.plot.fit.bm(nclass=50,fit.obj=junk,data=faithful[,1],
 name="Maximum likelihood using",xlab="faithful1",param.vec=c("rs","fmkl"))

## Do a quantile plot on the raw quantiles
 qqplot.gld.bi(faithful[,1],junk$par,param1="rs",param2="fmkl",
 name="RS FMKL ML fit")

## Or a qq plot to examine deviation from straight line
 qqplot.gld.bi(faithful[,1],junk$par,param1="rs",param2="fmkl",
 name="RS FMKL ML fit",type="str.qqplot")
set.seed(1000)

 junk<-rweibull(300,3,2)

## Fitting mixture of generalised lambda distributions on the data set using 
## both the maximum likelihood and partition maximum likelihood and plot the 
## resulting fits
 junk<-fun.auto.bimodal.ml(faithful[,1],per.of.mix=0.1,clustering.m=clara,
 init1.sel="rprs",init2.sel="rmfmkl",init1=c(-1.5,1.5),init2=c(-0.25,1.5),
 leap1=3,leap2=3)
 fun.plot.fit.bm(nclass=50,fit.obj=junk,data=faithful[,1],
 name="Maximum likelihood using",xlab="faithful1",param.vec=c("rs","fmkl"))

## Do a quantile plot on the raw quantiles
 qqplot.gld.bi(faithful[,1],junk$par,param1="rs",param2="fmkl",
 name="RS FMKL ML fit")

## Or a qq plot to examine deviation from straight line
 qqplot.gld.bi(faithful[,1],junk$par,param1="rs",param2="fmkl",
 name="RS FMKL ML fit",type="str.qqplot")

Quasi Randum Numbers via Halton Sequences

Description

These functions provide quasi random numbers or space filling or low discrepancy sequences in the $p$ -dimensional unit cube.

Usage

sHalton(n.max, n.min = 1, base = 2, leap = 1)
 QUnif (n, min = 0, max = 1, n.min = 1, p, leap = 1)
sHalton(n.max, n.min = 1, base = 2, leap = 1)
 QUnif (n, min = 0, max = 1, n.min = 1, p, leap = 1)

Arguments

`n.max`	maximal (sequence) number.
`n.min`	minimal sequence number.
`n`	number of $p$ -dimensional points generated in `QUnif`. By default, `n.min = 1, leap = 1` and the maximal sequence number is `n.max = n.min + (n-1)*leap`.
`base`	integer $\ge 2$ : The base with respect to which the Halton sequence is built.
`min`, `max`	lower and upper limits of the univariate intervals. Must be of length 1 or `p`.
`p`	dimensionality of space (the unit cube) in which points are generated.
`leap`	integer indicating (if $> 1$ ) if the series should be leaped, i.e., only every `leap`th entry should be taken.

Value

sHalton(n,m) returns a numeric vector of length n-m+1 of values in $[0,1]$ .

QUnif(n, min, max, n.min, p=p) generates n-n.min+1 p-dimensional points in $[min,max]^p$ returning a numeric matrix with p columns.

Note

For leap Kocis and Whiten recommend values of $L=31,61,149,409$ , and particularly the $L=409$ for dimensions up to 400.

Author(s)

Martin Maechler

References

James Gentle (1998) Random Number Generation and Monte Carlo Simulation; sec.\ 6.3. Springer.

Kocis, L. and Whiten, W.J. (1997) Computationsl Investigations of Low-Discrepancy Sequences. ACM Transactions of Mathematical Software 23, 2, 266–294.

Examples

32*sHalton(20, base=2)
QUnif(n=10,p=2,leap=409)
32*sHalton(20, base=2)
QUnif(n=10,p=2,leap=409)

Compute skewness and kurtosis statistics

Description

This uses the S+ version directly.

Usage

skewness(x, na.rm = FALSE, method = "fisher")
kurtosis(x, na.rm = FALSE, method = "fisher")
skewness(x, na.rm = FALSE, method = "fisher")
kurtosis(x, na.rm = FALSE, method = "fisher")

Arguments

`x`	Any numerical object. Missing values `NA` are allowed.
`na.rm`	Logical flag: if na.rm=TRUE, missing values are removed from x before doing the computations. If `na.rm=FALSE` and `x` contains missing values, then the return value is NA.
`method`	Character string specifying the computation method. The two possible values are `fisher` for Fisher's g1 (skewness) and g2 (kurtosis) versions, and `moment` for the functional forms of the statistics. Only the first character of the string needs to be supplied.

Details

The moment forms are based on the definitions of skewness and kurtosis for distributions; these forms should be used when resampling (bootstrap or jackknife). The "fisher" forms correspond to the usual "unbiased" definition of sample variance, though in the case of skewness and kurtosis exact unbiasedness is not possible.

Value

A single value of skewness or kurtotis.

If y = x - mean(x), then the "moment" method computes the skewness value as mean(y $\mbox{\textasciitilde}$ 3)/mean(y $\mbox{\textasciitilde}$ 2) $\mbox{\textasciitilde}$ 1.5 and the kurtosis value as mean(y $\mbox{\textasciitilde}$ 4)/mean(y $\mbox{\textasciitilde}$ 2) $\mbox{\textasciitilde}$ 2 - 3. To see the "fisher" calculations, print out the functions.

Author(s)

Splus

Examples

x <- runif(30) 
skewness(x) 
skewness(x, method="moment") 
kurtosis(x) 
kurtosis(x, method="moment") 
x <- runif(30) 
skewness(x) 
skewness(x, method="moment") 
kurtosis(x) 
kurtosis(x, method="moment")

Carry out the “starship” estimation method for the generalised lambda distribution

Description

Calculates estimates for the FMKL parameterisation of the generalised lambda distribution on the basis of data, using the starship method. The starship method is built on the fact that the generalised lambda distribution is a transformation of the uniform distribution. This method finds the parameters that transform the data closest to the uniform distribution. This function uses a grid-based search to find a suitable starting point (using starship.adaptivegrid) then uses optim to find the parameters that do this.

Usage

starship(data, optim.method = "Nelder-Mead", initgrid = NULL, param="FMKL",
optim.control=NULL)
starship(data, optim.method = "Nelder-Mead", initgrid = NULL, param="FMKL",
optim.control=NULL)

Arguments

data

Data to be fitted, as a vector

optim.method

Optimisation method for optim to use, defaults to Nelder-Mead

initgrid

Grid of values of $\lambda_3$ and $\lambda_4$ to try, in starship.adaptivegrid. This should be a list with elements, lcvect, a vector of values for $\lambda_3$ , ldvect, a vector of values for $\lambda_4$ and levect, a vector of values for $\lambda_5$ (levect is only required if param is fm5).

If it is left as NULL, the default grid depends on the parameterisation. For fmkl, both lcvect and ldvect default to:

-1.5

-1

-.5

-.1

1.5

(levect is NULL).

For rs, both lcvect and ldvect default to:

1.5

(levect is NULL).

For fm5, both lcvect and ldvect default to:

-1.5

-1

-.5

-.1

1.5

and levect defaults to:

-0.5

0.25

0.5

param

choose parameterisation: fmkl uses Freimer, Mudholkar, Kollia and Lin (1988) (default). rs uses Ramberg and Schmeiser (1974) fm5 uses the 5 parameter version of the FMKL parameterisation (paper to appear)

optim.control

List of options for the optimisation step. See optim for details. If left as NULL, the parscale control is set to scale $\lambda_1$ and $\lambda_2$ by the absolute value of their starting points.

Details

The starship method is described in King and MacGillivray, 1999 (see references). It is built on the fact that the generalised lambda distribution is a transformation of the uniform distribution. Thus the inverse of this transformation is the distribution function for the gld. The starship method applies different values of the parameters of the distribution to the distribution function, calculates the depths q corresponding to the data and chooses the parameters that make the depths closest to a uniform distribution.

The closeness to the uniform is assessed by calculating the Anderson-Darling goodness-of-fit test on the transformed data against the uniform, for a sample of size length(data).

This is implemented in 2 stages in this function. First a grid search is carried out, over a small number of possible parameter values (see starship.adaptivegrid for details). Then the minimum from this search is given as a starting point for an optimisation of the Anderson-Darling value using optim, with method given by optim.method

See references for details on parameterisations.

Value

Returns a list, with

`lambda`	A vector of length 4, giving the estimated parameters, in order, $\lambda_1$ - location parameter $\lambda_2$ - scale parameter $\lambda_3$ - first shape parameter $\lambda_4$ - second shape parameter
`grid.results`	output from the grid search - see `starship.adaptivegrid` for details
`optim`	output from the optim search - `optim` for details

Author(s)

Robert King, Darren Wraith

References

Freimer, M., Mudholkar, G. S., Kollia, G. & Lin, C. T. (1988), A study of the generalized tukey lambda family, Communications in Statistics - Theory and Methods 17, 3547–3567.

Ramberg, J. S. & Schmeiser, B. W. (1974), An approximate method for generating asymmetric random variables, Communications of the ACM 17, 78–82.

King, R.A.R. & MacGillivray, H. L. (1999), A starship method for fitting the generalised $\lambda$ distributions, Australian and New Zealand Journal of Statistics 41, 353–374

Owen, D. B. (1988), The starship, Communications in Statistics - Computation and Simulation 17, 315–323.

Examples


 data <- rgl(100,0,1,.2,.2)
 starship(data,optim.method="Nelder-Mead",initgrid=list(lcvect=(0:4)/10,
 ldvect=(0:4)/10))
data <- rgl(100,0,1,.2,.2)
 starship(data,optim.method="Nelder-Mead",initgrid=list(lcvect=(0:4)/10,
 ldvect=(0:4)/10))

Carry out the “starship” estimation method for the generalised lambda distribution using a grid-based search

Description

Calculates estimates for the generalised lambda distribution on the basis of data, using the starship method. The starship method is built on the fact that the generalised lambda distribution is a transformation of the uniform distribution. This method finds the parameters that transform the data closest to the uniform distribution. This function uses a grid-based search.

Usage

starship.adaptivegrid(data, initgrid=list(
lcvect = c(-1.5, -1, -0.5, -0.1, 0, 0.1, 0.2, 0.4, 0.8, 1, 1.5), 
ldvect = c(-1.5, -1, -0.5, -0.1, 0, 0.1, 0.2, 0.4, 0.8, 1, 1.5),
levect = c(-0.5,-0.25,0,0.25,0.5)),param="FMKL")
starship.adaptivegrid(data, initgrid=list(
lcvect = c(-1.5, -1, -0.5, -0.1, 0, 0.1, 0.2, 0.4, 0.8, 1, 1.5), 
ldvect = c(-1.5, -1, -0.5, -0.1, 0, 0.1, 0.2, 0.4, 0.8, 1, 1.5),
levect = c(-0.5,-0.25,0,0.25,0.5)),param="FMKL")

Arguments

data

Data to be fitted, as a vector

initgrid

A list with elements, lcvect, a vector of values for $\lambda_3$ , ldvect, a vector of values for $\lambda_4$ and levect, a vector of values for $\lambda_5$ (levect is only required if param is fm5).

Note: if param=rs, the non-positive values are dropped from lcvect and ldvect.

param

Details

The closeness to the uniform is assessed by calculating the Anderson-Darling goodness-of-fit test on the transformed data against the uniform, for a sample of size length(data).

This function carries out a grid-based search. This was the original method of King and MacGillivray, 1999, but you are advised to instead use starship which uses a grid-based search together with an optimisation based search.

See references for details on parameterisations.

Value

`response`	The minimum “response value” — the result of the internal goodness-of-fit measure. This is the return value of starship.obj. See King and MacGillivray, 1999 for more details
`lambda`	A vector of length 4 giving the values of $\lambda_1$ to $\lambda_4$ that produce this minimum response, i.e. the estimates

Author(s)

Robert King, Darren Wraith

References

Freimer, M., Mudholkar, G. S., Kollia, G. & Lin, C. T. (1988), A study of the generalized tukey lambda family, Communications in Statistics - Theory and Methods 17, 3547–3567.

Ramberg, J. S. & Schmeiser, B. W. (1974), An approximate method for generating asymmetric random variables, Communications of the ACM 17, 78–82.

King, R.A.R. & MacGillivray, H. L. (1999), A starship method for fitting the generalised $\lambda$ distributions, Australian and New Zealand Journal of Statistics 41, 353–374

Owen, D. B. (1988), The starship, Communications in Statistics - Computation and Simulation 17, 315–323.

Examples


 data <- rgl(100,0,1,.2,.2)
 starship.adaptivegrid(data,list(lcvect=(0:4)/10,ldvect=(0:4)/10))
 data <- rgl(100,0,1,.2,.2)
 starship.adaptivegrid(data,list(lcvect=(0:4)/10,ldvect=(0:4)/10))

Objective function that is minimised in starship estimation method

Description

The starship is a method for fitting the generalised lambda distribution. See starship for more details.

This function is the objective funciton minimised in the methods. It is a goodness of fit measure carried out on the depths of the data.

Usage

starship.obj(par, data, param = "fmkl")
starship.obj(par, data, param = "fmkl")

Arguments

`par`	parameters of the generalised lambda distribution, a vector of length 4, giving $\lambda_1$ to $\lambda_4$ . See references or `qgl` for details on the definitions of these parameters
`data`	Data — a vector
`param`	choose parameterisation: `fmkl` uses Freimer, Mudholkar, Kollia and Lin (1988) (default). `rs` uses Ramberg and Schmeiser (1974)

Details

The closeness to the uniform is assessed by calculating the Anderson-Darling goodness-of-fit test on the transformed data against the uniform, for a sample of size length(data).

This function returns that objective function. It is provided as a seperate function to allow users to carry out minimisations using optim or other methods. The recommended method is to use the link{starship} function.

Value

The Anderson-Darling goodness of fit measure, computed on the transformed data, compared to a uniform distribution. Note that this is NOT the goodness-of-fit measure of the generalised lambda distribution with the given parameter values to the data.

Author(s)

Robert King, Darren Wraith

References

Freimer, M., Mudholkar, G. S., Kollia, G. & Lin, C. T. (1988), A study of the generalized tukey lambda family, Communications in Statistics - Theory and Methods 17, 3547–3567.

Ramberg, J. S. & Schmeiser, B. W. (1974), An approximate method for generating asymmetric random variables, Communications of the ACM 17, 78–82.

King, R.A.R. & MacGillivray, H. L. (1999), A starship method for fitting the generalised $\lambda$ distributions, Australian and New Zealand Journal of Statistics 41, 353–374

Owen, D. B. (1988), The starship, Communications in Statistics - Computation and Simulation 17, 315–323.

Examples


 data <- rgl(100,0,1,.2,.2)
 starship.obj(c(0,1,.2,.2),data,"fmkl")
data <- rgl(100,0,1,.2,.2)
 starship.obj(c(0,1,.2,.2),data,"fmkl")

Trimmed L-moments

Description

Calculates sample trimmed L-moments with trimming parameter 1.

Usage

t1lmoments(data,rmax=4)
t1lmoments(data,rmax=4)

Arguments

`data`	matrix or data frame.
`rmax`	maximum order of trimmed L-moments.

Value

array of trimmed L-moments (trimming parameter = 1) up to order 4 containing a row for each variable in data.

Note

Functions link{Lmoments} and link{Lcoefs} calculate trimmed L-moments if you specify trim=c(1,1).

Author(s)

Juha Karvanen [email protected]

References

Karvanen, J. and A. Nuutinen (2008). "Characterizing the generalized lambda distribution by L-moments." Computational Statistics and Data Analysis 52(4): 1971-1983.

Asquith, W. (2007). "L-moments and TL-moments of the generalized lambda distribution." Computational Statistics and Data Analysis 51(9): 4484-4496.

Elamir, E. A., Seheult, A. H. 2004. "Exact variance structure of sample L-moments" Journal of Statistical Planning and Inference 124 (2) 337-359.

Hosking, J. 1990. "L-moments: Analysis and estimation distributions using linear combinations of order statistics", Journal of Royal Statistical Society B 52, 105-124.

Examples


x<-rnorm(500)
t1lmoments(x)

x<-rnorm(500)
t1lmoments(x)

Determine Missing Values

Description

Returns a vector showing the position of missing values in a vector.

Usage

which.na(x)
which.na(x)

Arguments

`x`	An object which should be of `logical`, `numeric`, or `complex`

Value

This returns the indices of values in x which are missing or "Not a Number".

Examples

# A non-zero number divided by zero creates infinity,
# zero over zero creates a NaN 
weird.values <- c(1/0, -2.9/0, 0/0, NA) 
which.na(weird.values) 
# Produces:
#  [1] 3 4 
# A non-zero number divided by zero creates infinity,
# zero over zero creates a NaN 
weird.values <- c(1/0, -2.9/0, 0/0, NA) 
which.na(weird.values) 
# Produces:
#  [1] 3 4

Package 'GLDEX'

Help Index

This package fits RS and FMKL generalised lambda distributions using various methods. It also provides functions for fitting bimodal distributions using mixtures of generalised lambda distributions.

Description

Details

Author(s)

References

See Also

Examples

Digit/Bit Representation of Integers in any Base

Description

Usage

Arguments

Value

Note

Author(s)

Examples

Fitting mixture of generalied lambda distribtions to data using maximum likelihood estimation via the EM algorithm

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

See Also

Examples

Fitting mixture of generalied lambda distribtions to data using parition maximum likelihood estimation

Description

Usage

Arguments

Details

Value

Note

References

See Also

Examples

Fitting mixtures of generalied lambda distribtions to data using quantile matching method

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

See Also

Examples

Finds the final fits using the maximum likelihood estimation for the bimodal dataset.

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

See Also

Examples

Finds the final fits using partition maximum likelihood estimation for the bimodal dataset.

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

See Also

Examples

Finds the initial values for optimisation in fitting the bimodal generalised lambda distribution.

Description

Usage

Arguments

Details

Value

Note

Author(s)

References