The Complementary Generalized Transmuted Poisson-G Family of Distributions

We introduce a new class of continuous distributions called the complementary generalized transmuted Poisson-G family, which extends the transmuted class pioneered by Shaw and Buckley (2007). We provide some special models and derive general mathematical properties including quantile function, explicit expressions for the ordinary and incomplete moments, generating function, Rényi and Shannon entropies and order statistics. The estimation of the model parameters is performed by maximum likelihood. The flexibility of the new family is illustrated by means of two applications to real data sets.

Let p(t) be the pdf of a random variable ] be a function of the cdf of a random variable X such that the following conditions hold: ] is differentiable and monotonically non-decreasing, and (3) Recently, Alzaatreh, Lee, and Famoye (2013) defined the cdf of the T-X family of distributions by where W [G(x)] satisfies the conditions (3).The pdf corresponding to 4 is given by where G(x; ξ) is the baseline cdf depending on a parameter vector ξ, G(x; ξ) = 1 − G(x; ξ), θ > 0 and |λ| ≤ 1 are two additional shape parameters.The CGTP-G family is a wider class of continuous distributions.It includes the TG family when θ → 0. The main advantage of the new family relies on the fact that practitioners will have a quite flexible two-parameter generator to fit real data from several fields.We provide a comprehensive account of some of its mathematical properties.
The rest of the paper is organized as follows.In Section 2, we define the CGTP-G family.
In Section 3, we present two special models and plots of their pdfs and hazard rate functions (hrfs).We give a very useful linear representation for the family density function in Section 4.
In Section 5, we derive some of its general mathematical properties including asymptotics, ordinary and incomplete moments, quantile and generating functions, residual life and reversed residual life functions and entropies.In Section 6, we investigate the order statistics and their moments.Maximum likelihood estimation of the model parameters is addressed in Section 7.
Simulation results to assess the performance of the maximum likelihood estimation method are reported in Section 8.In Section 9, we provide two applications to real data to illustrate the flexibility of the new family.Finally, we offer some concluding remarks in Section 10.
Further, let Y 1 and Y 2 be iid random variables with cdf Π(x) and define  Expanding the quantities exp θG(x) and exp 2θG(x) in power series, we can write Applying the binomial expansion to [1 − G(x)] i , we have By changing the sums over the indices k and i, we obtain Then, the pdf of the CGTP-G family reduces to where h k+1 (x) = (k + 1) g(x) G(x) k is the Exp-G pdf of a random variable Y k+1 with power parameter k + 1 and Equation ( 9) reveals that the CGTP-G density function is a linear combination of Exp-G densities.Thus, some mathematical properties of the new family can be derived from those properties of the Exp-G class.
By integrating (9), we obtain the same linear representation for the cdf of X where H k+1 (x) is the cdf of the Exp-G family with power parameter k + 1.

Mathematical properties
The formulae derived throughout the paper can be easily handled in most symbolic computation platforms such as Maple, Mathematica and Matlab.These platforms have currently the ability to deal with analytic expressions of formidable size and complexity.Established explicit expressions to obtain some statistical measures can be more efficient than computing them directly by numerical integration.

Asymptotics
Let a = inf{x|G(x) > 0}, the asymptotics of F (x), f (x) and τ (x) as x → a are given by

The Complementary Generalized Transmuted Poisson-G Family of Distributions
The asymptotics of F (x), f (x) and τ (x) as x → ∞ are given by These equations show the effect of parameters on tails of distribution.

Moments
The nth ordinary moment of X, say µ n , can be determined from ( 9) as Setting n = 1 in (10), we have the mean of X.The central moments (µ s ) and cumulants (κ s ) of X follow from (10) as The skewness and kurtosis of X are the third and fourth standardized cumulants given by γ 1 = κ 3 /κ 3/2 2 and γ 2 = κ 4 /κ 2 2 , respectively.

Incomplete moments
Here, we determine the nth incomplete moment of X defined by m n (y) = y −∞ x n f (x)dx.We have where The integral m n,δ (y) can be obtained analytically for special models with closed-form expressions for Q G (u; ξ) or evaluated at least numerically for most baseline distributions.
An important application of the first incomplete moment of X in (11), say m 1 (y), refers to the Bonferroni and Lorenz curves.These curves are very useful in economics, reliability, demography, insurance and medicine.
For a given probability π, the Bonferroni and Lorenz curves are given by B(π) = m 1 (p)/(pµ 1 ) and Another application is related to the mean deviations about the mean ( and about the median (δ 2 = E(|X − M |)) of X given by respectively, where M = Q(0.5) is the median of X, µ 1 = E(X) comes from equation (10), F (µ 1 ) is easily evaluated from ( 8) and m 1 (z) is obtained from (11) with n = 1.

Residual and reversed residual life functions
For n = 1, 2, . .., the nth moment of the residual life of X, uniquely determines F (x) and it is given by Then, Another interesting function is the mean residual life (MRL) function or the life expectation at age t just given by v , which represents the expected additional life length for a unit which is alive at age t.The MRL of X can be obtained by setting n = 1 in the last equation.
The nth moment of the reversed residual life for t > 0 and n = 1, 2, . .., uniquely determines F (x) and follows from v n (t).The mean inactivity time (MIT) of X given by M 1 (t) represents the waiting time elapsed since the failure of an item on condition that this failure had occurred in (0, t).

Quantile and generating functions
The qf of X is obtained by inverting If U is a uniform variate on the unit interval (0, 1), then the random variable X = Q(U ) has density (7).
We can also simulate the CGTP-G distribution as follows: if u ∼ U (0, 1), the solution of the nonlinear equation, for λ = 0, is given by For λ = 0, we obtain The moment generating function (mgf) of X, say M (t) = E(e t X ), can be obtained from (9) as where M δ (t;ξ) is the mgf of Y δ given by The last two integrals can be computed numerically for most parent distributions.

Entropies
The Rényi entropy of a random variable X represents a measure of variation of the uncertainty.It is defined by Using the pdf in ( 7), we can write The Complementary Generalized Transmuted Poisson-G Family of Distributions Let (δ) n = Γ(δ)/Γ(δ − n) be the falling factorial.Then, where Then, the Rényi entropy of the CGTP-G family is given by where the integral can be determined numerically for any parent distribution.
The δ-entropy, say H δ (X), for δ > 0 and δ = 1, is defined by and then The Shannon entropy, say SI, of a random variable X is given by It is the special case of the Rényi entropy when δ ↑ 1.

Order statistics
Order statistics make their appearance in many areas of statistical theory and practice.Let X 1 , . . ., X n be a random sample from the CGTP-G family.The pdf of the ith order statistic, say X i:n , is given by After some algebra, we can write .
Further, we have where ϕ j+i−1,0 = b j+i−1 0 and (for k ≥ 1) Hence, where Equation ( 12) is the main result of this section.Thus, the density function of the CGTP-G order statistics is a triple linear combination of Exp-G distributions.Based on equation ( 12), we can obtain some structural properties of X i:n from those of the Exp-G model.
The qth moment of X i:n is given by Based upon the moments in equation ( 13), we can derive explicit expressions for the Lmoments of X as infinite weighted linear combinations of the means of suitable Exp-G densities.These moments are analogous to the ordinary moments but can be estimated by linear combinations of order statistics.They are given as linear functions of expected order statistics, namely

Maximum likelihood estimation
Several approaches for parameter estimation were proposed in the literature but the maximum likelihood method is the most commonly employed.We consider the estimation of the unknown parameters of the proposed family from complete samples only by maximum likelihood, although we provide a summary of the least squares method.Let x 1 , . . ., x n be a random sample from the CGTP-G family with parameters λ, θ and ξ.Let ζ = (λ, θ, ξ intercal ) intercal be the p × 1 parameter vector.Then, the log-likelihood function for ζ is given by where q i = 1 − λp i e θ −1 − λe θ s i e θ −1 , p i = 1 − e θG(x i ;ξ) and s i = 1 − e −θG(x i ;ξ) .The equation for (ζ) can be maximized either directly by using the MATH-CAD program, SAS (PROC NLMIXED), R (optim function) and Ox program (sub-routine MaxBFGS), or by solving the nonlinear likelihood equations obtained by differentiating this equation.

The Complementary Generalized Transmuted Poisson-G Family of Distributions
The components of the score vector are given by Setting the nonlinear system of equations U λ = U θ = 0 and U ξ k = 0 (for the components of ξ) and solving them simultaneously yields the maximum likelihood estimate (MLE) ζ = ( λ, θ, ξ intercal ) intercal .It is usually more convenient to adopt nonlinear optimization methods such as the quasi-Newton algorithm to maximize numerically.For interval estimation of the parameters, we obtain the p × p observed information matrix J(ζ) = {− ∂ 2 ∂r ∂s } (for r, s = λ, θ and varying on the components of ξ), whose elements can be evaluated numerically.
Under standard regularity conditions when n → ∞, the distribution of ζ can be approximated by a multivariate normal N p (0, J( ζ) −1 ) distribution to construct confidence intervals for the parameters.Here, J( ζ) is the total observed information matrix evaluated at ζ, whose elements are given in Appendix A.
An alternative approach to the maximum likelihood method is the least square estimation.For the CGTP-G model, the least square estimates (LSEs) λ, θ and ξ of λ, θ and ξ are defined as those arguments that minimize the objective function: where x i:n is a possible outcome of the ith order statistic based on a n-points random sample and The minimum point λ, θ and ξ can also be given as a solution in the following system of non-linear equations: where k varies over the components of ξ.

Applications
In this section, we provide two applications to real data to prove empirically the flexibility of the CGTPLi model introduced in Section 3.2.The goodness-of-fit statistics for this model are compared with other competitive models and the MLEs of the model parameters are determined numerically.For the two real data sets, we compare the fits of the CGTPLi distribution with the Kumaraswamy Lindley (KwLi) ( Çakmakyapan and Kadilar (2014)), beta Lindley (BLi) (Merovci and Sharma (2014)), McDonald modified Weibull (McMW) (Merovci and Elbatal (2013)), Kumaraswamy-transmuted exponentiated modified Weibull (KwTEMW) (Al-Babtain, Fattah, Ahmed, and Merovci (2017)), transmuted modified Weibull (TMW) (Khan and King (2013)) and power Lindley (PoLi) (Ghitany, Al-Mutairi, Balakrishnan, and Al-Enezi (2013)) models with corresponding densities (for x > 0): • The KwLi density is given by • The BLi density is given by • The McMW density is given by • The KwTEMW density is given by The Complementary Generalized Transmuted Poisson-G Family of Distributions • The TMW density is given by • The PoLi density is given by The parameters of the above densities are all positive real numbers except for the KwTEMW and TMW distributions for which |λ| ≤ 1.
In order to compare the fitted models, we consider some goodness-of-fit statistics, namely the Akaike information criterion (AIC), consistent Akaike information criterion (CAIC), Hannan-Quinn information criterion (HQIC), Bayesian information criterion (BIC) and −2 , where is the maximized log-likelihood.Moreover, we use the Anderson-Darling (A * ) and the Cramer-von Mises (W * ) statistics in order to compare the fits of the two new models with other nested and non-nested models.The statistics are widely used to determine how closely a specific cdf fits the empirical distribution of the data set.The smaller these statistics are, the better the fit is.Upper tail percentiles of the asymptotic distributions of these goodness-of-fit statistics were tabulated by Nichols and Padgett (2006).
These following numerical results are obtained using the MATH-CAD program.Furthermore, the R codes to compute the cdf, pdf and maximum likelihood estimates for the CGTPLi distribution are given in Appendix B.

Active repair times data
The first data set represents the active repair times (h) by Jorgensen ( 2012)) for an airborne communication transceiver.These data consist of 40 observations (see Appendix C).
Tables 2 and 3 provide the MLEs and their standard errors (in parentheses) of the model parameters for some distributions and the goodness-of-fit statistics for the current data, respectively.The plots of the fitted CGTPLi pdf and other fitted pdfs defined before, for the two data sets, are displayed in Figure 3.

Cancer patient data
The second data set refers to the remission times (in months) of a random sample of 128 bladder cancer patients given in Appendix C (Lee and Wang, 2003).These data have been used by Nofal et al. (2016) and Mead and Afify (2016) to fit the generalized transmuted log-logistic and Kumaraswamy exponentiated Burr XII distributions, respectively.The MLEs and their corresponding standard errors (in parentheses) of the model parameters and the values of −2 , AIC, CAIC, HQIC, BIC, W * and A * are given in Tables 4 and 5, respectively.
In Tables 4 and 5, we compare the fits of the CGTPLi model with the TMW, McMW, KwTEMW, PoLi, KwLi and BLi models.We note that the CGTPLi model has the lowest values for the −2 , AIC, CAIC, HQIC, BIC, W * and A * statistics (for the two real data sets) among the fitted models.So, the CGTPLi model could be chosen as the best model.It is quite clear from the values in Tables 3 and 5 that the CGTPLi model provides the best fits to these data sets.So, we prove empirically that this distribution can be a better model than other competitive models.Further, the plots in Figure 3 reveal that the CGTPLi distribution provide the best fits.In fact, it can be considered a very competitive model to other distributions with positive support.

Conclusions
The idea of generating new extended models from classic ones has been of great interest among researchers in the past decade.We propose a new complementary generalized transmuted Poisson-G (CGTP-G) family of distributions, which extends the transmuted class (Shaw and Buckley (2007)) by adding one extra shape parameter.Many well-known distributions emerge as special cases of the proposed family.We provide some mathematical properties of the new family including explicit expressions for the ordinary and incomplete moments, mean deviations, generating function, Rényi and q-entropies and order statistics.The maximum likelihood estimation of the model parameters is investigated and the observed information matrix is determined.By means of two real data sets, we verify that a special case of the CGTP-G family can provide better fits than other models generated from well-known families.

Figure 1 :
Figure 1: Plots of the CGTPW density and hazard rate.

Figure 2 :
Figure 2: Plots of the CGTPLi density and hazard rate.

Figure 3 :
Figure 3: The estimated CGTPLi pdf and other estimated pdfs for active repair times (left panel) and cancer data (right panel).

Table 1 :
Biases and MSEs for simulation data

Table 2 :
MLEs and their standard errors (in parentheses) for active repair times

Table 3 :
Goodness-of-fit statistics for active repair times

Table 4 :
MLEs and their standard errors (in parentheses) for the cancer patient data

Table 5 :
Goodness-of-fit statistics for the cancer patient data