On Distribution Characteristics of a Fuzzy Random Variable

By combining two types of uncertainty randomness and vagueness the concept of fuzzy random variable was introduced in order to integrate fuzzy set theory into a branch of statistical analysis called “statistics with vague data”. In this paper, a concept of fuzzy random variable will be presented. Using classical techniques in Probability Theory, some aspects and results associated to a random variable (including expectation, variance, covariance, correlation coefficient, fuzzy (empirical) cumulative distribution function) will be extended to this notion of fuzzy random variable. This notion provides a useful framework/results in order to extend statistical analysis to situations when the outcomes of random experiment are fuzzy sets.


Introduction
Statistical data are frequently associated with an underlying imprecision due, for instance, to inexactitude in the measuring process, vagueness of the involved concepts or a certain degree of ignorance about the real values.In many cases, such an imprecision can be modeled by means of fuzzy sets in a more efficient way than considering only a single value or category (Zadeh 1965).Thus, these kinds of data are jointly affected by two sources of uncertainty: fuzziness (due to imprecision, vagueness, partial ignorance) and randomness (due to sampling or measurement errors of stochastic nature).Randomness models the stochastic variability of all possible outcomes of a situation, and fuzziness relates to the unsharp boundaries of the parameters of the model.As Zadeh (1995) states that "Probability Theory and Fuzzy Logic are complementary rather than competitive", clearly, a natural question is how fuzzy variables could interact with the type of random variables found in association with many real-life random experiments from different fields.In this way, by combining ideas, concepts and results from both theories, this article focuses on one important dimension of this issue, fuzzy random variables.
The concept of fuzzy random variable (frv) (also called "random fuzzy set" (Blanco-Fernández, Casals, Colubi, Corral, García-Bárzana, Gil, González-Rodríguez, López, Lubiano, Montenegro, Ramos-Guajardo, De La Rosa De Sá, and Sinova 2013)) was introduced in order to deal with situations where the outcomes of a random experiment are modeled by fuzzy sets (Colubi, Domínguez-Menchero, López-Díaz, and Ralescu 2001;Colubi, Fernández-García, and Gil 2002;Colubi and Gil 2007;Colubi and González-Rodríguez 2007;Couso and Sánchez 2008;Feng 2000;Gil 2001;Gil, López-Díaz, and Ralescu 2006;González-Rodríguez, Colubi, and Gil 2006a;Krätschmer 2001;Kruse and Meyer 1987;Kwakernaak 1978Kwakernaak , 1979;;Liu and Liu 2003;Puri andRalescu 1985, 1986;Shapiro 2009).An frv is a mapping that associates a fuzzy set of the final space to each possible result of a random experiment in a provided probability space structure.Thus, this concept generalizes the definitions of random variable and random set.Although these generalizations are not unique in the literature but they can be formalized in equivalent ways.Each definition differs from the others in the structure of the final space and the way the measurability condition is transferred to this context.For instance, Krätschmer (2001); Kruse and Meyer (1987) and Puri andRalescu (1985, 1986) focused on the properties of the multi-valued mappings associated to the α-cuts.Kwakernaak (1978Kwakernaak ( , 1979) ) assumes that the outcomes of the frv are fuzzy real subsets and the extreme points of their α-cuts are classical random variables.Puri andRalescu (1985, 1986) require the α-cuts to be measurable (also different conditions for measurability of multi-valued mappings can be formulated).On the other hand, Klement, Puri, and Ralescu (1986) and Diamond and Kloeden (1994) define frv's, as classical measurable mappings.Couso and Sánchez (2008) present three different higher order possibility models that represents the imprecise information provided by an frv.
In the literature on frvs, there are only a few references to modeling the distribution of these random elements.These models are theoretically well stated, but they are not soundly supported by empirical evidence, since they correspond to restrictive random mechanisms and hence they are not realistic in practice (González-Rodríguez, Colubi, Gil, and Coppi 2006;Möller, Graf, M., and Sickert 2002).This motivated us to present in this paper another model that represents the imprecise information provided by an frv.Within this framework, we use the tools of general Probability Theory (Billingsley 1995) to define fuzzy cumulative distribution function and fuzzy empirical cumulative distribution function for an frv.We also extend the concepts of expectation, variance, covariance and correlation coefficient of an frv by reproducing classical techniques.For instance, when the images of the frv are convex fuzzy subsets of R, we can use fuzzy arithmetic to derive a method of construction of the fuzzy expectation.On the other hand, we can make a parallel construction of the variance: let us consider a particular metric defined over the class of fuzzy subsets of the final space.In this setting, we define the variance of an frv as the mean (classical expectation of a random variable) of the squares of the distances from the images of the frv to the (fuzzy) expectation.In this context the variance of an frv is a (precise) number that quantifies the degree of dispersion of the images of the frv.Extending these results is not just a matter of motivation, but the main issue is that the concepts of fuzzy cumulative distribution function and fuzzy empirical cumulative distribution function for an frv strongly affects the aim of the Statistics to be developed around (Hesamian and Chachi 2013).Although in the literature distributions and parameters could be defined in some senses in connection with the frv through Zadeh's extension principle (Zadeh 1965), but the objective of statistical developments refer usually to the distribution and parameters of the underlying original real-valued random variable (Wu 1999).When the distribution of an frv can be defined, the objective of statistical developments will only refer to the distribution and parameters of the frv, since either there is no underlying real-valued random variable behind the process (as happens when we deal with judgments, valuations, ratings, and so on) or the interest is just to be focused on the fuzzy perception (Blanco-Fernández et al. 2013).Therefore, the aim of inferential statistical developments with fuzzy data based on frvs will be to draw conclusions about the distribution of the involved frvs over populations on the basis of the information supplied by samples of (fuzzy) observations from these frvs.One of the relevant inferential problems is to estimate the parameters or measures associated with the distribution of an frv on the basis of the information provided by a sample of independent data from it.Furthermore, when Statistics are based on the concept of frv, some additional problems arise (see also Conclusion), like 1. the lack of realistic general "parametric" families of probability distribution models for frvs (Blanco-Fernández et al. 2013); 2. the lack of Central Limit Theorems (CLTs) for frvs which are directly applicable for inferential purposes (Wu 2000;Krätschmer 2002a,b).
The above first item will be considered in this paper for the proposed frv by defining the concepts of fuzzy cumulative distribution function and fuzzy empirical cumulative distribution function.The second item (and also some other items in Conclusion) can be addressed in feature researches.
The paper is organized as follows.The next section provides the necessary technical background used for convenience of explaining general concepts concerned with fuzzy sets.In Section 3, we propose a new definition of frv.In Section 4, using classical techniques in Probability Theory, we extend some common characteristics of frvs including expectation, variance, covariance, correlation coefficient.In Section 5, we generalize the concept of fuzzy cumulative distribution function and fuzzy empirical cumulative distribution function for an frv.We end the paper with some general concluding remarks and open problems.

Preliminary concepts
In this section, first, we shall review the basic definitions and terminologies of the fuzzy set theory and uncertainty theory which are necessary for our paper (for further details, the reader is referred to Liu (2002Liu ( , 2016)); Peng and Liu (2004); Viertl (2011);Zimmermann (2001)).Then, a new definition of distance measure between fuzzy numbers is defined.

Fuzzy numbers
A fuzzy set A of the universal set X is defined by its membership function A : X → [0, 1].In this paper, we consider R (the real line) as the universal set.We denote by A[α] = {x ∈ R : A(x) ≥ α} the α-level set (α-cut) of the fuzzy set A of R, for every α ∈ (0, 1], and is a non-empty compact interval.We denote by F(R), the set of all fuzzy numbers of R.
A specific type of fuzzy number, which is rich and flexible enough to cover most of the applications, is the so-called LR-fuzzy number.Typically, the LR fuzzy number N = (n, l, r) LR with central value n ∈ R, left and right spreads l ∈ R + , r ∈ R + , decreasing left and right shape functions We can easily obtain the α-cut of N as follows For the algebraic operations of LR-fuzzy numbers, we have the following result on the basis of Zadeh's extension principle.Let A = (a, l 1 , r 1 ) LR and B = (b, l 2 , r 2 ) LR be two LR-fuzzy numbers and λ ∈ R − {0} be a real number.Then

Some notions from uncertainty theory
In the following, we introduce an index to compare fuzzy number A ∈ F(R) and crisp value x ∈ R. The index is used for defining a new notion of frv.
Definition 1 (Liu and Liu (2002)).Let A ∈ F(R) and x ∈ R. The index which is defined by shows the credibility degree that A is less than or equal to x.Similarly, C{ A > x} = 1 − C{ A ≤ x} shows the credibility degree that A is greater than x (see also Liu (2016)).
Lemma 1.Let A, B ∈ F(R) and λ be a real number.Then Example 1. Suppose that A = (a, l, r) LR is a LR-fuzzy number, and let x ∈ R, then We can easily obtain the α-pessimistic values of A as follows As an example, consider the triangular fuzzy number A = (a, l, r) T , then

A new distance measure between fuzzy numbers
In the literature one can find many useful metrics between fuzzy numbers.Valuable references on this topic can be found in Blanco-Fernández et al. (2013); Feng and Liu (2006); Liu and Liu (2002).In the following, a new definition of metrics between fuzzy numbers is defined.
Definition 3. The distance measure is defined as the mapping D : such that it associates with two fuzzy numbers A, B ∈ F(R) the following value One can conclude that the mapping D : F(R) ⊗ F(R) → [0, ∞) satisfies the following conditions: 1.For any A, B ∈ F(R), D( A, B) = 0 if and only if A = B.

For any
As an example, we can easily obtain the distance between two LR-fuzzy numbers A = (a, l 1 , r 1 ) LR and B = (b, l 2 , r 2 ) LR as follows For symmetric fuzzy numbers A = (a, l, l) L and B = (b, r, r) L , we have

Fuzzy random variables
In the context of random experiments whose outcomes are not numbers (or vectors in R p ) but they are expressed in inexact terms, the concept of frv turns out to be useful.Random fuzzy numbers (or, more generally, random fuzzy sets (Blanco-Fernández et al. 2013)) is a well-stated and supported model within the probabilistic setting for the random mechanisms generating fuzzy data.They integrate randomness and fuzziness, so that the first one affects the generation of experimental data, whereas the second one affects the nature of experimental data which are assumed to be intrinsically imprecise.The notion of random fuzzy set can be formalized in several equivalent ways.Thus, in this regard, different notions of frv have been introduced and investigated in the literature (Colubi et al. 2001;Couso and Sánchez 2008;Feng 2000;Gil et al. 2006;González-Rodríguez et al. 2006a;Hesamian and Chachi 2013;Krätschmer 2001;Kruse and Meyer 1987;Kwakernaak 1978Kwakernaak , 1979;;Liu and Liu 2003;Puri andRalescu 1985, 1986;Shapiro 2009).
Definition 4. Suppose that a random experiment is described by a probability space (Ω, A, P), where Ω is a set of all possible outcomes of the experiment, A is a σ-algebra of subsets of Ω and P is a probability measure on the measurable space (Ω, A).The fuzzy-valued mapping X : Ω → F(R) is called an frv if for any α ∈ [0, 1], the real-valued mapping X α : Ω → R is a real-valued random variable on (Ω, A, P).Throughout this paper, we assume that all random variables have the same probability space (Ω, A, P) Kwakernaak (1978Kwakernaak ( , 1979) ) introduced the notion of frvs which has been later formalized in a clear way by Kruse and Meyer (1987) as: given a probability space (Ω, A, P), a mapping X : Ω → F(R) is said to be an frv if for all α ∈ (0, 1] the two real-valued mappings X L α : Ω → R and X U α : Ω → R are real-valued random variables. It can be easily investigated that the following relationships are held between the notion of frv proposed in Definition 4 and Kwakernaak and Kruse's definition of frv (see also, Example 1) The first relation shows that the information contained in the two-dimensional variable ( X L α , X U α ) is summarized in the one-dimensional variable X α making the computational procedures in the problems more easier.
Definition 5. Two frvs X and Y are said to be independent if X α and Y α are independent, for all α ∈ [0, 1].In addition, we say that two frvs X and Y are identically distributed if X α and Y α are identically distributed, for all α ∈ [0, 1].Similar arguments can be used for more than two frvs.We also say that X 1 , . . ., X n is a fuzzy random sample if X i 's are independent and identically distributed frvs.We denote by x 1 , . . ., x n the observed values of fuzzy random sample X 1 , . . ., X n .

Fuzzy expected value, variance and covariance of an frv
In analyzing fuzzy data two main types of summary measures/parameters may be distinguished: 1. fuzzy-valued summary measures, like the mean value of an frv or the median of an frv as measures for the central tendency of their distributions; 2. real-valued summary measures, like the variance of an frv as a measure for the mean error/dispersion of the distributions of the frv, or the covariance and correlation coefficient as measures of the (absolute) linear dependence/association of an frv.
Definition 6.Let (Ω, A, P) be a probability space and X : Ω → R be a real-valued random variable.We say that X has finite mean and write X ∈ L 1 (Ω, A, P) if and only if E(X) = Ω X dP < M , for some constant M < ∞.
Definition 7. Given a probability space (Ω, A, P) and an associated frv X : Ω → F(R) such that for any α ∈ [0, 1] the real-valued random variable X α : Ω → R on (Ω, A, P) has finite mean then the mean value of X is the fuzzy value E( X) ∈ F(R) such that for all α ∈ [0, 1] The mean value of an frv satisfies the usual properties of linearity and it is the Fréchet's expectation w.r.t.D, which corroborates the fact that it is a central tendency measure (Näther 2001).In this way, Proposition 1. E is additive (i.e., equivariant under the sum of frvs), that is, for frvs X and Y associated with the same probability space (Ω, A, P) and such that X α , Y α ∈ L 1 (Ω, A, P), we have that 1. E(λ ⊕ X) = λ ⊕ E( X), for any constant number λ ∈ R.
Proposition 2. E is the Fréchet's expectation of X w.r.t.D, that is, so that the mean is the fuzzy value leading to the lowest mean squared D-distance (or error) with respect to the frv distribution, and this corroborates the fact that it is a central tendency measure.
Definition 8.The variance of an frv X is defined as The situation with the usual random variable is a special case of the proposed procedure.By using the indicator function I {X} as the membership function for the frv, the variance of the crisp random variable X, i.e.V ar(X), coincides with ν(X), therefore, we have ν(X) = V ar(X).
Now, if we define the scalar multiplication between frvs X and Y as follows then, it is easy to conclude that ν( X) = E X, X − E( X), E( X) .
Proposition 3. Let X = ( X 1 , X 2 , . . ., X n ) be a fuzzy random sample, and be the crisp variance value of the fuzzy sample X, where ¯ X = 1 n ⊕ n i=1 X i is the fuzzy sample mean value.Then the following properties are held: 3. ν(λ ⊗ X) = λ 2 ν( X), for any constant number λ ∈ R.
Definition 9.The covariance and correlation coefficient of frvs X and Y are defined as follows, respectively, We can easily show that 1. Cov( X, λ) = 0 for any constant number λ ∈ R.

Fuzzy cumulative distribution function
In this section, we extend the concepts of Fuzzy Cumulative Distribution Function (F.C.D.F.) and Fuzzy Empirical Cumulative Distribution Function (F.E.C.D.F.) for an frv.
Definition 10.The F.C.D.F. of frv X at x ∈ R is defined as fuzzy set F X (x) with the following membership function Definition 11.We say that F.C.D.F.F X (x) is continuous at x ∈ R, if for every α ∈ [0, 1], the function ( F X (x)) U α is continuous at x (or equivalently, for every α ∈ [0, 1], the crisp random variable X α is continuous).
Definition 12. Suppose that X 1 , X 2 , . . ., X n is a fuzzy random sample.The F.E.C.D.F. of fuzzy random sample X 1 , X 2 , . .., X n , at x ∈ R is defined to be the fuzzy set F n (x) with the following membership function Example 2. Suppose that, based on a fuzzy random sample of size n = 30, we observe the triangular fuzzy numbers given in Table 1 (Hesamian and Chachi 2013;Viertl 2011).According to Definition 12, the F.E.C.D.F. of this fuzzy random sample is obtained and the 3-dimensional curve of its membership function is shown in Fig. 1, for every x ∈ [0, 3].Moreover, in order to make the 3-dimensional curve of the membership function in Fig. 1 more clear, the α-cut of this membership function is shown in Fig. 2, for α = 0.3.Example 3. Let X = Θ ⊕ Ξ, where Ξ is a (usual) normal random variable with mean 0 and variance σ 2 , i.e.Ξ ∼ N (0, σ 2 ), and Θ is a constant fuzzy set.For example, suppose Θ is a LR-fuzzy number, i.e.Θ = (θ, l, r) LR with known θ, l, r, and fixed functions L, and R. Therefore, X = (Ξ + θ, l, r) LR and for each ω, X(ω) = (Ξ(ω) + θ, l, r) LR is an observation of X.Now, we have (see also, Example 1) Since Ξ is a normal random variable, therefore, it is clear that X α is a normal random variable for each α ∈ [0, 1], i.e.So, according to Definition 4, X is an frv.We can easily show that E( X) = Θ, and ν( X) = σ 2 .Now, we are going to obtain the membership function of fuzzy set F X (x), i.e. the F.C.D.F. of the frv X at x ∈ R. Its membership function is defined as in which where, Φ is the cumulative distribution function of standard normal random variable Z, i.e. if Z ∼ N (0, 1) then P(Z ≤ z) = Φ(z), z ∈ R. We consider a simplification of the parameters Θ and σ 2 , therefore, we take Θ = (0, 1, 1) T and σ = 1 as special cases.Substituting these values in the above equations, we can easily obtain Thus, the membership function of fuzzy set F X (x) is given as follows for any y ∈ [0, 1] Note that, the function Φ(x + 1 − 2α) is strictly decreasing with respect to α ∈ [0, 1], for any fixed x ∈ R (see Fig. 3).Therefore, for any y ∈ The above obtained α must be between 0 and 1, so Finally, according to the above equations, the membership function of F X (x) at x ∈ R is given by The membership function F X (x) is depicted in Fig. 4.
This notion of frv is the definition of normality for frvs and X = Θ ⊕ Ξ, (Ξ ∼ N (0, σ 2 ), and Θ is a constant fuzzy set) is called the normal (Gaussian) frv in the literature (Feng 2000;Puri and Ralescu 1985).

Conclusions
In this paper the concept of modeling fuzzy random variable is presented dealing with situations where the outcomes of a random experiment are modeled by fuzzy sets.In order to model the imprecise information of random experiments the notions of fuzzy cumulative distribution function and fuzzy empirical cumulative distribution function are considered (Möller et al. 2002).To achieve suitable statistical methods dealing with imprecise data and extend the usual approaches to imprecise environments several probabilistic definitions have been obtained in connection with this random element, some of them having immediate statistical implications.Fuzzy set theory seems to have suitable tools for modeling the imprecise information of random experiments and provides appropriate statistical methods based on them (see, for instance, Bandemer and Näther (1992); Chachi and Taheri (2011);Chachi, Taheri, and Viertl (2012); Colubi (2009); Colubi and Gil (2007); Colubi and González-Rodríguez (2007); Colubi, González-Rodríguez, Lubiano, and Montenegro (2006); Coppi, Gil, and Kiers (2006);Gebhardt, Gil, and Kruse (1998); González-Rodríguez, Montenegro, Colubi, and Gil (2006b); Hesamian and Chachi (2013); Kruse and Meyer (1987); Taheri and Hesamian (2011)).As a consequence, different approaches can also be provided for developing fuzzy statistical methods using the new concept of frv proposed in this paper.We end the paper with some general concluding remarks and open problems.
1-The new concept of frv proposed in this paper can be used to develop some kind of linear estimation theory.The attempt can be done to develop a certain kind of linear theory for frvs with respect to extended addition and scalar multiplication.However, the classical estimation problem in a linear regression model in view of fuzzy data can be a potential topic for further researches (see, for instance, Wünsche and Näther (2002)).

2-
The new concept of frv can be studied successfully for limit theorems, and can be applied to asymptotic statistics with vague data (see, for instance, Klement et al. (1986)).Notice that there are lack of Central Limit Theorems (CLTs) for frvs which are directly applicable for inferential purposes (actually, there exist some CLTs for frvs according to which the normalized distance sample-population fuzzy mean converges in law to the norm of a Gaussian random element but with values often out of the cone) (Wu 2000;Krätschmer 2002a,b).Also, the essential large sample properties of the fuzzy empirical distribution function (like Cantelli-Glivenko's Lemma (Govindarajulu 2003)) can be stated and proved.
3-From a statistical point of view, fuzzy expected value and fuzzy median play important roles as central summary measures.The point estimation of these measures can be one of the first statistical analysis concerning frvs.Later, the initial hypothesis testing procedures can be studied, although they need some theoretical/practical constraints (see, for instance, Colubi ( 2009)).5-As for the real/vectorial-valued case, hypotheses could either concern parameters/measures of the distribution of the frv(s) (see items 3 and 4 above) or concern the distribution itself (parametric/non-parametric).Therefore, testing hypothesis related to the distribution(s) of one-sample or multi-sample of observations can be considered.In this regard, non-parametric tests (like goodness-of-fit tests) can be developed to determine whether two underlying one dimensional distributions (or multi underlying one dimensional distributions) are the same or not.Here based on the definition of fuzzy empirical cumulative distribution functions, test statistics and test functions can be defined (see, for instance, Lin, Wu, and Watada (2010); Hesamian and Chachi (2013); Hryniewicz (2006); Taheri and Hesamian ( 2011)) 6-It has been shown that the distribution of any real-valued random variable can be represented by means of a fuzzy set.The characterizing fuzzy sets correspond to the expected value of a certain frv based on a family of fuzzy-valued transformations of the original realvalued ones (González-Rodríguez et al. 2006a).They can be used for descriptive/exploratory or inferential purposes.This fact adds an extra-value to the fuzzy expected value and the preceding statistical procedures, that can be used in statistics about real distributions.

Figure 1 :
Figure 1: The plot of membership function of F.E.C.D.F. of the fuzzy observations inTable 1 for values of x ∈ [0, 3]

Figure 4 :
Figure 4: The membership function of the fuzzy cumulative distribution function F X (x) in Example 3

4-
The bootstrap techniques have empirically shown to be efficient and powerful in hypothesis testing.Furthermore, analogous two-sample tests and, in general, multi-sample tests for the equality of fuzzy expected values can also be obtained (see, for instance,González-Rodríguez et al. (2006b)).