Estimating the Ratio of a Crisp Variable and a Neutrosophic Variable

The  estimation  of  the  ratio  of  two  means  is  studied  within  the  neutrosophic  theory  framework.  The  variable  of interest Y is measured in a sample of units and the auxiliary variable X is obtainable for all units using records  or predictions. They are correlated and the sample is selected using simple random sampling. The indeterminacy of the auxiliary variable is considered and is modeled as a neutrosophic variable.  The bias and variance of the proposed estimator are derived.


Introduction
, considered how the soft set theory constitutes a general mathematical tool for modeling uncertainty and impreciseness. This approach overcomes the need of parameterizing, as in the theories of probability, fuzzy sets and rough sets . The neutrosophic theory is based on a new set conceptualization. The roots of it may be found in Smarandache (2002Smarandache ( , 2003. Neutrosophic set is a generalization of the so-called intuitionistic   The behavior of a statistic is evaluated considering how it behaves in the long run. That is, how is its performance when analyzing a long sequence of samples , . . , , → ∞. Considering neutrosophic statistics an extension of the classical statistics may be developed using the similar principles. When the statistician considers that the data is known only approximately , he/she would not be confident in the inferences generated using crisp numbers. For example if a question is sensitive the respondent will tend to falsify the true value of Y, if carrying the stigma. Then the data is expected to be indeterminate. This situation is present in different neutrosophic studies as in the problems analyzed in Cacuango et al. (2020).
Neutrosophic statistics framework admits that the information provided by the data is not crisp, but ambiguous, vague, imprecise and/or incomplete. When the indeterminacy is zero the analysis made using a neutrosophic point of view would coincide with the results derived by classical statistics. The practitioner, using neutrosophic statistical methods, would interpret and organize the data taking into account the existence of these indeterminacies for obtaining some clues on the underlying patterns.
Some basic ideas on the estimation within a neutrosophic theory framework are developed in the next section.
Section 3 is concerned with the development of some aspects of estimation theory in a Neutrosophic context.
Section 4 develops a theory on ratio estimation. Numerical experiments are discussed in the Section 5.

Estimation in a Neutrosophic context
The proposals of Smarandache (2014 , 2016) discussed how common statistic equations and formulas, due to the data indeterminacy of some of the involved variables, may be better replaced by considering that they take values in a fixed set. The usual notation is to replace the variable crisp Z by its neutrosophic counterpart ZN. N identifies that the variable is "neutrosophic". The impreciseness on the true value of Z is modeled by considering not a value but a set including it. In the applications of statistics the decision makers frequently deal with imprecise data. A convenient model seems to be considering = + instead of Zi The statistician measures Zi but has the feeling that it is imprecise. It is subject to a basic error, which belongs to an interval IZ, and is "tuned" by Ai for each "i". For example, a promoter obtains in the web a value of the index of achievement of a singer, say Zi , but doubts that it is correct, as changes in the public preferences are constant. Hence, the change in the singers indexes moves in IZ =(-2,5 2,5) but for a particular singer the decision maker considers that it may be 3,5 times larger. Then, is recorded Zi+3,5IZ. Note that now the decision maker is able to implement decision rules where impreciseness is modeled.
Consider a sample of size m, the observations determine the sequence {( + ), = 1, … } . From Smarandache (2014Smarandache ( , 2016, is easily deduced that the sample mean is The deviation of each observation is and its square is given by Therefore, the sample variance is The sample experiments are described as usual. For the population U there is a sample space S. The sampler selects a sample design d(s). It assigns a probability to each element of the sample space and allows determining the probability of selecting a certain unit of U by computing The variable of interest Y is measured in each selected unit, it is a random variable that provides a result Y(ui).
The sampler looks in the collection of recorded values of X, , … , and obtains the corresponding random value xi. Due to the existence of some indeterminacy is considered that it is the neutrosophic random variable Then, in the study should be considered the existence of indeterminacy in the records and acknowledging this fact to work within a neutrosophic framework. Hence, taking an individual ui of the population ⇝ , + , = is obtainable from it. As the unit is selected with probability P(ui) the gathered information is random. In the rest of the paper, without losing in generality, is considered that Aj=A for any j=1,…,M.
The sampling design to be considered in this paper is Simple Random Sampling (SRS) Without replacement. It is defined, see Singh (2003), as In that case This result also holds if the selection is made with replacement. When M is sufficiently large the difference between selecting with or without replacement is negligible, in terms of the inferential processes, when ≅ 1. Using (2.6) is easily derived that the variance of + is given by Y is a crisp variable and its variance is

A ratio estimator
Frequently statistical research must deal with the estimation of a ratio or use it for deriving an estimator of the mean or the total of a variable of interest Y. A concomitant , or auxiliary, variable X, correlated with Y, is known. The population ratio of them is . Consider that a SRS sample is drawn from the population. A naive estimator is , where and are the sample means of Y and X. The auxiliary variable is obtained, commonly, from records or predictions, which usually are outdated and/or subject to impreciseness. The impreciseness may be modeled adequately in the context of Neutrosophic Theory.
Consider the neutrosophic number + , ∈ ( , ) . The sampler may model the imprecise knowledge on X by determining that for every individual of the population a measurement error interval. For example, if the decision maker considers that the percent of tax under-report is between 0% and 20%, is fixing that ∈ (0, 0,2 ).
The variable Y is measured by direct interviewing the individuals of the population. It is a non-neutrosophic number. In the previous example, R may be  The rate of the mean of the preferences of the public for a song with respect to the monthly mean of downloads.


The rate of the reported taxes with respect to the previous occasion payments.
 The ratio of the fuel consumption reports of a transport enterprise in two consecutive months. The alternative neutrosophic population ratio is The operator = * is used for modeling the indetermination present in the unknowledge of the true value of .
Values of close to zero model the decision maker`s confidence that is the true expectation of X. In many occasions is adequate using = ∑ = , ∈ ( , ). The decision maker usually fixes x0=0.
Once a sample is selected the sample mean of XN is computed as well as The need of estimating a ratio of two variables in this context suggests using ̅ = ̅ . Then, the neutrosophic estimator is Take the first term and analyze − = ̅ − (4.8) Consider that ∆ ̅ = ̅ , = , . As |∆ ̅ < 1| with m → ∆ ̅ = ∞ is valid developing ∆ ̅ in the Taylor Series The expected values of the summands are (∆ ̅ ) = 0, = , (∆ ̅ ) = , (∆ ∆ ̅ ) = and is derived that For the second term The approximation is valid assuming that ∆ ̅ ≅ 0 for a sufficiently large sample size. The first term of the expected value is (∆ ) = 0. Note that Its design expectation is On the other hand Developing the first term in taylor series is obtained The second term is and its expectation is given by Developing the third term is obtained because only the terms of order t≤2 in the Taylor Series are considered as significant. Therefore Note that, if = 0 the classical sampling results on ratio estimators are obtained.

Numerical studies
Only one of the multiple theoretical challenges of developing neutrosophic counterparts of sampling models is P2. The farmers tax-report was the variable of interest Y and X was the last month tax-payment. The population size was 450. The consensus of the specialists of the state office performed the role of the decision maker.
P3. The fuel consumption report of a fleet of 76 vehicles in two consecutive months was measured. Y was the consumption in the actual month and X the report in the previous month. The owner of the enterprise acted as decision maker.
P4. The monthly incomes of 117 employees of an enterprise was Y and X was the total amount of operations with their credit cards. The owner of the enterprise acted as decision maker.
The researchers had a complete knowledge of Y and X. Hence was possible to compute the values of the involved See the results of the study in Table 1. A lecture of the lines of the previous table suggests that the samples averaged an absolute difference between the estimate and the true value, which take values in the corresponding fifth column. The MSE of the methods appears in the eighth column. The decision makers fixed the corresponding IX. They considered that was obtained a good description of the impreciseness of the estimates rules by their appreciations.