Antipyretics for children are prescribed by a pediatrician. But there are emergency situations for fever when the child needs to be given medicine immediately. Then the parents take responsibility and use antipyretic drugs. What is allowed to be given to infants? How can you lower the temperature in older children? What medications are the safest?
Criteria for agreement (compliance)
To test the hypothesis about the correspondence of the empirical distribution to the theoretical distribution law, special statistical indicators are used - goodness-of-fit criteria (or compliance criteria). These include the criteria of Pearson, Kolmogorov, Romanovsky, Yastremsky, etc. Most agreement criteria are based on the use of deviations of empirical frequencies from theoretical ones. Obviously, the smaller these deviations, the better the theoretical distribution corresponds to the empirical one (or describes it).
Consent criteria - these are criteria for testing hypotheses about the correspondence of the empirical distribution to the theoretical probability distribution. Such criteria are divided into two classes: general and special. General goodness-of-fit tests apply to the most general formulation of a hypothesis, namely the hypothesis that observed results agree with any a priori assumed probability distribution. Special goodness-of-fit tests involve special null hypotheses that state agreement with a particular form of probability distribution.
Agreement criteria, based on the established distribution law, make it possible to establish when discrepancies between theoretical and empirical frequencies should be considered insignificant (random), and when - significant (non-random). It follows from this that the agreement criteria make it possible to reject or confirm the correctness of the hypothesis put forward when aligning the series about the nature of the distribution in the empirical series and to answer whether it is possible to accept for a given empirical distribution a model expressed by some theoretical distribution law.
Pearson's χ 2 (chi-square) goodness-of-fit test is one of the main goodness-of-fit tests. Proposed by the English mathematician Karl Pearson (1857-1936) to assess the randomness (significance) of discrepancies between the frequencies of empirical and theoretical distributions:
Where k- the number of groups into which the empirical distribution is divided; fi- empirical frequency of a trait in i-th group; / ts °р - theoretical frequency of the sign in i-th group.
Scheme for applying the criterion y) to assess the consistency of the theoretical and empirical distributions comes down to the following.
- 1. The calculated measure of discrepancy % 2 asc is determined.
- 2. The number of degrees of freedom is determined.
- 3. Based on the number of degrees of freedom v, %^bl is determined using a special table
- 4. If % 2 asch >x 2 abl, then for a given level of significance a and the number of degrees of freedom v, the hypothesis about the insignificance (randomness) of the discrepancies is rejected. Otherwise, the hypothesis can be recognized as not contradicting the experimental data obtained and with probability (1 - a) it can be argued that the discrepancies between theoretical and empirical frequencies are random.
Significance level - this is the probability of erroneously rejecting the put forward hypothesis, i.e. the probability that a correct hypothesis will be rejected. In statistical studies, depending on the importance and responsibility of the problems being solved, the following three levels of significance are used:
- 1) a = 0.1, then P = 0,9;
- 2) a = 0.05, then P = 0,95;
- 3) a = 0.01, then P = 0,99.
Using the goodness-of-fit criterion y), The following conditions must be met.
- 1. The volume of the population under study must satisfy the condition p> 50, while the frequency or group size must be at least 5. If this condition is violated, it is necessary to first combine small frequencies (less than 5).
- 2. The empirical distribution must consist of data obtained as a result of random sampling, i.e. they must be independent.
The disadvantage of the Pearson goodness-of-fit criterion is the loss of some of the original information associated with the need to group observation results into intervals and combine individual intervals with a small number of observations. In this regard, it is recommended to supplement the check of distribution compliance with the criterion y) other criteria. This is especially necessary when the sample size is P ~ 100.
In statistics, the Kolmogorov goodness-of-fit test (also known as the Kolmogorov–Smirnov goodness-of-fit test) is used to determine whether two empirical distributions obey the same law, or to determine whether a resulting distribution obeys an assumed model. The Kolmogorov criterion is based on determining the maximum discrepancy between accumulated frequencies or frequencies of empirical or theoretical distributions. The Kolmogorov criterion is calculated using the following formulas:
![](https://i1.wp.com/studme.org/htm/img/33/2904/100.png)
![](https://i2.wp.com/studme.org/htm/img/33/2904/101.png)
Where D And d- accordingly, the maximum difference between the accumulated frequencies (/-/") and between the accumulated frequencies ( rr") empirical and theoretical series of distributions; N- number of units in the aggregate.
Having calculated the value X, a special table is used to determine the probability with which it can be stated that deviations of empirical frequencies from theoretical ones are random. If the sign takes values up to 0.3, then this means that there is a complete coincidence of frequencies. With a large number of observations, the Kolmogorov test is able to detect any deviation from the hypothesis. This means that any difference in the sample distribution from the theoretical one will be detected with its help if there are a sufficiently large number of observations. The practical significance of this property is insignificant, since in most cases it is difficult to count on obtaining a large number of observations under constant conditions, the theoretical idea of the distribution law to which the sample should obey is always approximate, and the accuracy of statistical tests should not exceed the accuracy of the selected model.
The Romanovsky goodness-of-fit test is based on the use of the Pearson criterion, i.e. already found values x 2 > and the number of degrees of freedom:
![](https://i1.wp.com/studme.org/htm/img/33/2904/102.png)
where v is the number of degrees of freedom of variation.
The Romanovsky criterion is convenient in the absence of tables for x2. If K r TO? > 3, then they are non-random and the theoretical distribution cannot serve as a model for the empirical distribution being studied.
B. S. Yastremsky used in the criterion of agreement not the number of degrees of freedom, but the number of groups ( k), a special value 0, depending on the number of groups, and a chi-square value. The Yastremsky agreement criterion has the same meaning as the Romanovsky criterion and is expressed by the formula
![](https://i2.wp.com/studme.org/htm/img/33/2904/103.png)
where x 2 is Pearson's goodness-of-fit test; /e gr - number of groups; 0 - coefficient, for the number of groups less than 20 equal to 0.6.
If 1ph act > 3, the discrepancies between theoretical and empirical distributions are not random, i.e. the empirical distribution does not meet the requirements of a normal distribution. If 1f act
Theoretical and empirical frequencies. Checking for normal distribution
When analyzing variation distribution series, it is of great importance how empirical distribution sign corresponds normal. To do this, the frequencies of the actual distribution must be compared with the theoretical ones, which are characteristic of a normal distribution. This means that, based on actual data, it is necessary to calculate the theoretical frequencies of the normal distribution curve, which are a function of normalized deviations.
In other words, the empirical distribution curve needs to be aligned with the normal distribution curve.
Objective characteristics of compliance theoretical And empirical frequencies can be obtained using special statistical indicators called consent criteria.
Concordance criterion called a criterion that allows you to determine whether the discrepancy is empirical And theoretical distributions are random or significant, i.e. whether the observational data agree with the put forward statistical hypothesis or do not agree. The distribution of the population, which it has due to the hypothesis put forward, is called theoretical.
There is a need to install criterion(rule) that would allow one to judge whether the discrepancy between the empirical and theoretical distributions is random or significant. If the discrepancy is random, then they believe that the observational data (sample) are consistent with the hypothesis put forward about the law of distribution of the general population and, therefore, the hypothesis is accepted; if the discrepancy turns out to be significant, then the observational data do not agree with the hypothesis and it is rejected.
Typically, empirical and theoretical frequencies differ because:
the discrepancy is random and due to a limited number of observations;
the discrepancy is not accidental and is explained by the fact that the statistical hypothesis that the population is normally distributed is erroneous.
Thus, consent criteria make it possible to reject or confirm the correctness of the hypothesis put forward when aligning the series about the nature of the distribution in the empirical series.
Empirical Frequencies obtained as a result of observation. Theoretical frequencies calculated by formulas.
For normal distribution law they can be found as follows:
Σƒ i- sum of accumulated (cumulative) empirical frequencies
h - difference between two neighboring options
σ - sample standard deviation
t–normalized (standardized) deviation
φ(t)–probability density function of normal distribution (found from the table of values of the local Laplace function for the corresponding value of t)
There are several goodness-of-fit tests, the most common of which are: chi-square test (Pearson), Kolmogorov test, Romanovsky test.
Pearson χ goodness-of-fit test 2 – one of the main ones, which can be represented as the sum of the ratios of the squares of the differences between theoretical (f T) and empirical (f) frequencies to theoretical frequencies:
k is the number of groups into which the empirical distribution is divided,
f i – observed frequency of the trait in the i-th group,
f T – theoretical frequency.
For the χ 2 distribution, tables have been compiled that indicate the critical value of the χ 2 goodness-of-fit criterion for the selected significance level α and degrees of freedom df (or ν). The significance level α is the probability of erroneously rejecting the proposed hypothesis, i.e. the probability that a correct hypothesis will be rejected. R - statistical significance accepting the correct hypothesis. In statistics, three levels of significance are most often used:
α=0.10, then P=0.90 (in 10 cases out of 100)
α=0.05, then P=0.95 (in 5 cases out of 100)
α=0.01, then P=0.99 (in 1 case out of 100) the correct hypothesis can be rejected
The number of degrees of freedom df is defined as the number of groups in the distribution series minus the number of connections: df = k –z. The number of connections is understood as the number of indicators of the empirical series used in calculating theoretical frequencies, i.e. indicators connecting empirical and theoretical frequencies. For example, when aligned with a bell curve, there are three relationships. Therefore, when aligned by bell curve the number of degrees of freedom is defined as df =k–3. To assess significance, the calculated value is compared with the table χ 2 table
If the theoretical and empirical distributions completely coincide, χ 2 =0, otherwise χ 2 >0. If χ 2 calc > χ 2 tab, then for a given level of significance and number of degrees of freedom, we reject the hypothesis about the insignificance (randomness) of the discrepancies. If χ 2 calc< χ 2 табл то гипотезу принимаем и с вероятностью Р=(1-α) можно утверждать, что расхождение между теоретическими и эмпирическими частотами случайно. Следовательно, есть основания утверждать, что эмпирическое распределение подчиняетсяnormal distribution. Pearson's goodness-of-fit test is used if the population size is large enough (N>50), and the frequency of each group must be at least 5.
Kolmogorov goodness-of-fit test is based on determining the maximum discrepancy between the accumulated empirical and theoretical frequencies:
where D and d are, respectively, the maximum difference between the accumulated frequencies and the accumulated frequencies of the empirical and theoretical distributions. Using the distribution table of the Kolmogorov statistics, the probability is determined, which can vary from 0 to 1. When P(λ) = 1, there is a complete coincidence of frequencies, P(λ) = 0 - a complete discrepancy. If the probability value P is significant in relation to the found value λ, then we can assume that the discrepancies between the theoretical and empirical distributions are insignificant, that is, they are random. The main condition for using the Kolmogorov criterion is a sufficiently large number of observations.
Kolmogorov goodness-of-fit test
Let us consider how the Kolmogorov criterion (λ) is applied when testing the hypothesis of normal distribution the general population. Aligning the actual distribution with the bell curve consists of several steps:
Compare actual and theoretical frequencies.
Based on actual data, the theoretical frequencies of the normal distribution curve, which is a function of the normalized deviation, are determined.
They check to what extent the distribution of the characteristic corresponds to normal.
For IV column of the table:
In MS Excel, the normalized deviation (t) is calculated using the NORMALIZATION function. It is necessary to select a range of free cells by the number of options (spreadsheet rows). Without removing the selection, call the NORMALIZE function. In the dialog box that appears, indicate the following cells, which contain, respectively, the observed values (X i), average (X) and standard deviation Ϭ. The operation must be completed simultaneous by pressing Ctrl+Shift+Enter
For the V column of the table:
The probability density function of the normal distribution φ(t) is found from the table of values of the local Laplace function for the corresponding value of the normalized deviation (t)
For VI column of the table:
Kolmogorov goodness-of-fit test (λ) determined by dividing the module max difference between empirical and theoretical cumulative frequencies by the square root of the number of observations:
Using a special probability table for the agreement criterion λ, we determine that the value λ = 0.59 corresponds to a probability of 0.88 (λ
Distribution of empirical and theoretical frequencies, probability density of theoretical distribution
When applying goodness-of-fit tests to check whether the observed (empirical) distribution corresponds to the theoretical one, one should distinguish between testing simple and complex hypotheses.
The one-sample Kolmogorov-Smirnov normality test is based on maximum difference between the cumulative empirical distribution of the sample and the estimated (theoretical) cumulative distribution. If the Kolmogorov-Smirnov D statistic is significant, then the hypothesis that the corresponding distribution is normal should be rejected.
Criteria for checking randomness and assessing outlier observations Literature Introduction In the practice of statistical analysis of experimental data, the main interest is not the calculation of certain statistics itself, but the answers to questions of this type. Accordingly, many criteria have been developed to test the proposed statistical hypotheses. All criteria for testing statistical hypotheses are divided into two large groups: parametric and non-parametric.
Share your work on social networks
If this work does not suit you, at the bottom of the page there is a list of similar works. You can also use the search button
Test
Using Consent Criteria
Introduction
Literature
Introduction
In the practice of statistical analysis of experimental data, the main interest is not the calculation of certain statistics itself, but the answers to questions of this type. Is the population mean really equal to a certain number? Is the correlation coefficient significantly different from zero? Are the variances of the two samples equal? And many such questions may arise, depending on the specific research problem. Accordingly, many criteria have been developed to test the proposed statistical hypotheses. We will consider some of the most common ones. These will mainly relate to means, variances, correlation coefficients and abundance distributions.
All criteria for testing statistical hypotheses are divided into two large groups: parametric and non-parametric. Parametric tests are based on the assumption that the sample data are drawn from a population with a known distribution, and the main task is to estimate the parameters of this distribution. Nonparametric tests do not require any assumptions about the nature of the distribution, other than the assumption that it is continuous.
Let's look at the parametric criteria first. The test sequence will include the formulation of the null hypothesis and the alternative hypothesis, the formulation of the assumptions to be made, the determination of the sample statistics used in the test and, the formation of the sample distribution of the statistics being tested, the determination of the critical regions for the selected criterion, and the construction of a confidence interval for the sample statistics.
1 Goodness-of-fit criteria for means
Let the hypothesis being tested be that the population parameter. The need for such a check may arise, for example, in the following situation. Suppose that, based on extensive research, the diameter of the shell of a fossil mollusk in sediments from some fixed location has been established. Let us also have at our disposal a certain number of shells found in another place, and we make the assumption that a specific place does not affect the diameter of the shell, i.e. that the average value of the shell diameter for the entire population of mollusks that once lived in a new place is equal to the known value obtained earlier when studying this type of mollusk in the first habitat.
If this known value is equal, then the null hypothesis and the alternative hypothesis are written as follows: Let us assume that the variable x in the population under consideration has a normal distribution, and the value of the population variance is unknown.
We will test the hypothesis using statistics:
, (1)
where is the sample standard deviation.
It was shown that if true, then t in expression (1) has a Student t-distribution with n-1 degrees of freedom. If you choose the significance level (the probability of rejecting the correct hypothesis) to be equal, then, in accordance with what was discussed in the previous chapter, you can determine the critical values for testing =0.
In this case, since the Student distribution is symmetrical, then (1-) part of the area under the curve of this distribution with n-1 degrees of freedom will be contained between the points and, which are equal to each other in absolute value. Therefore, all values less than a negative and greater than a positive value for a t-distribution with a given number of degrees of freedom at a chosen significance level will constitute the critical region. If the sample t value falls within this region, the alternative hypothesis is accepted.
The confidence interval for is constructed using the previously described method and is determined from the following expression
(2)
So, let us know in our case that the diameter of the shell of a fossil mollusk is 18.2 mm. We had at our disposal a sample of 50 newly found shells, for which mm, a = 2.18 mm. Let's check: =18.2 against We have
If the significance level is chosen =0.05, then the critical value. It follows that it can be rejected in favor at the significance level =0.05. Thus, for our hypothetical example, it can be argued (with some probability, of course) that the diameter of the shell of fossil mollusks of a certain species depends on the places in which they lived.
Due to the fact that the t-distribution is symmetrical, only positive t values of this distribution are given at the selected significance levels and number of degrees of freedom. Moreover, not only the share of the area under the distribution curve to the right of the t value is taken into account, but also to the left of the -t value at the same time. This is due to the fact that in most cases, when testing hypotheses, we are interested in the significance of deviations in themselves, regardless of whether these deviations are larger or smaller, i.e. we check against, not against: >a or: Let's return now to our example. The 100(1-)% confidence interval for is 18,92,01
Let us now consider the case when it is necessary to compare the means of two general populations. The hypothesis being tested looks like this: : =0, : 0. It is also assumed that it has a normal distribution with a mean and variance, and - a normal distribution with a mean and the same variance. In addition, we assume that the samples from which the general populations are estimated are extracted independently of each other and have a volume, respectively, and From the independence of the samples it follows that if we take a larger number of them and calculate the average values for each pair, then the set of these pairs of averages will be completely uncorrelated. Null hypothesis testing is done using statistics (3)
where and are variance estimates for the first and second samples, respectively. It is easy to see that (3) is a generalization of (1). It was shown that statistics (3) have a Student t-distribution with degrees of freedom. If and are equal, i.e. = = formula (3) is simplified and has the form (4)
Let's look at an example. Let us assume that when measuring the stem leaves of the same plant population over two seasons, the following results are obtained: We assume that the conditions for using the Student’s t-test, i.e. the normality of the populations from which the samples are taken, the existence of an unknown but the same variance for these populations, and the independence of the samples are satisfied. Let us estimate at the significance level =0.01. We have Table value t = 2.58. Therefore, the hypothesis about the equality of the average values of stem leaf lengths for a plant population over two seasons should be rejected at the chosen level of significance. Attention! The null hypothesis in mathematical statistics is the hypothesis that there are no significant differences between the compared indicators, regardless of whether we are talking about means, variances or other statistics. And in all these cases, if the empirical (calculated by formula) value of the criterion is greater than the theoretical (selected from the tables), it is rejected. If the empirical value is less than the tabulated value, then it is accepted. In order to construct a confidence interval for the difference between the means of these two populations, let us pay attention to the fact that the Student’s test, as can be seen from formula (3), evaluates the significance of the difference between the means relative to the standard error of this difference. It is easy to verify that the denominator in (3) represents exactly this standard error using the previously discussed relationships and assumptions made. In fact, we know that in general If x and y are independent, then so are Taking sample values and instead of x and y, and recalling the assumption made that both populations have the same variance, we obtain (5)
The variance estimate can be obtained from the following relation (6)
(We divide by because two quantities are estimated from the samples and, therefore, the number of degrees of freedom must be reduced by two.) If we now substitute (6) into (5) and take the square root, we get the denominator in expression (3). After this digression, let's return to constructing a confidence interval for through -. We have Let us make some comments related to the assumptions used in constructing the t-test. First of all, it was shown that violations of the assumption of normality for have an insignificant effect on the level of significance and power of the test for 30. Violations of the assumption of homogeneity of variances of both populations from which the samples are taken are also insignificant, but only in the case when the sample sizes are equal. If the variances of both populations differ from each other, then the probabilities of errors of the first and second types will differ significantly from those expected. In this case, the criterion should be used to check (7)
with the number of degrees of freedom . (8)
As a rule, it turns out to be a fractional number, therefore, when using t-distribution tables, it is necessary to take the table values for the nearest integer values and interpolate to find the t corresponding to the obtained one. Let's look at an example. When studying two subspecies of the marsh frog, the ratio of body length to tibia length was calculated. Two samples were taken with volumes =49 and =27. The mean and variance of the ratio of interest to us turned out to be =2.34, respectively; =2.08; =0.21; =0.35. If we now test the hypothesis using formula (2), we obtain that At a significance level of =0.05, we must reject the null hypothesis (tabular value t=1.995) and assume that there are statistically significant differences at the chosen significance level between the mean values of the measured parameters for the two frog subspecies. When using formulas (6) and (7) we have In this case, for the same significance level =0.05, the table value t=2.015, and the null hypothesis is accepted. This example clearly shows that the neglect of the conditions adopted in the derivation of one or another criterion can lead to results that are directly opposite to those that actually take place. Of course, in this case, having samples of different sizes in the absence of a predetermined fact that the variances of the measured indicator in both populations are statistically equal, formulas (7) and (8) should have been used, which showed the absence of statistically significant differences. Therefore, I would like to repeat once again that verification of compliance with all the assumptions made when deriving a particular criterion is an absolutely necessary condition for its correct use. The constant requirement in both of the above modifications of the t-test was the requirement that the samples be independent of each other. However, in practice there are often situations when this requirement cannot be met for objective reasons. For example, some indicators are measured on the same animal or area of territory before and after the action of an external factor, etc. And in these cases we may be interested in testing the hypothesis against. We will continue to assume that both samples are drawn from normal populations with the same variance. In this case, we can take advantage of the fact that differences between normally distributed quantities also have a normal distribution, and therefore we can use the Student's t test in the form (1). Thus, the hypothesis will be tested that n differences are a sample from a normally distributed population with a mean equal to zero. Denoting the i-th difference by, we have , (9) Let's look at an example. Let us have at our disposal data on the number of impulses of an individual nerve cell during a certain time interval before () and after () the action of the stimulus: Hence, keeping in mind that (9) has a t-distribution, and choosing a significance level of =0.01, from the corresponding table in the Appendix we find that the critical value of t for n-1=10-1=9 degrees of freedom is 3.25. A comparison of the theoretical and empirical t-statistic values shows that the null hypothesis of no statistically significant differences between firing rates before and after the stimulus should be rejected. It can be concluded that the stimulus used statistically significantly changes the frequency of impulses. In experimental studies, as mentioned above, dependent samples appear quite often. However, this fact is sometimes ignored and the t-test is used incorrectly in form (3). This can be seen as invalid by considering the standard errors of the difference between uncorrelated and correlated means. In the first case And in the second The standard error of the difference d is Taking this into account, the denominator in (9) will have the form Now let us pay attention to the fact that the numerators of expressions (4) and (9) coincide: therefore, the difference in the value of t in them depends on the denominators. Thus, if formula (3) is used in a problem with dependent samples, and the samples have a positive correlation, then the resulting t values will be less than they should be when using formula (9), and a situation may arise where that the null hypothesis will be accepted when it is false. The opposite situation may arise when there is a negative correlation between samples, i.e. in this case, differences will be recognized as significant that in fact are not. Let's return again to the example with impulse activity and calculate the t value for the given data using formula (3), not paying attention to the fact that the samples are related. We have: For the number of degrees of freedom equal to 18, and the significance level = 0.01, the table value is t = 2.88 and, at first glance, it seems that nothing happened, even when using a formula that is unsuitable for the given conditions. And in this case, the calculated t value leads to the rejection of the null hypothesis, i.e. to the same conclusion that was made using formula (9), correct in this situation. However, let's reformat the existing data and present it in the following form (2): These are the same values, and they could well be obtained in one of the experiments. Since all values in both samples are preserved, using the Student's t test in formula (3) gives the previously obtained value = 3.32 and leads to the same conclusion that has already been made. Now let’s calculate the value of t using formula (9), which should be used in this case. We have: The critical value of t at the selected significance level and nine degrees of freedom is 3.25. Consequently, we have no reason to reject the null hypothesis; we accept it, and it turns out that this conclusion is directly opposite to that which was made when using formula (3). Using this example, we were once again convinced of how important it is to strictly comply with all the requirements that were the basis for determining a particular criterion in order to obtain correct conclusions when analyzing experimental data. The considered modifications of the Student's test are intended to test hypotheses regarding the average of two samples. However, situations arise when it becomes necessary to draw conclusions regarding the equality of k averages at the same time. For this case, a certain statistical procedure has also been developed, which will be discussed later when discussing issues related to analysis of variance. 2 Goodness-of-fit tests for variances The testing of statistical hypotheses regarding the variances of general populations is carried out in the same sequence as for the means. Let us briefly recall this sequence. 1. A null hypothesis is formulated (about the absence of statistically significant differences between the compared variances). 2. Some assumptions are made regarding the sampling distribution of statistics, with the help of which it is planned to estimate the parameter included in the hypothesis. 3. The significance level for testing the hypothesis is selected. 4. The value of the statistics of interest to us is calculated and a decision is made regarding the truth of the null hypothesis. And now let's start by testing the hypothesis that the variance of the general population = a, i.e. against. If we assume that the variable x has a normal distribution, and that a sample of size n is randomly drawn from the population, then statistics are used to test the null hypothesis (10)
Remembering the formula for calculating dispersion, we rewrite (10) as follows: . (11)
From this expression it is clear that the numerator is the sum of the squares of the deviations of normally distributed values from their mean. Each of these deviations is also normally distributed. Therefore, in accordance with the distribution known to us, the sums of squares of normally distributed values of statistics (10) and (11) have a -distribution with n-1 degrees of freedom. By analogy with the use of the t-distribution, when checking for the selected significance level, critical points are established from the distribution table, corresponding to the probabilities of accepting the null hypothesis and. The confidence interval for at selected is constructed as follows: . (12)
Let's look at an example. Let us assume, on the basis of extensive experimental research, that the dispersion of the alkaloid content of one plant species from a certain area is equal to 4.37 conventional units. The specialist has at his disposal a sample of n = 28 such plants, presumably from the same area. The analysis showed that for this sample =5.01 and it is necessary to make sure that this and previously known variances are statistically indistinguishable at the significance level =0.1. According to formula (10) we have The resulting value must be compared with the critical values /2=0.05 and 1--/2=0.95. From the Appendix table for with 27 degrees of freedom we have 40.1 and 16.2, respectively, which means that the null hypothesis can be accepted. The corresponding confidence interval for is 3.37<<8,35.
In contrast to testing hypotheses regarding sample means using the Student's test, when errors of the first and second types did not change significantly when the assumption of normal distribution of populations was violated, in the case of hypotheses about variances when the conditions of normality were not met, the errors changed significantly. The problem considered above about the equality of the variance to some fixed value is of limited interest, since situations are quite rare when the variance of the population is known. Of much greater interest is the case when you need to check whether the variances of two populations are equal, i.e. testing a hypothesis against an alternative. It is assumed that samples of size and are randomly drawn from general populations with variances and. To test the null hypothesis, Fisher's variance ratio test is used (13)
Since the sums of squared deviations of normally distributed random variables from their means have a distribution, both the numerator and denominator of (13) are distributed values divided by and respectively, and therefore their ratio has an F-distribution with -1 and -1 degrees of freedom. It is generally accepted - and this is how F-distribution tables are constructed - that the largest of the variances is taken as the numerator in (13), and therefore only one critical point is determined, corresponding to the selected significance level. Let us have at our disposal two samples of volume =11 and =28 from populations of common and oval pond snails, for which the height-to-width ratios have variances =0.59 and =0.38. It is necessary to test the hypothesis about the equality of these variances of these indicators for the populations being studied at a significance level of =0.05. We have In the literature, you can sometimes find a statement that testing the hypothesis about the equality of means using the Student's test should be preceded by testing the hypothesis about the equality of variances. This is the wrong recommendation. Moreover, it can lead to mistakes that can be avoided if not followed. Indeed, the results of testing the hypothesis of equality of variances using Fisher's test largely depend on the assumption that the samples are drawn from populations with a normal distribution. At the same time, the Student's test is insensitive to violations of normality, and if it is possible to obtain samples of equal size, then the assumption of equality of variances is also not significant. In the case of unequal n, formulas (7) and (8) should be used for verification. When testing hypotheses about equality of variances, some features arise in calculations associated with dependent samples. In this case, statistics are used to test a hypothesis against an alternative (14)
If the null hypothesis is true, then statistics (14) has a Student t-distribution with n-2 degrees of freedom. When measuring the gloss of 35 coating samples, a dispersion of =134.5 was obtained. Repeated measurements two weeks later showed =199.1. In this case, the correlation coefficient between paired measurements turned out to be equal to =0.876. If we ignore the fact that the samples are dependent and use the Fisher test to test the hypothesis, we get F=1.48. If you choose the significance level =0.05, then the null hypothesis will be accepted, since the critical value of the F-distribution for =35-1=34 and =35-1=34 degrees of freedom is 1.79. At the same time, if we use formula (14) suitable for this case, we obtain t = 2.35, while the critical value of t for 33 degrees of freedom and the selected significance level = 0.05 is equal to 2.03. Therefore, the null hypothesis of equal variances in the two samples should be rejected. Thus, from this example it is clear that, as in the case of testing the hypothesis of equality of means, the use of a criterion that does not take into account the specifics of experimental data leads to an error. In the recommended literature you can find the Bartlett test, which is used to test hypotheses about the simultaneous equality of k variances. In addition to the fact that calculating the statistics of this criterion is quite laborious, the main disadvantage of this criterion is that it is unusually sensitive to deviations from the assumption of normal distribution of the populations from which samples are drawn. Thus, when using it, you can never be sure that the null hypothesis is actually rejected because the variances are statistically significantly different, and not because the samples are not normally distributed. Therefore, if the problem of comparing several variances arises, it is necessary to look for a formulation of the problem where it will be possible to use the Fisher criterion or its modifications. 3 Criteria for agreement regarding shares Quite often it is necessary to analyze populations in which objects can be classified into one of two categories. For example, by gender in a certain population, by the presence of a certain trace element in the soil, by the dark or light color of eggs in some species of birds, etc. We denote the proportion of elements that have a certain quality by P, where P represents the ratio of objects with the quality we are interested in to all objects in the aggregate.
Where