Fully Casualized Design: A Brief Literary Review

The term experimental design is characterized by the manner in which the portions are distributed in a given experiment. The completely randomized design is the most commonly used among experimental designs because the principles of randomization and repetition provide authenticity of the conclusions due to guarantee that the experimental units (plots), even if distinct and exhibit equal probability of being distributed to the groups. It is widely used in experiments have uniform conditions as the experimental conditions are critical in obtaining a good experimental design. It offers a wide experimental application, but must pay attention to the test, that even homogeneous, can present experimental conditions that will harm the experiment. So in order to get a good design, an early collection of information to evaluate the homogeneity of the experimental conditions is critical. This study is a literature review on the DIC, with their main characteristics, mathematical modeling, analysis of variance technique (ANOVA) and analysis of assumptions for ANOVA.


INTRODUCTION
The relevant variables to the object of study that focus on units of a sample or population, we use in statistical analysis, they are obtained from previously planned experiments, known experimental data (BERGAMASCHI, et al., 2011). The implication of factors that may or may not be controllable during experiment necessitates the use of statistical methods of analysis, to verify their prominence in bringing random variation or error experimental (ANDRADE &OGLIARI, 2007).
Among the factors that cannot be controlled stand out environmental heterogeneity not provided by the experimenter and the variations inherent to the experimental material. Thus, in order to minimize the variation of chance, the experimenter you must set the design so that it is possible to isolate the effects of the factors that, indeed, can be controlled. Thus, the experiment relates to the set of rules that determines the definition of treatment, the arrangement of in the experimental plots and their assignments to treatment and how to analyze the Data from the experiment (DUARTE, 1996).
The completely randomized design (CRD) is the simplest of all experimental designs, it contains only the principle of randomization and repetition. Requiring homogeneity of the material and environmental conditions Experimental since their treatments are distributed in the form of parcels entirely random. The static scope of DIC is given by equation 1 (Silva 2007). y ij = μ + α i + eij (Eq. 1) At where, yij is the value observed in experimental plot that received I -in th treatment jth repetition; μ is a general constant associated with this random variable; αi is the effect of the treatment; andij is the error associated with observation yij, supposed to have normal distribution.
The model indicates that the shape of the biological response of a unit Experimental subjected to the treatments is given by: Biological Response Treatment Average + = Error Casual Biological and described in Equation 2. yij = μ + eij ( i = 1, ..., k and j = 1, ..., r) (Eq. 2) At where, ithe index referring to treatment; j The experimental unit.

III.
STATISTICAL PROCEDURE: ANALYSIS OF VARIANCE Statistical inference for analysis of variance (ANOVA) is obtained from Distribution F of Snedecor considering two independent random variables, one being due to other treatments and due to the experimental residue (PADOVANI, 2014).
According to Duarte (1996), if we consider an experiment aimed at test treatment (t) using repetitions (r) for each of the model determines the partition of degrees of freedom and the sum of squares for the variation Total being observed, according to equation 3. y1 j = m + t1 + e1j (Eq. 3) At where, y1j is the data collected in the experimental unit received at a given treatment repetition; m is the constant inherent in the overall average; t1 is the effect provided by the treatment; and 1j It is the error of the experimental unit.
If the data meets the principles of analysis of variance, then the propos al the model can summarized as shown in Table 1.  Duarte (1996) describes that there are some assumptions that must be used to make valid the application of ANOVA because the error greatness experimental and forward answer to the mathematical model assumption guarantee effectiveness and quality of a particular experiment.

IV. ANOVA TO TEST HYPOTHES ES
These assumptions are: a) additivity, in this condition the effects of t he factors that occur in the mathematical model must added together, so do not there interactions; b) independence of errors; c) homoscedasticity of variance and; d) normality of errors (BARBIN, 2003).
Carvalhoet. al., (2010) describe who should use tes ts to confirm whether the assumptions of the mathematical model are being met. proof these hypotheses should be performed prior to any analysis and testing assumptions including Student's distribution, F Snedecor or chi-square.
The main tests are: Test not Tukey additivity, to ascertain the additivity; random testing, to verify the randomness of the errors on the Experimental map; Lilliefors test to verify the normality of the provision of and errors; Bartlett test to analyze the homogeneity of the errors between the treatments (CONAGIN et al., 1993).

ANOVA Applicability
The main objective of the trial is to analyze alternatives (treatment) in order to identify among them those of greater biological return, agronomic and even economic (DUARTE, 1996). In this sense, all experiments aim for transparent and clear results in the environmental field need means efficient statistical.
As an example of the applicability of completely randomized design for scientific nature of experiments, Angels (2005) brings the following question: "Consider the following experiment was conducted co nsidering a design randomized. Nine Strains of fungi were compared by measuring growth rates in microns / hour. "   Once you get all of these values, compares it with the F calculated F tabulated (1%, as called for example, the value is 2.9475).
According to the F test was significant difference between treatments, and therefore, this calculation allows us to reject the null hypothesis (H 0). That way, it means that one of the fungal strains is more efficient with respect to the rate of growth, and this is the basis for the next steps for obtaining data Statistical through the use of some means comparison test or contrasts.

Independence of errors
Padovani (2014) describes that the independence of errors is guaranteed by principle of randomization. If the errors of the independence assumption is satisfied, on graph-standardized residuals versus the order of data collection, the waste must be casually distributed around zero, without following a pattern. 2) in common. THE homogeneity of variance has two hypotheses from data groups obtained from a given experiment, as the assumptions below, and δ •2The variance of each of the data groups (LIMA, 2014). H 0: δ 2 1 δ =δ 2 2 = ... = δ 2 H 1: one of δ 2 'S is different from the others Box (1953) recommends that the results of an ANOVA are considered valid, the largest variance should not exceed four times the smaller. . Dean et al, (1999), discloses that in a more analytical decision tests: Cochran, Hartley, Bartlett and Levene were highlights for the homogeneity of variances.
Based on further in the example cited by Angels (2005), for the homoscedasticity, realized by the following figure (Figure 1), there is heterocedasticity between treatments, because some of them are showing different behavior regarding the distribution of errors.

Normality of the errors
The normal probability plot is a graphical technique for assessing whether a data set is approximately normally distributed and is a special case the probability plot. The data are plotted in relation to a distribution Normal theoretical such that the dots should form an approximate straight line. Matches this straight line indicate departures from normality (Chambers et al., 1983).
The probability graph is formed by the vertical axis the values of requested response and the horizontal axis with the median statistics for ordered the given distribution. According Filiben (1975), the median statistics order They may be approximated according to Equation 6. Ni = G (Ui) (Eq. 6) At where, Ui is the median uniform statistical order (defined below); G is the percentage point function to the desired distribution.
The function of a percentage point is the inverse of cumulative distribution function (Probability that x is less than or equal to some value). That is, given a probability, x is the corresponding cumulative distribution function. At medians uniform statistics order are defined as: mi = 1 -mnfor i = 1 mi = (i -0.3175) / (n+ 0.365) for i = 2, 3, ..., n-1 0.5 mi = (1 / n) for i = n Furthermore, a straight line may be fitted to points, added as a reference line. The more points vary this line, the greater the indicating a departure from the specified distribution. This definition implies that a probability plots can be easily generated for any distribution to which point the percentage can be calculated (ANSCOMBE, 1973).
A disadvantage of this method of calculating probability plots is that estimates of intercept and slope of the fitted line are in fact estimates for the parameters of location and distribution scale. Although this is not very important for the normal distribution, since the location and scale are the estimated mean and standard deviation, respectively, can be useful for many other distributions (WILK et al, 1968).
In addition to the graphical methods we have just considered for assess the residual normality, we can perform a hypothesis test in which the null hypothesis is that the errors have a normal distribution. A large value of p, therefore, fails to reject the null hypothesis is a good result. This means that it is reasonable to assume that the errors have a normal distribution. Normally, assessment of the appropriate residual plots is sufficient to diagnose deviations from normality. However, more rigorous and formal quantification of normality can It is requested (TUFTE, 1983). Therefore, one can apply several test procedures common to normal.

Anderson-Darling
Test The Anderson-Darling test measures the area between line an adjusted (based on chosen distribution) and a non-parametric function (based on points of Plot (Eq. 8) If n is even, allowed to m = n / 2, while if n is odd, left to m = (n -1) / 2. If n is odd, the median data value b is not used in the calculation (Equation 9). To calculate the test statistic used-if W = ± 2 b = ∑ 1=1 ai(xn+1−i−xi) (Eq. 9) These values air are calculated using the means, variances and covariance's of (i). W is compared with tabulated values of distribution of this statistic. Smaller values of W will lead to rejection of the null hypothesis (Shapiro, 1965).

Kolmogorov-Smirnov test
The Kolmogorov-Smirnov test (also known as Lilliefors Test) compares the empirical cumulative distribution function of the sample data with the expected distribution if data were normal. If this difference is observed sufficiently large, the test will reject the null hypothesis of normality of the population (CALLEGARI-JACQUES, 2003). The test statistic is given by equation Being that, Z it's the same as F (X ( i)); F (x) is the probability distribution function of the normal distribution; X ( i) it's the same as i The order statistics of a random sample, 1 ≤ i ≤ n; n it's the sample size.
The test statistic is compared with the critical values of a distribution Normal to determine the p value.

Chi-square test
The chi-square test is used to test whether a data sample came from a population with a specific distribution.
An attractive feature of the chi-square adequacy test is that it can be applied to any univariate distribution for which you can calculate the cumulative distribution function. The suitability of the chi-square test is applied the binary data (ie, data placed into classes). In fact, this is not a restriction becaus e you can simply calculate the histogram or table often before generating the chi-square test. However, the statistic value Chi-square depends on how the data are categorized. another disadvantage Chisquare test is that it requires sufficient sample size for the approximation of chi-square is valid (Snedecor and Cochran, 1989 Being, F cumulative distribution function for distributing being tested; Y u it is the upper limit for class i; Y is the lower limit for class i; N is the sample size.

V.
TECHNICAL ANOVA ANOVA is a statistical technique to analyze the variation in one variable response (continuous random variable) measured under conditions defined by factors discrete (classification variables, often with nominal levels). Often used ANOVA to test the equality of several means, comparing the variance between groups regarding the variance within groups (Random error). Sir Ronald Fisher pioneered the development of ANOVA analyze the results of agricultural experiments (Fisher, 1925).
Today, the ANOVA It is included in almost all statistical packages, which makes it accessible to researchers in all experimental sciences. It is easy to insert a set of data and perform a simple ANOVA, but it is challenging to choose the ANOVA suitable for different experimental designs, examine whether the data adhere to modeling assumptions and interpret the results correctly (STEEL et al., 1980).
To determine the appropriate ANOVA model, we must know the relationships between factors and experimental units. Statistical distinguish two types of factors in experimental design and ANOVA: "fixed factors" and "random factors". a "Fixed factor" is one for

International Journal of Advanced Engineering Research and Science (IJAERS)
[ Vol-5, Issue-7, July-2018]  https://dx.doi.org/10.22161/ijaers.5.7.14  ISSN: 2349-6495(P) | 2456-1908(O) www.ijaers.com Page | 105 which specific levels are of interest. a researcher could repeat the experiment using identical factor levels in twice. (SCHEFFE, 1959). Conceptually, each level of a fixed factor It is a distinct population with a single response average. When one researcher deliberately organizes or modify the levels of a fixed factor, s called if these levels treatments. The primary objective of the ANOVA is to test whether the means response are identical between the levels of the factors. In contrast to a fixed factor, levels of a "random factor" represent a random sample of a number potentially infinite levels. Different levels of factors would be chosen randomly if the experiment was redone. With random factors, the objective of ANOVA is to make an inference about random variation within a population. When a factor level is applied to two or more experimental units independent, he is "replicated". If replicates are equal in number to each factor level, the experimental design is "balanced" (LEVENE, 1960).
The ANOVA concept provides details for two common models. The first model, one-way ANOVA fixed end, is an extension test t Student-Independent 2 that allows you to simultaneously compare averages of several samples independent. The second model, fixed effects ANOVA 2-way has two factors, A and B, and each level of factor A appears in combination with each factor level B. This model allows us to compare the means of the factor A levels and between levels B. factor Moreover, we examined whether the combined factors induce effects interaction (synergic or antagonistic) in response (SCHLOTZHAUER et al., 1987).

VI. COEFFICIENTS DETERMINATION AND CHANGE OF AN EXPERIMENT
In addition to hypothesis testing and confidence intervals, otherwise analyze whether the model adopted in a given experiment is efficient or not treated if the coefficient of determination or explanation and the coefficient of variation.
The coefficient of determination or explanation is represented by the symbol R 2. This indicator determines what percentage of the variance explained by Regression is the total variation (VIALI, 2018).
It is given by the ratio between SQTreat (sum of squares of treatment) and SQTot (total sum of squares of the values found), indicating the proportion of the total variance explained by the variation due to treatment (0≤R 2 ≤1) (PADOVANI, 2014).
The coefficient of variation of an experiment, represented by (CV) estimates the accuracy of experiments representing the standard deviation expressed as average percentage (MOHALLEM et. at., 2008).
According to Snedecor (1980), the distribution coefficient of variation allows the establishment of tracks values that guide researchers on the validity and veracity of their experiments.
Is given by the ratio between the standard deviation (ANOVA, is the square root a positive QMRes) and the overall mean data, inferring how data comportamin relation to the general average. The magnitude of the reverse precision CV refers to the idea the experimental data (PADOVANI, 2014).

VII.
MULTIPLE COMPARISONS Multiple comparisons are used when the variance analysis detects that there is a significant effect on certain treatment of an experiment, the certain level of significance, where it rejects the null hypothesis (SOUSA et. al., 2012). They have his theory based on the normality of the model residues linear used to fit the data (and BORGES FERREIRA, 2003).
The test multiple comparisons of means are of great importance in applied research (CONAGIN et. al., 2008) when trying to compare the Qualitative treatments.
In this sense, several tests are used for this purpose, and the same usually take the name of its author, the main ones being: Tukey, Student-Newman Keuls (SNK), Student's t test (LSD), Duncan, among others (BORGES & FERREIRA, 2003).
The choice of test to be used should be based on statistics qualities the study aims, considering it is always for the non-violation of the assumptions Basic to their application, such as normality and homoscedasticity errors of independence (EAX. el., 2005).

Tukey test
Tukey's test is based on the amplitude of estudentizada distribution, and can It is used to compare any and all contrast between two averages treatment, with accuracy when the number of repetitions is equal in all treatments. When there is a different number of repetitions Test Tukey can still be used, however, the result will approximate (GOMES, 2000).
For the minimum significant difference in the Tukey test, is used to formula described in equation 12 If the contrast is greater than the value of Δ, then the average level differ ɑ of significânicia.
In addition to the Tukey test, it is also possible to carry out comparison tests multiple of that example by Duncan test, SNK, among others.

Student-Newman-Keuls test (SNK)
The SNK test is performed the same way as the Tukey test, however, exception is that the critical value in SNK is not the number of treatments, but the number average amplitude included in the medium being tested (CALLEGA RI-JACQUES, 2003).
One of the advantages of using SNK test is that it allows separating means in discrete groups, without overlap between the groups (CANTERI et. al., 2001 At where, q: refers to the value of the total amplitude estudentizada 5% probability; s: refers to the square root of QMR (error mean square), which corresponds the estimate of the standard deviation of the experimental error; A: refers to the number of repetitions of the experiment and / or average (FERREIRA, 2011).
According to Sampaio (2002), when average compared feature different numbers of repetitions, the formula will be shown below (Equation 14): Q = SNK √ 2 2 ( 1 + 1 )(Eq. 14) At where, Ra: refers to the number of repetitions of treatment experiment "A"; RBb: refers to the number of repetitions of the experiment Treatment "B".

t test Student
The Student t test, t test or simply seeks to reject or not a hypothesis null when the test statistic (t) follows a Student's t distribution. Can be conducted to compare a sample of a population, comparing two samples compare two parallel and independent samples (Lopes et. al., 2015).two means A and B obtained in experimental groups can be compared in the following relationship by t test (Equation 15). t = −μ √ (Eq. 15) At where, X: refers to the median of the sample; μ: refers to the average population (or reference); S: refers to standard deviation; n: refers to the number of subjects (JUNIOR, 2012).

Duncan test
Duncan test is taken to a new method for the comparison of averages, with a more difficult application of the Tukey test, however, much more efficient with regard to the breakdown of the results and breakdown of treatments. It requires that all treatments have the same number of repetitions so that their results show accuracy (OLIVEIRA, 2008).
Typically, it is applied at 5% probability, and despite being more work is less rigorous than the Tukey test (VIANA, 2012). it should be point out that when using three or more averages, Duncan's theory is wrong, because the global significance level is not maintained (BANZATTO & KRONKA, 2006).
According to Gomes (2000), when the number of averages is very large (greater than 10, for example), the application of this test becomes very cumbersome.
According Vieira and Hoffmann (1989) to obtain DMS is the following formula applied (Equation 16): d.m.s = z √QMR /r (Eq. 16) At where, Z: refers to a standard value at significance level and the number means covered by the range delimited by the medium in comparison; QMR: refers to the mean square of ANOVA; A: refers to the number of repetitions.