the r book simulation models-

26Spatial StatisticsThere are three kinds of problems that you might tackle with spatial statistics:rpoint processes (locations and spatial patterns of individuals); rmaps of a continuous response variable (kriging); rspatially explicit responses affected by the identity, size and proximity of neighbours.26.1Point processesThere are three broad classes of spatial pattern on a continuum from complete regularity (evenly spaced hexagons where every individual is the same distance from its nearest neighbour) to complete aggregation (all the individuals clustered into a single clump): we call these regular, random and aggregated patterns and they look like this:regularrandomaggregatedIn their simplest form, the data consist of sets of x and y coordinates within some sampling frame such asa square or a circle in which the individuals have been mapped. The first question is often whether there is any evidence to allow rejection of the null hypothesis of complete spatial randomness (CSR). In a random pattern the distribution of each individual is completely independent of the distribution of everyThe R Book, Second Edition. Michael J. Crawley. 2013 John Wiley this needs an estimate of the distance at which the semivariogram is (1 + nugget)/2 = 1.2/2 = 0.6, as well as an estimate of the nugget. Inspection gives a distance of about 12.5, so we write:model6 - update(model4, corr=corRatio(c(12.5,0.2),form=latitude+longitude,nugget=T)We can use anova to compare the two spatial models:anova(model5,model6)Model dfAICBIClogLik model51 59 1185.863 1370.177 -533.9315 model62 59 1183.278 1367.592 -532.6389The rational quadratic model (model6) has the lower AIC and is therefore preferred to the spherical model.To test for the significance of the spatial correlation parameters we need to compare the preferred spatial model6 with the non-spatial model4 (which assumed spatially independent errors):anova(model4,model6)Model dfAICBIClogLikTest L.Ratio p-value model41 57 1354.742 1532.808 -620.3709 model62 59 1183.278 1367.592 -532.6389 1 vs 2 175.464.0001The two extra degrees of freedom used up in accounting for the spatial structure are clearly justified. We need to check the adequacy of the corRatio model. This is done by inspection of the sample variogram for the normalized residuals of model6:plot(Variogram(model6,resType=“n“)SPATIAL STATISTICS8650.20.40.6Semivariogram10 Distance403020There is no pattern in the plot of the sample variogram, so we conclude that the rational quadratic is adequate.To check for constancy of variance, we can plot the normalized residuals against the fitted values like this:plot(model6,resid( ., type=“n“)fitted(.),abline=0)210123Normalized residuals20 Fitted values353025and the normal plot is obtained in the usual way:qqnorm(model6,resid(.,type=“n“)866THE R BOOK2310123Quantiles of standard normal321012 Normalized residualsThe model looks fine.The next step is to investigate the significance of any differences between the varieties. Use update to change the structure of the model from yieldvariety-1 to yieldvariety:model7 - update(model6,model=yieldvariety) anova(model7)Denom. DF: 168 numDFF-value p-value (Intercept)1 30.399419.0001 variety551.8509390.0015Thedifferencesbetweenthevarietiesnowappeartobehighlysignificant(recallthattheywereonlymarginallysignificant with our linear model3 using analysis of covariance to take account of the latitude and longitudeeffects). Specific contrasts between varieties can be carried out using the L argument to anova. Suppose thatwe want to compare the mean yields of the first and third varieties. To do this, we set up a vector of contrastcoefficients c(-1,0,1) and apply the contrast like this:anova(model6,L=c(-1,0,1)Denom. DF: 168 F-test for linear combination(s) varietyARAPAHOE varietyBUCKSKIN -11 numDFF-value p-value 11 7.6967280.0062Notethatweusemodel6(withallthevarietymeans),notmodel7(withaninterceptandHelmertcontrasts).The specified varieties, Arapahoe and Buckskin, exhibit highly significant differences in mean yield.SPATIAL STATISTICS86726.7Creating a dot-distribution map from a relational databaseHere is an example of extracting a relatively small subset of data from a large relational database, and using the information to produce a dot distribution map. The Access database contains two related tables:rsites contains information on 2628 locations; rrecords contains lists of species found at each site (43 001 in total).The two tables are related by a variable called site number. The task is to extract eastings and northings for each record of a named species, and use these to produce a dot-distribution map, with one dot for each site at which that particular species was recorded. Instructionsonhowtomakeanopendatabaseconnectionareonp.154.Iassumethatyouhavedownloaded theAccessdatabasecalledberks.accdbfromthisbookswebsite(seep.iii)andcreatedanODBCchannel called berks on your computer. Open the channel to connect R to the Access databas