environmental science; bid if you can do . 1500ArcGIS assessment and report. ADVANCE SOFTWARE

User Generated

Zvffzbhevar

Science

Description

YOU MUST HAVE EXPERIENCE IN ARCGIS AND ARCMAP SOFTWARE

please use Harvard Referencing System

This assessment component focuses on the use of GIS to assess spatial patterns of both disease and environmental pollution. PC-based GIS practical exercises that will allow you to assess the spatial dimensions of environmental risk. During these practical exercises two workbooks will be provided to assist you. For the purpose of the unit assessment, you are required to complete these workbooks and submit them as part of the portfolio. The report should consider and include the following:

 Introduction, including clear statement of aims and objectives.

 Summary of methods.

 Data synthesis, analysis, interpretation and discussion, including evidence of integration with published reports and scientific journal articles.

 Use of supporting figures and tables.

 Conclusions.

When creating a map please export each as PDF on a separate file. do not forget to include:

north arrow

title

scale bar

legend

and my name

ARCGIS DATA FILES

https://1drv.ms/f/s!AkaVelnDaX_lguJJHgs8_IEeXFVMjg

Unformatted Attachment Preview

Polygon Pattern Analysis (Assessed Practical) 6F6Z2002 Environmental Risk Management Tutor: Elias Symeonakis e.symeonakis@mmu.ac.uk 2 February 2018 Introduction Much of the health data readily accessible in the public domain is grouped or aggregated together so that individuals cannot be recognised (for reasons of privacy and confidentiality). While this can pose significant problems for researchers associated with ecological fallacy and the modifiable areal unit problems (Cromley & McLafferty 2002), areal data can still prove highly valuable, especially when one considers the potential influence of the wider social and physical setting on the existence and possible diffusion of a particular health event. When working with areal data it is important that the data be standardised to prevent any visualisations simply reflecting population distribution (Gatrell 2002). There are many levels of aggregation available, with different units often employed by different agencies involved in the data gathering process. Thankfully, in the UK we can access health data aggregated (and standardised) to the national Census units – this makes its analysis and interpretation much easier, particularly when trying to establish socioeconomic determinants of health. Using GIS the analyst can explore the evidence for spatial patterning within health data (Gatrell 2002) – there may be wider structural problems or factors influencing people’s health which can be seen within aggregated data. Indeed, Gatrell (2002:60) continues to describe how one can “look at the disease rate in one location and rates in neighbouring locations, up to a specified distance. This helps [to] identify disease hotspots”. The presence of such spatial patterns is common in health data (and geographical data much more widely) and the latter has led Waldo Tobler to comment (in what is often described as the first law of geography): Everything is related to everything else, but near things are more related than distance things. Different measures of spatial autocorrelation are available for spatial analysis including Moran’s I, the Getis and Ord G-statistic and Anselin’s Local Indicators of Spatial Association (LISA). Increasingly these measures are available within standard desktop GIS. They can determine the substance (if any) of the apparent spatial pattern present within the geographical data (Figure 1). This is truly important because, as Slocum et al. (2005) note, there is a chance that the visual pattern is truly random rather than clustered appearance on screen would suggest. Figure 1. Types of spatial patterns (after Boots & Getis 1988) In today’s exercise you will employ two different measures of spatial autocorrelation: 1. Global – Moran’s I (or coefficient) 2. Local – Local Indicator of Spatial Association Learning outcomes On completion of this practical you should be able to:   Implement a range of areal pattern descriptors in ArcGIS for the analysis of health data Understand the basic characteristics of the different pattern descriptors and recognise their suitability for different health data analysis scenarios Data In this practical exercise you will be working with real health data for the 3104 counties of the continental United States (the USA without Alaska, Hawaii and Puerto Rico). The data is presented in shapefile format and can be found on the S:\ drive. S:\Faculty of Science & Engineering\Environmental Sciences\6F6Z2002\PolygonPatternAnalysis\ & Geographical NOTE: Copy the ENTIRE \Data folder to your drive! Not just separate files from it! Moran’s I: a global measure of spatial autocorrelation All measures of spatial autocorrelation attempt to quantify Tobler’s law of geography – that features in close geographical space are highly likely to share attribute properties. The Moran coefficient is widely used because of its perceived statistical robustness (Slocum et al. 2005). The method can be thought of as a spatial extension to the correlation technique used in standard descriptive statistics, but in this instance, it looks at the spatial arrangement of the values of the variable under investigation (O’Sullivan & Unwin 2003; Slocum et al. 2005). Figure 2, below, taken from Slocum et al. (2005:52) provides a brief explanation of how Moran’s I is calculated: Figure 2. Calculating Moran’s I (Source: Slocum, 2005, p. 52) With the Moran’s I statistic the values of neighbouring areal units are compared against the overall mean (this is the covariance measurement). Where neighbours share similar values in comparison to this mean they will show positive covariance values. Dissimilar values will present negative values (see O’Sullivan & Unwin 2003:197). These values are also weighted to identify neighbouring units, with a value of 1 for adjacent units and 0 for non-neighbouring units. The computed Moran’s I value ranges from -1 (perfect negative spatial autocorrelation – such as you would see on a chessboard) and +1 (perfect positive spatial autocorrelation – large sections of homogenous areas). A value of 0 is indicative of a random pattern or distribution. A note of caution Please take the utmost care when using these spatial statistics (indeed when using any statistics for that matter). All these measures suggest is the likelihood of a spatial pattern being present (or not) and the possibility that there may be some underlying factor or process responsible. They are not explanatory in their own right and using the results of such statistics should be only one of the steps undertaken in your exploration of health data – not the end point! The next stage would be to consider dataset reliability and whether the particular statistical techniques are appropriate and what further testing and analysis is necessary. 1. Start up ArcMap (ArcGIS) and create a new map file – add the US Health Data shapefile (US_Pov_HD.shp) to your display. 2. Using the symbology tab create a graduated colour map display for heart disease (HeartDisea) (deaths per 100,000 persons – men and women) Figure 2b). Figure 2b. Heart Disease Deaths per 100,000 persons in the continental USA 3. Spend a few minutes examining the data closely – can you see any obvious spatial patterns present within the dataset? 4. To test the robustness of any spatial patterning you will calculate the Moran’s I measure of spatial autocorrelation for the heart disease dataset. First, you must make sure that the ArcToolbox is up and running. Click on the red toolbox icon from the toolbar to install this function within your display. You will find that there are many different functions available – you will be using the Spatial Statistics Tools at the bottom of the list. 5. Double click on Spatial Statistics Tools and do the same for the buttons Analyzing Patterns and Spatial Autocorrelation (Moran’s I). This will bring up a new dialogue box as shown in Figure 3. 6. Select US_Pov_HD.shp as the Input Feature Class and HeartDisese as the Input Field. You should also make sure that the Generate Report (optional) is ticked and that you use the Inverse Distance option for determining the spatial relationship (Figure 3). Then click OK. Figure 3. Running Moran’s I in ArcGIS 7. This process could take a few minutes. You can access the results from the main menu, by selecting Geoprocessing > Results. The numerical results and the graphic from the Moran’s I analysis can be visualised by doubleclicking on the ‘Report File’ in the results panel: Figure 4(a and b). Displaying the Spatial Autocorrelation (global) Moran’s I Results LISA: a local measure of spatial autocorrelation While the global measure of spatial autocorrelation that was above is a robust and useful statistic, researchers have recently begun to highlight that it is very unlikely that any kind of geographical data (representing health events or otherwise) will be spatially homogenous across the entire study region (Lee & Wong 2001). Using a single global measure is therefore pretty crude and a more localised measure would be much more appropriate (O’Sullivan and Unwin, 2011). Such a local measure would enable researchers to identify pockets of variable spatial autocorrelation across the region of study – this is particularly useful in the search for health hotpots, e.g. the clustering of people affected by similar illnesses or other health complaints. Luc Anselin, an internationally recognised researcher in spatial science and the creator of the GeoDa software (http://geodacenter.asu.edu/ ), has developed a local measure of spatial autocorrelation known as the Local Indicator of Spatial Association - LISA (Anselin 1995). This procedure works in the same way as the global measure, but instead of just obtaining a single number – you can create a variety of outputs including: (1) a significance map, (2) a cluster map and (3) a scatterplot. Each of the maps presents a thematic map display which highlights areas of local spatial autocorrelation – as opposed to the global summary, or average, statistic provided by the standard Moran’s I. The cluster map, for example, displays data in several colours to signify the four quadrants (high-high, low-low = positive spatial autocorrelation; high-low, low-high = negative spatial autocorrelation). 8. CONTINUING WITH THE ANALYSIS OF THE ARTHRITIS (i.e. field HeartDiseas), select Spatial Statistics Tools > Mapping Clusters from the ArcToolbox menu and then choose Cluster and Outlier Analysis (Anselin Local Morans I) from the list of options. Figure 5. After some processing you should see a new map display presentation in ArcGIS – this is colour coded to help identify High-high spatial autocorrelation (SA) (hotspots) in black, High-low SA in yellow, Low-high in white and Low-low in blue. Figure 6. The output of a LISA analysis is not only the map of the areas with HH, LL, LH and HL local spatial autocorrelation. LISA also adds another 3 attributes to the attribute table. So, altogether, it adds the following 4 attributes: • The local Moran’s I (attribute: LMiIndex) • The respective z-score (attribute: LMiZScore) • The respective p-value (attribute: LMiPValue) • The Cluster Type (Attribute: COType) To view the attribute table, right click on the layer in the table of contents on the left side of the screen, and select Open Attribute Table: Figure 7. This will bring up the following table: Figure 8. You can visualise these new attributes by double-clicking on the layer in the table of contents , and selecting how you wish to visualise them in the Symbology tab: Select a graduated colour display to visualise the P-values (LMiPValue), edit the range of value used so you can clearly see areas where the value is less than 0.05 Figure 9. Figure 10: The p-value output from LISA for the Heart disease attribute (mapped here with 5 classes. The first class is modified accordingly to show all counties with p-values below 0.05. Why? (HINT: use your statistical testing knowledge and with Figure 4b above) Figure 11: The re s pect iv e z-value output from LISA for the Heart disease attribute, mapped here w it h classes that are meaningful in order to interpret them. Question: why is the choice of these classes considered meaningful? (HINT: use your statistical testing knowledge and check Figure 4b above). What information does this give us? Compare this map with the heart disease deaths maps. ASSESSED TASKS Repeat this task generating the same type of outputs (global Moran’s I and local LISA) for the following variable in the US_Pov_HD.shp shapefile located in S:\Faculty of Science & Engineering\Environmental & Geographical Sciences\6F6Z2002\PolygonPatternAnalysis\:  Poverty ( % of residents living in poverty at last census) Produce Maps for the following: – – – – Heart Disease Deaths: A choropleth map (with appropriately chosen symbology, north arrow, scale bar, title and your name) Heart Disease LISA outputs, i.e.: • local Moran’s I • Cluster Type • p-value • z-score Poverty (%): A choropleth map (with appropriately chosen symbology, north arrow, scale bar, title and your name) Poverty LISA Outputs, i.e.: • local Moran’s I • Cluster Type • p-value • z-score Briefly discuss (no more than one side of A4) the global and local patterns of spatial autocorrelation shown in your results. Are there any areas of the US which are particularly unhealthy in terms of heart disease? Can you see any potential links between poorer health outcomes (in terms of heart disease death) and socioeconomic status (poverty)? You should support your commentary with some links to the academic literature. Should you wish to comment on any additional data, such as race or education status, you will find the “Interactive Atlas of Disease” produced by the Centre for Disease Control a useful resource. http://www.cdc.gov/dhdsp/maps/atlas/ Environmental Risk Management Spatial Epidemiology Assessed practical: Point Pattern Analysis Tutor: Dr Elias Symeonakis (E410a) e.symeonakis@mmu.ac.uk 26 January 2018 Introduction In GIS we are able to utilise and analyse a variety of spatially‐referenced datasets. Typically these spatial or geographical datasets are represented by what we term spatial entity data models* or entities for short. Entities are essentially graphical components used by the computer to represent the different phenomena of interest within the chosen study area. There are several types of entity described below (after Chang (2003) and Heywood et al. (2006)) including: Point - a zero dimensional feature represented by a single coordinate XY pair or an individual pixel Line – a one dimensional feature which represents length and is encoded either as a string of coordinate XY pairs or a linear series of contiguous pixels Polygon – a two dimensional feature which has both an area and a perimeter. It is represented either using a series of connected coordinate XY pairs with the same start and end point coordinate or a cluster of contiguous pixels Surface – this is a special form of entity which represents continuous phenomena either using a raster grid or a Triangulated Irregular Network (TIN) *You can find a fuller explanation of the spatial data modelling process (including entity Network – this is another specialist entity representation which recognises the selection) in Heywood of et line al. (2006:71-107). interconnection features 1 In this practical exercise we will focus entirely on the point entity data model. Points are used to represent the spatial location of events or activities known to have occurred in a defined geographical area (Bailey & Gatrell 1995, Boots & Getis 1988). These are typically individual events, or features, such as the centroid location or address point of a person (or persons) affected by a particular illness or disease. This type of analysis is very commonly used in spatial analysis, particularly in the areas of health, crime and ecology with a myriad of academic papers available on the subject and several textbooks which focus on this particular aspect of spatial analysis alone. Point pattern analysis is a common procedure where centroid (or point location) data form the primary dataset (Birkin et al. 1996). Researchers then employ a series of statistical methods in an attempt to determine whether any patterns exist in the spatial or geographical distribution of points (i.e. events) in the study area. Spatial point patterns specifically include “a set of locations, irregularly distributed within a designated region and presumed to have been generated by some form of [random or other] mechanism” (Diggle 2003:vii). Rather than rely upon simple visual interpretation of the point distribution(s) which may suggest specific patterns where none truly exists, specialist statistical methods are employed to help identify whether any discernable point patterns exist and to help establish the possible underlying causes for any evident spatial behaviours and patterns. Types of point pattern In undertaking point pattern analysis the user is exploring the dataset for evidence of specific spatial or geographic properties. From this, the user can then begin to establish whether there are specific processes which have generated the observed point pattern. Typically this involves the study of the dispersion of points (location of point patterns with respect to the geographical study area) or alternatively the arrangement of points with respect to each other (Boots and Getis 1988). To understand these properties more clearly it is important to define the possible patterns expected in a point pattern map display (Figure 1). Figure 1. Point patterns (after Boots & Getis 1988) 2 The point pattern conditions include clustering (or aggregation), regularity, and randomness, and are defined in the box below. Clustering (Aggregation) A concentration of events or objects (O’Sullivan & Unwin 2003), where the points are more tightly grouped together than would be expected from a completely random pattern. Dispersion (Evenly spaced) The events or objects appear to be uniformly, or evenly, spaced. The observed average distance between the points is also greater than that found within a completely random pattern. Randomness Diggle (1983) describes the pattern of Complete Spatial Randomness (CSR) where the points are characterised by uniformity and independence. More simply, the pattern of points occurs by chance, with no variation in intensity across the study area. Boots & Getis (1988) note that CSR is doubtful in real world situations where the likelihood is that no single process (acting upon the points) is dominant, giving the appearance of a CSR pattern. When studying an area of interest, it is useful to adopt point patterns analysis methods. Very often data are collected at a number of discrete locations. Usually, we attempt to extrapolate from the limited data to obtain information about the wider population or region. The analysis of these points can allow us to identify whether there is any definable spatial component in their behaviour. Take the following crime-based example: your hometown has recently suffered a spate of break-ins, and the local police authority want to obtain further information to help them catch the criminals and reduce the incidence of burglary. It is highly likely that the police workers will record the burglaries in point form employing the household location as the unit of observation. Using spatial statistics and GIS the police can begin to piece together criminal activity in the area. First of all the police will be very interested in determining whether there is any pattern to these burglaries. For instance, are there any localities that are more affected than others by the criminal activity, i.e. hotspot areas where a greater number of burglaries have been recorded? Obviously the distribution of points is likely to be affected by the type of built environment and population numbers and dynamics. Once this factor has been accounted for, the researcher or police worker can begin to examine the distribution of point data to see whether there are any discernable patterns. In this examination the police worker can start to determine whether there are clusters of criminal activity. Using this evidence we can begin to hypothesise about the nature of the burglaries, and potentially establish the reasons for such increased activity. For example, are any areas affected in particular? And if so, is there any additional evidence that might be able to help explain this? The following list of bullet points highlights some potential lines of inquiry:  Higher incidence of burglaries in areas of socio-economic deprivation, potentially as a result of poorer home security 3 Higher incidence in student areas, where multiple occupancy (e.g. flats in renovated houses) is common offering greater opportunity for criminals The use of such data exploration techniques to help develop hypotheses is a fairly typical aspect of spatial data analysis. This type of technique can be used to help build spatial process models and improve our understanding of the phenomenon under observation. Importantly, this is often an iterative process, with many steps involved in the development of the spatial model. Furthermore, there are a variety of different point pattern analysis methods available to the GIS user and some of the key techniques are discussed below with example exercises for you to complete later. Distance Measures One of the most common ways to detect any pattern within a point distribution is to examine the distances – or spaces – between points (Gatrell et al. 1996), and compare these to another, typically random, arrangement. Although relatively straightforward to calculate such measures are particularly effective in demonstrating what are known as second order effects, described by O’Sullivan and Unwin (2003: 79) as indicative of some form of “interaction between locations”. In other words, second order effects demonstrate local patterning or variation which is distinct from the global pattern or first order effects (Bailey and Gatrell 1995). Two such distance measures commonly used in point pattern analysis are described below. Nearest neighbour analysis is explained first followed by Ripley’s K statistic. Nearest Neighbour Analysis Nearest neighbour analysis is based upon a solid geographical principle that those objects or phenomena that are located in close proximity to one another are likely to share similar properties. This procedure describes the point pattern through calculating the mean distance to each point’s nearest neighbour (Kitchin and Tate 2000). Then, using relatively simple statistical analyses that compare the average distance(s) between closest neighbouring point observations with those of a previously known pattern (typically the analyst would select a random pattern in this type of analysis) it is possible to establish whether there are clustering or dispersed patterns within the point distribution. Cluster patterns are defined by the short distances between proximal neighbouring points, while dispersed point patterns display greater observed average distances between points when compared to a random distribution network. To calculate the expected average distant neighbours the following equation is used: Rexp = 1 / (2√ (n / A)) Where A is the area of the study location and n is the number of points in the particular distribution. Lee and Wong (2001) identify another useful statistic based upon the average distance information ‐ this is the randomness statistic, and is a simple ratio between observed and expected distance between point locations. R = robs / rexp Where robs is the observed average distance between nearest neighbours and rexp is the expected average distance between nearest neighbours using the basis of the theoretical pattern. 4 Employing this statistic is relatively easy to determine whether point distributions follow clustered, random or dispersed distributions. Where R is less than 1 the data set is characterised by an increasing cluster tendency, and in contrast R values greater than 1 assume dispersed spatial behaviour (evenly spaced events). Nearest Neighbour Analysis: A Worked Example So taking a hypothetical example of a study of town and city locations within a 100 x 100 kilometre study area, we can begin to establish the mean distances of the different events and compare this to the expected average distance between nearest neighbours. The region of interest contains 8 major towns and cities across its 10, 000 square kilometre study area as shown in Figure 2. The location of each settlement is provided in Table 1, as are the details of distance to closest neighbour. The calculation of the nearest neighbour index is given after the table. Figure 2. Point display of settlement distribution in hypothetical study area 5 Ripley’s K Function One of the problems associated with the nearest neighbour statistic is that it only considers the closest neighbour and does not consider other spatial scale effects (O’Sullivan and Unwin 2003, Mehrer and Westcott 2006). The K function originally developed by Ripley (1976) provides an opportunity to explore spatial patterning at different spatial scales within the chosen study area. To calculate K we must visit every event or point in the study area and then establish the mean number of other points falling within a set distance of the start point (Bailey and Gatrell 1995). Typically this distance is defined as a circle of radius d and is repeated for different radius values (O’Sullivan and Unwin 2003) (Figure 3). The mean counts for each circle are then divided by what is known as the mean intensity of the process – which is in effect the total number of events or points divided by the study area (Fotheringham et al. 2000). 6 Figure 3. Determining the K function (source: Bailey & Gatrell (1995:93)) The results of the K function can be presented graphically and help to show at what spatial scales different pattern behaviours (such as clustering may) occur (Figure 4). When the observed K value is larger than the expected K value for a particular distance, the distribution is more clustered than a random distribution at that distance (scale of analysis). When the observed K value is smaller than the expected K value, the distribution is more dispersed than a random distribution at that distance. When the observed K value is larger than the Higher Confidence Envelope value, spatial clustering for that distance is statistically significant. When the observed K value is smaller than the Lower Confidence Envelope value, spatial dispersion for that distance is statistically significant. 7 Figure 4. Point pattern behaviour at different spatial scales (Source: ArcGIS 10.1 help pages) Intensity measures Alternative approaches to measuring point patterns have moved away from basic measures of distance to the intensity (or density) of points in a given area. One such method is quadrat analysis, where simply the number of events (points) that occurs within a set of, typically square, sampling frames is counted. This is used to establish a frequency distribution, which records the number of events in each individual quadrat. This distribution can then be compared against another distribution, commonly a random pattern. In a random pattern, the mean number of points in each quadrat would approximate the variance of the number of points per quadrat. This can be calculated by the Variance Mean Ratio (VMR), which equals 1 for a random distribution. Where the VMR is greater than 1 then a cluster pattern is identified. Dispersed patterns are shown by a VMR of less than 1. This type of method has significant problems, however, most notably concerning the choice of quadrat size and the fact that it does not consider local density – only measuring the number of points and not their spatial distribution within a single quadrat. Thankfully there are a number of other intensity‐based measures, the most significant of which is the Kernel Density Estimator described next. Kernel Density Estimation The kernel density estimation technique involves the creation of a continuous (raster) surface which represents the variation in the density of point events in a given study area (Chainey & 8 Ratcliffe 2005). Specifically the analysis involves the estimation of the density of points across geographical space using kernels which have a defined search radius (Figure 5). Figure 5. The kernel function (Source: Chang (2003:282)) The appearance of the resultant raster‐based output is strongly influenced by the choice of kernel bandwidth – the radius used to search for other points around each event (O’Sullivan & Unwin 2003). Software environments for point pattern analysis There is a great deal of specialist software available for all kinds of spatial analysis including point pattern detection. Many of the standard desktop GIS packages, such as ArcGIS and IDRISI, include some (admittedly rather limited) point pattern analysis functionality, although you will find standalone specialist packages such as CrimeStat and R more capable for the task with a great range of point pattern analysis options available. ArcGIS ESRI’s ArcGIS software environment offers users the ability to undertake nearest neighbour, Ripley’s K and kernel density estimation. This is primarily carried out through the ArcToolbox unction in ArcGIS desktop. You should see the ArcToolbox as a small icon on the main toolbar. 9 CrimeStat CrimeStat is a standalone spatial statistical package for the analysis of point-based crime data. It was created by Ned Levine for the analysis of US crime data and is freely available for download for educational and research use. It offers a range of measures from basic centrographic analysis through to complex spatio-temporal modelling. http://www.icpsr.umich.edu/CRIMESTAT/ R – Statistical Computing R is a statistical computing environment created by the academic community and freely available for non-commercial use. Although it is used for many different statistical tasks it has a very strong spatial statistical component based around a series of additional packages which can be downloaded and added to the main R GUI interface. http://www.r-project.org/ Working in ArcGIS The practical exercise is to be completed using the desktop GIS package ArcGIS available in the computer labs. Please note that you may not finish this task within the hour or so available and therefore may need to work on this outside of the GIS lab class. You are required to submit the map and table outputs from the different analyses and answer the questions set out below. You should aim to write this up (including any figures) in 2 or 3 sides of A4. Point Pattern Analysis with ArcGIS The dataset provided for this practical exercise is: Lancashire Lung Cancer data – this is a shapefile with the locations of reported lung cancer incidences in southern Lanchasire (Source: Bailey and Gatrell (1995)). The data are available on S:\Faculty of Science & Engineering\Environmental & Geographical Sciences\6F6Z2002\Point Pattern Analysis\lung_cancer_lancs Copy the folder onto your own drive space or alternatively onto a USB flash drive. Nearest neighbour analysis 1. Open up ArcGIS and connect to the lung_cancer_lancs data folder in your personal drive space (or USB stick). 2. Add the lung_cancer_lancashire.shp file to your display. 10 3. You should now see a display like that shown in Figure 6 – you will see that it contains point data that represent the incidence of lung cancer among the local population. You may want to change the symbology properties of the data if the default symbol and colour is not to your liking. Figure 6. Lung Cancer data from Lancashire (Source: Bailey and Gatrell (1995)) 4. Visually examine the lung cancer point dataset for Lancashire. Can you see any pattern emerge? How might you describe this set of points – clustered, dispersed, random? 5. Once you have decided upon how to describe this pattern visually the next step is to see whether there is any statistical basis to this assumption. Here you will employ the nearest neighbour index. Select ArcToolbox from the toolbar – this is the little red tool box icon on the toolbar – you should now see a new menu display on your screen next to Layers. Within ArcToolbox there are numerous different modules and operations. 11 6. Find Spatial Statistics Tools and from its submenu select Analyzing Patterns. Here you will find the option to perform the Average Nearest Neighbour technique (Figure 7). Figure 7. Average Nearest Neighbour statistic 7. Select lung_cancer_lancashire as the Input Feature Class. Check the Generate Report box and click OK (accept all other defaults). 8. ArcGIS should now start processing your data and calculating its nearest neighbour index. Don’t worry if this takes the computer a minute or two to complete. When it does finish you should ask to see the results (from the main menu: Geoprocessing > Results). What is the value of the Nearest Neighbour Ratio? Remember, you should compare it with unity. What pattern does the lung cancer events data show? Double-click on the HTML Report File: Nearest Neighbor_Result.html to open up the graphical output of the results (Figure 7b). Make sure you keep a record of the results. 12 Figure 7b. Graphical output of NN results using ArcGIS Kernel Density Estimation (KDE) 9. To estimate a surface that describes the density of the cancer incidences in Lancashire using the KDE approach, go to the Arctoolbox > Spatial Analyst Tools > Density menu and double click on Kernel Density. Select lung_cancer_lancashire as the Input Feature Class and choose an appropriate output name and location for the raster that will be created via the KDE. Leave the rest of the defaults as they are and click OK. After the KDE algorithm finishes, the raster will be automatically displayed. You can modify the colouring scheme as you see fit, e.g. Figure 7c. To modify the colouring scheme, you need to double click on the Kernel Desnity layer, click on the Symbology tab and then click on the Classify button to modify the number of classes and the method of classification. You can use the Symbology of the lung cancer locations layer to change their symbol too, from the default dots to x’s (Figure 7c) so that they do not cover too large an area of the map. You can also click on the 0 density value class in the table of contents to change it’s colour to transparent (Figure 7d). Figure 7c. Density of lung cancer occurrences in Lancashire estimated using the Kernel Density Estimation tool (number of classes: 9, method: Quantile) Figure 7d: Changing the colour of individual classes Are the locations of lung cancer in Lancashire clustered? To assist in the discussion, click on the little arrow next to the Add Data button from the top menu in ArcMap and select Add Basemap (Figure 7e): Figure 7e: Adding a basemap Can you find out using the base map what if any urban areas are linked to these clusters? Use the zoom in tool if you need to. What happens if you modify the search radius option in the KD estimation window? Try a larger and a smaller radius and visualize the resulting rasters to compare. K statistical analysis of the lung cancer data 10. Using the lung_cancer_lancashire.shp data select Multi-Distance Spatial Cluster Analysis (Ripley’s K Function) from the Analyzing Patterns submenu of Spatial Statistical Tools. You should be prompted with a dialogue box approximately like that shown in Figure 8. Figure 8. Ripley’s K Function in ArcGIS 11. Select lung_cancer_lancashire as the Input Feature Class and then choose a suitable name and location for the Output Table. For Compute Confidence Envelope select 99 Permutations. Check the display output graphically box and click OK. 12. After several minutes of processing you should be presented with a dialogue box which shows how the data are clustered (or dispersed) and how these patterns change with spatial scale. 13. Are the locations of lung cancer in Lancashire clustered? And is there any variation with spatial scale? Keep a copy of the output graph for inclusion in your submission. TASKS Write up this practical in report format with (i) an introduction, (ii) a description of methods, (iii) the visual presentation of any maps/result outputs, and (iv) your answers/written discussions to any set questions below. Include a map display for the Lancashire lung cancer data. Your map should be presented separately and include north arrow, legend and your name (clearly labeled). Write a brief commentary describing the pattern of lung cancer events (is any clustering or other pattern present?) and link this discussion with the results of your nearest neighbor analysis, kerned density estimation and Ripley’s K analysis. You should also include the output graphical results and KDE map. Make sure that your write up makes appropriate use of the supporting academic literature. 14 References Bailey, T.C. & Gatrell, A.C. (1995) Interactive Spatial Data Analysis. Harlow:Prentice Hall Boots, B. & Getis, A. (1988) Point Pattern Analysis. London: Sage Birkin, M., Clarke, G.P., Clarke, M. & Wilson, A.G. (1996) Intelligent GIS: Location Decisions and Strategic Planning. Cambridge: Geoinformation. Chainey, S. & Ratcliffe, J. (2005) GIS and Crime Mapping. London: Wiley. Chang, K‐T. (2003) Introduction to Geographic Information Systems. Second Edition. Boston: McGraw Hill. Diggle, P.J. (2003) Statistical Analysis of Spatial Point Patterns. Second Edition. London: Arnold. Fotheringham, A.S., Brunsdon, C. & Charlton, M. (2000) Quantitative Geography: Perspectives on Spatial Data Analysis. London: Sage. Gatrell, A.C., Bailey, T.C., Diggle, P.J. & Rowlingson, B.S. (1996) Spatial point pattern analysis and its application in geographical epidemiology. Trans Inst Br Geogr 21:256-274. Heywood, I., Cornelius, S. & Carver, S. (2006) An Introduction to Geographical Information Systems. Third Edition. Harlow: Prentice Hall. Kitchin, R. & Tate, N.J. (2000) Conducting Research in Human Geography: Theory, Methodology and Practice. Harlow: Prentice Hall. Lee, J. & Wong, DS. (2001) Statistical Analysis with ArcView GIS. New York: Wiley. Levine, N. (2007) CrimeStat: A Spatial Statistics Program for the Analysis of Crime Incident Locations (v3.1). Ned Levine & Associates, Houston, TX, and the National Institute of Justice, Washington, DC. Mehrer, M. & Westcott, K. (2006) GIS and Archaeological Site Location Modeling. CRC Press. O’Sullivan, D. & Unwin, D. (2003) Geographic Information Analysis. New Jersey: Wiley. Ripley, B.D. (1976) The second-order analysis of stationary point processes. Journal of Applied Probability 13: 255-266. Acknowledgements: The data for this exercise have been created by other researchers and are included in Bailey and Gatrell (1995). 15
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

This question has not been answered.

Create a free account to get help with this and any other question!

Related Tags