Geographically Weighted Regression
Geographically weighted regression (GWR) is a spatial analysis technique that takes non-stationary variables into consideration (e.g., climate; demographic factors; physical environment characteristics) and models the local relationships between these predictors and an outcome of interest.
GWR is an outgrowth of ordinary least squares regression (OLS); and adds a level of modeling sophistication by allowing the relationships between the independent and dependent variables to vary by locality. GWR is useful as an exploratory technique – its usefulness as a prediction tool is controversial – it allows visualization of stimulus-response relationships and if/how that relationship varies in space. It also accounts for spatial autocorrelation of variables. Briefly, GWR constructs a separate OLS equation for every location in the dataset, which incorporates the dependent and explanatory variables of lcations falling within the bandwidth of each target location. Bandwidth can be manually entered by the user (based on previous literature etc.) or it can be determined by the statistical software. In R, for example, a first set of OLS models are run to determine the bandwidth. If the bandwidth is not manually entered by the investigator, most software problems allow the investigator to select the default or “adaptive” bandwidth, which is recommended in the literature.
GWR was originally developed for the analysis of spatial point data and allows for the interpolation of values that are not included in the data set. It is applied under the assumption that the strength and direction of the relationship between a dependent variable and its predictors may be modified by contextual factors. GWR has high utility in epidemiology, particularly for infectious disease research and evaluations of health policies or programs. Limitations of GWR include problems of multicollinearity and the approaches to calculating goodness of fit statistics. We have included two articles that specifically address these concerns.
How GWR works:
1. OLS models are run to determine the global regression coefficients (β) for the independent variables:
yi = β0 + β1x1i + β2x2i +…+ βnxni + Ɛi
with the estimator:
β’ = (XT X)-1 XT Y
2. Once the independent variables that you wish to retain in the model are identified, and there is a theoretical basis for thinking that the relationships may differ by space, GWR may be an appropriate next step. The regression models that underlie GWR:
yi = β0 + β1x1i + β2x2i +…+ βnxni + Ɛi
With the estimator:
β’(i) = (XTW(i) X)-1XTW(i)Y
Where W(i) is a matrix of weights specific to location i such that observations nearer to i are given greater weight than observations further away.
3. There are a number of software packages that will run GWR (arcGIS, R, GWR 4.0) and we have attached links to the documetnation for running GWR in these different software programs. Running a GWR model is a slow process. Please be aware that these models can take a long time to run – remember that you are running many small regression models.
Textbooks & Chapters
Bivand R, Pebesma EJ, Gómez-Rubio V. (2008). Applied spatial data analysis using R. Heidelberg: Springer. : http://gis.humboldt.edu/OLM/r/Spatial%20Analysis%20With%20R.pdf
Fotheringham AS, Brundson C, and Charlton M. (2002). Geographically weighted regression: The analysis of spatially varying relationships. West Sussex, England: John Wiley and Sons, Ltd.
Goovaerts P. (2008). Geostatitical Analysis of Health Data: State-of-the-Art and Perspectives. Soares A, Pereira MJ, & Dimitrakopolous R (Eds.) Proceedings of the Sixth European Conferences on Geostatistics for Environmental Applications (pp. 3-22). Heidelberg: Springer. Accessed directly through SpringerLink: https://link.springer.com/book/10.1007/978-1-4020-6448-7
Mitchell A. (2012). ESRI Guide to GIS Analysis, Volume 2: Spatial Measurements and Statistics. New York: ESRI Press.
Leung Y. (2009). Discovery of Spatial Relationships in Spatial Data. In Knowledge Discovery in Spatial Data (pp. 223-276). Tokyo: Springer. Accessed directly through SpringerLink:http://link.springer.com/chapter/10.1007/978-3-642-02664-5_5
Thapa RB, Estoque RC (2012). Geographically Weighted Regression in Geospatial Analysis. Y Murayama (Ed.) Progress in Geospatial Analysis (pp. 85-96). Tokyo: Springer. Accessed directly through SpringerLink: http://link.springer.com/chapter/10.1007/978-4-431-54000-7_6
Wheeler DC, Páez A. (2010). Geographically Weighted Regression. MM Fischer & A Getis (Eds.)Handbook of Applied Spatial Analysis: Software Tools, Method, and Application (pp. 461-486). Heidelberg: Springer. Accessed directly through SpringerLink:http://link.springer.com/chapter/10.1007/978-3-642-03647-7_22
Background/logic for GWR and demonstration of its application:
Brunsdon C, Fotheringham AS, and Charlton ME. (1996). Geographically weighted regression: A method for exploring spatial nonstationarity. Geographical Analysis 28(4):281-298.
Brunsdon C, Fotheringham S, and Charlton M. (1998). Geographically weighted regression—modelling spatial non-stationarity. The Statistician 47(3):431-443. https://www.jstor.org/stable/2988625
Mennis J. (2006). Mapping the Results of Geographically Weighted Regression. The Cartographic Journal 43(2): 171-179.
Brief overview of OLS regression, regression with spatial data, and GWR:
Charlton M and Fotheringham AS. (2009). Geographically weighted regression. [White Paper]. https://www.geos.ed.ac.uk/~gisteac/fspat/gwr/gwr_arcgis/GWR_WhitePaper.pdf
Description of methods for testing goodness of fit and model validity in GWR models:
Leung Y, Mei CL, and Zhang WX. (2000).Statistical tests for spatial nonstationarity based on the geographically weighted regression model. Environment and Planning A 32(1): 9-32. http://www.envplan.com/abstract.cgi?id=a3162
Description of GWR multicollinearity problem:
Wheeler D and Tiefelsdorf M. (2005). Multicollinearity and correlation among local regression coefficients in geographically weighted regression. Journal of Geographical Systems 7(2):161-187. http://link.springer.com/content/pdf/10.1007%2Fs10109-005-0155-6
Description of SAS macro for GWR:
Chen VYJ and Yang TX. (2012). SAS macro programs for geographically weighted generalized linear modeling with spatial point data: Applications to health research. Computer Methods and Programs in Biomedicine 107(2):262-273.
GWR in Nutritional Epidemiology:
Yoo D. (2012) Height and death in the Antebellum United States: a view through the lens of geographically weighted regression. Economics and Human Biology 10(1):43-53. https://pubmed.ncbi.nlm.nih.gov/22036017/
GWR in Population and Family Health Research:
Shoff C and Yang TC. (2012) Spatially varying predictors of teenage birth rates among counties in the United States. Demographic Research 27(14):377-418.http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3493119/pdf/nihms409513.pdf
GWR in Infectious Disease Epidemiology:
Liu Y, Jiang S, Liu Y, Wang R, Li X, Yuan Z, Wang L, and Xue F. (2011) Spatial epidemiology and spatial ecology study of worldwide drug-resistant tuberculosis. International Journal of Health Geographics 10:50.
GWR in R:
this is the documentation for a zip code package in R that has a database of city, state, latitude, and longitude coordinates for United States Zip Codes.
GWR in SAS – method used in the Chen and Yang (2012) paper, including a link to the data file:
GWR in STATA:
GWR in ArcGIS:
Places to download shapefiles:
Tiger/Line Shapefiles maintained by the US Census:
Diva-GIS: Shapefiles downloadable by country:
BYTES of the big Apple – Local NYC files maintained by the NYC Department of Planning:
Stat Silk – clearinghouse of links to shapefile maintaining pages
EPIC courses are offered briefly introducing spatial epidemiology.
ESRI Training/Workshop (ArcGIS) – Beyond Where: Using regression Analysis to Explain Why. Free, 1 hour web-based training seminar.
Join the Conversation
Have a question about methods? Join us on Facebook