Thursday, November 6, 2014

Geocoding

Goals and Objectives

The goal of this lab was for us to connect to one of the databases on the departmental server and geocode some data that was in an Excel file. Once did that another goal of this lab was to learn how to gather, normalize and geocode data we obtained from the Wisconsin DNR. This is one of the first parts of a larger project we will be building on throughout the semester involving sand mining in Western Wisconsin. Our study area will consist of Trempealeau county and several others.

Methods

The first part of this lab was to connect to a geodatabase on the geography department’s server. This geodatabase included an Excel sheet with information about a couple of hundred Frac Sand Mines in Western Wisconsin. This data was given to us by the Wisconsin DNR. This Excel did not have all the data or was not organized in a way that we could geocode right away so we had to normalize the table. In other words I created 5 new columns to the Excel file, State, City, Street, Zip Code, and PLSS. By adding these fields you make the table readable by the geocode tool. So once we had the table normalized we could geocode the data or find the addresses and locations of these mines. We did this because not all the mines had an address with them. There are two ways of geocoding, manual and automatic. For the automatic way you put the normalized table into the geocoder and it will give you the street address for the mine if it can find it. The automatic was is obviously easier but most of my mines had to be manually geocoded. This is done using a wide variety of tools. I used Google Maps and Google Earth along with the PLSS finder. The Public Land Survey System is a grid system that is used to keep track of land parcel ownership. If I knew the PLSS address for a mine I would put it into the PLSS finder and it would take me to the grid square where that address is. This narrowed done my search area. I would then switch to Google Earth or Maps and look around that area for a sand mine with similar dimensions to what it is described as in my Excel sheet. On some occasions I could find the mine and get a street address off of Google Earth or Maps but a lot of the time the imagery is not up to date enough to show these mines because most of them are fairly new. Another method I used to find the mines was to just Google the name of the mine or the county it was in. I found a couple of my mines this way because some counties have all the mines in a document with addresses and owner information and the mine name so those made this easier. This was a challenging and pretty time consuming process. I was only responsible for 16 mines and of those I had to manually find 5. This took me a couple of hours.

Results

Below are a sample of the non-normalized table and the table after I cleaned it up. You can see in the non-normalized table that all the address information in just in one column. The goecoder cannot differentiate between street addresses and zip code and things like that in that format so I normalized the table. I took that address information and spread it out into separate columns which the goecoder could read.

Non-Normalized Table



Normalized Table

After we had all of our mines geocoded and placed into a shapefile in ArcMap we compared how much variation there was between the places each of us placed our mines with 4 other people who also placed the same mines. We did this by using the point distance tool which selects the mine closest to each of my mines and measures the distance between them. My table is measured in meters and as you can see below there was a lot of variation as to where the mine was placed by other people compared to where I put mine.
Distance Table in Meters

Discussion

During this lab there were a couple of different kinds of errors that could occur. During the geocoding process the major and most obvious error that occurred is an operational error. It is operational because it is a user error or caused by human mistake. When everyone had their mines geocoded and we brought them all together into one map the same mine was placed in multiple different places. In a perfect world each geocoded mine should have been in the same spot as the others with the same ID. This is caused by a mistake in image analysis or just by the opinion of the person where the mine should be.
Another type of error is called inherent error. This error occurred when we measured the distance between the mines and got different measurements. This is caused by the limitations of the computer or technology you are using to make the measurements. Technology is only so good. So operator accuracy is limited by machine technology and also by the resolution of the data. The lower the resolution the less accurate the measurements will be.

With these errors occurring the location of these mines is not very accurate in most cases. If we were in need of a very accurate location we could take the locations we got and compare them to a higher resolution data set that has more accurate locations of the mines and calculate the root mean square error of planimetry. 

Conclusion

Data that is not normalized is hard to work with. This lab showed the importance of normalized data and how if it isn't, preforming other tasks with it such as geocoding is nearly impossible. Knowing how to normalize data is a good skill to have because you will get data in all kinds of formats and from all kinds of places that can not be used until it is normalized. Geocoding is also a very valuable skill to know. It can be applied in many practical situations. The results of this process are what you see below. This is my map showing all the geocoded mines in an easy to understand way. It includes the location of my 16 mines as well as the locations of those same mines as located by other students in the class. The differences in the locations illustrate the errors that can occur during the geocoding process. 
Final Map of Mines

Data Source: Wisconsin DNR



No comments:

Post a Comment