COR777 TO: WLSCORS.LIS FROM: Eric Lewerenz SUBJ: Creation of datasets using variables from GDT (Geographic Data Technology) and explanation of some variables DATE: December 13, 2002 ******************************************************************************* [1] Description of data set & variable creation See the SAS program gis.sas and related documentation in \hauserproj\archive\users\eleweren\Project\GIS. [2] Explanation of variables (from GDT Geocoding Services "User Manual") [a] GDTXIN (p.28-29) Centroid Type (US): A 1-character field identifying the type of centroid that was assigned. Centroid Type Codes Include 0 Not a centroid (street address match) 2 ZIP+2 centroid 4 ZIP+4 centroid X 5-digit ZIP Code centroid Blank No centroid available A ZIP+4 centroid coordinate is the latitude/longitude point that falls at the middle-address of the address range associated with the ZIP+4 according to the USPS. A ZIP+2 centroid coordinate is the latitude/ longitude point that is the mathematical average of te coordinates associated with all ZIP+4s within a ZIP+2. A ZIP+2 is also known as a sector. A 5-digit ZIP Code centroid is the delivery-weighted centerpoint latitude/longitude of the polygon formed by the 5-digit ZIP Code boundaries. These 5-digit ZIP Code centroids are also described as "delivery-based centroids." [b] GDTSTAT (p.30-31) Match Status Codes: A 2-character code appended to each record indicating the type of match or failure. Match Status codes are assigned automatically. "B" codes are the most common, and indicates that the match was made through the Batch process. The presence of a "B" code means that all input components met the criteria needed to achieve an "address level" or quality match. NOTE: In the Batch process, addresses are geocoded to the most accurate latitude/longitude coordinates possible. When the address cannot be located, geocoding is done to the appropriate postal code centroid: ZIP+4, ZIP+2 or 5-digit ZIP. Code Description Recd 1 B1 Street segment match in Batch. 2 B2 Intersection match in Batch. 5 B5 Alternate name in Batch. Address match to an alternate or "secondary" name of a street in GDT's database. A common example: when a US or State highway passes through a town, becoming "Main St." 6 B6 Placeholder match in Batch. Matched to a point in GDT's database that had been placed earlier as a "best estimate until acquisition of better resources." 10 Not a valid 2-digit state abbreviation. Either a typo or non-covered area (Guam, Virgin Islands, etc.). 11 Locality not found in list of valid localities. The Postal Service and therefore GDT does not list this as a serviceable locality. 12 Street address parse error. Incomplete or poorly formatted addresses such as blank fields. The geo- coding software was unable to break the address down into prefix, street name, type, suffix, directional, etc. properly. 14 Street name could not be found. The street name given is not found in GDT's street database. This is either due to missing data in GDT's database or an invalid address. 15 Address range did not exist. The address given is not found in GDT's street database on the street given. This is either due to missing address ranges on streets in GDT's database or an invalid address number. 16 More than one segment with address range. Ambiguity: Either due to the address occurring multiple times in GDT's street database, or the address isn't specific enough (e.g., "100 Main St" when GDT's database contains both "100 N Main St" and "100 S Main St"). 17 Unable to match intersection. There may be two valid streets that do not intersect or one or both streets could not be matched. Another possibility is that the two streets intersect in more than one place. [c] Other variables...? [3] Understanding ZIP Codes A ZIP code is a number of up to 9 digits used by the U.S. Postal Service to identify an area where mail is delivered. There are three levels of ZIP codes: [a] 5-digit ZIP: The first digit of a 5-digit ZIP divides the country into 10 large areas numbered from 0 in the northeast to 9 in the far west. Within these areas, each state is divided into an average of 10 smaller geographical areas, identifed by digits 2 and 3. These digits, in conjunction with the first digit, represent a sectional center facility ot a mail processing facility area. Digits number 4 and 5 identify a post office, station, branch or local delivery area. [b] ZIP+2: A 5-digit ZIP area is divided into ZIP+2 delivery sectors, each of which may be a group of post office boxes, several office buildings, a single high-rise building with multiple firms or apartments, a small geographic area, or several blocks. [c] ZIP+4: ZIP+2 sectors can be further subdivided into ZIP+4 sections identified by attaching another set of two digits. A ZIP+4 might be one floor of an office building, specific departments in a firm, a group of post office boxes, or one side of a street (block face) between intersecting streets. Single-site ZIP+4s: Although ZIP+4s usually define a single side (block face) of a single city block in urban areas (or both sides of a longer roadway in rural areas), they can also define a single building, or even a single floor in a building. For example, a ZIP+4 might be assigned to a high-rise apartment building in mid-town Manhattan.