<<

1. What is Geocoding?

Geocoding is an attempt to provide the geographic location (, ) of an by matching the address to an address range. The address ranges used in the geocoder are the same address ranges that can be found in the TIGER/Line Shapefiles which are derived from the Master Address File (MAF). The address ranges are potential address ranges, not actual address ranges. Potential ranges include the full range of possible structure numbers even though the actual structures might not exist. The majority of the address ranges we have are for residential areas. There are limited address ranges available in commercial areas. Our address ranges are regularly updated with the most current information we have available to us.

The hypothetical graphic below may help customers understand the concept of geocoding and Census Geography ( displayed in this document are factitious and shown for example only.) If we look at Block 1001 in the example below the address range in red 101-199 is the range of numbers that overlap the actual individual house numbers associated with the blue circles (e.g. 103, 117, 135 and 151 Main St) on that side of the street (i.e. the Left side, note the arrow is pointing to the right on Main Street.) Based on this logic, the from address would be 101 and the to address would be 199 for this address range. Besides providing a user with the geographic location of an address the Census Geocoder can also provide all of the additional Census geographic information associated with a location, for example a , Tract, , and State.

For a definition of many of the Census terms discussed in this document please consult the Census Bureau’s Geography Reference page at (http://www.census.gov/geo/reference/terms.html).

1

2. How do I create a digital file so that I can submit my addresses for batch geocoding?

An important first step to preparing your file for batch geocoding is to format your spreadsheet of addresses into 5 columns like the example below.

Note: City, State and Zip Code fields can be left blank. See example below.

The geocoder accepts input files in text (.csv, .txt, .dat) and Excel format (.xls, .xlsx). The output file is provided in the same format as the input file. Most database software products have the option of saving your file in these formats.

3. What is the Census Geocoder?

The Census Geocoder (https://geocoding.geo.census.gov/) allows customers the ability to submit one or many addresses (10,000 addresses is the limit) to determine their (interpolated latitude and longitude) if they fall within a census address range. The latitude and longitude coordinate system is NAD83. There are seven different geocoding options to choose from on the home page: there are three options under “Find Locations Using” and four options under “Find Geographies Using,” see example below.

2

We will discuss each of these below and group them based on the similarity of their output product.

A. Find Locations Using… One Line.

For this selection the user types in one address in the text box, using comma’s to separate the different address components, note example below.

The user gets the option to choose three different address range (AR) benchmarks in the pulldown menu.

Benchmark refers to the time period when the address range was captured in TIGER. Public_AR_Current is the most current benchmark. More Benchmark information is discussed in Section #4, What does Benchmark and Vintage Mean?

A discussion of the output is provided in Appendix A.

B. Find Locations Using… Address.

These options allow the user to input an address through a series of text boxes to get a geocode (see example below.)

A discussion of the output is provided in Appendix A.

3

C. Find Locations Using … Address Batch. This option allows the user to submit multiple addresses in a digital file formatted in text (.csv, .txt, .dat) or Excel (.xls, .xlsx).

To get started the user selects their digital address file for input with the browse button, next the user can choose a Benchmark type, and finally one clicks the Get Results button to receive the output information. After the processing is complete the user will be prompted by their web application if they would like to open the GeocodResults.csv file or save it, note screen shot below.

The output information in shown in Appendix B.

D. Find Geographies Using …

There are four options using the Find Geographies Using … method which is discussed below.

I. Find Geographies Using … One Line. For this option the user types in one address in the text box, using comma’s to separate the different address components (see example below.)

4

This option allows the user to choose a different address range (AR) benchmark (discussed above) and different Geography Vintage types (note image below.)

Vintage is the date when the geography information was captured. More Vintage information is discussed in Section #4, What does Benchmark and Vintage Mean?

The output information is shown in Appendix C.

II. Find Geographies Using … Address

This option allows the user to input an address through a series of text boxes, see example below.

Again the user has the option of choosing a different Benchmark and Vintage. The output information in shown in Appendix C.

III. Find Geographies Using … Address Batch.

5

This option allows the user to submit multiple addresses in a digital file formatted in text (.csv, .txt, .dat) or Excel (.xls, .xlsx) to obtain the Census Geography information associated with the address.

To get started the user selects their digital address file for input with the browse button, next the user can choose a Benchmark and Vintage type, and finally the user clicks the Get Results button to receive the output information. After the processing is complete the user will be prompted by their web application if they would like to open the GeocodResults.csv file or save it, note screen shot below.

The output information in shown in Appendix D.

IV. Find Geographies Using … Geographic Coordinates. This option allows the user to submit Longitude (X) and Latitude (Y) coordinate values to determine the geography associated with that location.

The output information in shown in Appendix C. Note that this option does not output any address range information.

4. What does Vintage and Benchmark mean? Benchmark refers to the date or time frame when the address range repository was last updated. Vintage refers to the date or time frame when the geography is from. If you choose an address range from an earlier time frame you will only be able to choose geography from

6

that time or earlier. So for example if you choose a 2010 Benchmark Address Range you will only have the option of choosing Vintage geography from either 2000 or 2010. If you choose a current Benchmark Address Range you will have the option of choosing Vintage geography from Current, ACS2015, ACS2014, ACS2013 or Census 2010.

5. Possible reasons your address did not geocode. There are several possible reasons why an address you submit may not geocode to a Census address range.  The Street and address that you submitted truly does not exist or has not been built.  The address is a newly constructed home that has not yet been captured by our address capturing techniques.  The address may have existed at one time but now does not exist (i.e. the housing unit may have been demolished or destroyed by natural or man-made causes.)  The address may have existed at one time but now does not exist (i.e. the housing unit address may have been changed to a non-residential one.)  The house number or street name may have changed because of renaming and/or renumbering due to E911 activities.  The address submitted matches to a single address range street segment. Because of the Census Bureau’s commitment to Title 13 individual address information is considered confidential information and thus cannot be released to the public. A single address range street segment essentially identifies the location and name of a single address which is prohibited and cannot be released.

Our address ranges consist mainly of residential addresses. If you do not get a result and you know the approximate location of the address, we recommend you use our TIGERweb interactive map viewer. If you do not know the approximate location we recommend you use outside sources to determine the approximate location. We are continually improving our addresses and address ranges. We release updated geography and address ranges at least once per year. Please send any questions or comments to [email protected].

6. LUCA participants

Entities submitting address lists for geocoding to the LUCA Geocoder will submit their addresses to the Census Bureau via the SWIM application. The address list file submitted to the LUCA geocoder can exceed 10,000 records and must be in a .csv format similar to the formats discussed above for the public geocoder. All files submitted to the LUCA geocoding must be formatted into 5 columns like the example shown in Section 2 above.

The output products that are returned to the LUCA participants are provided to assist them with their LUCA submission. We shall discuss these specific data products next.

7

Output file format for the Address Count List.

Address Geocoding Output File Layout for the Address Count List

Column Name Example The total number of addresses that MATCHED = ‘M’ matched to an address range. The total number of addresses that tied between two or more Census address ranges. A Tie indicates multiple possible results for that MATCHED = ‘T’ address. The total number of addresses that MATCHED = ‘U’ did not match to an address range. TABSTATE Tabulation State FIPS Code TABCOUNTY Tabulation County FIPS Code TABTRACT Census Tabulation Tract Code TABBLOCK Census Tabulation Block Code The number of address that matched to a census address range in the ADDRESS COUNT particular block.

The output file format for the Address List:

Output Fields Definition LINE NUMBER The unique number for each address. INPUT ADDRESS The address submitted by the participant MATCH INDICATOR The results in this column refer to if the address matched a census address range (Match), did not match (No_Match), or was a tie between two or more Census address ranges (Tie). A Tie indicates multiple possible results for that address. MATCH TYPE Indicates if the match that occurred was exact (Exact), or equivocated (Equivocate.) TIGER OUTPUT ADDRESS The equivocated address that matched the TIGER address range. INTERPOLATED LONGITUDE, INTERPOLATED The interpolated Latitude and LATITUDE Longitude value based on the address range location, TIGERLINE ID The specific TIGER Line Identifier number.

8

TIGERLINE SIDE The side of the address range that the input address matched to left or right. STATE The two digit state FIPS code. COUNTY The two digit county FIPS code. TRACT The six digit code. BLOCK The Census Block code (4-6 digits.)

Appendix A

The output data when using Find Locations Using… One Line and Address.

The output information is defined in the table below:

Output Fields Definition Matched Address The address that was used in the geocoding process. Coordinates The Longitude (X) and Latitude (Y) values based on an interpolation on where the address falls along the address range. Tiger Line Id The unique Tiger Line Id of the street segment. Side The side of the street the address range lies on either L (Left) or R (Right). From Address The from address value. To Address The to address value. PreQualifier A word or phrase that precedes all other elements of the street name and modifies it, but is separated

9

from the street name by a street name pre- directional and/or pre-type.* PreDirection A word preceding the street name that indicates the directional taken by the thoroughfare from an arbitrary starting point, or the sector where it is located.* PreType The element of the complete street name proceeding the street name element that indicates the type of street.* Street Name The official name of the street. SuffixType The element of the complete street name following the street name element that indicates the type of street.* Suffix Direction A word following the street name that indicates the directional taken by the thoroughfare from an arbitrary starting point, or the sector where it is located.* SuffixQualifier A word or phrase that follows all other elements of the street name and modifies it, but is separated from the street name by a street name suffix-type, suffix-directional and/or suffix-type.* City The city the address is located in. State The State the address is located in. Zip The Zip Code the address is located in. (* Note documentation for this is available at the following URL https://www.fgdc.gov/standards/projects/address- data/05-11.2ndDraft.CompleteDoc.pdf).

Appendix B

The output data when using Find Locations Using… Address Batch.

The output information is defined in the table below:

Output Fields Definition Record ID Number The unique number for each address submitted. Note the output file may not

10

return the records in the same order as that submitted

Input Address The address submitted by the customer. TIGER Address Range Match Indicator Values in the Match Results column refer to if the address matched a census address range (Match), did not match (No_Match), or was a tie between two or more Census address ranges (Tie). A Tie indicates multiple possible results for that address. TIGER Match Type Indicates if the match that occurred was exact (Exact), non-exact (Non-Exact), tied with another address range (Tie) or no match (No Match). TIGER Output Address Standardize version of the input address that was used to match to the TIGER Address range. Interpolate Latitude and Longitude The Longitude (X) and Latitude (Y) values based on an interpolation on where the address falls along the address range. TIGERLine ID The unique Tiger Line Id of the street segment. TIGERLine ID Side The side of the street the address range lies on either L (Left) or R (Right).

Appendix C

Results from using Find Geographies Using … One Line, Address or Geographic Coordinates.

For each successful match using either of these options the software returns the following information, note the four example screen shots below. (Note the examples below have been cut up to provide a natural break to help describe the output data.)

11

This first section of the output data is similar to the output data given for the Find Locations using either One Line or Address (please note Appendix A.)

Output screen shot #2 (Geographic information for the county that the address range is located in.)

The output information is defined in the table below:

Geography Output County Information Definition OID MAF/TIGER Object Identifier STATE State FIPS Code FUNCSTAT Functional Status ST.GEOMETRY.LEN The Length of the State Boundary AREAWATER Water Area in Square Miles for the County NAME The Name of the County LSADC Legal Statistical Area Description Code

12

CENTLON Centroid Longitude (of the county) ST.GEOMETRY.AREA The Geometric Area of the State BASENAME Base name portion of the Standardized Name INTPTLAT Internal Point Latitude COUNTYCC County FIPS Class Code MTFCC MAF/TIGER Feature Classification Code COUNTY County FIPS Code Geographic Identifier – Fully Concatenated Geographic Code CENTLAT Centroid Latitude INTPTLON Internal Point Longitude AREALAND Land Area in Square Miles (of the County) COUNTYNS County National Standard Code OBJECTID MAF/TIGER Object Identifier?

Output screen shot #3 (Geographic information for the tract that the address range is located in.)

The output information is defined in the table below

Geography Output Tract Information Definition OID MAF/TIGER Object Identifier STATE State FIPS Code FUNCSTAT Functional Status NAME Census Tract Number AREAWATER Water Area in Square Miles for the Tract LSADC Legal Statistical Area Description Code CENTLON Centroid Longitude (of the tract) BASENAME Base name portion of the Standardized Name INTPTLAT Internal Point Latitude MTFCC MAF/TIGER Feature Classification Code

13

COUNTY County FIPS Code GEOID Geographic Identifier – Fully Concatenated Geographic Code CENTLAT Centroid Latitude INTPTLON Internal Point Longitude AREALAND Land Area in Square Miles of the Tract

OBJECTID MAF/TIGER Object Identifier TRACT Tract Number

Output screen shot #4 (Geographic information for the block that the address range is located in.)

The output information is defined in the table below:

Geography Output Block Information Definition BLKGRP Census Block Group Number OID MAF/TIGER Object Identifier FUNCSTAT Functional Status STATE State FIPS Code AREAWATER Water Area in Square Miles for the Block NAME The Name of the Block SUFFIX The Suffix of the Block LSADC Legal Statistical Area Description Code CENTLON Centroid Longitude (of the block) LWBLKTYP Land/Water Block Type BASENAME Base name portion of the Standardized Name

14

BLOCK Block Number INTPTLAT Internal Point Latitude MTFCC MAF/TIGER Feature Classification Code COUNTY County FIPS Code GEOID Geographic Identifier – Fully

Concatenated Geographic Code

CENTLAT Centroid Latitude INTPTLON Internal Point Longitude AREALAND Land Area in Square Miles for the Block OBJECTID MAF/TIGER Object Identifier? TRACT Tract Number

Output screen shot #5 (Geographic information for the State that the address range is located in.)

The output information is defined in the table below:

Geography Output State Information Definition OID MAF/TIGER Object Identifier STATE State FIPS Code FUNCSTAT Functional Status NAME State Name AREAWATER Water Area in Square Miles for the State LSADC Legal Statistical Area Description Code

15

CENTLON Centroid Longitude (of the State) STUSAB USPS State Abbreviation BASENAME Base name portion of the Standardized Name INTPTLAT Internal Point Latitude DIVISION Code MTFCC MAF/TIGER Feature Classification Code STATENS State National Standard Code GEOID Geographic Identifier – Fully Concatenated Geographic Code CENTLAT Centroid Latitude INTPTLON Internal Point Longitude REGION Census Region Code AREALAND Land Area in Square Miles for the State OBJECTID MAF/TIGER Object Identifier?

Appendix D Results from using Find Geographies Using… Address Batch

The output information is defined in the table below.

Output Fields Definition Record ID Number The unique number for each address submitted. Note the output file may not return the records in the same order as that submitted Input Address The address submitted by the customer. TIGER Address Range Match Indicator Values in the Match Results column refer to if the address matched a census address range (Match), did not match (No_Match), or was a tie between two or more Census address ranges (Tie). A Tie indicates multiple possible results for that address. TIGER Match Type Indicates if the match that occurred was exact (Exact), or equivocated (Equivocate.) TIGER Output Address Standardize version of the input address that was used to match to the TIGER Address range.

16

Interpolated Latitude and Longitude The Longitude (X) and Latitude (Y) values based on an interpolation on where the address falls along the address range. TIGERLine ID The unique Tiger Line Id of the street segment. TIGERLine ID Side The side of the street the address range lies on either L (Left) or R (Right). State Code The State FIPS Code Identifier County Code The County FIPS Code Identifier Tract Code Census Tract Number Block Code Census Block Number

17