Page 1 of 8

The 2017 Commodity Flow Survey- Subarea Estimates

Authors (in alphabetical order): Olivia Brozek (U.S. Census Bureau) Scot Dahl (U.S. Census Bureau) Steven Riesz (U.S. Census Bureau)

May 2021 Page 2 of 8

The 2017 Commodity Flow Survey- Subarea Estimates

Table of Contents Feedback from the data users ...... 2 Information about the Commodity Flow Survey ...... 2 Definitions of the Subareas and the origin-destination pairs...... 3 Description of the files containing the estimates ...... 4 Quality of the estimates...... 6 Disclosure avoidance ...... 7 Citing the estimates, and citing this paper ...... 8

Feedback from the data users

This is the first time these estimates have been published on the internet, and comments from the data users about the content and usefulness of these estimates are appreciated. Please contact the CFS staff by e-mail ([email protected]) or phone (301-763-2108), and give them your comments. Information about the Commodity Flow Survey

The Commodity Flow Survey (CFS) is a joint effort by the Bureau of Transportation Statistics (BTS), U.S. Department of Transportation, and the U.S. Census Bureau, U.S. Department of Commerce. The survey is the primary source of national and subnational level data on domestic freight shipments by establishments in mining, manufacturing, wholesale, auxiliaries, and selected retail and services trade industries located in the 50 states and the District of Columbia. Data are provided on the type, origin and destination, value, weight, modes of transportation, distance shipped, and ton-miles of commodities shipped. The CFS is conducted every 5 years as part of the Economic Census. It provides estimates of national freight flows for all modes of transportation, and is the only publicly available source of commodity flow data for the truck mode. The CFS was conducted in 1993, 1997, 2002, 2007, 2012, and 2017.

For more information about the 2017 Commodity Flow Survey (CFS), see https://www.census.gov/programs-surveys/cfs.html

Page 3 of 8

For information about the methodology of the 2017 CFS, including sample design, sampling error, nonsampling error, disclosure avoidance, and definitions, see https://www2.census.gov/programs-surveys/cfs/technical- documentation/methodology/2017cfsmethodology.pdf Definitions of the Subareas and the origin-destination pairs

In what follows below, by the phrase “metropolitan areas”, we mean micropolitan statistical areas (MiSAs), metropolitan statistical areas (MeSAs), or combined statistical areas (CSAs).

The Subareas are groups of counties, and were defined by the Bureau of Transportation Statistics (BTS), by dividing the existing 132 CFS Areas into 329 Subareas, primarily based on the sample size of shipments (more precisely, the number of shipments that were used to compute estimates) and the definitions of metropolitan areas. Note that 33 of the 132 CFS Areas were not split into Subareas. Each Subarea consists of at least one county and, in general, each Subarea contains a sample size of at least 10,000 shipments, though there were many exceptions.

In more detail, we used the following criteria to define Subareas:

• Subareas were groups of counties. • Subareas were contained within CFS Areas, i.e. no Subarea crosses CFS Area boundaries. • Within rest-of-state CFS Areas, where there were metropolitan areas that did not make up too large a proportion of the rest-of-state CFS Area, Subareas were created from metropolitan areas, e.g. Toledo, Little Rock, Des Moines. • Within rest-of-state CFS Areas, where there were metropolitan areas that made up a very large proportion of the rest-of-state CFS Area, Subareas were not created from metropolitan areas, e.g. Albuquerque, Columbia. • No metropolitan areas that were contained within a CFS Area, were split into Subareas. • We retained as much logical geographic contiguity as possible. • Subareas contained 10,000 ± 2,000 shipments, if possible. o In some cases, this target was reduced to about 5,000 shipments. o Some Subareas had many more than 10,000 shipments.

Estimates were not made for every possible origin-destination pair due to data limitations and the lack of industry exchange. BTS determined the particular origin-destination pairs for which estimates were made, as follows:

After defining the Subareas, the 2012 CFS Public Use Microdata (PUM) file was used to identify origin- destination pairs with enough likely shipment flow between them to produce useful estimates. The pairs were determined by the following procedure:

• Each origin Subarea in a state was paired with itself, all other Subareas in its state, all four Census regions, and the US. In addition: • If there were roughly 300 (or more) shipments going from an origin Subarea to all of the Subareas of a neighboring state, then all of those Subareas (as well as the CFS Areas and the state) were paired with the origin Subarea.

Page 4 of 8

• Otherwise, if there were roughly 300 (or more) shipments going from the origin Subarea to all of the CFS Areas of a neighboring state, then all of those CFS Areas (and the state) were paired with the origin Subarea. • Otherwise, if there were roughly 300 (or more) shipments going from the origin Subarea to a state, then that state was paired with the origin Subarea.

The estimates were made to capture commodities (in four broad groups) transported by trucks using approximately 5,734,000 truck or ground parcel shipments contained in the 2017 CFS data, out of approximately 5,979,000 total shipments. Commodities transported by other modes are not included in this special tabulation of CFS 2017. Description of the files containing the estimates

There are two files that contain the estimates.

1. 2017 CFS Subarea Tab - Table O (PUBLIC).csv.

This file contains estimates of the total value, total weight, total ton-miles, average miles per shipment, and total number of shipments, for truck or ground parcel shipments. The areas for which the estimates are made are: Origin (Subarea) x Destination (Census Region, State, CFS Area, or Subarea) x Commodity Group.

2. 2017 CFS Subarea Tab - Table D (PUBLIC).csv.

This file contains estimates of the total value, total weight, total ton-miles, average miles per shipment, and total number of shipments, for truck or ground parcel shipments. The areas for which the estimates are made are: Origin (Census Region, State, CFS Area, or Subarea) x Destination (Subarea) x Commodity Group.

The origin-destination (OD) pairs on this file are the reverse of those on the first file. That is, if there is an OD pair on the first file of Origin = geographic area 1 (a Subarea) and Destination = geographic area 2 (a Subarea, or a higher level of geography), then on the second file, there is an OD pair of Origin = geographic area 2 (a Subarea, or a higher level of geography) and Destination = geographic area 1 (a Subarea).

Because the estimates are rounded, the sum of the estimates for smaller areas may not exactly equal the estimate for a larger area.

Table A below describes the variables on these files. The variable names are on the first row of each file.

Page 5 of 8

Table A: Variables on the files containing the estimates

Abbreviations in Table A:

# = An integer between 0 and 9, inclusive. Z = The number rounds to zero. S = Suppressed, i.e. not published, because the estimate did not meet publication standards. N = Not computed.

Variable Description Valid values Comment OREG Origin Census region code M (Midwest), OREG, OST, OAREA, OSUB, N (Northeast), and DREG, DST, DAREA, S (South), DSUB are blank when they W (West) are not needed to characterize the geographic OST Origin FIPS state code 1 – 56 area. When all of the OAREA Origin CFS Area code ##### or ### destination variables are OSUB Origin Subarea code A – F, Z blank, the destination is the OFAF Origin FAF (Freight Analysis Framework) RFFAS (See US. Subarea code the box to the left for the Similarly, OFAF and DFAF OFAF is of the form RFFAS, where meaning of may not be five characters R=Census region (M, N, S, W), RFFAS.) long, as shown in the FF=FIPS state code (1-56), examples in column 2 of this table. A=FAF CFS Area code (1-9), and S=Subarea code (A-F, Z).

Examples: M, M17, M171, M171A DREG Destination Census region code M (Midwest), N (Northeast), S (South), W (West) DST Destination FIPS state code 1 – 56 DAREA Destination CFS Area code ##### or ### DSUB Destination Subarea code A – F, Z DFAF Destination FAF Subarea code RFFAS (See the box to the DFAF is of the form RFFAS, where left for the R=Census region (M, N, S, W), meaning of FF=FIPS state code (1-56), RFFAS.) A=FAF CFS Area code (1-9), and S=Subarea code (A-F, Z).

Examples: M, M17, M171, M171A DMODE Mode of domestic transportation “99” “99” is defined as truck- this includes ground parcel shipments. SCTG Commodity Group code “00”, “01-09”, “10-19”,

Page 6 of 8

Variable Description Valid values Comment SCTG stands for Standard Classification of “20-34”, Transported Goods. “35-43” VAL Total value, in millions of dollars A number, S, Z TON Total weight, in thousands of tons A number, S, Z TMILE Total ton-miles, in millions A number, S, Z AVGMILES Average miles per shipment A number, S, Z WGHTSHPCNT Total number of shipments A number, S VAL_S CV (coefficient of variation) of the total value, A number, S The CVs were computed expressed as a percent using the 2017 Public Use TON_S CV of the total weight A number, S File GVFs (Generalized TMILE_S CV of the total ton-miles A number, S Variance Functions). AVGMILES_S CV of the average miles per shipment A number, S WGHTSHPCNT_S CV of the total number of shipments N, S The CV was not computed.

For DMODE and SCTG, the quotation marks are part of the csv file output. For example, 99 appears as “99” in the csv file. This prevents two of the commodity groups from being displayed as dates (e.g. 01-09 as 1/9/2019) when the file is opened in Excel.

Table B below shows examples of the Origin x Commodity Group areas for which the estimates are made.

Table B: Examples of Origin Subarea x Destination x Commodity Group Areas

Type of Origin Destination Commodity Origin x Subarea DREG-DST-DAREA-DSUB (The variable Group Destination Pair OREG-OST- can be blank.) OAREA- OSUB Subarea x M-17-176-A M-17-176-A 10-19 Subarea M-17-99999-D 35-43 Subarea x CFS M-17-176-D M-17-176 10-19 Area M-18-176 20-34 Subarea x State M-17-176-B M-17 20-34 M-18 01-09 Subarea x M-17- M 01-09 Region 99999-A S 20-34 Subarea x US M-17-476-B 01-09

Quality of the estimates

Estimates in this Experimental Data Product release have not been reviewed by the Census Bureau in as much detail as the other published estimates. Thus, caution should be exercised in their use and interpretation.

Page 7 of 8

The estimates may have large sampling errors. Sampling error is a general term. One measure of sampling error is the coefficient of variation (CV), which is also called the relative standard error. The CV is the square root of the variance of the estimate, divided by the expected value of the estimate. The CV was estimated using the method of generalized variance functions, which was developed for the 2017 CFS Public Use File. This method may not estimate the CV as accurately as the method of random groups, which is used for other published CFS estimates. For information about the method of generalized variance functions, see Section 6, p. 5-6 of the 2017 Commodity Flow Survey (CFS) Public Use File (PUF) Data Users Guide, at https://www2.census.gov/programs-surveys/cfs/datasets/2017/cfs_2017_puf_users_guide.pdf.

The 2017 CFS first stage sample was stratified by origin CFS Areas, not by origin Subareas. Therefore, the true variance of the Subarea estimates is larger than it would have been, had we stratified by origin Subareas. This is because the sample sizes of the establishments in the origin Subareas vary from sample to sample, when one does not stratify by origin Subareas.

Many of the estimates are computed from a very small number of shipments. Estimates computed from one or two shipments have been suppressed. Some of these suppressed estimates can be derived directly from these tables by subtracting unsuppressed estimates from their respective totals. However, the estimates derived by such subtraction may have large CVs, be subject to poor response, or other factors that make them potentially misleading. Estimates derived this way should not be attributed to the Bureau of Transportation Statistics or the U.S. Census Bureau.

In addition to sampling error, the estimates may contain high levels of nonsampling error. Nonsampling error can be attributed to many sources: error in coverage of the universe of businesses, nonresponse, differences in the interpretation of questions, mistakes in the recording and coding of data, and other errors in collection, processing, and tabulation of the data. Although no direct measures of nonsampling error are available, steps have been taken in all survey processes to minimize their influence.

Data users who create their own estimates using data from these tables should cite the Bureau of Transportation Statistics and the U.S. Census Bureau as the source of the original data only. The Bureau of Transportation Statistics and the U.S. Census Bureau have not sanctioned, conducted, or reviewed any analysis performed using these estimates. Conclusions drawn from any analysis of these data are the sole responsibility of the performing party. Disclosure avoidance

For the 2017 CFS, the primary method of disclosure avoidance is noise infusion, in which shipment-level quantities are perturbed before computing the estimates, by applying a random noise multiplier to the quantitative data, such as the shipment value and shipment weight. Disclosure avoidance is accomplished in a manner that causes the vast majority of the estimates in our standard published tables to be perturbed by at most a few percentage points.

Title 13, United States Code (U.S.C.), Sections 8(b), 131, and 182; and Title 49 U.S.C., Section 6302 authorize the Census Bureau to conduct this survey. Title 13 U.S.C., Sections 224 and 225 require businesses and other organizations that receive the survey to respond to the Census Bureau. Section 9 of the same law ensures that responses are confidential and will only be used for statistical purposes. No estimates are provided that would disclose the operations of an individual business.

The Census Bureau has reviewed this data product for unauthorized disclosure of confidential information, and has approved the disclosure avoidance practices applied. (Approval ID: CBDRB-FY21-082.)

Page 8 of 8

Citing the estimates, and citing this paper

If these estimates, or any results based on these estimates, are released to others, you may cite the Bureau of Transportation Statistics and the U.S. Census Bureau as the source of the original data, however, the cautionary statements listed above concerning the data limitations, and the potential influences of sampling and nonsampling errors, should be included with the release.

Source: U.S. Department of Transportation, Bureau of Transportation Statistics; and, U.S. Department of Commerce, U.S. Census Bureau. (Experimental Data Product – 2021-05-21). [Shipment Characteristics, by Origin Geography by Destination Geography by Commodity Group, for Truck or Ground Parcel Shipments] [Experimental Data Product]. 2017 Commodity Flow Survey.

This paper may be cited as follows:

U.S. Department of Transportation, Bureau of Transportation Statistics; and, U.S. Department of Commerce, U.S. Census Bureau. The 2017 Commodity Flow Survey- Subarea Estimates. https://www.census.gov/data/experimental-data-products/commodity-flow-survey-subarea-estimates.html https://www2.census.gov/data/experimental-data-products/commodity-flow-survey-subarea- estimates/2017-commodity-flow-survey-subarea-estimates-table-o.xlsx https://www2.census.gov/data/experimental-data-products/commodity-flow-survey-subarea- estimates/2017-commodity-flow-survey-subarea-estimates-table-d.xlsx