<<

The Geography of Trade in the European Single Market

Shawn W. Tan∗

The World Bank Draft (October 2016)

Abstract

This paper uses a unique dataset of freight shipments between 270 regions in 28 countries of the European Single Market to examine how trade responds to spatial frictions. We find that aggregate trade falls rapidly over short distances as distance to destination increases. The sharp reduction in trade is driven by the fall in the extensive margins – the total and average number of shipments. And less so by the intensive margins: average value and average price reduce gradually, and average quantity does not change as distance increases. effects are present and affect aggregate trade largely through the extensive margins. We show that trade in intermediate inputs and the co-location of firms can explain why trade reduces rapidly at such short distances. We find that when the industrial demand of a sector at the destination is higher, the regions are more likely to trade with each other.

JEL Classification: F10, F15, R10, R12, R40. Keywords: Distance, border effects, intermediate inputs, firm co-location, .

∗Shawn Tan is an economist in the Trade and Competitiveness Global Practice at the World Bank and can be contacted at swtan [at] worldbank.org. He thanks Jelena Kmezic for her research assistance.

1 1 Introduction

Trade should move freely within the European Single Market (ESM), unimpeded by tariffs and trade barriers. Yet, internal trade flows are still impeded by spatial frictions. These spatial frictions associated with distance and can exert a force on trade flows and reduce the flows between countries in Europe, and between the regions in each country. While many papers demonstrate how distance and borders reduce trade in Europe, they do not examine how and why these spatial frictions reduce trade. This paper not only demonstrates that spatial frictions reduce trade within the ESM, but also which components of trade are reduced most by spatial frictions. In addition, this paper will show how trade in intermediate goods and the co-location of firms can explain the strong negative effect of spatial frictions on trade. A unique dataset is used in this paper that captures freight shipments in 28 countries within the ESM. The data provides origin-destination detail about shipments between 270 European regions at the NUTS-2 level for 13 sectors.1 To our knowledge, this is the the best available data documenting sector-level inter-regional trade in Europe: it captures not only international trade flows between 28 countries but also inter-regional trade flows between the 270 regions. The detailed data allow us to decompose trade flows into the intensive and extensive margins of trade and to examine how spatial frictions reduce these components of trade. We find that spatial frictions exert a strong negative effect on trade in the ESM. Aggregate trade falls rapidly over short distances as the distance to destination increases. The sharp reduction in trade is driven largely by the fall in the extensive margins, where the total and average number of shipments drop after 250 km and remain flat thereafter. In contrast, the intensive margins are less affected by distance: average value and average price decrease gradually, and average quantity does not change as distance increases. Thus the relationship between aggregate trade and distance within the ESM over short distances are driven largely by shipments to nearby customers (the extensive margin) and not the value of shipments to customers (the intensive margin). Border effects are also present in the ESM. Trade flows within national borders and

1The NUTS (Nomenclasture of Units for Territorial Statistics) is the standard used in the by the EU to refer to different subdivisions of countries for statistical purposes. There are three NUTS levels: NUTS-1 refers to major economic regions with a population of at least 3 million people; NUTS-2 refers to basic regions with a population of at least 800,000 to 3 million people; and NUTS-3 refer to small regions with about 150,000 to 800,000 people.

2 within regional borders are higher than trade flows that cross these borders. Aggregate trade flows are 5.75 times higher for shipments within the same country and 10 times higher for shipments within the same region, compared to shipments that cross the national or regional borders. A large proportion (66 to 92 percent) of the own-region and own-country border effects occurs through the extensive margins. In contrast, the intensive margins play a smaller role in how spatial frictions affect trade. We explore the hypothesis that trade in intermediate inputs and the co-location of firms explain why aggregate trade and the extensive margins reduce at such short distances. To lower trade costs, firms are more likely to locate near firms that produce their inputs. Producing and consuming firms may choose to co-locate in the same region or cluster spatially in neighboring regions. Regions with matching production structures, where a region consumes the products of the other region, are more likely to trade with each other. Using probits to test this hypothesis for each sector, we find that when the industrial demand of a sector at the destination is higher, the probability of the two regions trading are higher even after controlling for the supply and demand for the goods. The results are robust to different measurements of industrial demand at the destination. The paper contributes to two strands of literature. The first strand relates to the estimation of border effects, and in particular the border effects in Europe. The literature started with McCallum (1995) who looked at U.S.- trade flows and show that there is a ‘border puzzle’, where internal trade flows are higher than trade flows that cross the national border. Many papers have estimated the border effects in Europe using two types of trade flows: international (country to country) trade flows and regional (region to region/country) trade flows. Head and Mayer (2000), Nitsch (2000), Chen (2004), and Minondo (2007) use international trade flows and find that border effects exist for a variety of European countries. However, the border effects estimated in these papers may suffer from poor identification as each country has only one internal trade flow — trade with itself — to identify the border variable, compared to an estimation using regional trade flows that will have more than one internal trade flow. As a result, the estimated border effects using international trade flows may be a factor of the geographical aggregation of the data and the border effects may reduce when finer geographical data is used (Wolf, 2000; Hillberry, 2002; Hillberry and Hummels, 2003). The set of papers that use regional trade data to estimate border effects in Europe is small due the unavailability of regional trade flows data.2 Many papers have used

2This issue is not limited to Europe as data on trade flows within a country is limited.

3 the same dataset that covers Spanish regions trading with each other and with other European countries to estimate the border effects for Spanish regions (Gil-Pareja et al., 2005; Ghemawat et al., 2010; Requena and Llano, 2010; Llano-Verduras et al., 2011; Garmendia et al., 2012). These papers use freight data to construct regional trade flows but the country coverage is limited: while the data contains trade flows between Spanish regions, trade flows between these regions and other European countries are at the national level. Helble (2007) uses freight data between regions in and and Kashiha et al. (2016) use data on shipments of wine from eight European countries to the U.S. to estimate the border effects in Europe. This paper will be the first to estimate the border effects in Europe using a comprehensive dataset that covers trade flows between 270 regions in 28 European countries. This paper is closest to the work by Hillberry and Hummels (2008) who use data on freight shipments from the U.S. Commodity Flow Survey. They show that aggregate trade falls dramatically over a short distance, which is largely driven the extensive margins. They also show the state borders impede cross border trade. More importantly, the authors show that border effects exist for trade that crosses the finest geographical unit – a five digit zip code – highlighting that the geographical aggregation of data cannot fully explain the presence of border effects. One can imagine a reductio ad absurdum scenario where border effects may be present for trade between city blocks. The second strand of literature relates to the examination of the reasons behind the border effects and what drives the relationship between aggregate trade and spatial frictions. One explanation is the role of intermediate inputs and the co-location of firms.3 Firms locate close to each other and the input-output linkages between these firms can increase the trade of intermediate inputs, thereby increasing the estimated border effects. Hillberry and Hummels (2008) demonstrate that the trade in intermediate inputs can explain this relationship. In her estimation of border effects for seven EU countries, Chen (2004) also finds some evidence that the spatial co-location of firms increase border effects. While Hillberry and Hummels (2008) have data at a finer geographical level within the U.S., this paper can examine the issue over a wider geographical scope with data that has trade flows within and between countries. 3There are other possible reasons to explain the border effects. Anderson and Van Wincoop (2003) show that large border effects may be an artifact of omitted variable bias, but recent studies that find border effects in Europe have accounted for this estimation bias. Another reason can be the presence of smaller plants as Holmes and Stevens (2012) show that larger plants are able to invest in distribution channels and are more likely to ship longer distance.

4 The paper proceeds in five sections. Section 2 presents the decomposition of aggregate trade and empirical strategy to examine how spatial frictions affects trade. Section 3 describes the data. Section 4 presents the empirical results and shows how trade is affected by spatial frictions. Section 5 provides an explanation for the results. Section 6 concludes.

2 Empirical Framework

Analyses of spatial frictions on trade are usually conducted with a gravity model, where aggregate trade values are regressed on a set of gravity control variables. The assumptions behind the gravity model is that all firms are symmetric, all varieties are traded due to the constant elasticity of substitution (CES) consumption function, and trade costs are represented in the iceberg form where the ‘melt’ is proportional to the value and distance traveled. However, as Hillberry and Hummels (2008) show, prices, quantities and the number of varieties can vary across destinations and co-vary with spatial frictions. Thus we should examine the effects of spatial frictions not only on aggregate trade values, but also on the components of aggregate trade. We decompose aggregate trade into six components following Hummels and Klenow

(2005) and Hillberry and Hummels (2008). Aggregate trade (Tij) is the sum of the value of all shipments between region i to region j. Region i is the origin and region j is the destination defined as a NUTS-2 region. Aggregate trade can be decomposed into the intensive and extensive margins:

Tij = Nij × PQij, (1)

where the extensive margin is the total number of shipments (Nij) and the intensive margin   is the average value of each shipment PQij between region i and region j. Both the intensive and extensive margins can be further decomposed into their components. Total number of shipments can be decomposed into the number of unique  k  sectors k shipped Nij and the average number of shipments per sector between i and j  f    Nij . Average value can be decomposed into the average price P ij and average quantity   Qij per shipment between i and j. Thus the second decomposition of aggregate trade is

5 as follows:

k f Tij = Nij × Nij × P ij × Qij. (2) | {z } | {z } =N ij =PQij

First we use a non-parametric technique to uncover the relationship between distance and the trade components. The relationship between distance and the trade components can be examined using a kernel regression estimator that provides a non-parametric estimate.4 The advantage of using kernel regressions is that they provide a picture of how trade reacts to distances, especially at very short distances. The disadvantage is that the kernel regressions do not control for other factors that may confound the relationship between distance and trade. Next we use linear regressions to estimate the relationship between the trade variables and distance, which is similar to other gravity model estimation. The linear regressions control for other effects besides distance and can estimate the importance of region and country borders. The logs of aggregate trade and its six components are used as dependent variables in separate regressions. Taking logs of equation (1) and (2), the dependent variables are:

ln Tij = ln Nij + ln PQij, (3) f k ln Tij = ln Nij + ln Nij + ln P ij + ln Qij. (4)

Each dependent variable is regressed on the log of bilateral distance (ln Dij), the square 2 of log bilateral distance (ln Dij) , and dummy variables that equal one if the origin

and destination are in the same region (regionij) and if they are in the same country 5 (countryij). The square of log distance is included to account for the non-linearity in the relationship between trade and distance. Origin and destination fixed effects are included in the regression to account for the multilateral resistance terms.6 The linear regressions also provide the relative magnitude of the regressions coefficients

4The kernel regression uses the Gaussian kernel estimator in STATA that calculates on n=50 and allows the estimator to determine the optimal bandwidth. 5While including a variable for similar languages between the trading pairs is common in gravity model estimation, the variable is excluded from regression as it may be co-linear with the country dummy variable. Many of the European countries have languages that are unique to their country. 6The inclusion of origin and destination fixed effects account for the omitted variable bias highlighted by Anderson and Van Wincoop (2003).

6 to calculate the importance of the components of total value. As OLS is a linear operator, the regression coefficients of aggregate trade will equal the sum of the coefficients of the same variables in the regressions of its trade components. Thus the contribution of the distance coefficients to the total will the relative importance of the intensive and extensive margins in the effects spatial frictions have on trade. For example, when ln Tij, ln Nij and ln PQij are regressed on distance, the distance coefficients (β) in the latter two regressions will sum to distance coefficient in the first regression: βN + β = βT where subscripts ij PQij ij are the dependent variables. If β /βT = 0.6, then 60 percent in the reduction of total PQij ij value by spatial frictions is explained by the reduction in the intensive margins.

3 Data

The paper uses a unique dataset – the European Road Freight Transport Survey – that captures the commodity freight movement between 270 regions in 28 countries within the ESM in 2011.7 There are over 2.6 million shipments between these regions in 2011. Each shipment records the quantity or weight (in kilograms or kg), the goods classification, the regions of the origin and destination, and the actual distance traveled. The origin is the place of loading and the destination is the place of unloading of the shipments. Both origin and destination are reported at the NUTS-2 region level: there are 270 origin regions and 272 destination regions in the data. There are 20 goods classification in the data that are broad in nature and can be defined as sectors. The analysis will use 13 of these goods classification covering the agriculture and manufacturing sectors. The goods classification are based on the NST 2007 classification and the full descriptions are presented in the appendix.8 The data is representative of the freight flows in Europe. It is is based on a random stratified survey of road vehicles and collected by each European countries for its jurisdiction over a period of one week for each quarter in 2011. Data collection follows the same survey methodology set out by Eurostat. The data is adjusted with survey weights and provided to Eurostat, who compiles and anonymizes the dataset. Since the data is collected for each

7The countries included in the estimation are: Austria, , Bulgaria, Cyprus, Czech Republic, , , France, Germany, Greece, Hungary, Ireland, Italy, Latvia, , , , , Slovakia, Slovenia, , , , and the . The data includes Croatia but the country is dropped from the estimation because Croatia was not part of the ESM in 2011 and would have tariff barriers on shipments in and out of the country. 8The analysis excludes the other non-agriculture or non-manufactured goods, such as municipal waste, mail and parcels, and goods moved in the course of household and office moves.

7 quarter in 2011, shipments between an origin-destination pair of a goods classification in different quarters are regarded as separate shipments. In addition, a weighted average distance is calculated for each origin-destination pair using the quantity of each shipment of the origin-destination pair as weights. The main drawback of this data is the lack of price information. The survey only collects shipment quantities, not values. Although the quantities are expressed in kg and allow us to easily aggregate the data, the shipments are not comparable without their values. We use the 2011 unit prices from the CEPII unit price dataset, which is calculated from trade flows data (Berthou and Emlinger, 2011). Unit prices are available between reporters and partners at the country level for products at the HS 6-digit code level so unit prices can be calculated for each origin-destination-sector triad.9 The concordance between the goods classification and the HS codes is presented in the appendix. The calculation of the bilateral unit prices for each sector is done in three steps. First, the countries of origin and destination regions are matched to the reporters and partners in the unit price data. Second, a bilateral unit price for each sector is constructed by calculating the average unit price of all the products in that sector traded between the two countries. Third, for sectors without unit prices, such as shipments within countries, the unit price for that country-sector is calculated as the average of the bilateral unit prices for all European countries it ships to.10 Following these steps, each origin-destination-sector in the freight data has a corresponding unit price and the value of the shipments can be calculated. There are three advantages to this dataset. First, to the best of our knowledge, this is the best available data documenting regional trade in Europe. While this dataset cannot compare with the richness of the commodity flows survey in the U.S. used by Hillberry and Hummels (2008), there is no other dataset that covers trade flows between European countries at the sub-national level and is comprehensive enough to cover most European countries. Second, the key feature of the data is the distance of the shipment between the origin and destination. The distance of each shipment is the actual distance covered by the goods as it is being transported on the vehicle and is distributed continuously

9Unit prices are only available for bilateral pairs that are trading in 2011. While each origin-destination pair that has positive shipments is likely to have data on unit prices, some pairs do not. This is because the unit price data is based on official exports and the freight data captures movements of goods, which can include transshipment. 10Cyprus is the only country that does not ship goods from two sectors (goods classification 02 and 07) to other European countries. But it has shipments from these sectors that are within Cyprus. The unit prices for these sector are calculated as the average unit price of all products in that sector that is traded with all countries.

8 from 1 to 4,477 km. Unlike other border effect studies, the bilateral distance is not just a measurement of distance between the central location of two regions. The data allows us to distinguish the effects of small distances and examine the non-linear effects distances on trade. Third, the micro data allows us to decompose the aggregate flows between origin and destination. Total value can be decomposed into the average value, total number of shipments, number of unique sectors, average number of shipments, average quantity and average price of shipments. The summary statistics for the aggregate value and its components are presented in Table 1. There is a large variation in the value of trade ranging from US$760 to US$1,400 million, and shipments that travel between 1 and 4,500 km. The percentage of cross-border trade to total trade of each regions are presented in Figure 1. Figure 1(a) shows the percentage of the region’s trade flows within itself to its total trade. Many regions have high within-region trade flows and many regions have percentages above 62 percent. Figure 1(b) shows the percentage of the region’s trade within the country, including itself, to its total trade. Many regions trade more than 86 percent within the country. It is marked that many German regions have low within-region trade flows and low within-country trade flows, indicating that these regions trade more with other countries than with other German regions.

[Table 1 about here.]

[Figure 1 about here.]

We will also use data on regional population, gross output of a sector, and the industrial demand of a sector for the empirical estimation in Section 5 that examines the reason behind the effects of spatial frictions on trade. The population data for each region in 2011 is obtained from Eurostat. Data on gross output levels of the origin and destination is not available so the data is calculated using regional GDP and sectoral employment provided by Eurostat. Gross output levels for each sector is the proportion of regional GDP determined by the labor share of the sector in 2011. Labor shares are the share of employment in the sector as a total of all manufacturing and services sectors.11 The

11The sectors are defined by the NACE classification. The denominator in the labor shares cover sectors B (Mining and quarrying), C (Manufacturing), D (Electricity, gas, steam and air conditioning supply), E (Water supply; sewerage, waste management and remediation activities), F (Construction), G (Wholesale and retail), H (Transportation and storage), I (Accommodation and food service activities), J (Information and communication), L (Real estate activities), M (Professional, scientific and technical activities), and N (Administrative and support service activities).

9 concordance between the goods classification and the NACE sector is presented in the appendix. Lastly industrial demand is the expenditure of a sector in the destination measured by the sum of all shipments in the sector at the destination. Two alternative measures of industrial demand will also be used: the number of firms and the number of employees in a region’s sector in 2011 obtained from Eurostat. Some regions are missing data on the number of firms and employees in 2011 despite the data being available in 2010 and 2012 so averages over these two years are used to fill in the missing data in 2011.12

4 Decomposing Trade Responses to Spatial Frictions

We begin with the kernel regression of aggregate trade (Tij) on distance, which is presented in Figure 2. The value of trade between regions drops drastically with distance, falling at 250 km and remaining flat thereafter. Many regional pairs that are less than 250 kilometers (km) apart are within the same country. This result emphasizes how distance affects trade even at very short distances and the importance of examining the effects of distance at a finer geographical level. Other gravity studies examine how distance affects the trade between countries and do not capture the effects within countries and at short distances.

[Figure 2 about here.]

[Figure 3 about here.]

Aggregate trade can be decomposed into two components – total shipments and average value – as described in equation (1). The kernel regressions of these two components on

distance are presented in Figure 3. Total shipments (Nij) exhibit a similar relationship to distance as aggregate trade, where total shipments fall rapidly as distance increases   beyond 250 km and remaining thereafter. In contrast, while average value PQij has a negative relationship with distance, it does not fall as rapidly as distance increases. This result suggests that the sharp fall in aggregate trade over distance is driven by decreases in the extensive margin and not the intensive margin. We will examine these relationships further in the linear regressions below. Total shipments and average value can be further decomposed into the four components described in equation (2). The kernel regressions of these four components on distance

12The missing data on the number of firms in 2011 is only filled in for regions with data in 2010 and 2012. The missing data on the number of employees in 2011 is only filled in for regions if there are firms in that sector in 2011.

10 are presented in Figure 4. The average number of shipments exhibit the same responses to distance as aggregate trade and total shipments: it falls rapidly over short distance, further emphasizing that the relationship between aggregate trade and distance is driven by the extensive margin. The number of unique sectors reduces, albeit slower, from over eight sectors to just above one sector as distance increases above 1,500 km. The average price also reduces over distance and exhibit a similar relationship to distance as average value. The average quantity, however, does not change much over distance.

[Figure 4 about here.]

The relationships between aggregate trade, its components and distance in the kernel regressions are similar to the ones in the U.S., with the exception of average price and quantity. Hillberry and Hummels (2008) show that in the U.S. average price increases and average quantity decreases as distances between regions increase. In Europe, it is the opposite: average price decreases while average quantity is not affected as distances between regions increase. A possible explanation for the different result is that the U.S. data is a survey of establishments on their shipments, which covers intermediate and final goods, whereas the European data is a survey of freight transport companies, which includes a broader range of sectors including raw materials and agricultural goods. Since freight trucks are used to transport a large proportion of bulky goods in Europe, heavier items are likelier to be transported at long distance. The kernel regressions of average price and average quantity on distance reflect this fact in Europe where less valuable goods are transported over longer distance in Europe.

[Table 2 about here.]

Next we examine the relationship between distance and trade using a linear regression model. The linear regression results are presented in Table 2 and we can make four observations. First, the spatial frictions are significantly correlated with the trade variables, and aggregate trade and its components reduce as distance increases. The quadratic distance term prevents an easy interpretation of the distance elasticity for each dependent variable. We evaluate the distance elasticity at the sample mean of 978 km and present it in the last row of Table 2. The distance elasticity of aggregate trade is -1.712, which is high but within the range of distance elasticities estimated in the literature.13 The

13Disdier and Head (2008) finds that 90% of the 1,400 estimates they examined in the literature lies between -0.28 and -1.55.

11 distance elasticities of the extensive margins are larger, in absolute terms, than those of the intensive margins, reflecting the kernel regression results where the extensive margins are more sensitive to distances. Second, except for average quantity, there are higher aggregate trade, extensive margins and intensive margins between regions that are trading with themselves or with regions in the same country. For example, aggregate trade is exp(1.749) = 5.75 times higher for shipments within the same country than shipments that cross national borders and 10.0 times higher for shipments within the same region than shipments outside of the region. Hillberry and Hummels (2008) show that shipments within a U.S. state is 1.73 times higher than those outside the state compared with our result of 10 times within a European region. The larger border effects in Europe may be because Europeans have strong preferences for their domestic goods. Third, unlike the other dependent variables, average quantity reduces when shipments are within same region or between regions in the same country. The result indicates that the average quantity of shipments is larger when the destination is outside of the country. This confirms our hypothesis in the kernel regressions that freight in Europe is used to transport bulky and heavier items over longer distances and across national borders.14 Lastly, the extensive margins account for a large proportion of the spatial effects on total trade, confirming the results in the kernel regressions. Using the decomposition in equation (1), 66 percent of the aggregate own-country border effect and 92 percent of the own-region border effect come from the total shipments, while the remainder comes from average value. The contribution of the extensive margins can be further decomposed using equation (2). The number of unique sectors and the average number of shipments each account for half of the 66 percent of the aggregate own-country border effect. The average number of shipments account for most (88 percentage points out) of the aggregate own-region border effect.

5 Trade in Intermediate Goods and Firm Co-location

Spatial frictions have a strong negative effect on trade flows, which reduces sharply at short distances. Border effects are also present in the ESM and intra-regional and intra-national trade flows are higher than trade flows that cross regional and national borders. The

14This result is not driven by the presence of dry-bulk commodities and agricultural products.We find similar results when the regression is restricted to manufactured goods (goods classification 02 to 13).

12 kernel regressions and the linear regressions highlight that the effect of spatial frictions on trade happens primarily through the extensive margins, specifically the total and average number of shipments. The intensive margins, whether it is the average value, price or quantity, account for smaller portions of this effect. The sensitivity of the extensive margins to trade costs is supported in the literature on heterogeneous firm and the discussion of firms’ export behavior to trade costs. Melitz (2003) show that in a world where there is a fixed cost to exporting above the usual variable costs to trade, only a set of productive firms can afford to export. As each firm is producing a unique heterogeneous good, the set of exported goods will be limited by the fixed cost of exporting. Chaney (2008) extends the discussion and show that when the elasticity of substitution is low (i.e. the good is more differentiated), the extensive margin is more affected by trade costs. These theoretical results are supported by the empirical work. For instance, Bernard et al. (2007) demonstrate that extensive margins play a large role in how distance reduces aggregate trade flows. There are two explanations in the literature for the strong effect that spatial frictions have on trade flows. The first explanation is the presence of home bias where consumers prefer their home good as it is more suitable to their tastes. When goods are differentiated by origin, consumers are more likely to consume their local goods (Evans, 2003). The presence of information barriers can also prevent firms from selling to foreign customers.15 There is anecdoctal evidence in Europe where consumers have a strong preference for their national goods. For instance, Italian consumers prefer to consume locally produced cheeses like a Raschera or Castlemagno instead of a Rocquefort or Comte from France because the local cheeses pair better with local food and are more suitable in regional dishes. It is then not surprising that the estimated border effects in Europe are high. The problem with the home bias explanation, however, is that there is a limit to how localized these preferences are. While home bias can explain why intra-regional trade flows are higher within a region, home bias is less likely to explain the presence of border effects at a smaller spatial unit. With data at a finer geographical level, Hillberry and Hummels (2008) show that border effects are present even at the 5-digit zip code level in the U.S. In addition, home bias is more likely to explain the strong local preferences for consumption goods such as food, beverages and textiles products that are differentiated, and less likely to explain local preferences for industrial inputs that will be more homogeneous.

15Garmendia et al. (2012) and Combes et al. (2005) show social and business networks can account for the home bias in intra-national trade.

13 The second explanation, and one which we are going to explore in this section, is the trade in intermediate inputs resulting in the co-location of firms. The input-output linkages between firms can generate trade in intermediate goods between regions with matching production structures. A region can produce an intermediate good, say the turbine engine of an Airbus A380, that is a specific input of a final good (an Airbus A380 aircraft). The intermediate good will only be transported to regions that produce the final good, and will be useless to regions that do not. In addition, vertical specialization and increasing fragmentation of production can magnify small trade costs and cause firms to locate closer to each other (Puga, 1999; Amiti, 2005; Rossi-Hansberg, 2005; Yi, 2010). Bernard et al. (2015) show that geographical proximity in matching suppliers and customers in Japan and a majority of connections between them are formed locally, with a median distance of 30 km. The spatial clustering of firms will increase the internal border effects as firms within a region trade more intermediate goods between themselves (Hillberry and Hummels, 2003; Chen, 2004; Wolf, 2000). Thus, regions that have matching production structures, where region i’s products are consumed by region j’s industries, are more likely to trade with each other. Conversely, regions that do not have matching production structures are less likely to trade with each other. We test whether industrial demand affects the likelihood of shipments between two regions by using probits and controlling for other factors that can influence shipments such as distance, regional gross outputs and consumer demand. The empirical model is specified as:

 k  k k k 2 k k k k P r Iij = 1 = Φ(β0 + β1 ln Dij + β2 (ln Dij) + β3 Regionij + β4 Countryij + β5 ln GOi k k k k k +β6 ln GOj + β7 ln P opj + β8 ln Ej ) ∀k = 2, ..., 13, (5)

k where Iij is the indicator variable that equals one if region i ships good from sector k k to region j. Dij, Regionij and Countryij are defined in the above section. ln GOi is the log of the gross output for region i of sector k that controls for the supply of the industry in the origin. Two variables are included to control for the consumer demand in  k the destination. First the log of the gross output of sector k in the destination ln GOj k captures the sector’s supply in region j. A negative coefficient on ln GOj will indicate that consumers substitute towards the local varieties and region i is less likely to ship good from sector k to the destination. Second the log of population in the destination

(ln P opj) captures the size of the domestic consumer market and controls for the overall

14 size of demand in the destination.  k The log of the industrial expenditure in sector k in the destination ln Ej is the primary variable of interest in the specification. The variable informs us about how the industrial demand of sector k in the destination affects the probability of shipments between regions. Since the sectors are broadly defined and include both the input and the output, firms in that sector will use inputs from the same sector. Controlling for the effects of distance, regional output and consumer demand, a positive coefficient indicates that a higher industrial demand at the destination increases the probability of regional shipments. One difference in the data used for the estimation of the probit regressions and the linear regressions is the distance variable. The linear regressions use the average of actual distance traveled by the shipments but for the probit regressions, data on bilateral distances are needed for all origin-destination pairs and not just pairs with positive shipments. So the bilateral distances are calculated as the distance between the centroid of the origin and destination. For pairs where the origin and destination are in the the same region, the internal distance within a region is calculated as the weighted average of the distance traveled for all shipments within the region. Eight regions – seven in Switzerland and one in Liechtenstein – do not have sectoral employment data for the construction of gross output variables and are omitted from probit regressions.16 We focus on the manufacturing sectors (classification 02 to 13) as employment data for the agricultural sector (classification 01) is not available for the construction of gross outputs. The data is a square matrix with 262 regions and 68,644 possible pairs of trading regions.

[Table 3 about here.]

The summary statistics for the regions with positive shipments in each sector are presented in Table 3. The sector with the highest number of regional pairs with positive shipments is the food, beverages and sector, which tends to produce final goods. Many regions are trading intermediate goods, such as those from the basic metal products, rubber and plastic products, and machinery and electrical equipment sectors. The sector with the lowest number of regions trading with each other is the coal, petroleum and gas sector as goods in this sector are more likely transported via pipelines than trucks.

[Table 4 about here.]

16The French region, FR93, is dropped because it does not have any shipments within itself and the internal distance cannot be calculated.

15 The estimated coefficients for the eight variables in equation (5) are reported in Table k 4. The coefficients for ln GOi and ln P opj are positive and statistically significant for all sectors, indicating that the sector’s size in the origin and the consumer base in the destination have positive effects on the probability of shipments between two regions. For k seven sectors, the coefficients for ln GOj are positive and significant, which suggests that the sector’s gross output in the destination has a positive effect on the probability of shipments and consumers do not substitute towards the local varieties. There are two sectors — metal ores and other mining products, and chemicals and chemical products — where the sector’s gross output in the destination decreases the probability of shipments. Goods from these sectors tend to be heavy or hazardous to transport and so firms that use these goods as inputs are more likely to use the local varieties, thereby decreasing shipments. The coefficients of the industrial demand variable are positive and statistically significant for 11 of 12 sectors. As expected the positive coefficients confirm that higher industrial demand increases the probability of regional shipments. The marginal effects of the industrial demand variable are computed at the mean and presented in Table 4. A one percent increase in industrial demand increases the probabiliy of regional trade between 0.29 to 0.61 percent. The largest increases in the probability of regional trade occur when industrial demand increases in the natural resource sectors — the coke and refined petroleum products sector (07) and metal ores and mining products sector (03) — and sectors relying on intermediate goods, such as the transport equipment sector (12) and the textiles and leather sector (03). We check the robustness of this result using two alternative measures of industrial demand at the destination: the number of firms and the number of employees in the sector at the destination. Both measures capture the size of the sector and are positively related with the industrial demand of the sector. They are included in the estimation of equation k (5) as two separate regressions and the estimated coefficients for ln Ej are presented in Table 5, where the other coefficients in the full specification are omitted for brevity. All the coefficients of the industrial demand variable are positive and statistically significant. These results support our hypothesis that trade in intermediate goods and the co-location of firms can explain the effects of spatial frictions on trade flows that occur mainly through large extensive margins.

[Table 5 about here.]

16 6 Conclusion

Using a unique dataset of freight shipments in the ESM, we find that aggregate trade responds sharply to spatial frictions. Trade falls sharply over short distances, dropping after 250 km and remaining flat thereafter. The rapid fall in aggregate trade is largely attributed to the sharp fall in the number of shipments, whereas the average value of trade reduces more gradually over distances. The sharp negative relationship between aggregate trade and distance over short distances is driven largely by shipments to nearby customers and not the value of shipments to customers. Further examination of the trade components show that the relationship is driven by the average number of shipments, not by average value, average price, and average quantity. The linear regression results support the role of the extensive margins in explaining the role of spatial frictions on trade. The extensive margins account for 70 to 90 percent of the border effects compared to the intensive margins. The trade responses to spatial frictions and the role of the extensive margin can be explained by the trade in intermediate goods. There may be a spatial clustering of firms connected by input-output linkages who choose to locate close to each other to avoid trade costs. As a result, regions are more likely to ship products to other regions that have a demand for these inputs and not to regions without any industrial demand for them. We show that controlling for the supply and consumer demand for the products, the industrial demand of a sector’s products at the destination increases the probability that a region will ship the product to the destination. The paper provides policy guidance to the discussion on trade integration within the ESM. Many studies show that despite the lack of trade barriers in ESM, national and regional borders still play a role in reducing trade flows. Our results show that while border effects exist within the ESM, they can be explained by the trade in intermediate goods and the co-location of firms. Policy makers that are concerned about the lack of trade integration in the European Union should also examine the role of value chains and be aware how the trade in intermediate goods may bias the border effect estimates upwards. Border effects may capture some aspects of non-tariff barriers and policy makers may be interested to understand what are the welfare implications of removing these non-tariff barriers. We leave this question for future research.

17 References

Amiti, M. (2005) “Location of vertically linked industries: agglomeration versus comparative advantage,” European Economic Review, 49, 809–832.

Anderson, J. E. and E. Van Wincoop (2003) “Gravity with Gravitas: A Solution to the Border Puzzle,” American Economic Review, 93, 170–192.

Bernard, A. B., J. B. Jensen, S. J. Redding, and P. K. Schott (2007) “Firms in international trade,” The Journal of Economic Perspectives, 21, 105–130.

Bernard, A. B., A. Moxnes, and Y. U. Saito (2015) “Production networks, geography and firm performance,” NBER Working Paper, No. 21082.

Berthou, A. and C. Emlinger (2011) “The Trade Unit Values Database,” CEPII Working Paper, No. 2011-10.

Chaney, T. (2008) “Distorted gravity: the intensive and extensive margins of international trade,” The American Economic Review, 98, 1707–1721.

Chen, N. (2004) “Intra-national versus international trade in the European Union: why do national borders matter?” Journal of international Economics, 63, 93–118.

Combes, P.-P., M. Lafourcade, and T. Mayer (2005) “The trade-creating effects of business and social networks: evidence from France,” Journal of international Economics, 66, 1–29.

Disdier, A. and K. Head (2008) “The Puzzling Persistence of the Distance Effect on Bilateral Trade,” The Review of Economics and Statistics, 90, 37–48.

Evans, C. L. (2003) “The economic significance of national border effects,” The American Economic Review, 93, 1291–1312.

Garmendia, A., C. Llano, A. Minondo, and F. Requena (2012) “Networks and the disappearance of the intranational home bias,” Economics Letters, 116, 178–182.

Ghemawat, P., C. Llano, and F. Requena (2010) “Competitiveness and interregional as well as international trade: The case of Catalonia,” International Journal of Industrial Organization, 28, 415–422.

18 Gil-Pareja, S., R. Llorca-Vivero, J. A. Martínez-Serrano, and J. Oliver-Alonso (2005) “The border effect in Spain,” The World Economy, 28, 1617–1631.

Head, K. and T. Mayer (2000) “Non-Europe: the magnitude and causes of market fragmentation in the EU,” Weltwirtschaftliches Archiv, 136, 284–314.

Helble, M. (2007) “Border effect estimates for France and Germany combining international trade and intranational transport flows,” Review of World Economics, 143, 433–463.

Hillberry, R. (2002) “Aggregation bias, compositional change, and the border effect,” Canadian Journal of Economics/Revue canadienne d’économique, 35, 517–530.

Hillberry, R. and D. Hummels (2003) “Intranational home bias: some explanations,” Review of Economics and Statistics, 85, 1089–1092.

——— (2008) “Trade responses to geographic frictions: A decomposition using micro-data,” European Economic Review, 52, 527–550.

Holmes, T. J. and J. J. Stevens (2012) “Exports, borders, distance, and plant size,” Journal of International Economics, 88, 91–103.

Hummels, D. and P. J. Klenow (2005) “The Variety and Quality of a Nation’s Exports,” American Economic Review, 95, 704–723.

Kashiha, M., C. Depken, and J.-C. Thill (2016) “Border effects in a free-trade zone: Evidence from European wine shipments,” Journal of Economic Geography, lbw017.

Llano-Verduras, C., A. Minondo, and F. Requena-Silvente (2011) “Is the border effect an artefact of geographical aggregation?” The World Economy, 34, 1771–1787.

McCallum, J. (1995) “National Borders Matter: Canada-U.S. Regional Trade Patterns,” American Economic Review, 85, 615–623.

Melitz, M. J. (2003) “The Impact of Trade on Intra-industry Reallocations and Aggregate Industry Productivity,” Econometrica, 71, 1695–1725.

Minondo, A. (2007) “The disappearance of the border barrier in some European Union countries’ bilateral trade,” Applied Economics, 39, 119–124.

19 Nitsch, V. (2000) “National borders and international trade: evidence from the European Union,” Canadian Journal of Economics/Revue canadienne d’économique, 33, 1091– 1105.

Puga, D. (1999) “The rise and fall of regional inequalities,” European economic review, 43, 303–334.

Requena, F. and C. Llano (2010) “The border effects in Spain: an industry-level analysis,” Empirica, 37, 455–476.

Rossi-Hansberg, E. (2005) “A spatial theory of trade,” The American Economic Review, 95, 1464–1491.

Wolf, H. C. (2000) “Intranational home bias in trade,” Review of economics and statistics, 82, 555–563.

Yi, K.-M. (2010) “Can multistage production explain the home bias in trade?” The American Economic Review, 100, 364–393.

20 Appendix: List of Goods Classfications

Classification Description HS Codes (2002 ver.) NACE Sectors 01 Products of agriculture, hunting, and forestry; HS 01 to HS 15 – fish and other fishing products 02 Coal and Lignite; crude petroleum and HS 2701, 2702, 2709, B05, B06 natural gas and 2711 03 Metal ores and other mining and quarrying HS 25 and HS 26 B07, B08, B09 products; peat; uranium and thorium 04 Food products, beverage and tobacco HS 16 to HS 24 C10, C11, C12 05 Textiles and textile products; leather and HS 50 to HS 63 C13, C14, C15 leather products 06 Wood and products of wood and cork (except HS 44 to HS 49 C16, C17, C18 furniture); articles of straw and plaiting materials; pulp, paper and paper products; printed matter and recorded media 07 Coke and refined petroleum products HS 2703 to HS 2708, C19 HS 2710, HS 2712 to HS 2715 08 Chemicals, chemical products, and man-made HS 28 to HS 38 C20, C21, C22 fibers; rubber and plastic products; nuclear fuel 09 Other non-metallic mineral products HS 68 to HS 71 C23 10 Basic metals; fabricated metal products, HS 72 to HS 83 C24, C25 except machinery and equipment 11 Machinery and equipment, n.e.c.; office HS 84 and HS 85 C26, C27, C28 machinery and computers; electrical machinery and apparatus n.e.c.; radio, television and communication equipment and apparatus; medical, precision and optical instruments; watches and clocks 12 Transport equipment HS 86 to HS 89 C29, C30 13 Furniture; other manufactured goods n.e.c. HS 90 to HS 92, HS 94 C31, C32 to HS 96

21 Figure 1: Maps of Within-Region and Within-Country Trade (a) Percentage of Within-region Trade

NO07

SE33

NO06 SE32

NO05 NO02 SE31 NO03 NO01 SE12 SE11 NO04 EE00 SE23 UKM6 UKM5 SE21 DK05 LV00 UKM2 DK04 SE22 DK01 UKM3 UKC2 DK03 DK02 LT00 UKN0 UKD1UKC1 UKE2 DEF0 PL63 IE01 UKE4UKE1 DE80 PL62 UKD7 UKE3 DE60 PL42 UKD6UKF1UKF3 NL12NL11 DE50DE93 PL61 PL34 UKG2 NL13DE94 IE02 UKL1 UKG3UKF2 UKH1 NL32NL23NL21 DE92 DE40DE30 PL12 UKL2UKG1 NL31NL22 PL43 PL41 UKH2UKH3 NL33 DEA3DEA4 DE91DEE0 UKK1UKJ1UKI1 NL41 PL11 UKI2 NL34BE21 DEA1DEA5 DED5 UKK2UKJ3UKJ2UKJ4 BE25BE23 BE22NL42 DE73 PL51 PL31 UKK4 BE10BE31BE24 DEA2 DE72 DEG0 DED4 PL52 PL33 UKK3 FR30BE32BE35BE33 DEB1 CZ04 CZ05 PL22 BE34DEB2 DE71DE26 DE24 CZ01CZ02 PL32 FR22 LU00 CZ08 PL21 FR23 DEC0DEB3 DE25DE23 CZ03 CZ07 DE12DE11 CZ06 FR25 FR10 FR21 FR41 DE22 SK03 SK04 FR42 SK01SK02 FR52 DE13DE14DE27DE21 AT31 AT12AT13 HU31 FR51 FR24 CH03CH04 AT32 AT11 HU10 HU32 FR26 FR43 AT34 AT33 AT22 HU22HU21 RO11 RO21 CH02CH06CH05 ITH1 AT21 CH01 HU33 FR53 CH07 ITH2 ITH4SI02SI01 HU23 RO12 FR63 ITC2 RO42 FR72 FR71 ITC4 ITH3 ITC1 RO22 RO41 RO31 FR61 ITC3 ITH5 RO32 FR62 FR82 FR81 ITI1 ITI3 BG31 BG32BG33 ES12 ES13 ES21 ES11 ITI2 ES22 BG34 ES23 FR83 ITF1 BG41 ES41 ES51 ITI4 BG42 PT11 ES24 ITF2 EL11 ITF3 ITF4 EL12 ES30 ITF5 EL13 PT16 ITG2 ES42 ES52 ES53 EL21 EL14 ES43 ITF6 EL41 PT17PT18 EL22 EL24 ES62 EL23 EL30 ITG1 PT15 ES61 EL25 EL42 ES63 ES64 EL43 CY00 Percentage = 100 * Within−region / total trade 86.5% to 100% 73.2% to 86.5% 61.8% to 73.2% 37.7% to 61.8% 0% to 37.7% ES70 (b) Percentage of Within-country Trade

NO07

SE33

NO06 SE32

NO05 NO02 SE31 NO03 NO01 SE12 SE11 NO04 EE00 SE23 UKM6 UKM5 SE21 DK05 LV00 UKM2 DK04 SE22 DK01 UKM3 UKC2 DK03 DK02 LT00 UKN0 UKD1UKC1 UKE2 DEF0 PL63 IE01 UKE4UKE1 DE80 PL62 UKD7 UKE3 DE60 PL42 UKD6UKF1UKF3 NL12NL11 DE50DE93 PL61 PL34 UKG2 NL13DE94 IE02 UKL1 UKG3UKF2 UKH1 NL32NL23NL21 DE92 DE40DE30 PL12 UKL2UKG1 NL31NL22 PL43 PL41 UKH2UKH3 NL33 DEA3DEA4 DE91DEE0 UKK1UKJ1UKI1 NL41 PL11 UKI2 NL34BE21 DEA1DEA5 DED5 UKK2UKJ3UKJ2UKJ4 BE25BE23 BE22NL42 DE73 PL51 PL31 UKK4 BE10BE31BE24 DEA2 DE72 DEG0 DED4 PL52 PL33 UKK3 FR30BE32BE35BE33 DEB1 CZ04 CZ05 PL22 BE34DEB2 DE71DE26 DE24 CZ01CZ02 PL32 FR22 LU00 CZ08 PL21 FR23 DEC0DEB3 DE25DE23 CZ03 CZ07 DE12DE11 CZ06 FR25 FR10 FR21 FR41 DE22 SK03 SK04 FR42 SK01SK02 FR52 DE13DE14DE27DE21 AT31 AT12AT13 HU31 FR51 FR24 CH03CH04 AT32 AT11 HU10 HU32 FR26 FR43 AT34 AT33 AT22 HU22HU21 RO11 RO21 CH02CH06CH05 ITH1 AT21 CH01 HU33 FR53 CH07 ITH2 ITH4SI02SI01 HU23 RO12 FR63 ITC2 RO42 FR72 FR71 ITC4 ITH3 ITC1 RO22 RO41 RO31 FR61 ITC3 ITH5 RO32 FR62 FR82 FR81 ITI1 ITI3 BG31 BG32BG33 ES12 ES13 ES21 ES11 ITI2 ES22 BG34 ES23 FR83 ITF1 BG41 ES41 ES51 ITI4 BG42 PT11 ES24 ITF2 EL11 ITF3 ITF4 EL12 ES30 ITF5 EL13 PT16 ITG2 ES42 ES52 ES53 EL21 EL14 ES43 ITF6 EL41 PT17PT18 EL22 EL24 ES62 EL23 EL30 ITG1 PT15 ES61 EL25 EL42 ES63 ES64 EL43 CY00 Percentage = 100 * Within−country / total trade 86.5% to 100% 73.2% to 86.5% 61.8% to 73.2% 37.7% to 61.8% 0% to 37.7% ES70

Note: The NUTS-2 region labels are included in the figures. The percentages are calculated out of the region’s total trade. 22 Figure 2: Kernel Regression: Aggregate trade (Tij) on Distance Total Value on distance 40000 30000 20000 Million USD 10000 0 0 250 500 1000 2000 3000 4000 5000 Kilometers

Figure 3: Kernel Regressions

  Total Shipments (Nij) on Distance Average Value PQij on Distance Total shipments on distance Average value on distance 15 4000 3000 10 2000 Number Million USD 5 1000 0 0 0 250 500 1000 2000 3000 4000 5000 0 250 500 1000 2000 3000 4000 5000 Kilometers Kilometers

23 Figure 4: Kernel Regressions

 k   f  Number of Unique Sectors Nij on Distance Average Number of Shipments Nij on Distance Number of Unique Sectors on distance Average number of shipments on distance 10 300 8 200 6 Number Number 4 100 2 0 0 0 250 500 1000 2000 3000 4000 5000 0 250 500 1000 2000 3000 4000 5000 Kilometers Kilometers

    Average Price P ij on Distance Average quantity Qij on Distance Average price on distance Average quantity on distance 1 25 .8 20 .6 15 .4 Million USD 1,000 Kilograms 10 .2 0 5 0 250 500 1000 2000 3000 4000 5000 0 250 500 1000 2000 3000 4000 5000 Kilometers Kilometers

24 Table 1: Summary Statistics

Observations Mean Std. Dev. Min Max

Tij: Aggregate Value (in mil USD) 26542 1007.22 20055.34 0.00076 1402730.00

Nij: Number of Shipments 26542 71.23 753.25 1.00 30216.00

PQij: Average Value (in mil USD) 26542 9.42 68.27 0.00076 5249.07 k Nij: Number of Unique Sectors 26542 3.11 2.96 1.00 13.00 f Nij: Average Number of Shipments 26542 6.93 58.71 1.00 2324.31 P ij: Average Price (in mil USD) 26542 0.58 3.73 0.000057 226.25

Qij: Average Quantity (in 1,000 kg) 26542 15.36 6.61 0.10 65.50 Dij: Distance (km) 26542 955.42 618.57 1.00 4552.60

Table 2: Linear Regression Results

(1) (2) (3) (4) (5) (6) (7) Dependent ln T ln N ln PQ ln N k ln N f ln Q ln P variable: ij ij ij ij ij ij ij ln Dij -1.285*** -1.366*** 0.0809 0.253*** -1.619*** 0.335*** -0.254* (0.185) (0.0793) (0.151) (0.0471) (0.0494) (0.0571) (0.145) 2 (ln Dij) -0.0310** 0.0170*** -0.0480*** -0.0674*** 0.0844*** -0.0373*** -0.0107 (0.0147) (0.00629) (0.0120) (0.00374) (0.00392) (0.00453) (0.0115)

Countryij 1.749*** 1.148*** 0.600*** 0.547*** 0.601*** -0.0339** 0.634*** (0.0472) (0.0202) (0.0384) (0.0120) (0.0126) (0.0145) (0.0370)

Regionij 2.306*** 2.111*** 0.196 0.0879** 2.023*** -0.0834* 0.279** (0.163) (0.0699) (0.133) (0.0415) (0.0436) (0.0503) (0.128) Constant 9.928*** 8.841*** 1.088** 1.685*** 7.155*** 1.921*** -0.834* (0.636) (0.272) (0.517) (0.162) (0.170) (0.196) (0.499) Observations 26,542 26,542 26,542 26,542 26,542 26,542 26,542 R2 0.464 0.676 0.212 0.567 0.632 0.135 0.210

εD -1.712 -1.132 -0.580 -0.675 -0.457 -0.179 -0.401 Standard errors in parentheses. *** p<0.01, ** p<0.05, * p<0.1. Origin and destination fixed effects are included in all the regressions. Distance elasticity is evaluated at the sample mean distance of 978 km.

25 Table 3: Summary Statistics

Goods Classification / Sector Region Pairs with Tij > 0 Percent (%) 02: Coal, petroleum, and gas 769 1.12 03: Metal ores and other mining products 3,491 5.09 04: Food, beverages, and tobacco 10,635 15.49 05: Textile and leather products 2,961 4.31 06: Wood prod., paper, and media 8,348 12.16 07: Coke and refined petroleum products 2,037 2.97 08: Chemicals, rubber, and plastic products 9,423 13.73 09: Non-metallic mineral products 6,213 9.05 10: Basic metal products 8,731 12.72 11: Machinery and electrical equipment 7,597 11.07 12: Transport equipment 6,714 9.78 13: Furniture and other manu. goods 5,342 7.78 Full description of goods division in Appendix. Total number of country pairs is 68,644.

26 Table 4: Probit Regression Results

Goods Div. 02 03 04 05 06 07 08 09 10 11 12 13

ln Dij 1.768*** -0.137 -0.601** -0.337* 0.353 0.829*** -0.283* -0.224 -0.144 -0.428** 0.157 0.356** (0.264) (0.277) (0.239) (0.201) (0.226) (0.178) (0.159) (0.287) (0.150) (0.169) (0.127) (0.150) 2 (ln Dij ) -0.223*** -0.0806*** -0.0395** -0.0371** -0.122*** -0.157*** -0.0583*** -0.0745*** -0.0709*** -0.0307** -0.0734*** -0.0950*** (0.0241) (0.0231) (0.0185) (0.0164) (0.0181) (0.0158) (0.0126) (0.0231) (0.0120) (0.0134) (0.0101) (0.0122) Region = 1 1.075*** 0.490*** 0.561** 0.644*** 0.848*** 0.944*** 0.742*** 0.642*** 1.135*** 1.083*** 0.756*** 0.726*** (0.102) (0.120) (0.225) (0.114) (0.180) (0.107) (0.178) (0.169) (0.189) (0.174) (0.112) (0.115) Country = 1 0.538*** 0.517*** 0.420*** 0.351*** 0.287*** 0.555*** 0.222*** 0.374*** 0.281*** 0.434*** 0.382*** 0.526*** (0.0466) (0.0308) (0.0264) (0.0313) (0.0273) (0.0340) (0.0269) (0.0269) (0.0269) (0.0268) (0.0260) (0.0270) k ln GOi 0.0417*** 0.00973*** 0.151*** 0.184*** 0.345*** 0.0454*** 0.236*** 0.226*** 0.309*** 0.171*** 0.0827*** 0.200*** 27 (0.00513) (0.00267) (0.0108) (0.0112) (0.00973) (0.00365) (0.00773) (0.0139) (0.00793) (0.00763) (0.00459) (0.0141) k ln GOj -0.00350 -0.00468* 0.00175 0.0395*** 0.115*** 0.0165*** -0.0232** 0.0140* 0.107*** 0.0430*** 0.0147*** 0.00913 (0.00598) (0.00267) (0.00691) (0.00898) (0.0143) (0.00389) (0.0101) (0.00842) (0.0119) (0.00667) (0.00349) (0.00823)

ln P opj 0.333*** 0.225*** 0.334*** 0.238*** 0.212*** 0.147*** 0.325*** 0.200*** 0.165*** 0.233*** 0.199*** 0.181*** (0.0298) (0.0161) (0.0125) (0.0194) (0.0158) (0.0203) (0.0155) (0.0153) (0.0145) (0.0129) (0.0124) (0.0148) k ln Ej -0.0269*** 0.187*** 0.197*** 0.184*** 0.214*** 0.193*** 0.186*** 0.126*** 0.212*** 0.206*** 0.241*** 0.212*** (0.00890) (0.00822) (0.00602) (0.00967) (0.00745) (0.0125) (0.00711) (0.00525) (0.00763) (0.00657) (0.00690) (0.00632)

Log-likelihood -2470.09 -7761.15 -19481.38 -8232.76 -15654.48 -5013.58 -18091.21 -12594.57 -16836.80 -16675.37 -15831.61 -12530.69 Marginal effect -0.0971*** 0.512*** 0.368*** 0.485*** 0.476*** 0.609*** 0.378*** 0.296*** 0.444*** 0.430*** 0.512*** 0.505*** k of ln Ej (0.0010) (0.0006) (0.0001) (0.0007) (0.0003) (0.0018) (0.0002) (0.0002) (0.0003) (0.0002) (0.0002) (0.0003) Robust standard errors in parentheses. *** p<0.01, ** p<0.05, * p<0.1. All regressions have 68,664 observations, which is the total number of possible country pairs. The Ek marginal effects of ln Ek are calculated as elasticities, dP r j , and computed at the mean. j dEk P r j k Table 5: Robustness Check: Probit Results with Different Measures for ln Ej

Number of Firms Number of Employees k k Goods Div. ln Ej Log-likelihood ln Ej Log-likelihood 02 0.241*** -2425.81 0.157*** -2432.87 (0.0226) (0.0172) 03 0.102*** -8163.99 0.0719*** -8182.08 (0.00877) (0.00590) 04 0.0609*** -20060.53 0.0399** -20079.19 (0.00925) (0.0160) 05 0.167*** -8254.04 0.213*** -8262.10 (0.00915) (0.0124) 06 0.111*** -16010.14 0.140*** -16034.70 (0.0103) (0.0177) 07 0.118*** -5180.34 0.0750*** -5184.76 (0.0136) (0.00894) 08 0.109*** -18386.18 0.135*** -18377.41 (0.0153) (0.0161) 09 0.195*** -12816.86 0.115*** -12881.18 (0.0140) (0.0163) 10 0.177*** -17080.63 0.276*** -17084.14 (0.0109) (0.0179) 11 0.0748*** -17170.87 0.136*** -17132.71 (0.0119) (0.0125) 12 0.0905*** -16397.07 0.0585*** -16401.75 (0.0100) (0.00707) 13 0.151*** -12948.60 0.0684*** -13042.49 (0.0103) (0.00821)

k Coefficients of ln Ej are reported here and specification includes other variables in equation (5). Robust standard errors in parentheses. *** p<0.01, ** p<0.05, * p<0.1. All regressions have 68,664 observations, which is the total number of possible country pairs.

28