Close Menu

An Occupational Approach to Analyzing Regional Invention

NCSES 22-202   |   May 6, 2022
  |   Kathryn R. Dotzel and Timothy R. Wojan
 

Disclaimers

Working papers are intended to report exploratory results of research and analysis undertaken by the National Center for Science and Engineering Statistics. Any opinions, findings, conclusions, or recommendations expressed in this working paper do not necessarily reflect the views of the National Science Foundation. This working paper has been released to inform interested parties of ongoing research or activities and to encourage further discussion of the topic.

Any views, findings, or opinions expressed in this paper are solely those of the authors. They do not reflect any official position of the Food and Drug Administration or the Department of Health and Human Services.

 

Abstract

The purpose of this working paper is to refine and evaluate the occupation-driven approach for analyzing regional invention suggested by Wojan, Dotzel, and Low (2015), who argue that patenting rates should be computed on the subset of workers that might plausibly contribute to patenting. We introduce several modifications to their original analysis, including use of a special tabulation of more detailed occupation data and added controls for regional characteristics that could confound identification of the inventive class—the group of occupations most associated with patent production. We then critically evaluate the findings by implementing a simultaneous equations model of the interrelationship between the composition and productivity of the inventive class. Substitutions of alternative subpopulations of knowledge workers such as the college-educated population for the inventive class allow us to assess the value added of the updated inventive class construct. Simultaneous equations results confirm the existence of an interrelationship between patent production and membership in the inventive class in rural regions and highlight an apparent disarticulation of inventive employment and patent-intensive manufacturing employment in urban areas.

 

Introduction

Though not escaping criticism, patents per capita has endured as the default metric for regional inventive capacity for the past 30 years, a dependence that can largely be credited to the absence of a compelling alternative (Galindo-Rueda 2013; Usai 2011; Krammer 2009; Organisation for Economic Co-operation and Development [OECD] 2009, 2010; Carlino, Chatterjee, and Hunt 2007; Griliches 1990). As the inventive capacity of regions becomes more central to understanding the innovation economy, a metric of convenience may impose significant costs. In the past decade, patenting productivity in the United States, measured as number of patents granted to American inventors per capita, has reached levels rivaling those seen during the Industrial Revolution and post–Civil War era (Rothwell et al. 2013). Continued reliance on patents per capita will erroneously discount the patenting productivity of regions highly amenable to less-patentable ventures, such as tourism and the arts, while unquestionably exalting the patenting prowess of high-technology clusters with workforces dominated by inventive workers.

If patenting is largely reliant on occupation-specific human capital, then a valid analysis of the phenomenon should incorporate this fundamental input. In a 2015 publication, Wojan, Dotzel, and Low draw a parallel between demography and the economics of invention to illuminate this argument. Just as a fertility rate is calculated on a denominator of women of child-bearing age, a regional patenting rate should be computed on the subset of the population that might plausibly contribute to patenting. Identifying that subset—an inventive class—is done empirically using an iterative regression technique to uncover the occupations strongly associated with patent production. This parallel to demography extends to traditional standardization and decomposition techniques that distinguish between scale- and productivity-based differences in patent production across regions.

In this study, we introduce several modifications to Wojan, Dotzel, and Low’s (2015) original analysis, including use of a special tabulation of more detailed occupation data from the Census Bureau and added controls for regional characteristics that might influence local patenting productivity. We additionally limit the universe of candidate occupations for the inventive class to those for which engagement in the development, design, or creation of novel ideas by workers is sufficiently important. These changes allow for a more precise identification of the impact of occupational employment on patent production and serve to remove occupations represented in the original inventive class with no obvious connection to patenting. We further extend Wojan, Dotzel, and Low’s analysis by implementing a simultaneous equations model of the interrelationship between the composition and productivity of the updated inventive class. Substitutions of alternative subpopulations of potential inventors and contributors to the inventive process (i.e., denominators) for the inventive class allow us to further assess the value added of the updated construct.

This paper is organized as follows. The next section provides a brief review of the relevant literature on regional invention. This is followed by a redefinition of Wojan, Dotzel, and Low’s (2015) inventive class using more detailed occupation data. The updated inventive class is then used to standardize and decompose traditional population-denominated patenting rates. Analysis of the interrelationship between the composition and productivity of the inventive class and other subpopulations of knowledge workers provides an empirical evaluation of alternative regional patent indicators. Robustness checks and limitations of the analysis are addressed before concluding with a discussion of the broader implications of this research for regional invention indicators.

 

Summary of the Literature on Regional Invention

With the transformation from an industrial economy to an innovation economy there is increasing interest in the innovative capacity of economically lagging regions (Rodríguez-Pose and Wilkie 2019; Stephens, Partridge, and Faggian 2013; Usai 2011). Patents per capita have been used to investigate the heterogeneity of the innovation process across regions to assess the viability of extending innovation policy—possibly customized to different regional types—to all areas (Apa et al. 2018). Establishing the construct validity of variables used to measure innovative capacity would appear to be a requisite first step in advancing this policy debate.

Most of the criticism regarding patents per capita as a metric for innovative capacity, defined by Furman, Porter, and Stern at the national level as “the ability of a country to produce and commercialize a flow of innovative technology over the long term” (2002:889), has been directed at the numerator—that is, patents. Carlino and Kerr (2014) argue that patents may be a poor measure of innovative activity because they represent only an initial step toward innovation and are not necessarily exploited in the region they are produced (Feldman and Kogler 2010; Feldman 1994). Furthermore, most patents have little economic impact (Carlino and Kerr 2014), and not all innovation is patentable (Griliches 1990), such as innovation in services and marketing (OECD 2009). Even in the most highly patented industry of manufacturing, trade secrets and lead-time advantages are more frequently used to protect intellectual property (Cohen, Nelson, and Walsh 2000; Shackelford and Jankowski 2021).

In contrast to the numerator, few studies directly challenge the use of population as a denominator for calculating patenting rates. Morel et al. argue that patents per capita “underestimates the innovative capacity of developing countries, because it fails to detect the productivity of highly capable centers of excellence within countries with large populations” (2005:401). The illogicality of including individuals that have no likelihood of participating in a patent application in the patenting rate denominator also holds for developed countries, whose populations include many children, physical laborers, workers in service-providing industries, and retirees that are highly unlikely to patent. Predictions from the spatial division of labor—that is, of low demand for high-skilled workers in rural compared to urban areas—are borne out empirically even within highly patentable sectors (Gibbs, Kusmin, and Cromartie 2005; Wojan 1999; Massey 1995). This further diminishes the value of the patents per capita measure as a means of comparing patenting productivity across regions.

Although Carlino, Chatterjee, and Hunt acknowledge the reasonableness of an inventiveness metric based on occupations traditionally associated with patenting, they argue against this in practice because “it is not obvious what the appropriate set of occupations should be” (2007:404). Rodríguez-Pose and Wilkie (2019) acknowledge the potential problem of per capita normalization in their examination of patenting in lagging regions but find that their population-denominated results are robust to limiting the denominator to the labor force. The select studies that propose or use alternative occupational-derived denominators in place of population when computing patenting rates—such as research and development employees (Fritsch and Slavtchev 2010; Scherer 1965) and scientists and engineers (Furman, Porter, and Stern 2002; Meliciani 2000)—run the risk of being too selective in defining the inventive class and contributing to the potentially misleading result that formal R&D resources are most productive where they are rare. In line with Wojan, Dotzel, and Low’s (2015) study, our goal is to identify a denominator that better captures the variability of local inventive capacity across space.

The inductive technique aims to strike a balance between restricting the patenting rate denominator to a few select occupations with the highest propensity to patent and inundating the denominator with individuals who are highly unlikely to participate in a patent application. We focus on regional inventive capacity, acknowledging Carlino and Kerr’s (2014) argument that patent production is a poor metric for regional innovation in general. Despite their ability to represent merely one facet of the wider innovation economy, patents remain a powerful predictor of regional productivity growth, low unemployment rates, and the formation of publicly traded firms (Rothwell et al. 2013). Our main hypothesis is that a refined, inductively identified denominator will better capture the geographical variability of local inventive capacity, compared to the more traditional per capita measure, and draw a more accurate picture of invention across space.

 

Redefining the Inventive Class Using Detailed Occupation Data

Data and methodology

To identify the detailed Census occupations strongly related to patent production, we modify Wojan, Dotzel, and Low’s “hybrid rational/inductive approach” (2015:539), regressing the aggregate number of patents produced in commuting zone (CZ) z during the period 2000–05 (patents) on the share of the workforce employed in a selection of detailed Census occupations (equation 1). The hybrid approach allows us to exploit a shared understanding of core occupations in the inventive class, defined by the National Science Foundation’s (NSF)’s classification of science, engineering, and technical (SET) occupations (referred to as “scientists, engineers and technicians” by NSF) (Morrison and Green 2005) without excluding occupations demonstrating a strong association with patenting. The model is estimated at the commuting zone level to minimize discrepancies between place of work and place of residence when assigning patenting activity to regions, and rural and urban commuting zones are analyzed separately.

Equation 1

Patents are estimated as a function of 19 randomly selected occupational shares; the science, engineering and technical occupational share; the share of remaining occupations; historical patent stock; share of population with a college degree; the wage-rental ratio; a natural amenities measure and geographic variables.

We include share of the workforce in SET occupations in each regression (SETSh) and, ultimately, all 48 SET occupations as part of our inventive class due to these occupations’ strong a priori relationship with invention. We randomly select an additional 19 non-SET occupation shares (OccSh) to include in each of 10,000 separate regressions. We select from only those non-SET occupations for which engagement in the development, design, or creation of novel ideas by workers is sufficiently important, as described below.

Descriptions of variables included in equation 1, excluding the occupation shares, with associated summary statistics, are provided in table 1. Absent from Wojan, Dotzel, and Low’s (2015) analysis but included in our regressions are 1975–1980 patent stock (PatStock), share of working-age population with a bachelor’s degree or higher (HKSh), population density (PopDens), wage-rental ratio (WageRentR), and the Department of Agriculture’s (USDA’s) Economic Research Service (ERS) natural amenity rank (NatAmen). These variables are intended to control for conditions affecting knowledge sharing and quality of life in each commuting zone that may influence patenting productivity.

Description of variables in inventive class regressions, excluding occupation shares

(Descriptive statistics)
Note(s):

Equation 1 additionally includes categorical controls for Census region and ERS Rural-Urban Continuum Code of the commuting zone's most urban county.

Source(s):

U.S. Patent and Trademark Office; Census Bureau, 2000 Census of Population and Housing, Summary File 1 and Summary File 4; Department of Agriculture Economic Research Service, 1999.

Our hypothesis is that commuting zones with a history of patenting, as measured by patent stock, are more likely to have institutions in place and an inventive culture that supports patent production. Past discoveries in a given region provide a direct measure of the local knowledge stock that may inform and promulgate later inventions (Furman, Porter, and Stern 2002). By controlling for workforce education, we account for the human capital endowment of commuting zones, which is not necessarily interchangeable with occupational employment. A skilled workforce improves the ability of firms in a region to absorb and develop new knowledge (Cohen and Levinthal 1990). Population density serves as a proxy for agglomeration economies which may promote cooperative invention (Knudsen et al. 2008). Additionally, a number of studies find that in-person interactions play a key role in facilitating the flow of knowledge between collaborators (e.g., McCann and Simonen [ 2005 ]; Storper and Venables [ 2004 ]). Natural amenities could increase the productivity of inventive workers by driving the location choices of higher-quality workers (Guimarães, Munn, and Woodward 2015). Less speculatively, enhancement of individual creativity due to exposure to natural beauty is now supported by several statistical studies (Pasanen, Neuvonen, and Korpela 2018; Plambech and Konijnendijk van den Bosch 2015). Amenity-rich areas may also facilitate creative thought processes that are essential to finding the nonobvious solutions that characterize invention. At face value, wage-rental ratio captures cost of living in a commuting zone, but it can also be viewed as a negative proxy for amenities, both natural and man-made (Roback 1982, 1988).

Two sets of categorical variables capturing the Rural-Urban Continuum Code (RUCC) of the commuting zone’s most urban county and U.S. Census region, respectively, are additionally included in equation 1 (represented by Geo). Table 2 summarizes the land area, population, and regional distribution of the sample’s 309 urban commuting zones and 378 rural commuting zones by assigned RUCC.

Summary of commuting zones in inventive class regressions, by assigned Rural-Urban Continuum Code

(Descriptive statistics)
Note(s):

Each commuting zone (CZ) has been assigned the Economic Research Service (ERS) RUCC (defined at the county level) associated with the CZ’s most urban county. All Hawaii and Alaskan CZs (including the Alaskan portion of CZ 171), as well as CZs 654, 646, 680, and 685, are excluded from this analysis due to missing values for one or more variables in equation 1.

Source(s):

Census Bureau, 2000 Census of Population and Housing.

To define the two patenting measures, we rely on a cross-section of an annual data set of all utility patents granted by the U.S. Patent and Trademark Office (USPTO), which geolocates inventors to their county of residence at the time of application. Although these USPTO data are available annually for years 1975–2010, reliance on a special Equal Employment Opportunity (EEO) tabulation of Census 2000 occupational data to define the occupational measures precludes a multiyear analysis. We aggregate patent data for 6 years to generate Patents and PatStock to minimize cases of nonexistent and low patent production in commuting zones. For each commuting zone, Patents represents the aggregate of utility patents awarded in years 2000–05, and PatStock represents the aggregate of utility patents awarded in years 1975–80. Employing a 6-year aggregate for our dependent variable also accounts for the time lag between patent application and granting—which is, on average, 2 years (Hall, Jaffe, and Trajtenberg 2001).

Remaining variables in equation 1 are generated based on data from the Department of Agriculture’s (USDA’s) ERS and the Census Bureau. All occupation share measures are based on a special EEO county-level tabulation of Census 2000 occupational data. Following Furman, Porter, and Stern (2002), the use of occupation data in our analysis allows us to correct for distortions caused by variability in the size of dependent population across regions. Because occupation data for subsequent years are not available to us, the analysis is cross-sectional in nature. In a departure from Wojan, Dotzel, and Low’s (2015) study, the analysis is additionally restricted to the contiguous United States because ERS does not provide natural amenity metrics for Alaska and Hawaii.

At the most detailed level, the Census 2000 Occupational Classification System provides county-level data on the number of employees in 509 occupations (Fronczek and Johnson 2003). Our analysis excludes the four military-specific Census occupations as well as legislators, postmasters and mail superintendents, and judges, magistrates, and other judicial workers because data were not provided by Census for these occupations. Of the remaining 502 occupations, 454 do not belong to the group of SET occupations used by NSF to track advancements in the innovative and inventive economy (defined as the non-SET occupations). Differing from Wojan, Dotzel, and Low (2015), only those non-SET occupations with Occupational Information Network (O*NET) Thinking Creatively (TC) “Importance” and “Level” ratings that exceed the minimum ratings across SET occupations—1.2 and 1, respectively, both for “agricultural food science technicians”—are included in the random selection procedure.

The O*NET database, sponsored by the Department of Labor’s Employment and Training Administration, is the product of nationwide survey of workers from a broad set of occupations. Each occupation in the O*NET database is associated with a set of descriptors that “defines the key features of [the] occupation as a standardized, measurable set of variables” (O*NET Resource Center 2017). O*NET provides data on 227 descriptors across six domains, including “work activities,” and, within this domain, “mental processes.” O*NET’s TC value describes one such mental process and is attributed to occupations that require workers to “develop…, design…, or creat[e] new applications, ideas, relationships, systems, or products, including artistic contributions as part of their work activities (O*NET OnLine 2017b). Because O*NET uses the Standard Occupational Classification (SOC) System to classify occupations, it was necessary to match each three-digit Census occupation to its six-digit SOC equivalent occupation before assigning TC values. A detailed description of our Census-SOC matching methodology is available upon request.

After completing the match of the Census occupations to the SOC occupations, each Census occupations was then matched to its TC value based on archived data from O*NET OnLine, beginning with the 3.0 database, published in August 2000. If a match was not made using the 3.0 database (7% of cases), Census occupations were then matched to the 3.1 database values. Unmatched occupations were subsequently matched to TC values from the O*NET databases in chronological order, ending with database 20.3, published in April 2016.

Based on our criteria that considered non-SET occupations should have TC Importance and Level ratings at least as great as minimum ratings across SET occupations, 300 of the 454 non-SET occupations remain in the pool of selected occupations in the analysis. Following each regression, we update a collection of count variables, which record instances of inclusion for each occupation share as well as whether each occupation share coefficient is positive and significant at the 5% significance level. These measures allow us to calculate the percentage of time that an occupation share effect is positive and significant in the iterative regression analysis for rural and urban samples. To mitigate omitted variable bias, share of the workforce employed in a considered non-SET occupation for which an individual occupation share is not included is captured by the ExclSh variable in each run of the model.

Our inventive subset inclusion criteria for the non-SET occupations follow from Wojan et al.’s study (2015): those occupations associated with coefficients that are positive and significant in at least 75% of their regressions in the metro or nonmetro analysis are characterized as inventive.

Discussion of results

Of the 300 non-SET occupations included in the analysis, 11 are identified as inventive, that is, consistently associated with positive, significant coefficients (see table 3). Occupations that are strongly associated with patenting in both urban and rural commuting zones include the collection of SET occupations and also marketing managers. The coefficient estimate for engineering managers was significant in 60% of the rural equations in which it appeared and in 100% of urban equations. Occupations with a strong association to patenting in urban commuting zones include market and survey researchers, designers and technical writers, and electronics assemblers—a production occupation. Occupations with a strong association to patenting in rural commuting zones include all the remaining production occupations.

Inventive occupations

(Percent positive and significant coefficients in iterative estimation of equation 1)

CZ = commuting zone.

Note(s):

Summaries of coefficients for the iterative regression analysis are available in Dotzel (2017:165–70).

Source(s):

Dotzel K, 2017, Three Essays on Human Capital and Innovation in the United States, PhD dissertation, Graduate School of The Ohio State University.

Several occupations found by Wojan et al. (2015) to be consistently associated with patenting in nonmetro commuting zones are not represented in our results. A number of these exclusions (e.g., “entertainers and performers,” “postsecondary teachers”) are likely due to our added controls for Census region and human capital stock: “actors” and “producers and directors,” suboccupations of “entertainers and performers,” are highly concentrated in the West Census region, while “postsecondary teachers” are concentrated in commuting zones with a high percentage of workers with bachelor’s degrees. “Printing workers” and “customer service representatives,” occupations not traditionally associated with patent production, are additionally absent from the updated inventive class. “Marketing and sales managers,” “market and survey researchers,” and “technical writers” represent new additions compared to the 2015 study. In general, it seems that the modifications made to the previous analysis are effective in removing spurious occupations with no discernible connection to patenting.

Table 4 compares patenting rates across commuting zones containing a global city; these zones are characterized by their “availability of advanced producer services essential for the global coordination of activities by multinational corporations” (Wojan et al. 2015:549). Average absolute value of the difference in patents per inventive class member rank and patents per capita rank for global city commuting zones is larger for the updated inventive class (2.3) compared to Wojan et al. (2015) (1.4). In line with the trend recognized by Wojan and colleagues (2015), commuting zones specializing in media or finance or containing high-amenity cities typically regarded as tourist and/or retirement destinations—including the cities of Los Angeles, Miami, New York, and Orlando—perform better when commuting zones are ranked by patents per inventive class member than when ranked by patents per capita. These cities additionally fall within the bottom five cities when ranked by inventive class employment share, suggesting that they are particularly disadvantaged by their small inventive class populations.

Patenting statistics for commuting zones containing global cities

(Alternative patenting rates)

a See Wojan TR, Dotzel KR, Low SA, 2015, Decomposing Regional Patenting Rates: How the Composition Factor Confounds the Rate Factor, Regional Studies, Regional Science 2(1):535–51.

Note(s):

Following Wojan, Dotzel, and Low (2015), the Aleutians West Census Area, AK, is omitted from the Seattle, WA, commuting zone. Difference in ranks column denotes the change in rank, when global city commuting zones are ranked by patents per inventive class member, between Wojan, Dotzel, and Low's (2015) study and our updated analysis.

Source(s):

U.S. Patent and Trademark Office; Census Bureau, special tabulations (2015) of the 2000 Census.

 

Standardization and Decomposition Analysis Using Newly Defined Inventive Class

A still underrecognized weakness of existing studies of regional invention is their general lack of acknowledgement of differences in the composition of inventive populations across space. In the field of demography, it would be unprecedented to base conclusions regarding regional fertility on a rate (e.g., number of live births divided by total population) that does not consider the size of the region’s childbearing population. As shown by Wojan et al. (2015), tools of standardization and decomposition, summarized by Das Gupta (1993) and traditionally applied in demography and other social science disciplines, can be extended to the topic of invention using our newly defined inventive class. Patents per capita for each region can be decomposed as the product of share of total population in the inventive class, known as the compositional factor, and patents produced per inventive class member, known as the rate factor (RF). This decomposition allows for standardization across compositional and RFs, that is, what observed patenting rates would look like if compositional factors—or alternatively, RFs—were identical across regions. In addition, the decomposition answers to what extent we can attribute differences in observed patenting rates across regions to variation in each of these factors.

Following Wojan et al. (2015), we first investigate how variation in population growth rates impacts the two factors’ contribution to differences in observed patenting rates. We do this by separating commuting zones that contain a global city into two groups: those ranked in the top 50% of commuting zones based on their population growth rate between 1975 and 2000, and those ranked in the bottom 50%. We then apply Das Gupta’s (1993) standardization and decomposition methodology for the two-factor, two-population case.

Results calculated using the updated inventive class in table 5, defined using the detailed Census occupations in table 3, mirror those based on the 2015 inventive class, with slight differences in magnitude of rates and percent distribution of effects. The standardization of factors calculates the per capita patenting rate that would have been observed if either the inventive class rate or inventive class composition were expressed as the average across both high-growth and low-growth cities. For example, the RF standardization demonstrates that if both groups had the same average patenting rate among inventive class members, low-growth global cities would have higher per capita patenting rates than high-growth global cities (0.0031 vs. 0.0029). Differences in the magnitudes of factors across studies suggest a smaller and more productive inventive class for global cities in the updated analysis.

The observed, population-denominated patenting rate (i.e., patents per capita) for commuting zones containing high-growth global cities is 31% higher than the rate for commuting zones with low-growth global cities. Results of the decomposition analysis tell us that 126% of the difference in patents per capita rates across the defined populations is driven by disparities in inventive class patenting productivity across regions, a smaller percentage than indicated by Wojan et al.’s (2015) results. While the observed patenting rate for high-growth global city commuting zones is highest, if inventive class productivity were consistent between the two populations, the patenting rate of low-growth global cities would dominate. The decomposition results also tell us that the high-growth population suffers based on its inventive class composition. If share of population in the inventive class were identical across groups, the gap between the groups’ patenting rates would increase. Nonetheless, the compositional disadvantage of high-growth global city commuting zones is more than offset by their productivity advantage.

Comparison of standardization and decomposition of patenting rates for global city commuting zones based on population growth rate

(Patenting rates)

a See Wojan TR, Dotzel KR, Low SA, 2015, Decomposing Regional Patenting Rates: How the Composition Factor Confounds the Rate Factor, Regional Studies, Regional Science 2(1):535–51.

Note(s):

RF is patents per inventive class member, and CF is inventive class as a share of total commuting zone population. Following Wojan, Dotzel, and Low (2015), Aleutians West Census Area, AK, is omitted from the Seattle, WA, commuting zone.

Source(s):

U.S. Patent and Trademark Office; Census Bureau, special tabulations (2015) of the 2000 Census.

Next, we examine how patenting rates differ between populations defined based on level of urbanization. Following Wojan et al. (2015), we divide commuting zones into three groups—rural; urban, without a global city (UWGC); and urban, containing a global city (UCGC). For this step in the analysis, we apply Das Gupta’s two-factor, three-population standardization and decomposition methodology, which ensures that there is “only one standardized rate for a population when standardization is done with respect to the same factor” (1993:97) and that effects are internally consistent.

The results for our two-factor, three-population case based on level of urbanization are presented in table 6a and table 6b. Our results indicate that for each region-type pair, differences in the composition of inventive populations across regions are responsible for 50%–55%—that is, the majority—of the difference in population-denominated patenting rates. This contrasts with the decomposition results of Wojan et al. (2015), which identify productivity differences as the main driver of differences in observed population-denominated patenting rates for each region-type pair. While observed patents per capita rates indicate that global cities are approximately six times more productive than rural areas in producing patents, when inventive class population share is held constant across the two populations, global cities are only twice as productive. This provides compelling evidence that the composition of inventive populations across regions must be considered by researchers if meaningful conclusions are to be made concerning the inventive capacity of rural and urban regions.

Standardization of patenting rates for commuting zones based on level of urbanization

(Patenting rates)

a See Wojan TR, Dotzel KR, Low SA, 2015, Decomposing Regional Patenting Rates: How the Composition Factor Confounds the Rate Factor, Regional Studies, Regional Science 2(1):535–51.

Note(s):

RF is patents per inventive class member, and CF is inventive class as a share of total commuting zone population. In a departure from Wojan, Dotzel, and Low (2015), all Hawaiian and Alaskan commuting zones (including Aleutians West Census Area, AK, part of the Seattle, WA, commuting zone) are omitted. This change results in slight discrepancies in observed patenting rates across studies.

Source(s):

U.S. Patent and Trademark Office; Census Bureau, special tabulations (2015) of the 2000 Census.

Decomposition of patenting rates for commuting zones based on level of urbanization

(Rate and composition factors pertaining to patenting)

CF = compositional factor; RF = rate factor; UCGC = urban, containing a global city; UWGC = urban, without a global city.

a See Wojan TR, Dotzel KR, Low SA, 2015, Decomposing Regional Patenting Rates: How the Composition Factor Confounds the Rate Factor, Regional Studies, Regional Science 2(1):535–51.

Note(s):

RF is patents per inventive class member, and CF is inventive class as a share of total commuting zone population. In a departure from Wojan, Dotzel, and Low (2015), all Hawaiian and Alaskan commuting zones (including Aleutians West Census Area, AK, part of the Seattle, WA, commuting zone) are omitted. This change results in slight discrepancies in observed patenting rates across studies.

Source(s):

U.S. Patent and Trademark Office; Census Bureau, special tabulations (2015) of the 2000 Census.

 

Empirical Evaluation of Newly Defined Inventive Class

Two-factor decomposition of the per capita patenting rate makes it clear that continued reliance on the measure will confound our understanding of the true contributors to inventive capacity. In the extreme case, a regional policy to eliminate the social safety net plausibly could be associated with an increase in the per capita patenting rate. However, the two-factor decomposition would correctly identify the cause as an increase in the share of the population in the inventive class as the disadvantaged population is decimated. This hypothetical example calls into question the value of any finding that fails to model both the rate and compositional factors.

Data and methodology

In this exercise, we evaluate the correlates of inventive production in commuting zones and membership in our inductively defined inventive class. Contributors to the inventive process may be drawn to regions with existing inventive populations—and, likewise, support regional inventive productivity through own-invention or the dissemination of knowledge to other inventors in their region (Agrawal, Kapur, and McHale 2008; Thompson and Fox-Kean 2005; Jaffe, Trajtenberg, and Henderson 1993). Since the direction of causality between these two processes is not obvious, a simultaneous equations model is adopted. Generally, each equation in a simultaneous equations system is referred to as a structural equation. All dependent variables are assumed to be endogenous to the system, while all other variables are treated as exogenous and are taken to be instruments for the endogenous variables. Following Faggian and McCann (2009) in their study of interregional human capital flows in Great Britain, our simultaneous equations model is estimated using three-stage least squares (3SLS).

To estimate cross-sectional simultaneous equation models, limited or full information methods are available (Greene 2012). The limited information two-stage least squares (2SLS) method is most commonly used but provides inefficient parameter estimates when error terms are correlated across equations, a possibility that we have no theoretical reason to refute a priori in this case. The full-information 3SLS method, developed by Zellner and Thiel (1962), combines the traditional 2SLS procedure with seemingly unrelated regressions. Under 3SLS, the variance-covariance matrix of cross-equation error terms is estimated in the second stage and used to correct coefficient estimates; this represents the deviation from 2SLS. The 3SLS and 2SLS methods produce equivalent results when error terms are uncorrelated across equations.

Data from the Census Bureau, USDA’s ERS, Patent and Trademark Office, and Department of Justice’s (DOJ’s) Uniform Crime Reporting Statistics (DOJ 2006) are used to generate variables used in the analysis. Choice of variables mirrors those selected by Faggian and McCann (2009), with a few modifications motivated by differences in data availability and study relevance, as described below. Table 7 includes descriptions of and summary statistics for all considered variables, while pairwise correlation coefficients for these variables are presented in table 8.

Description of variables in simultaneous equations model—rural and urban commuting zones

(Descriptive statistics)
Note(s):

Equations 2 and 3 additionally include categorical controls for Census region and ERS Rural-Urban Continuum Code of the commuting zone's most urban county.

Source(s):

U.S. Patent and Trade Organization; Census Bureau, Special Tabulation (2000), Census of Population and Housing, Summary File 1 and Summary File 4, and Shapefile; Integrated Postsecondary Education Data System, 2000; Department of Justice, 2000; Department of Agriculture Economic Research Service, 1999, 2000; Bureau of Labor Statistics, Quarterly Census of Employment and Wages, 1998; Florida R, 2002, The Rise of the Creative Class: And How It’s Transforming Work, Leisure, Community, and Everyday Life, New York: Basic Books; McGranahan D, Wojan T, 2007, Recasting the Creative Class to Examine Growth Processes in Rural and Urban Counties. Regional Studies 41(2):197–216.

Association among variables in simultaneous equations model—rural and urban commuting zones

(Pearson correlation coefficients)
Source(s):

U.S. Patent and Trade Organization; Census Bureau, Special Tabulation (2000), Census of Population and Housing, Summary File 1 and Summary File 4, and Shapefile; Integrated Postsecondary Education Data System, 2000; Department of Justice, 2000; Department of Agriculture, Economic Research Service, 1999, 2000; Bureau of Labor Statistics, Quarterly Census of Employment and Wages, 1998.

The first structural equation (equation 2) estimates the aggregate number of patents produced in commuting zone z during the period 2000–05 (Patents) as a function of inventive class population (IC), aggregate 1975–80 patent stock (PatStock), population density (PopDens), wage-rental ratio (WageRentR), ERS natural amenity rank (NatAmen), university R&D expenditures (UniResExp), and a location quotient for patent-intensive industries (LQPatInd). As is the case for the iterative regression analysis (equation 1), use of an aggregate patenting measure serves to minimize cases of nonexistent and low patent production in commuting zones and accounts for the time lag between patent application and granting. A commuting zone geography is also similarly adopted to minimize discrepancies between place of work and place of residence when assigning patenting activity to regions.

In the second structural equation (equation 3), commuting zone inventive class population is related to Patents, unemployment rate (UR), WageRentR, NatAmen, crime rate (Crime), proximity to an urban center (ProxUrbCen), and LQPatInd. Each structural equation additionally includes controls for ERS RUCC and Census region (in 𝑮𝒆𝒐).

Equation 2

Patents are estimated as a function of the inventive class; historical patent stock; wage-rental ratio; a measure of natural amenities; university R&D expenditures; a location quotient for patent intensive manufacturing; and geographic variables.

Equation 3

The inventive class is estimated as a function of patents; the unemployment rate; wage-rental ratio; a measure of natural amenities; crime rate; proximity to an urban center; a location quotient for patent intensive manufacturing, and geographic variables.

The definition of our location quotient (LQ)—the ratio of the commuting zone’s share of employment to the nation’s share of employment in patent-intensive industries, as determined by the Economics and Statistics Administration and the USPTO (Department of Commerce 2012)—represents a departure from Faggian and McCann (2009), whose location quotient is defined based on employment in the wider manufacturing industry. We include LQPatInd in both structural equations because the agglomeration of workers in occupations highly amenable to patenting could concurrently affect the inventive productivity and size of the considered subpopulations. Our addition of historic patent stock to the structural equation for patents follows from equation 1.

The inclusion of a natural amenity measure in both structural equations signifies a second deviation from Faggian and McCann’s (2009) analysis. Initial consideration of favorable climate as a determinant of location choice can be traced to early studies by Graves (1979, 1980) and Carlino and Mills (1987). More recent works explore how natural amenities factor into the location choices of high-skilled workers specifically (Fiore et al. 2015; Whisler et al. 2008; Gottlieb and Joseph 2006). In addition to affecting the composition of the considered subpopulations, natural amenities may promote patent production from these workers by stimulating the creative thought processes essential to invention, as argued for equation 1 in section 3.1 (Pasanen, Neuvonen, and Korpela 2018; Plambech and Konijnendijk van den Bosch 2015). The interpretation of wage-rental ratio as a negative proxy for amenities (Roback 1982, 1988) similarly justifies its addition to equation 2.

The use of university R&D expenditures (UnivResExp) in place of the composite university, government, and private R&D expenditure measure used in McCann and Faggian (2009) is a third deviation. Data limitations in the United States relative to the European Union necessitate this choice, but UnivRevExp may be the critical component of the composite for examining rural patenting as land grant research universities in rural commuting zones may have a large impact on differential patenting rates. Studies of rural innovation (Aryal et al. 2018) and high technology manufacturing location (Woodward, Figueiredo, and Guimarães 2006) provide empirical support for this conjecture. The absence of private R&D is more of a concern in the urban commuting zone equations, where this type of expenditure is concentrated.

Although our primary motivation for this modeling exercise is to identify the correlates of regional inventive capacity, we also intend to assess the value-added of the inventive class construct by substituting three other subpopulations of knowledge workers as dependent variables in equation 3 who may potentially contribute to the patenting process for our inventive class: (1) SET workers, which Furman, Porter, and Stern (2002) include (as “scientists and engineers”) in their model of international innovative capacity; (2) Richard Florida’s (2002) creative class (CC), as recast by McGranahan and Wojan (2007); and (3) workers possessing a bachelor’s degree or higher (BA). Notably, a strong correlation between each subpopulation pair is visible in table 8, with CC and BA showing the strongest correlation in both rural and urban regions. We exclude measures of regional university quality and student density from equation 3 to isolate the impact of human capital on invention to the considered subpopulations of knowledge workers.

Based on the summary statistics in table 7, rural commuting zones produced approximately 35 patents during the period 2000–05, on average, compared to the 1,867 patents produced by each urban commuting zone. Mean populations of the considered groups of knowledge workers for urban commuting zones measured in year 2000 likewise significantly exceed those for rural commuting zones (by factors of 27, 30, 21, and 20 for inventive class, SET employment, CC, and human capital stock, respectively).

Discussion of results

Results for rural commuting zones for each structural equation, are provided in table 9, while results for urban commuting zones are provided in table 10.

Results for simultaneous equations model—rural commuting zones

(Estimated regression coefficients on patents and population subset)

* p < 0.10; ** p < 0.05; *** p < 0.01.

Note(s):

Structural equation estimates are based on 358 observations. Controls for Census region and Rural-Urban Continuum Code are included in each regression.

Source(s):

Dotzel K, 2017, Three Essays on Human Capital and Innovation in the United States, PhD Dissertation, Graduate School of The Ohio State University, p. 108.

Results for simultaneous equations model—urban commuting zones

(Estimated regression coefficients on patents and population subset)

* p < 0.10; ** p < 0.05; *** p < 0.01.

Note(s):

Structural equation estimates are based on 301 observations. Controls for Census region and Rural-Urban Continuum Code are included in each regression.

Source(s):

Dotzel K, 2017, Three Essays on Human Capital and Innovation in the United States, PhD Dissertation, Graduate School of The Ohio State University, p. 112.

For rural commuting zones, results for the first structural equation indicate that membership in each of the four considered classes of knowledge workers positively impacts patent production. Size of the SET workforce has the largest estimated effect on invention: for every 11 additional SET workers to a rural commuting zone, one patent is produced during the 6-year period of interest. Each additional SET worker’s contribution to patenting is approximately twice that of an additional worker in the inventive class: one patent is produced for every 20 additional IC workers. From a precision perspective, the IC measure in the rural patents equation is statistically more powerful than the SET measure; the p-value for the SET point estimate is approximately three times that of the IC point estimate (p-values 0.000806 and 0.000238, not shown). Workers employed in the creative class and workers possessing a bachelor’s degree have the smallest estimated impact on patent production: 50 and 63 additional creative class members and educated workers, respectively, are associated with the production of one patent. The much higher p-values for coefficients on both CC and BA in their respective regressions, compared to those for IC and SET, confirm that both measures include a good deal of information that is not relevant to patent production.

Although size of the SET workforce appears to dominate as a predictor of invention in rural regions, results for the second structural equation indicate that inventive output may have a significant role to play in supporting other groups of knowledge workers that a sole focus on SET workers obscures. For the rural sample, the production of an additional patent increases estimated employment in SET occupations by 9, inventive class membership by 13, creative class membership by 40, and educated workforce size by 60. Our results seem to confirm the existence of an interrelationship between patent production and membership in our updated inventive class.

Wage-rental ratio, which can be interpreted as a negative proxy for amenities (Roback 1982, 1988), is positively associated with patenting when the creative class or workers with a bachelor’s degree are the population of interest. Estimated effects for LQPatInd across specifications may provide insight into these results. The constructed location quotient for patent-intensive industries is associated with a significant coefficient in the BA patents specification, suggesting that the WageRentR variable may be picking up a tendency for manufacturing industries—and, more specifically, inventive industries—to concentrate in low-amenity commuting zones. The coefficient on LQPatInd is similarly positive (and relatively large in magnitude) in the creative class patents equation, though not precisely estimated. Other explanatory variables associated with significant effects in the first structural equation include PatStock and UniResExp for the inventive class specification. Insignificance of UniResExp estimates in the SET, CC, and BA patents results is notable given the policy interest in public R&D funding.

Based on the rural results for the second structural equation, WageRentR negatively impacts size of all considered populations of knowledge workers. This could be reflective of a willingness of these workers to accept lower wages or higher rents in exchange for access to amenities (e.g., dining, shopping, entertainment). The apparent ability of the inventive class to better account for the contribution of rural workers in patent-intensive industries, as evidenced by the positive and significant coefficient on LQPatInd in the second structural equation, compared to the alternative subpopulations of knowledge workers highlights its potential value to studies of rural invention. Analysis of urban commuting zones (table 10) discussed below points to regional variation in the inventive process and suggest the value of examining contributions from a wider spectrum of knowledge workers to get a complete picture of patent production.

As is the case for rural commuting zones, the subset of workers employed in SET occupations has the greatest impact on successful invention in urban commuting zones. However, the associated coefficient on SET in table 10 is only 25% larger than the coefficient on IC when the inventive class is substituted for SET workers as the population of interest (compared to the approximate 1:2 relationship that exists in rural commuting zones). In contrast to the rural results, the SET effect is more precisely estimated than the IC effect in the urban case. The magnitude of effects associated with workers employed in the creative class and workers possessing a bachelor’s degree are approximately one-third and one-fourth, respectively, of those for the inventive class and SET workforce. However, the estimated impact of creative class membership on patent production in urban regions is not precisely estimated. It is estimated that additions of 14, 11, and 48 workers to the inventive class, SET workforce, and educated workforce, respectively (and individually), contribute to the production of one patent in urban commuting zones for the period of interest.

Based on the estimated coefficients from the second structural equation, the production of an additional patent is associated with the addition of 10 workers employed in SET occupations, 14 inventive class members, 45 creative class members, and 63 educated workers to an urban commuting zone. A positive association between ERS natural amenity rank and patent production is observed in the IC and SET results, while the estimated impact of LQPatInd in the first structural equation is large and consistently positive for all considered subpopulations of knowledge workers in the urban case.

The large, negative coefficients on LQPatInd across specifications in the second structural equation are initially perplexing. Correlation between LQPatInd and IC is notably low in urban regions (0.17) relative to rural regions (0.34). When the urban sample is further divided into commuting zones that contain and do not contain a global city, associated correlation coefficients are 0.37 for UCGC commuting zones and 0.04 for UWGC commuting zones. Scatter plots of inventive class membership by patent-intensive LQ are displayed as figure 1, figure 2, and figure 3 for rural, UWGC, and UCGC commuting zones, respectively. In each figure, communing zone IDs from year 2000 are used in lieu of traditional scatter plot markers. Based on figure 3, the positive trend observed for urban commuting zones containing a global city appears to be driven by the commuting zones containing Chicago, IL (identified by commuting zone ID 58), San Jose, CA (218), and Los Angeles, CA (323). When these commuting zones are excluded from the group of UCGCs, correlation between LQPatInd and IC falls to -0.33.

Inventive class membership, by patent-intensive location quotient—rural commuting zones: 2000

(Occupational employment and employment concentration)
Note(s):
The Economic Research Service (ERS) 2000 commuting zone (CZ) IDs identified in figure (available at https://ers.usda.gov/data-products/commuting-zones-and-labor-market-areas). The 2000 ERS CZs are constructed using intercounty commuter flow data from the 2000 Census of Population. No naming protocol for CZs was developed, so the Commuting zone description names a statistical area contained in the CZ. Information on the constituent counties making up each Commuting zone ID is available at the website along with documentation of how the CZs were constructed. The group of patent-intensive industries is characterized by the above-average utility patent intensities (i.e., patent-to-employee ratios) of its industries. Select three- and four-digit industries within the following North American Industry Classification System subsectors are included: computer and electronic product manufacturing; chemical manufacturing; electrical equipment, appliance, and component manufacturing; machinery manufacturing; and miscellaneous manufacturing (Department of Commerce 2012). A location quotient of more than 1.000 indicates a CZ has a higher share of patent-intensive industry employment than the nation as a whole.
Source(s):

Census Bureau, 2000 Census Special Tabulation; Bureau of Labor Statistics, Quarterly Census of Employment and Wages, 1998; Department of Commerce, 2012, Intellectual Property and the U.S. Economy: Industries in Focus, available at https://www.uspto.gov/sites/default/files/news/publications/IP_Report_March_2012.pdf.

Inventive class membership, by patent-intensive location quotient—urban commuting zones without a global city: 2000

(Occupational employment and employment concentration) )