Introduction
Research publications and presentations at conferences represent the main mechanisms for disseminating research findings. Presentations are represented in the published research literature as conference proceedings. Published literature is an indicator of scientific activity and global research partnerships. Additionally, analysis of how the published literature is cited provides insight into the impact of research output. Scientific publications are not merely an exercise of ivory tower academics but serve as a key linkage enabling public uses of scientific output (Yin et al. 2021).
This report presents data on research publication output by country, scientific field, international collaboration, and impact measures. The first section examines comparative country data on publication output across science and engineering (S&E) fields. It also includes two sidebars: (1) publications by members of underrepresented groups and the impact on the research and development (R&D) workforce, and (2) measuring cross-disciplinary publication output. The second section focuses on collaboration between researchers in the United States and other regions, countries, and economies through examining coauthoring and citation patterns. This section includes a sidebar on the 2020 coronavirus publication output and collaboration network. The third section provides an analysis of scientific impact as measured by citations in research publications.
The analysis reported here is based on counting publications and citations using bibliometric data in Scopus, a database of scientific literature with English-language titles and abstracts (Science-Metrix 2021a). There are benefits and limitations to counting publications and citations using bibliometric data as an indicator of research output. A benefit is that this approach provides comparable information for analyzing research output across countries. A potential limitation is that country-specific incentive payments for academic publications are not considered (Franzoni, Scellato, and Stephan 2011). Additional limitations are the lack of measurement for the amount of research contained in each article and the contributions of associated data sets (Sugimoto and Larivière 2018).
There are two potential sources of bias in the data: inclusion of non-peer-reviewed articles, and a bias toward English-speaking countries because Scopus has a requirement for articles to contain an English-language title and abstract. The first bias is mitigated by removing articles published in journals that lack substantive peer review, sometimes referred to as predatory journals (Grudniewicz et al. 2019). This filtering removed 1 million research articles and conference papers from 2008 to 2020 (see Technical Appendix). The potential for a bias toward English-speaking countries is more difficult to solve. One solution undertaken by Elsevier is to increase publications from non-English-speaking countries. Specifically, Scopus has increased Chinese publications 624% from 1996 to 2020 (see Technical Appendix).
Over 44 million articles published from 1996 to 2020 are analyzed. The articles include conference papers and research articles (collectively referred to as articles) published in conference proceedings and peer-reviewed scientific and technical journals. The articles exclude editorials, errata, letters, and other material that do not typically present new scientific data, theories, methods, apparatuses, or experiments. The articles also exclude working papers and preprints, which are not generally peer reviewed yet.
Articles with authors working in multiple countries are used for both counting publication output by country and for determining international collaborations. The country is determined by the institutional address of each author as listed in the article. For counting country output, each country receives a fractional contribution based on the number of authors. For determining international collaboration, each country or region represented by one or more authors is counted once. Because whole counting is used for international collaboration and fractional counting is used for publication output, those values are not directly comparable.
Assignment of articles to S&E fields uses the 14 fields of science in the National Center for Science and Engineering Statistics (NCSES) Taxonomy of Disciplines (TOD) (Science-Metrix 2019). The categorization is done by first assigning the journal to one of the 176 subfields in the Science-Metrix classification and then to the TOD. This approach works well for most journals and fields; for example, all of dentistry gets assigned to health sciences. Challenges arise for subfields that are more general, such as energy, and multidisciplinary journals, such as Science or Nature. For these fields and journals, classification occurs at the article level based on an algorithm using author affiliations, the names of journals referenced in the bibliography, the titles of the references, the publication’s abstract, the publication’s author-defined keywords, the publication’s title, and the scientific field of references.
Publications-related data are best viewed as trends. Year-to-year differences are often not indicative of a pattern due to the process by which the information is indexed in Scopus. Additional details regarding document selection, limitations, and sources of bias are available in the Technical Appendix.