A list of demonstration projects is below. These projects were supported through two funding mechanisms by way of Congress and the Office of Management and Budget. This list will be updated as projects are initiated or completed.
Foreign Born Scientists and Engineers in the Workforce (Active)
An evidence-building project to understand the availability and demand for global science and engineering training and talent. This project explores data sources and linkage and conducts analyses to investigate return on investment for science and engineering talent trained in the United States.
Privacy Preserving Technologies Phase 1: Environmental Scan (Active)
An environmental scan of organizations, both within and outside of government, to understand the current landscape of privacy preserving technologies for the protection of persons, data, and systems that contribute to the use of confidential data, including both individual-level and business data, for evidence building and policymaking.
Data Protection Toolkit Use Case Analysis (Active)
This project conducts a use case analysis of the Federal Committee on Statistical Methodology's Data Protection Toolkit. This use case analysis will identify successful use cases and potential enhancements to the Toolkit for enabling access to federal data assets while protecting confidentiality.
National Center for Health Statistics: National Vital Statistics System Modernization—New Opportunities for Interoperable Data (Active)
This project highlights new opportunities for the use of interoperable health data to support timely research and public health surveillance. The results will inform planning for both the National Vital Statistics System (NVSS) and a potential future National Secure Data Service (NSDS), as well as the broader data and evidence ecosystem, by highlighting the possible applications for interoperable data, the level of data access needed for these uses, and the related privacy and confidentiality implications.
Secure Compute Environment Scan (SCE) (Active)
An environmental scan of secure computing environments within the federal government as it relates to data sharing and privacy and confidentiality restrictions within multiple statues. This scan will include compilation of legal requirements under different statuses for data protection and security in support of IT infrastructure to store and use confidential data.
Utilizing Privacy Preserving Record Linkage to Link Data from Two Federal Statistical Agencies (National Center for Health Statistics, National Center for Science and Engineering Statistics) (Active)
Development of a proof of concept to deploy a privacy preserving record linkage tool to link and share data from two federal statistical agencies. This tool will inform data linkage and sharing for a National Secure Data Service.
Evaluation of Noise Infusion for Large-Scale Demographic Sample Survey (Survey of Doctorate Recipients) (Active)
Evaluation of the use of noise infusion for a demographic survey as a possible privacy-preserving method in the use of federally confidential data. This research will inform the National Secure Data Service Demonstration project by exploring noise infusion for a sample survey and assessing the disclosure protections and quality considerations for the resulting estimates.
Models for a Data Concierge Service for a National Secure Data Service (Active)
An exploration of models for a data concierge service that can offer technical assistance to individuals seeking access to federal data within a National Secure Data Service, including answering general questions, identifying confidential data assets that meet the individual’s evidence building needs, and supporting the development of evidence-building proposals to apply for access to confidential data.
Utilizing Privacy Preserving Record Linkage with Parent Agency Data and Statistical Agency Data to Inform Programs and Policies (Active)
An investigation of the use of privacy preserving record linkage methods to connect survey and administrative data in a secure environment as a guide for future interagency data sharing and linking initiatives. This project will also demonstrate development of a data sharing agreement between a federal statistical agency and its parent agency.
Creating and Validating Synthetic Data (National Center for Science and Engineering Statistics, Census Bureau; Annual Business Survey) (Active)
A test and comparison of methods for creating synthetic data to support a tiered access model, exploring the use of synthetic data for evidence building, and testing the use of verification metrics in validating estimates produced from synthetic data. This research will help to inform the National Secure Data Service Demonstration project and support tiered access through the development of a publicly available dataset that can be accessed without the current barriers to restricted use data.
Federated Data Usage Platform (DUP) (Active)
Research to develop a robust and sustainable framework to enable the federal data ecosystem to better understand the uses of federal data in support of a potential, future National Secure Data Service (NSDS). This research will produce possibilities for a future, state-of-the-art, updatable publicly accessible platform that provides information on federal data usage as part of the NSDS Demonstration project.
Creating and Validating Synthetic Data (National Center for Science and Engineering Statistics; Survey of Earned Doctorates) (Active)
A test and comparison of methods for creating synthetic data to support a tiered access model, exploring the use of synthetic data with a demographic survey for evidence building, and testing the use of verification metrics in validating estimates produced from synthetic data. This research will help to inform the National Secure Data Service Demonstration project and support tiered access through the development of a publicly available dataset that can be accessed without the current barriers to restricted use data.
Expanding Equitable Access to Restricted Use Data through Federal Statistical Research Data Centers (FSRDCs) (Active)
An environmental scan of user demand and strategies for expanding access to the restricted use data made available through Federal Statistical Research Data Centers (FSRDCs) beyond the traditional base of users at high research activity (R1) universities. The results of this project will inform planning for both the FSRDCs and a potential future National Secure Data Service by providing critical evidence on unmet needs, service gaps, and resource demands required to expand access to restricted use data to a wider community for evidence-based decision-making.
Informing Evidence-Building Capacity among State, Local, Territorial, and Tribal Governments within a National Secure Data Service (Active)
An exploration of how a potential, future National Secure Data Service (NSDS) could support capacity building for research and data science among state, local, territorial, and tribal governments. A gap analysis will be conducted to determine what needs are not currently being met and recommendations will be made as to how these needs for state, local, territorial, and tribal governments could potentially be incorporated into a broader data concierge service model within a future NSDS.
Secure Compute Environment for National Secure Data Service Demonstration Project (Active)
A project to design and build a secure compute environment that will be leveraged as part of an overall effort to build a linkage and access infrastructure to support the National Secure Data Service Demonstration project. This compute environment will increase abilities to process and analyze data, maintain data security, and expand research access while also allowing for the implementation of testing privacy-preserving technologies as required under Section 10375 of the CHIPS and Science Act.
Data Access Alternatives: Artificial Intelligence Supported Interfaces (Active)
An effort to create and test machine-learning-backed or “artificial intelligence” (AI)-backed user experiences with federal statistical data. This project aims to improve user interactions that involve obtaining answers to questions via search engines or e-mailing federal staff or contractors. This project seeks to develop and pilot an AI chatbot (or the like) that answers users' text queries submitted via an interface.
Building Capacity for State, Local, and Territorial Governments to Use Administrative Data for Evidence-Building (Active)
An exploration of how a potential, future National Secure Data Service could support state, local, and territorial capacity building through development of an interface and roadmap to enable repeatable state and local data analysis that may inform state and federal policies.
Data Integration to Estimate STEM Attrition and Workforce Supply: A Pilot Approach (Active)
A project aimed at improving the understanding of the impact of science, technology, engineering, and mathematics (STEM) attrition that occurs within educational and workforce pathways on the supply of STEM talent required to meet future workforce demand.
Engaging Policy Stakeholders to Inform a Future National Secure Data Service (Active)
An exploration of the development of a National Secure Data Service framework to enable federal policy stakeholders to use data efficiently and effectively for informed, evidence-based decisions.
Artificial Intelligence–Ready Data Products to Facilitate Discovery and Use (Active)
An exploration of how a future National Secure Data Service could provide shared information and tools for making statistical data products more readily ingestible by artificial intelligence technologies.
Secure Compute Environment Testbed for a National Secure Data Service (Active)
A design and build of a secure compute environment testbed that will be leveraged as part of an overall effort to test linkage and access infrastructure to support the National Secure Data Service Demonstration project.
Synthetic Data Generation with Large, Real-World Data (Active)
A project to improve the understanding of how synthetic data generators work with large real-world data (e.g., data sets with over 30 billion rows of data) to inform a toolkit that generates synthetic data.
Artificial Intelligence for Enhancing Data Quality, Standardization, and Integration for Federal Statistics (Active)
An exploration of the development of a set of data processing tools that use artificial intelligence to enhance data standardization and integration activities that are central to the data quality requirements for the federal statistical system and beyond.