Across the United States health-related research enterprise, a new dawn has broken in the mainstream of federally-funded investigations that promises to accelerate the impact of data and information gained from this work. Driven by the interest of Executive Branch and Congressional leaders, new policies are now in place requiring researchers to submit data management and sharing plans with their new research proposals. The intention of this requirement upon researchers and institutions is to allow more rapid translation of results from projects, support the validation of research findings, and accelerate access to high-value large data sets. Dr. Lawrence Tabak, in leading the NIH implementation of the policy, recently pointed to the extensive planning and preparations that have been underway since the NIH Data Management and Sharing Policy (DMS) was put in place on October 29, 2020. NIH has been working with the research community for many years in preparation for this important step, and a key element of this is through the NIH common data resource.
With the policy requirements firmly in place, perspectives from the research community provide insights on how investigators are adapting to new practices and what to expect down the road from these new practices. Among the important aspects of implementation of the policy is the uses of tools that enable researchers to comply with these federal requirements, and leverage technologies to maximize the intended benefit from their implementation. We met up with Maria Praetzellis at the California Digital Library who is the Product Manager for Research Data Management at University of California Office of the President where she oversees a free and open source platform for researchers to create and manage their data management plans. DMPTool, funded by the National Science Foundation, is a platform geared toward creating machine actionable content about research projects that can be applied in support of data sharing across the enterprise. She noted that the tool and its coming enhancements are emerging as a community of practice from research institutions across the country working together to embrace the DMS plans in a meaningful and consistent way. “From our vantage point, we see this policy-oriented research serving many purposes to facilitate sharing of ideas and leading researchers toward projects with more significant impact than before.” Aside from DMPTool, researchers and their institutions have numerous options for accessing tools that enable automated knowledge transfer and there will likely be more resources that will emerge to maximize the use of data management and strategies enabling researchers to share ideas and collaborate more easily.
To learn more about the researcher experience with data sharing, we explored some ideas about it with Dr. Melissa Haendel, Director of the Translational and Integrative Science Lab (TISLab) and Center for Data to Health at the University of Colorado Anschutz School of Medicine. We asked her to describe her insights in data sharing and, in particular, highlight her leadership with the National COVID Cohort Collaborative (N3C).
In looking at the future impact of these policies, Dr. Haendel notes that “there are a lot of informatics methods that rely on having data en masse from many sites in order to ensure ethical AI and improved bias detection, data quality and data repair, as well as methods that enable analytics for rare events or conditions. Many of these initiatives are not currently achievable with the current data availability; further, even given the data we do have, the lack of consistency and even awareness of lack of consistency mean that many methods are simply limited for lack of ability to examine and validate across many sites and sources. We will see a lot more knowledge integration with raw data.” In addition, we spoke with Dr. Haendel about the ways that researchers will organize data in the future. She notes that “ontologies that are traditionally representative of cell or disease classification will be further refined with more multi-modal data types such as single-cell RNAseq data or multi-omics data. This will support rapid advancement of precision medicine. Context is everything however, and metadata will be critical for combining such data and knowledge sources. In the end, researchers will have increasing authority over their own data, how it is used and the ways in which they can themselves interrogate the data.”
Some researchers and health policy advocates have raised concerns regarding the potential hazards of sharing protected health information and untoward consequences of sharing. Dr. Haendel addresses this up front by adding, “I believe that the N3C has demonstrated that HIPAA-limited data can be shared effectively under a secure and well-governed structure. While many distributed networks exist for clinical data whereby the data resides behind institutional firewalls, we have very much revealed the lack of comparability and consistency of data even within distributed networks. Nothing beats actually having the data in hand. Furthermore, with the right governance, you can limit access and use these data exclusively for data quality and harmonization activities, giving improved data.”
Data, as the centerpiece for discovery and translation of ideas, can essentially change the trajectory of an investigator’s career. The DMS policy may also serve as a key pivot point in the promotion of academic careers overall. These requirements for funding are likely to bring us closer to the day when academic careers are accredited and promoted on the basis of dataset value rather than solely on the basis of a published manuscript. In addition, while the NIH DMS policy is currently constrained to this agency, the long-standing NIH public access policy enacted in 2013 for open sharing of publications has recently been expanded to all cover federally funded and now without an embargo or restrictions on open publications. When looking broadly at federal research policies, the cumulative effects of these social policies on the research enterprise will continue to open the door to greater opportunities to address key questions and fuel more breakthroughs in science, medicine and health.
Read Downing’s previous post on NIH data policy changes here.