Become a Member

Access exclusive resources, grow your professional network, and help shape the future of health services research.

Join Today
Attend the Science of D&I Conference

This year’s theme addresses existing and future efforts to maximize the benefits of D&I science.

Learn More
- Events
- Past Events
Join Our Advocacy Efforts

Explore our policy priorities and learn how you can engage with lawmakers and decision-makers to improve health and health care for all.

Learn More
Blog
Loss of Public Data Compromises Scientific Integrity

Our data series explores the crucial role of data for upholding the principles of scientific integrity.

Learn More
Staying Ahead in a Time of Change

AcademyHealth is helping researchers navigate the rapidly shifting policy landscape with timely Situation Reports and exclusive virtual events featuring expert insights.

Learn More

Become a Member

Access exclusive resources, grow your professional network, and help shape the future of health services research.

Join Today
Attend the Science of D&I Conference

This year’s theme addresses existing and future efforts to maximize the benefits of D&I science.

Learn More
- Events
- Past Events
Join Our Advocacy Efforts

Explore our policy priorities and learn how you can engage with lawmakers and decision-makers to improve health and health care for all.

Learn More
Blog
Loss of Public Data Compromises Scientific Integrity

Our data series explores the crucial role of data for upholding the principles of scientific integrity.

Learn More
Staying Ahead in a Time of Change

AcademyHealth is helping researchers navigate the rapidly shifting policy landscape with timely Situation Reports and exclusive virtual events featuring expert insights.

Learn More

Professional Development

EDM Forum - Big Data and Big Crowds: Getting Useful Data from Text Fields Using Large Data Sets

Overview: Abstracting information from imaging reports is traditionally done by trained researchers who review the report text and record the presence or absence of key words or findings. The standard abstraction method is laborious, time-consuming, and expensive, and the alternative method, natural language processing (NLP), requires special expertise to implement and tailor the algorithm. A third, more recent, option is crowd-sourcing through Amazon mTurk, a marketplace where people can sign-up to work on human intelligence tasks (HITs).

This webinar reviewed three approaches for abstracting data from imaging reports based on the experiences of the Back pain Outcomes using Longitudinal Data (BOLD) Project. The webinar also discussed strategies for working with and evaluating the effectiveness of abstraction conducted by trained researchers and individuals in the Amazon mTurk marketplace.

The BOLD Project established a cohort of 5,239 senior patients with back pain recruited from primary care settings in health systems that are part of the HMO Research Network. The BOLD-Extension of Research (BOLDER) Project provided for 18 month extension of the project. Several projects that are part of BOLDER require collecting not only quantitative variables, such as counts of particular CPT codes, but also information buried in text fields such as radiology reports. BOLDER will yield approximately 6,400 text-based spine imaging reports.

Download presentation slides here.

Free

101

Learning Objectives: At the conclusion of the session, participants were able to:

Review the BOLD/BOLDER registry
Review the approaches for abstracting data from imaging reports
Describe the accuracy, cost, and time of Amazon mTurk and NLP compared with the trained researcher abstract.