Overview: Abstracting information from imaging reports is traditionally done by trained researchers who review the report text and record the presence or absence of key words or findings. The standard abstraction method is laborious, time-consuming, and expensive, and the alternative method, natural language processing (NLP), requires special expertise to implement and tailor the algorithm. A third, more recent, option is crowd-sourcing through Amazon mTurk, a marketplace where people can sign-up to work on human intelligence tasks (HITs).

This webinar reviewed three approaches for abstracting data from imaging reports based on the experiences of the Back pain Outcomes using Longitudinal Data (BOLD) Project. The webinar also discussed strategies for working with and evaluating the effectiveness of abstraction conducted by trained researchers and individuals in the Amazon mTurk marketplace.

The BOLD Project established a cohort of 5,239 senior patients with back pain recruited from primary care settings in health systems that are part of the HMO Research Network. The BOLD-Extension of Research (BOLDER) Project provided for 18 month extension of the project. Several projects that are part of BOLDER require collecting not only quantitative variables, such as counts of particular CPT codes, but also information buried in text fields such as radiology reports. BOLDER will yield approximately 6,400 text-based spine imaging reports.

Download presentation slides here.


Learning Objectives: At the conclusion of the session, participants were able to:

  • Review the BOLD/BOLDER registry
  • Review the approaches for abstracting data from imaging reports
  • Describe the accuracy, cost, and time of Amazon mTurk and NLP compared with the trained researcher abstract.