careset data image

Due to overwhelming response from researchers at the AcademyHealth Annual Research Meeting, CareSet is delighted to extend its special DocGraph data offer for ARM 2023 attendees until September 15th!

Our national Medicare referral data shows:

  • referrals and collaborations between health care providers, by showing the number of patients they share
  • direction of the referral relationships

Access for ARM attendees:

While 2014-2020 data is available to all researchers free of charge, ARM attendees are gaining access to the most recent non-commercial years not publicly available, including:

  • Free access to the 2021 dataset.
  • A discounted rate for the 2022 dataset.

ARM researchers will be equipped with a non-commercial license to employ the data in their studies.

Search your AcademyHealth conference follow-up emails for the phrase “Parting Gift for ARM Attendees” to retrieve the discount code. Then request the data here!

About DocGraph data

When a Medicare patient visits a health care provider, the provider must submit a claim to Centers for Medicare and Medicaid Services (CMS). This claim includes their NPI (National Provider Identifier) as well as information about the patient and the care provided. When we analyze the data sourced from millions of claims per year across the nation, we can discern insights about patient flows.

DocGraph is structured as a directed graph, showing implied referrals -  shared patients over time between two providers. For instance, If 20 patients first visited Dr. Ashley, and then a month later they all saw Dr. Patch, our dataset would indicate 20 referrals to Dr. Patch. This contrasts from explicit referrals, which depend on the “referring provider” data on claims. Such fields often remain vacant and aren't suitable for wide-scale analyses. These implied referrals offer a clearer picture of how providers collaborate.

The data is limited to patient counts over 10, to protect patient privacy and in accordance with CMS cell suppression policies. More details about how the referrals are derived is provided in documentation that comes with the dataset. Data fields are as follows:

from_npi* - The provider seen first in sequence

to_npi - The provider seen second in sequence

patient_count - The total number of patients shared between the two providers over the time period

transaction_count - The count of times that a patient switched between the two providers, in the from-to direction. 

average_day_wait - The average number of days it took for a patient to switch to the second provider after having seen the first provider.

std_day_wait - The standard deviation of days it took for a patient to switch to the second provider

*An “NPI” stands for National Provider Identifier, which is a unique identifier assigned to a person or institution that bills for services in the United States.

Utilizing DocGraph Data

While this data has a commercial license available, we always ensure that researchers have free and affordable access to a non-commercial license of the data.

  • Journalists use our data to investigate fraud, waste, and abuse in the health system. When researching a provider engaging in potentially unethical behavior, our data can reveal referral relationships or inappropriate partnerships that might put patients at risk. It also informs journalists about the associates in a provider’s network who may be interviewed in the investigation.
  • Researchers use DocGraph as both supplemental and primary data for their projects. They examine how provider networks are related to insurance plans, SDOH, data interoperability, cost and quality outcomes, and the impact of COVID-19 on provider networks. Recent studies that have cited our data can be found in Health Services Research, BMJ Open, Medical Care, Journal of General Internal Medicine, Urban Institute. Authors come from HHS ONC, University of Colorado School of Medicine, Vanderbilt University Medical Center, University of Southern California, and more.
  • Other potential research avenues leveraging this dataset include patient pathway analysis, network graph evaluations, geospatial studies, analysis of access to care, and preventive epidemiological modeling. If you're delving into research concerning patient location or movement patterns, this data is especially useful.

    What other Medicare or Medicaid data would be valuable for your research?

    We want your feedback as we design and release other public use files (PUFs)! If your research could benefit from Medicare or Medicaid data that is not currently available from CMS, or is cost prohibitive from other data vendors, please fill out this form and let us know. You may also join our PUF mailing list (Google account sign-in required) to hear about data releases. We hope to hear from you!

    Our Commitment to Health Care Transparency

    We want our data to help researchers study the structure of the health care system and how provider and institutional behaviors impact patient outcomes.

    We’ve released Medicare referral data since 2012, when we first obtained it through a journalistic FOIA request. Later we gained direct access to the source data, allowing us to oversee the creation and maintenance of this PUF, previously managed by CMS.

    We are committed to offering free and affordable datasets and are always open to collaborating with those in need of financial support. As health care data journalists, data liberators, and researchers, our priority is to open recent and longitudinal health care data to fuel high quality health care research and journalism.

alma headshot

Alma Trotter

Data Policy Manager and a Healthcare Data Journalist - Careset

Alma Trotter is the Data Policy Manager and a Healthcare Data Journalist at CareSet. She leads the non-commerc... Read Bio

Blog comments are restricted to AcademyHealth members only. To add comments, please sign-in.