Skip to main content
Blog Post

Improving Health Equity in Medicaid: Addressing Data Quality and Methodology

This post is the fifth in the Health Equity Meeting Blog Series, summarizing the discussion by a panel of experts who addressed our current methods of analyzing Medicaid data for improving health equity.

researchers putting sticky notes on a wall

Conversations and analysis surrounding health equity require a better understanding of current disparities in health care outcomes and delivery. Using the Institute of Medicine’s (IOM) definition of health disparities as a guidepost, any difference in treatment not justified by underlying health conditions or patient preference is considered a disparity. When examining data, controls and methods of analysis must be employed in order to accurately measure health disparities.

“It’s all too easy to look at a paper and jump to the final result, but not realize that exactly what controls were used could really change the nature of the finding,” said Dr. Kosali Simon, Distinguished Professor at Indiana University, who moderated a session entitled “Methodological Challenges Associated with Improving Health Equity in Medicaid” on this topic.

Methodological challenges are only exacerbated when analyzing Medicaid health equity data, as there are differences in reporting requirements across states. Data analysis of health disparities requires workarounds and estimations to account for the quality of data available.

Improvements are Needed in Collection and Reporting of Race and Ethnicity Data

Medicaid’s data system, recently updated to the Transformed Medicaid Statistical Information System (T-MSIS), assists researchers trying to answer health equity-related questions. However, accurately measuring race and ethnicity using existing data systems is complex. Data for T-MSIS are collected, processed, and reported by states, so the completeness and quality of race, ethnicity, and language (REL) data vary by state. Understanding and developing solutions to measuring REL accurately is essential for developing and monitoring solutions for eliminating racial disparities in health and health care.

Indirectly Estimating Race is a Workaround for Poor REL Data

“As people have mentioned before, the data are pretty imperfect and not as good as we would hope it to be,” said Dr. Carol Irvin, Senior Fellow at Mathematica. To work around problems in completeness and quality of REL data, researchers have developed indirect estimators of race and ethnicity. For example, the Bayesian Improved Surname Geocoding (BISG) method was developed by RAND to predict the race/ethnicity of an individual using the individual’s surname and geocoded location. There are also measures to enhance BISG, such as regressing age, sex, and other individual-level characteristics to predict race/ethnicity. These characteristics can help to better predict race because the distribution of race is not the same across these characteristics.

Dr. John L. Czajka, formerly Senior Fellow at Mathematica, is working on a pilot project to develop and evaluate BISG estimates of race and ethnicity for Medicaid. Dr. Czajka found that for this population generally, overall accuracy was higher with BISG alone than BISG with regression analysis. However, a problem with indirect estimates at the state level is that there is considerable variation in predictive accuracy across the states and, within states, across groups. For example, Dr. Czajka found that white, non-Hispanic race/ethnicity is best predicted in states with very small minority populations; likewise, Hispanic and non-white, non-Hispanic race/ethnicity are predicted best in states where their population shares are large.

Improvements are Needed to Identify and Analyze Multiracial Individuals

Another common methodological challenge with measuring health equity in Medicaid is how to identify multiracial individuals in datasets. Few states report data on multi-race, and most of these underreport that population. Furthermore, surname and geographic location have little predictive value. Thus, researchers have typically excluded multiracial individuals from their analyses. However, this exclusion leads to problems, as Dr. Ninez Ponce at the University of California, Los Angeles Fielding School of Public Health’s Department of Health Policy and Management, articulated.

“We’ve got to deal with the multi-racial population because according to the 2020 Census, this population grew substantially from nine million to 33.8 million. If we exclude that population in Medicaid analyses, then we further suppress or eliminate insights on the smaller race groups—American Indians and Alaska Natives and Native Hawaiians and Pacific Islanders—which have over half of their population reporting more than one race.”

Complete Medicaid Data Fails to Account for Disparities in Need and Access to Treatment

Even when available Medicaid data are complete and highly accurate, methodological challenges exist in analyzing equitable outcomes as Medicaid claims data only provide insight into patients who have received care or a diagnosis. Racial disparities in access to care may exist due to patient preferences, clinical needs, legal and regulatory systems, and discrimination. Using the IOM definition of unequal treatment, data must be adjusted for need in order to isolate health disparities. Dr. Benjamin Lê Cook, Associate Professor of Psychiatry at Harvard Medical School/Cambridge Health Alliance, demonstrated the heart of the problem using mental health care as an example of the importance of adjusting for levels of need. Reliance on claims data alone could result in inaccurate assumptions of disparities in treatment access. Dr. Cook labels this a “denominator problem”, as those who do not access treatment will not have a claim and thus will not contribute to the fraction denominator when calculating treatment rates. Ideally, Medicaid claims data would be merged with other data sources to better capture information on the need for care. This discrepancy is further exacerbated when looking at data on racial and ethnic disparities. The Medical Expenditure Panel Survey (MEPS) is a set of surveys that provides data on health care cost, use, and insurance. These data can be used to create more accurate Medicaid samples, as a potential solution to the denominator problem. New methods of matching and imputing data from community datasets and surveys hold promise.

Without quality data, it is impossible to know the extent of the problem and how to address health equity. To work around poor REL data, researchers have made progress in using BISG methods and other models to estimate REL metrics. However, these methods are not ideal, and incomplete and poor quality REL data continue to be a barrier to addressing health equity. While the collecting and reporting of health equity data is improving, methodological challenges still exist when analyzing the data. For example, how are multiracial individuals identified and analyzed? Thus, further advancements in methodology are needed to address health equity. 

This post highlights quotes and learnings from the panel "Methodological Challenges Associated with Improving Health Equity in Medicaid" presented at the meeting "Harnessing Medicaid to Improve Health Equity: A Research and Policy Agenda" on Dec. 1 and 2, 2021. This meeting was co-hosted by Julie Donohue of the University of Pittsburgh, Susan Kennedy of AcademyHealth, Genevieve Kenney of the Urban Institute, Chima Ndumele of Yale University, and Kosali Simon of Indiana University. 



Julianne Akard

Bachelor's Candidate - Indiana University

Julianne Akard is a senior at Indiana University studying chemistry, economics, and healthcare management and ... Read Bio


Elizabeth McAvoy

Bachelor's Candidate - Indiana University

Elizabeth McAvoy is an undergraduate at Indiana University- Bloomington, pursing a Law & Public Policy BSPA an... Read Bio

Blog comments are restricted to AcademyHealth members only. To add comments, please sign-in.