Recently, the potential for artificial intelligence and machine learning to drive creative business solutions and consumer applications in healthcare has captured the imagination of researchers, policymakers, health care providers, entrepreneurs, and patients. What does this interest mean for our ability to understand and improve health and health care delivery, and where do data scientists fit into the big, multidisciplinary tent of health services researchers and health policy professionals? In this first installment of blogs on data science in health care, I’ll describe a few aspects about the nature of data science, and highlight some basic trends around the methods being used, including some notions about human-computer, and machine-to-machine interactions that are rendering actionable insights from data across all aspects of health care.
Throughout, we’ll discuss the role and contributions of data scientists. These professionals are in high demand for their quantitative talents - talents that have previously revolutionized the fundamentals of commerce from Wall Street to Main Street. Data science “quants” apply their engineering, computer science, and mathematical expertise to deliver powerful analytic capabilities. In doing so, they are becoming an important cog of the knowledge engine, powering the rise of the modern-day, information-based health care system. The quants and their tools are changing all aspects of the economy and society, including how humans interact with machines to form neural networks that support decision-making.
Note: Throughout, I use the terms “data scientists” and “quants” somewhat interchangeably to refer to individuals using their unique skills and tools to make sense of and provide meaning to a wide array of data.
The immense breadth of impact that data science is having today is powered by the explosion of data being produced, enabled by low cost computing, and made more valuable by rapid prognostication and probability assessment and capabilities to comprehend interrelationships among many variables that collectively provide competitive advantages and value.
While AI has been applied in medical problem-solving since the 1970s, we are amidst a new information age where the impact from AI and machine learning is rapidly expanding into all dimensions of healthcare. What are the reasons for this explosion in the medicine and health care sector?
First, there is an economic demand for better results. The quest for improved productivity, safety, and efficiency is fueling innovation across the sector, and the data is the key ingredient. From the harnessing of algorithms and electronic health record data to support value-based care delivery, to their predictive and modeling abilities in population health, to the computing of high dimensional datasets that support discovery research and personalized medicine, data science application is making contributions up and down the healthcare verticals.
Second, it is easier and cheaper to collect, store and utilize data than ever before. The dramatic fall in costs of computing, super-charged processing speeds, increased ease of use of analytic and visualization tools, and low costs of data storage are powerful incentives for use of health data. Another major driver is the avalanche of quality digital inputs from electronic health records (structured and unstructured), biosensor outputs, coded administrative financial transactions, and consumer provided data sources (including social media, search queries, and financial transactions). These vast new data resources are now highly contextualized, very fluid in their extensibility (an ability to expand/adapt) and increasingly accessible to quants working to derive information that informs health care decision-making.
Finally, there is increasing interest in health and health care by the data scientists themselves. Many quants find the challenges of medicine and health care to have personal and societal values that are appealing because the work offers an important sense of purpose.
From its beginnings as a way to collect, organize and extract information, data mining in healthcare has become the foundation for much more advanced learning. Today and in the future, quants are training machines to learn in ways that go beyond human capacity. This transition creates enormous opportunities and challenges – some of which we’ll discuss in future installments of this series – for answering critical questions in health and health care delivery.
So, what are some of the new ways that the quants and their tools are answering important health care questions?
One way is through basic data mining which is the process of finding previously unknown patterns and trends in databases and using that information to build predictive models. Data mining in and of itself is not new. It has been used intensively and extensively by financial institutions, for credit scoring and fraud detection; marketers, for direct marketing and cross-selling or up-selling; retailers, for market segmentation and product design; and manufacturers, for quality control and maintenance scheduling. In healthcare, data mining is becoming increasingly popular, if not essential. 
Several factors have motivated the new uses of data mining applications in healthcare. The existence of medical insurance waste, fraud and abuse, for example, has led many healthcare insurers to use data mining for program integrity. Data mining applications can be used by healthcare organizations to help make customer relationship management decisions, by providers to identify effective treatments and best practices, and by patients researching options for better and more affordable healthcare services. Because the huge amounts of data generated by healthcare transactions are too complex and voluminous to be processed and analyzed by traditional methods, data mining provides the methodology and technology to transform these mounds of data into useful information for decision-making.
Another way healthcare quants are responding to the challenges of improving health and health care is the integration of health data with data from related sectors. When integrating data from healthcare (and HIPAA-protected) services with non-health care derived data, a more comprehensive picture of social, behavioral, and other dimensions of health status and variations in outcomes emerges. Together, these data provide important insights into extending lessons from clinical trials into real-world experiences.
Yet another frontier is the application of machine learning to healthcare data with a focus on pattern mining. These activities enhance the application of algorithms to analyze increasingly large datasets. Novel applications of AI are appearing in the form of “bots,” automated verbal response systems similar to Siri and Echo, that provide interactive health care information to consumers or patients. Bots, supported by AI systems are now being evaluated for use in clinical trials enrollment and consumer research applications.
One such approach is the transition from building intelligent systems to “deep learning” that is fundamentally shaped by the ability to train neural networks. Neural networks are computer systems of algorithms assembled to mimic cognition patterns of the human brain. Deep learning is a form of machine learning that uses a model of computing that is inspired by the structure and function of the human brain. These technical capabilities enable object recognition, activity pattern recognition, and labeling of objects.
A new pathway in health care is emerging through a form of AI known as “reinforcement learning” that shifts the applications and algorithms to the possibility of computers making decisions and executing actions. In these deep learning applications, the program extends machine learning toward computing that enables the machine to learn from human interactions and continuously improve through its own testing of algorithm-driven experiences in convolutional neural networks. Deep learning applications in biomedical research and health care have emerged in recent years as a powerful tool, promising to reshape the future of AI applications.
For example, a major scientific achievement in computing this past year featured reinforcement-learning programs by a computer that outperformed humans in AlphaGo, a massively complex game that build on learning algorithms, modeled after the ancient board game, Go. These DeepMind neural networks function differently than other AI platforms such as IBM’s Deep Blue program or Watson, which were assembled from large databases and developed for a pre-defined purpose, and which only function within its scope. Whereas computer programs had previously used Monte Carlo tree search algorithm strategies (search engine programs designed to instruct plays in computer games) to find its moves in the game – like a computer using a database of options programmed by a human in order to select the next proper chess move. However, in these recent advances, the Google Deep Mind programs that operate AlphaGo are able to select moves based on knowledge the programs learned themselves from machine learning in an artificial neural network based on human and computer play. Deep learning, for example with DeepMind, demonstrates that the computational system is not pre-programmed and is in fact a neural network of algorithms that learn from experience, using only raw pixels as data input.
The potential implication of reinforcement learning in evidence development and medical decision-making is substantial. For example, applications of deep learning in health care have demonstrated practical utility in use by the United Kingdom National Health Service where patient records are analyzed through use of Google DeepMind technology. In another example, a messaging service to providers that offers clinical alerts, messaging, and task management through a new application, Streams, is now capable of issuing a message that suggests an action – such as “don’t use X medication,” as determined through the AI application.
As you can imagine, this is only the beginning. We’ll have more conversations about health care data science applications as these digital data pioneers hasten the pace of analysis and stretch the breadth of AI and machine learning.
Want to know more? Follow this blog for thoughts from AcademyHealth, the Health Datapalooza steering committee, and other thought leaders and experts, and share your thoughts with us @hdpalooza on Twitter.