and Machine Learning
Machine learning is a form of artificial intelligence in which algorithms learn from data, with or without explicit guidance, to improve predictions or classifications of current data. An algorithm, at its simplest, is designed to accomplish a specific task, then trained on data, and revised. The process is repeated until the algorithm achieves optimal performance in terms of fit to the training data. The machine itself generates the algorithm rather than relying on external coding to direct the algorithm’s construction. The ability to ingest a broader swath of variables and to explore multiple permutations offers gains over classic approaches to modeling.
Machine learning has great potential for therapeutic development and healthcare, ranging from discovery to diagnosis to decision making. Unlocking machine learning’s full potential, however, requires recognizing and addressing issues raised to date.
Early successes applying machine learning in other industries may not readily translate when attempting to scale machine learning in healthcare. That underscores a fundamental difference in applying machine learning to health and healthcare. Health specifically, the understanding of diseases and treatments is fundamentally different from other areas where machine learning has been used.
The fields of drug, device, and healthcare interventions are replete with examples of interventions that worked mechanistically, improved intermediate or other proxy measures, but failed final evaluation. despite the volume of healthcare data generated during current practice, much information that could meaningfully support insights remains uncollected, siloed, or otherwise inaccessible to researchers. Data that are often used are primarily collected for reimbursement, and it is unlikely that administrative data will offer truly meaningful insights compared with richer clinical data.
In short, health represents a distinct challenge for machine learning because of our still-limited understanding of disease, the effects of our interventions, and the lack of integrated data that can effectively capture this information at meaningful scale. Given this more challenging analytical environment, we are more thoughtful about how we employ machine learning in health and healthcare.
Poor-quality data will not yield meaningful insights, and no analytical method, regardless of its sophistication, can overcome shortfalls in data sufficiency, representativeness, or scale. We are aware that statistical techniques depend on the quality of data available to generate better findings.
Machine learning algorithms perform as well as or better than a conventional statistical approach with the data set used to develop them. We run tests in a different populations without recalibration or retraining and demonstrate consistent results across settings to confirm the algorithm’s utility.
An algorithm that fails to replicate established findings or counters the established body of evidence is more likely an indication of a methodological oversight or a data artifact than a truly novel insight. Our algorithms offer insights that are credible and aligned with the scientific or clinical consensus.