• Reference Manager
  • Simple TEXT file

People also looked at

Perspective article, artificial intelligence in medicine: today and tomorrow.

ai in medicine and healthcare research paper

  • 1 Medical Informatics, School of Medicine, Université Libre de Bruxelles, Brussels, Belgium
  • 2 Unit of Epidemiology, Biostatistics and Clinical Research, School of Public Health, Université Libre de Bruxelles, Brussels, Belgium
  • 3 Hopital Erasme, Université Libre de Bruxelles, Brussels, Belgium

Artificial intelligence-powered medical technologies are rapidly evolving into applicable solutions for clinical practice. Deep learning algorithms can deal with increasing amounts of data provided by wearables, smartphones, and other mobile monitoring sensors in different areas of medicine. Currently, only very specific settings in clinical practice benefit from the application of artificial intelligence, such as the detection of atrial fibrillation, epilepsy seizures, and hypoglycemia, or the diagnosis of disease based on histopathological examination or medical imaging. The implementation of augmented medicine is long-awaited by patients because it allows for a greater autonomy and a more personalized treatment, however, it is met with resistance from physicians which were not prepared for such an evolution of clinical practice. This phenomenon also creates the need to validate these modern tools with traditional clinical trials, debate the educational upgrade of the medical curriculum in light of digital medicine as well as ethical consideration of the ongoing connected monitoring. The aim of this paper is to discuss recent scientific literature and provide a perspective on the benefits, future opportunities and risks of established artificial intelligence applications in clinical practice on physicians, healthcare institutions, medical education, and bioethics.

1. Introduction

The expression “Medical Technology” is widely used to address a range of tools that can enable health professionals to provide patients and society with a better quality of life by performing early diagnosis, reducing complications, optimizing treatment and/or providing less invasive options, and reducing the length of hospitalization. While, before the mobile era, medical technologies were mainly known as classic medical devices (e.g., prosthetics, stents, implants), the emergence of smartphones, wearables, sensors, and communication systems has revolutionized medicine with the capability of containing artificial intelligence (AI) powered tools (such as applications) in very small sizes ( 1 ). AI has revolutionized medical technologies and can be commonly understood as the part of computer science that is able to deal with complex problems with many applications in areas with huge amount of data but little theory ( 2 ).

Intelligent medical technologies (i.e., AI-powered) have been met with enthusiasm by the general population partly because it enables a 4P model of medicine (Predictive, Preventive, Personalized, and Participatory) and therefore patient autonomy, in ways that could not be possible ( 3 ); smartphones are becoming for instance the go-to item to fill and distribute an electronic personal health record ( 4 ), monitor vital functions with biosensors ( 5 ) and helping to reach optimal therapeutic compliance ( 6 ), therefore gifting the patient with the spot as the main actor in the care pathway. The development of intelligent medical technologies is enabling the development of a new field in medicine: augmented medicine, i.e., the use of new medical technologies to improve different aspects of clinical practice. Several AI-based algorithms have been approved in the last decade by the Food and Drug Administration (FDA) and could therefore be implemented. Augmented medicine is not only enabled by AI-based technologies but also several other digital tools, such as surgical navigation systems for computer-assisted surgery ( 7 ), virtuality-reality continuum tools for surgery, pain management and psychiatric disorders ( 8 – 10 ).

Although the field of augmented medicine seems to encounter success with patients, it can be met with a certain resistance by healthcare professionals, in particular physicians: concerning this phenomenon, four widely discussed reasons should be provided. First, unpreparedness as to the potential of digital medicine is due to the evident lack of basic and continuing education regarding this discipline ( 11 ). Second, the early digitization of healthcare processes, very different from the promise of augmented medicine came with a steep increase of the administrative burden mainly linked to electronic health records ( 12 ), which has come to be known as one of the main components of physician burnout ( 13 ). Third, there is increasing fear as to the risk of AI replacing physicians ( 14 ), although the current and mainstream opinion in the literature is that AI will complement physician intelligence in the future ( 15 , 16 ). Fourth, the current world-wide lack of a legal framework that defines the concept of liability in the case of adoption or rejection of algorithm recommendations leaves the physician exposed to potential legal outcomes when using AI ( 17 ).

As of the lack of education in digital medicine, several private medical schools are preparing their future medical leaders to the challenge of augmented medicine by either associating the medical curriculum with the engineering curriculum or implementing digital health literacy and use in an upgraded curriculum ( 18 ).

The aim of this paper is to summarize recent developments of AI in medicine, provide the main use-cases where AI-powered medical technologies can already be used in clinical practice, and perspectives on the challenges and risks that healthcare professionals and institutions face while implementing augmented medicine, both in clinical practice and in the education of future medical leaders.

2. Current Applications of Artificial Intelligence in Medicine

2.1. cardiology, 2.1.1. atrial fibrillation.

The early detection of atrial fibrillation was one of the first application of AI in medicine. AliveCor received FDA approval in 2014 for their mobile application Kardia allowing for a smartphone-based ECG monitoring and detection of atrial fibrillation. The recent REHEARSE-AF study ( 19 ) showed that remote ECG monitoring with Kardia in ambulatory patients is more likely to identify atrial fibrillation than routine care. Apple also obtained FDA approval for their Apple Watch 4 that allows for easy acquirement of ECG and detection of atrial fibrillation that can be shared with the practitioner of choice through a smartphone ( 20 ). Several critiques of wearable and portable ECG technologies have been addressed ( 21 ), highlighting limitations to their use, such as the false positive rate originated from movement artifacts, and barriers in the adoption of wearable technology in the elderly patients that are more likely to suffer from atrial fibrillation.

2.1.2. Cardiovascular Risk

Applied to electronic patient records, AI has been used to predict the risk of cardiovascular disease, for instance acute coronary syndrome ( 22 ) and heart failure ( 23 ) better than traditional scales. Recent comprehensive reviews ( 24 ) have however reported how results can vary depending on the sample size used in research report.

2.2. Pulmonary Medicine

The interpretation of pulmonary function tests has been reported as a promising field for the development of AI applications in pulmonary medicine. A recent study ( 25 ) reported how AI-based software provides more accurate interpretation and serves as a decision support tool in the case on interpreting results from pulmonary function tests. The study received several critiques, one of which ( 26 ) reported how the rate of accurate diagnosis in the pulmonologists participating in the study was considerably lower than the country average.

2.3. Endocrinology

Continuous glucose monitoring enables patients with diabetes to view real-time interstitial glucose readings and provides information on the direction and rate of change of blood glucose levels ( 27 ) Medtronic received FDA approval for their Guardian system for glucose monitoring, which is smartphone-paired ( 28 ). In 2018, the company partnered with Watson (AI developed by IBM) for their Sugar.IQ system to help their customers better prevent hypoglycemic episodes based on repeated measurement. Continuous blood glucose monitoring can enable patients to optimize their blood glucose control and reduce stigma associated with hypoglycemic episodes; however, a study focusing on patient experience with glucose monitoring reported that participants, while expressing confidence in the notifications, also declared feelings of personal failure to regulate glucose level ( 27 ).

2.4. Nephrology

Artificial intelligence has been applied in several settings in clinical nephrology. For instance, it has been proven useful for the prediction of the decline of glomerular filtration rate in patients with polycystic kidney disease ( 29 ), and for establishing risk for progressive IgA nephropathy ( 30 ). However, a recent review reporters how at this moment research is limited by sample size necessary for inference ( 31 ).

2.5. Gastroenterology

The specialty of gastroenterology benefits from wide range of AI applications in clinical settings. Gastroenterologists made use of convolutional neural networks among other deep learning models in order to process images from endoscopy and ultrasound ( 32 ) and detect abnormal structures such as colonic polyps ( 33 ). Artificial neural networks have also been used to diagnose gastroesophageal reflux disease ( 34 ) and atrophic gastritis ( 35 ), as well as to predict outcomes in gastrointestinal bleeding ( 36 ), survival of esophageal cancer ( 37 ), inflammatory bowel disease ( 38 ), and metastasis in colorectal cancer ( 39 ) and esophageal squamous cell carcinoma ( 40 ).

2.6. Neurology

2.6.1. epilepsy.

Intelligent seizure detection devices are promising technologies that have the potential to improve seizure management through permanent ambulatory monitoring. Empatica received FDA approval in 2018 for their wearable Embrace, which associated with electrodermal captors can detect generalized epilepsy seizures and report to a mobile application that is able to alert close relatives and trusted physician with complementary information about patient localization ( 41 ). A report focused on patient experience, revealed that, in contrast to heart monitoring wearables, patients suffering from epilepsy had no barriers in the adoption of seizure detection devices, and reported high interest in wearable usage ( 42 ).

2.6.2. Gait, Posture, and Tremor Assessment

Wearable sensors have proven useful to quantitatively assess gait, posture, and tremor in patients with multiple sclerosis, Parkinson disease, Parkinsonism, and Huntington disease ( 43 ).

2.7. Computational Diagnosis of Cancer in Histopathology

Paige.ai has received breakthrough status from FDA for an AI-based algorithm that is capable of diagnose cancer in computational histopathology with great accuracy, allowing pathologist to gain time to focus on important slides ( 44 ).

2.8. Medical Imaging and Validation of AI-Based Technologies

A long-awaited meta-analysis compared performances of deep learning software and radiologists in the field of imaging-based diagnosis ( 45 ): although deep learning seems to be as efficient as radiologist for diagnosis, the authors pointed that 99% of studies were found not to have a reliable design; furthermore, only one thousandth of the papers that were reviewed validated their results by having algorithms diagnose medical imaging coming from other source populations. These findings support the need of an extensive validation of AI-based technologies through rigorous clinical trials ( 5 ).

3. Discussion: Challenges and Future Directions of Artificial Intelligence in Medicine

3.1. validation of ai-based technologies: toward a replication crisis.

One of the core challenges of the application of AI in medicine in the next years will be the clinical validation of the core concepts and tools recently developed. Although many studies have already introduced the utility of AI with clear opportunities based on promising results, several well recognized and frequently reported limitations of AI studies are likely to complicate such validation. We will hereby address three of such limitations, as well as provide possible ways to overcome them.

First, the majority of studies comparing efficiency of AI vs. clinicians are found to have unreliable design and known to lack primary replication, i.e., the validation of the algorithms developed in samples coming from other sources than the one used to train algorithms ( 45 ). This difficulty could be overcome in the open science era as open data and open methods are bound to receive more and more attention as best practices in research. However, transitioning to open science could prove difficult for medical AI companies that develop software as a core business.

Second, studies reporting AI application in clinical practice are known to be limited because of retrospective designs and sample sizes; such designs potentially include selection and spectrum bias, i.e., models are developed to optimally fit a given data set (this phenomenon is also known as overfitting), but do not replicate the same results in other datasets ( 32 ). Continuous reevaluation and calibration after the adoption of algorithms that are suspected of overfitting should be necessary to adapt software to the fluctuation of patient demographics ( 46 ). Furthermore, there is a growing consensus as of the need of development of algorithms designed to fit larger communities while taking into account subgroups ( 47 ).

Third, only few studies are known to compare AI and clinicians based on same data sets; even in that scenario, critiques have been made pointing at lower diagnostic accuracy rate than expected in specialty doctors. ( 26 ). Opposing AI and clinicians is, although well represented in the scientific literature, probably not the best way to tackle the issue of performance in medical expertise: several studies are now approaching the interaction between clinicians and algorithms ( 47 ) as the combination of human and artificial intelligence outperforms either alone.

3.2. Ethical Implications of Ongoing Monitoring

Medical technology is one of the most promising markets of the 21st century, with an estimated market value rapidly approaching a thousand billion dollars in 2019. An increasing percentage of the revenue is due to the retail of medical devices (such as heart monitoring devices) to a younger population, which is not the primary target consumer profile (because health problems such as atrial fibrillation are less likely to appear). Because of this phenomenon, the Internet of Things (IoT) is redefining the concept of healthy individual as a combination of the quantified self (personal indicators coded in the smartphone or wearable) and series of lifestyle wearable-provided parameters (activity monitoring, weight control, etc.).

Furthermore, in the last couple of years several wearable companies have been concluding important deals with either insurance companies or governments to organize a large-scale distribution of these products: this kind of initiatives are mainly aimed to induce lifestyle change in large populations. While western countries are continuing to evolve toward health systems centered around the patient's individual responsibility toward its own health and well-being, the ethical implications of ongoing medical monitoring with medical devices through the Internet of things are frequently discussed. For instance, ongoing monitoring and privacy violations have the potential to increase stigma around chronically ill or more disadvantaged citizens ( 48 ) and possibly penalize those citizens that are unable to adopt new standards of healthy lifestyle, for instance by reducing access to health insurance and care; little to no debate has been focused on these potential and crucial pitfalls in health policy making.

In this techno-political framework, the issue of data protection and ownership becomes more and more crucial, although more than two decades old ( 49 ). Several attitudes toward data ownership are described in the literature: although some works argue for common ownership of patients data to profit personalized medicine approaches ( 50 , 51 ), consensus is shifting toward patient ownership, as it has positive effects on patient engagement as well as may improve information sharing if a data use agreement between the patient and healthcare professionals is developed ( 52 ).

3.3. The Need to Educate Augmented Doctors

Several universities have started to create new medical curriculum, including a doctor-engineering ( 18 ), to answer the need of educating future medical leaders to the challenges of artificial intelligence in medicine ( 53 ). Such curricula see a stronger approach to the hard sciences (such as physics and mathematics), and the addition of computational sciences, coding, algorithmics, and mechatronic engineering. These “augmented doctors” would count on both a clinical experience and digital expertise to solve modern health problems, participate in defining digital strategies for healthcare institutions, manage the digital transition, educate patients and peers.

Society as well as healthcare institutions could benefit from these professionals as a safety net for any processes including AI in medicine but also as a drive of innovation and research. Aside from basic medical education, there is a need for implementation of ongoing educational programs regarding digital medicine and targeting graduated physicians, so as to allow retraining in this growing field. In most cutting-edge hospitals around the world, such experts are charged with the mission of Chief Medical Information Officer (CMIO).

3.4. The Promise of Ambient Clinical Intelligence: Avoiding Dehumanization by Technology

As reported by several studies ( 12 , 13 ), electronic health records can be an important administrative burden and a source of burnout, phenomenon increasingly present in physicians, both in training and trained. Although artificial intelligence solutions such as Natural Language Processing are becoming more and more capable of helping the physician deliver complete medical records, further solutions are needed to solve the issue of the increasing time allocated to indirect patient care.

Ambient clinical intelligence (ACI) is understood as a sensitive, adaptive and responsive digital environment surrounding the physician and the patient ( 54 ) and capable of, for instance, analyzing the interview and automatically fill the patient's electronic health records. Several projects are underway to develop an ACI, which would be a crucial application of artificial intelligence in medicine and much needed to solve modern problems with the physician workforce.

One of the great barriers to the adoption of intelligent medical technologies in physicians is the fear of a dehumanization of medicine. This is mainly due to the increasing administrative burden ( 12 ) imposed on physicians. However, modern technology such as ACI and Natural Language processing are bound to solve the issue of administrative burden and will help clinicians focus more on the patient.

3.5. Will Doctors Be Replaced by Artificial Intelligence?

As recently discussed in the literature ( 15 , 16 ) doctors will most likely not be replaced by artificial intelligence: smart medical technologies exist as such as support to the physician in order to improve patient management. As recent studies have indicated ( 45 ), however, comparisons frequently occur between artificial intelligence solutions and physicians, as if the two counterparts were in competition. Future studies should focus on the comparison between physicians using artificial intelligence solutions with physicians without the aid of such applications, and extend those comparisons to translational clinical trials; only then will artificial intelligence be accepted as complementary to physicians. Healthcare professionals stand nowadays in a privileged position, to be able to welcome the digital evolution and be the main drivers of change, although a major revision of medical education is needed to provide future leaders with the competences to do so.

4. Conclusion

The implementation of artificial intelligence in clinical practice is a promising area of development, that rapidly evolves together with the other modern fields of precision medicine, genomics and teleconsultation. While scientific progress should remain rigorous and transparent in developing new solutions to improve modern healthcare, health policies should now be focused on tackling the ethical and financial issues associated with this cornerstone of the evolution of medicine.

Author Contributions

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

1. Steinhubl SR, Muse ED, Topol EJ. The emerging field of mobile health. Sci Trans Med . (2015) 7:283rv3. doi: 10.1126/scitranslmed.aaa3487

PubMed Abstract | CrossRef Full Text | Google Scholar

2. Peng Y, Zhang Y, Wang L. Artificial intelligence in biomedical engineering and informatics: an introduction and review. Artif Intell Med . (2010) 48:71–3. doi: 10.1016/j.artmed.2009.07.007

3. Orth M, Averina M, Chatzipanagiotou S, Faure G, Haushofer A, Kusec V, et al. Opinion: redefining the role of the physician in laboratory medicine in the context of emerging technologies, personalised medicine and patient autonomy ('4P medicine'). J Clin Pathol . (2019) 72:191–7. doi: 10.1136/jclinpath-2017-204734

4. Abdulnabi M, Al-Haiqi A, Kiah MLM, Zaidan AA, Zaidan BB, Hussain M. A distributed framework for health information exchange using smartphone technologies. J Biomed Informat. (2017) 69:230–50. doi: 10.1016/j.jbi.2017.04.013

5. Topol EJ. A decade of digital medicine innovation. Sci Trans Med . (2019) 11:7610. doi: 10.1126/scitranslmed.aaw7610

6. Morawski K, Ghazinouri R, Krumme A, Lauffenburger JC, Lu Z, Durfee E, et al. Association of a smartphone application with medication adherence and blood pressure control: the MedISAFE-BP randomized clinical trial. JAMA Int Med . (2018) 178:802–9. doi: 10.1001/jamainternmed.2018.0447

7. Overley SC, Cho SK, Mehta AI, Arnold PM. Navigation and robotics in spinal surgery: where are we now? Neurosurgery . (2017) 80:S86–99. doi: 10.1093/neuros/nyw077

8. Tepper OM, Rudy HL, Lefkowitz A, Weimer KA, Marks SM, Stern CS, et al. Mixed reality with HoloLens: where virtual reality meets augmented reality in the operating room. Plast Reconstruct Surg . (2017) 140:1066–70. doi: 10.1097/PRS.0000000000003802

9. Mishkind MC, Norr AM, Katz AC, Reger GM. Review of virtual reality treatment in psychiatry: evidence versus current diffusion and use. Curr Psychiat Rep . (2017) 19:80. doi: 10.1007/s11920-017-0836-0

10. Malloy KM, Milling LS. The effectiveness of virtual reality distraction for pain reduction: a systematic review. Clin Psychol Rev . (2010) 30:1011–8. doi: 10.1016/j.cpr.2010.07.001

11. Haag M, Igel C, Fischer MR, German Medical Education Society (GMA) “Digitization-Technology-Assisted Learning and Teaching” joint working group “Technology-enhanced Teaching and Learning in Medicine (TeLL)” of the german association for medical informatics biometry and epidemiology (gmds) and the German Informatics Society (GI). Digital teaching and digital medicine: a national initiative is needed. GMS J Med Educ . (2018) 35:Doc43. doi: 10.3205/zma001189

12. Chaiyachati KH, Shea JA, Asch DA, Liu M, Bellini LM, Dine CJ, et al. Assessment of inpatient time allocation among first-year internal medicine residents using time-motion observations. JAMA Int Med . (2019) 179:760–7. doi: 10.1001/jamainternmed.2019.0095

13. West CP, Dyrbye LN, Shanafelt TD. Physician burnout: contributors, consequences and solutions. J Int Med . (2018) 283:516–29. doi: 10.1111/joim.12752

14. Shah NR. Health care in 2030: will artificial intelligence replace physicians? Ann Int Med . (2019) 170:407–8. doi: 10.7326/M19-0344

15. Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med . (2019) 25:44–56. doi: 10.1038/s41591-018-0300-7

16. Verghese A, Shah NH, Harrington RA. What this computer needs is a physician: humanism and artificial intelligence. JAMA . (2018) 319:19–20. doi: 10.1001/jama.2017.19198

17. Price WN, Gerke S, Cohen IG. Potential liability for physicians using artificial intelligence. JAMA . (2019) 322:1765–6. doi: 10.1001/jama.2019.15064

CrossRef Full Text | Google Scholar

18. Briganti G. Nous Devons Former des Médecins ≪ augmentés ≫ . Le Specialiste. (2019) Available online at: https://www.lespecialiste.be/fr/debats/nous-devons-former-des-medecins-laquo-nbsp-augmentes-raquo.html (accessed October 26, 2019).

Google Scholar

19. Halcox JPJ, Wareham K, Cardew A, Gilmore M, Barry JP, Phillips C, et al. Assessment of remote heart rhythm sampling using the AliveCor heart monitor to screen for atrial fibrillation: the REHEARSE-AF study. Circulation . (2017) 136:1784–94. doi: 10.1161/CIRCULATIONAHA.117.030583

20. Turakhia MP, Desai M, Hedlin H, Rajmane A, Talati N, Ferris T, et al. Rationale and design of a large-scale, app-based study to identify cardiac arrhythmias using a smartwatch: the apple heart study. Ame Heart J. (2019) 207:66–75. doi: 10.1016/j.ahj.2018.09.002

21. Raja JM, Elsakr C, Roman S, Cave B, Pour-Ghaz I, Nanda A, et al. Apple watch, wearables, and heart rhythm: where do we stand? Ann Trans Med . (2019) 7:417. doi: 10.21037/atm.2019.06.79.

22. Huang Z, Chan TM, Dong W. MACE prediction of acute coronary syndrome via boosted resampling classification using electronic medical records. J Biomed Inform . (2017) 66:161–70. doi: 10.1016/j.jbi.2017.01.001

23. Mortazavi BJ, Downing NS, Bucholz EM, Dharmarajan K, Manhapra A, Li SX, et al. Analysis of machine learning techniques for heart failure readmissions. Circ Cardiovasc Qual Outcomes . (2016) 9:629–40. doi: 10.1161/CIRCOUTCOMES.116.003039

24. Dorado-Díaz PI, Sampedro-Gómez J, Vicente-Palacios V, Sánchez PL. Applications of artificial intelligence in cardiology. The future is already here. Revista Española de Cardiología . (2019) 72:1065–75. doi: 10.1016/j.rec.2019.05.014

25. Topalovic M, Das N, Burgel PR, Daenen M, Derom E, Haenebalcke C, et al. Artificial intelligence outperforms pulmonologists in the interpretation of pulmonary function tests. Eur Respirat J . (2019) 53:1801660. doi: 10.1183/13993003.01660-2018.

26. Delclaux C. No need for pulmonologists to interpret pulmonary function tests. Eur Respirat J . (2019) 54:1900829. doi: 10.1183/13993003.00829-2019

27. Lawton J, Blackburn M, Allen J, Campbell F, Elleri D, Leelarathna L, et al. Patients' and caregivers' experiences of using continuous glucose monitoring to support diabetes self-management: qualitative study. BMC Endocrine Disord . (2018) 18:12. doi: 10.1186/s12902-018-0239-1

28. Christiansen MP, Garg SK, Brazg R, Bode BW, Bailey TS, Slover RH, et al. Accuracy of a fourth-generation subcutaneous continuous glucose sensor. Diabet Technol Therapeut . (2017) 19:446–56. doi: 10.1089/dia.2017.0087

29. Niel O, Boussard C, Bastard P. Artificial intelligence can predict GFR decline during the course of ADPKD. Am J Kidney Dis Off J Natl Kidney Found . (2018) 71:911–2. doi: 10.1053/j.ajkd.2018.01.051

30. Geddes CC, Fox JG, Allison ME, Boulton-Jones JM, Simpson K. An artificial neural network can select patients at high risk of developing progressive IgA nephropathy more accurately than experienced nephrologists. Nephrol Dialysis, Transplant . (1998) 13:67–71.

PubMed Abstract | Google Scholar

31. Niel O, Bastard P. Artificial intelligence in nephrology: core concepts, clinical applications, and perspectives. Am J Kidney Dis . (2019) 74:803–10. doi: 10.1053/j.ajkd.2019.05.020

32. Yang YJ, Bang CS. Application of artificial intelligence in gastroenterology. World J Gastroenterol . (2019) 25:1666–83. doi: 10.3748/wjg.v25.i14.1666

33. Fernández-Esparrach G, Bernal J, López-Cerón M, Córdova H, Sánchez-Montes C, Rodríguez de Miguel C, et al. Exploring the clinical potential of an automatic colonic polyp detection method based on the creation of energy maps. Endoscopy . (2016) 48:837–42. doi: 10.1055/s-0042-108434

34. Pace F, Buscema M, Dominici P, Intraligi M, Baldi F, Cestari R, et al. Artificial neural networks are able to recognize gastro-oesophageal reflux disease patients solely on the basis of clinical data. Eur J Gastroenterol Hepatol . (2005) 17:605–10. doi: 10.1097/00042737-200506000-00003

35. Lahner E, Grossi E, Intraligi M, Buscema M, Corleto VD, Delle Fave G, et al. Possible contribution of artificial neural networks and linear discriminant analysis in recognition of patients with suspected atrophic body gastritis. World J Gastroenterol . (2005) 11:5867–73. doi: 10.3748/wjg.v11.i37.5867

36. Das A, Ben-Menachem T, Cooper GS, Chak A, Sivak MV, Gonet JA, et al. Prediction of outcome in acute lower-gastrointestinal haemorrhage based on an artificial neural network: internal and external validation of a predictive model. Lancet . (2003) 362:1261–6. doi: 10.1016/S0140-6736(03)14568-0

37. Sato F, Shimada Y, Selaru FM, Shibata D, Maeda M, Watanabe G, et al. Prediction of survival in patients with esophageal carcinoma using artificial neural networks. Cancer . (2005) 103:1596–605. doi: 10.1002/cncr.20938

38. Peng JC, Ran ZH, Shen J. Seasonal variation in onset and relapse of IBD and a model to predict the frequency of onset, relapse, and severity of IBD based on artificial neural network. Int J Colorect Dis . (2015) 30:1267–73. doi: 10.1007/s00384-015-2250-6

39. Ichimasa K, Kudo SE, Mori Y, Misawa M, Matsudaira S, Kouyama Y, et al. Artificial intelligence may help in predicting the need for additional surgery after endoscopic resection of T1 colorectal cancer. Endoscopy . (2018) 50:230–40. doi: 10.1055/s-0043-122385

40. Yang HX, Feng W, Wei JC, Zeng TS, Li ZD, Zhang LJ, et al. Support vector machine-based nomogram predicts postoperative distant metastasis for patients with oesophageal squamous cell carcinoma. Br J Cancer . (2013) 109:1109–16. doi: 10.1038/bjc.2013.379

41. Regalia G, Onorati F, Lai M, Caborni C, Picard RW. Multimodal wrist-worn devices for seizure detection and advancing research: focus on the Empatica wristbands. Epilep Res . (2019) 153:79–82. doi: 10.1016/j.eplepsyres.2019.02.007

42. Bruno E, Simblett S, Lang A, Biondi A, Odoi C, Schulze-Bonhage A, et al. Wearable technology in epilepsy: the views of patients, caregivers, and healthcare professionals. Epilep Behav . (2018) 85:141–9. doi: 10.1016/j.yebeh.2018.05.044

43. Dorsey ER, Glidden AM, Holloway MR, Birbeck GL, Schwamm LH. Teleneurology and mobile technologies: the future of neurological care. Nat Rev Neurol . (2018) 14:285–97. doi: 10.1038/nrneurol.2018.31

44. Campanella G, Hanna MG, Geneslaw L, Miraflor A, Silva VWK, Busam KJ, et al. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat Med . (2019) 25:1301–9. doi: 10.1038/s41591-019-0508-1

45. Liu X, Faes L, Kale AU, Wagner SK, Fu DJ, Bruynseels A, et al. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. Lancet Digit Health . (2019) 1:e271–97. doi: 10.1016/S2589-7500(19)30123-2

46. Panch T, Mattie H, Celi LA. The “inconvenient truth” about AI in healthcare. NPJ Digit Med . (2019) 2:1–3. doi: 10.1038/s41746-019-0155-4

47. Kelly CJ, Karthikesalingam A, Suleyman M, Corrado G, King D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med . (2019) 17:195. doi: 10.1186/s12916-019-1426-2

48. Mittelstadt B. Ethics of the health-related internet of things: a narrative review. Ethics Informat Technol . (2017) 19:157–75. doi: 10.1007/s10676-017-9426-4

49. Williamson JB. Preserving confidentiality and security of patient health care information. Top Health Informat Manage . (1996) 16:56–60.

50. Montgomery J. Data sharing and the idea of ownership. New Bioeth Multidiscipl J Biotechnol Body . (2017) 23:81–6. doi: 10.1080/20502877.2017.1314893

51. Rodwin MA. The case for public ownership of patient data. JAMA . (2009) 302:86–8. doi: 10.1001/jama.2009.965

52. Mikk KA, Sleeper HA, Topol EJ. The pathway to patient data ownership and better health. JAMA . (2017) 318:1433–4. doi: 10.1001/jama.2017.12145

53. Brouillette M. AI added to the curriculum for doctors-to-be. Nat Med . (2019). 25:1808–9. doi: 10.1038/s41591-019-0648-3

54. Acampora G, Cook DJ, Rashidi P, Vasilakos AV. A survey on ambient intelligence in health care. Proc IEEE Inst Elect Electron Eng . (2013) 101:2470–94. doi: 10.1109/JPROC.2013.2262913

Keywords: digital medicine, mobile health, medical technologies, artificial intelligence, monitoring

Citation: Briganti G and Le Moine O (2020) Artificial Intelligence in Medicine: Today and Tomorrow. Front. Med. 7:27. doi: 10.3389/fmed.2020.00027

Received: 03 November 2019; Accepted: 17 January 2020; Published: 05 February 2020.

Reviewed by:

Copyright © 2020 Briganti and Le Moine. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Giovanni Briganti, giovanni.briganti@hotmail.com

† These authors have contributed equally to this work

AI in health and medicine


  • 1 Department of Biomedical Informatics, Harvard University, Cambridge, MA, USA.
  • 2 Department of Computer Science, Stanford University, Stanford, CA, USA.
  • 3 Scripps Translational Science Institute, San Diego, CA, USA. [email protected].
  • PMID: 35058619
  • DOI: 10.1038/s41591-021-01614-0

Artificial intelligence (AI) is poised to broadly reshape medicine, potentially improving the experiences of both clinicians and patients. We discuss key findings from a 2-year weekly effort to track and share key developments in medical AI. We cover prospective studies and advances in medical image analysis, which have reduced the gap between research and deployment. We also address several promising avenues for novel medical AI research, including non-image data sources, unconventional problem formulations and human-AI collaboration. Finally, we consider serious technical and ethical challenges in issues spanning from data scarcity to racial bias. As these challenges are addressed, AI's potential may be realized, making healthcare more accurate, efficient and accessible for patients worldwide.

© 2022. Springer Nature America, Inc.

Publication types

  • Research Support, N.I.H., Extramural
  • Artificial Intelligence*
  • Delivery of Health Care*
  • Prospective Studies

Grants and funding

  • UL1 TR002550/TR/NCATS NIH HHS/United States

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Review Article
  • Open access
  • Published: 28 October 2022

Artificial intelligence for strengthening healthcare systems in low- and middle-income countries: a systematic scoping review

  • Tadeusz Ciecierski-Holmes   ORCID: orcid.org/0000-0001-5642-218X 1 , 2 ,
  • Ritvij Singh   ORCID: orcid.org/0000-0002-4412-3163 3 ,
  • Miriam Axt   ORCID: orcid.org/0000-0003-3730-5177 1 ,
  • Stephan Brenner   ORCID: orcid.org/0000-0001-5397-7008 1 &
  • Sandra Barteit   ORCID: orcid.org/0000-0002-3806-6027 1  

npj Digital Medicine volume  5 , Article number:  162 ( 2022 ) Cite this article

13k Accesses

34 Citations

26 Altmetric

Metrics details

  • Health policy
  • Translational research

In low- and middle-income countries (LMICs), AI has been promoted as a potential means of strengthening healthcare systems by a growing number of publications. We aimed to evaluate the scope and nature of AI technologies in the specific context of LMICs. In this systematic scoping review, we used a broad variety of AI and healthcare search terms. Our literature search included records published between 1st January 2009 and 30th September 2021 from the Scopus, EMBASE, MEDLINE, Global Health and APA PsycInfo databases, and grey literature from a Google Scholar search. We included studies that reported a quantitative and/or qualitative evaluation of a real-world application of AI in an LMIC health context. A total of 10 references evaluating the application of AI in an LMIC were included. Applications varied widely, including: clinical decision support systems, treatment planning and triage assistants and health chatbots. Only half of the papers reported which algorithms and datasets were used in order to train the AI. A number of challenges of using AI tools were reported, including issues with reliability, mixed impacts on workflows, poor user friendliness and lack of adeptness with local contexts. Many barriers exists that prevent the successful development and adoption of well-performing, context-specific AI tools, such as limited data availability, trust and evidence of cost-effectiveness in LMICs. Additional evaluations of the use of AI in healthcare in LMICs are needed in order to identify their effectiveness and reliability in real-world settings and to generate understanding for best practices for future implementations.

Similar content being viewed by others

ai in medicine and healthcare research paper

An integrative review on the acceptance of artificial intelligence among healthcare professionals in hospitals

Sophie Isabelle Lambert, Murielle Madi, … Astrid Stephan

ai in medicine and healthcare research paper

Guidelines for clinical trial protocols for interventions involving artificial intelligence: the SPIRIT-AI extension

Samantha Cruz Rivera, Xiaoxuan Liu, … SPIRIT-AI and CONSORT-AI Consensus Group

ai in medicine and healthcare research paper

Patient apprehensions about the use of artificial intelligence in healthcare

Jordan P. Richardson, Cambray Smith, … Richard R. Sharp


Rapid technological developments of the past few decades have paved the way for an abundance of technologies that have and continue to revolutionise medicine and healthcare 1 , 2 , 3 . The field of artificial intelligence (AI), in particular, benefits largely from the expanding accessibility of the internet, the progression in software system development, and the fast advancement of microprocessor technology that translated into a variety of widely available devices including tablets, smartphones, laptops and virtual reality appliances 4 . With a widely recognised and accepted definition still underway 5 , this paper uses the definition by Russel and Norvig which describes AI as the wider field of “designing and building intelligent agents that receive precepts from the environment and take actions that affect that environment” 6 .

Particularly relevant AI technologies in medicine and healthcare include knowledge engineering, machine learning (e.g. precision medicine, neural network models), natural language processing, rule-based expert systems, surgical robots, or image and signal processing 7 . Medical education, clinical practice and healthcare delivery have all benefited from these technology advancements, which have offered new techniques and methodological approaches. AI is revolutionising the foundations of healthcare with its potential to improve both the scope and accessibility of healthcare provision at a global scale 8 , 9 .

Given these technological developments, AI has the potential to substantially change the role of how medical care and public health programmes might be implemented in the near future, especially in health systems where the distributions of and access to care have so far been challenging 3 , 10 . In low- and middle-income countries (LMICs), the value of AI is seen in its potential to build health systems by supporting and standardising clinical judgement and applying healthcare processes more objectively with a data-oriented approach 11 . Furthermore, given the shortages of skilled health workers in areas such as sub-Saharan Africa, where medical education capacities are limited 12 , AI-powered clinical tools could represent one way to increase quantity and quality of medical care 13 . However, current AI applications and machine learning still require large amounts of complete and regularly updated datasets, which still remain scarce for most LMICs 14 . While reports on the application of different AI technologies in LMICs continue to grow, the actual evidence base has so far not been reviewed. The scope and extent of implemented AI remains unclear, or whether AI technologies have proven to have potential for healthcare delivery in LMICs.

The goal of this systematic scoping review is therefore to review and map existing literature on health-specific AI applications and to summarise currently available evidence of AI’s efficacy across LMICs. To allow for a comprehensive outline of AI technologies applied to both medical practice and healthcare delivery, this paper systematically reviews and identifies all relevant literature across a wide range of AI applications, health systems, and LMICs. A further focus is on strengths, weaknesses and perceptions of the application of AI in healthcare in LMICs, exploring the following questions:

What are the effects of current AI-based technology on healthcare provision (e.g. diagnosis, treatment, health outcomes, provider or patient time, costs, etc.)?

What are the experiences of providers and patients with respect to the application of current AI-based healthcare technology (e.g. acceptance, perceived usefulness, trust in technology, feasibility to implement and integrate, etc.)?

What are key elements that support or challenge AI implementation in the LMIC healthcare context?

Eligible records

Our database and handsearch identified a total of 1126 articles, of which 1104 were included in title and abstract review after removal of duplicates (see Fig. 1 for details). The final sample of peer-reviewed articles entering analysis included a total of ten studies, described in Table 1 . A list of references for the included studies is available in Supplementary Note 2 .

figure 1

Flowchart of study identification, exclusion based on titles and abstracts, and inclusion in the final review after assessing full texts.

Study characteristics

Four studies were conducted in China, while the other six represent a range of LMICs across Latin America, South Asia and Sub-Saharan Africa (see Table 2 for a summary of key characteristics). Overall, a majority of studies ( n  = 8, 80%) were conducted in the context of upper-middle-income countries. All identified studies have been published since 2018 onwards. While most studies are based on cross-sectional designs, these varied in their quantitative and qualitative methodologies. Study populations ranged from 12 in a clinical research setting to 45,000 in research involving mHealth platforms 15 , 16 .

Features of studied AI technologies

Table 3 summarises the features of the studied AI technologies. Of the AI technologies studied in the reviewed articles, three were applied to the care of communicable diseases (two to HIV/tuberculosis, one to COVID-19), four to the care of non-communicable illnesses (three to various cancers, one to child nutrition), and three to general primary healthcare including pregnancy care. Within their clinical context, three technologies were applied to patient triage, four to screening or diagnostics, and three to care planning or provision. Of these, three tools assisted with triage and screening tasks performed by frontline health workers 17 , 18 , 19 . Four tools assisted physicians with diagnoses, clinical decision making and treatment planning 15 , 20 , 21 , 22 . Two articles studied the use of chatbots by individuals in the community, one being an ‘AI Doctor’ for primary care self-diagnosis 23 , and another offering social support messages on a health forum 24 . Two articles examined AI technologies used in distributing health educational information and support on child nutrition or pregnancy-related care with target populations in the community 16 , 24 .

Transparency of data and algorithms used in training AI tools

Overall, included studies varied regarding the extent to which datasets and algorithms used in the training and testing of AI tools were made transparent. Further, none of the datasets described in any of these studies were immediately accessible to the public in full. Five studies, however, provide reference to the datasets used 15 , 16 , 17 , 18 , 24 , and five studies described the AI algorithms used in detail 15 , 16 , 17 , 21 , 24 . Studies using commercially available products provided limited or no information on their respective datasets and algorithms 18 , 19 , 20 , 22 , 23 . Information gathered about the datasets and algorithms used can be found in Supplementary Table 2 and the Supplementary Discussion.

Interpretability of AI models

Most AI tools ( n  = 7, 70%) lacked any interpretability of their outputs, using ‘black-box’ algorithms 15 , 16 , 17 , 21 , 22 , 23 , 24 . A total of two AI tools for diagnosing TB or COVID-19 using chest X-rays provided interpretable heatmaps/areas-of-interest on a chest X-ray 18 , 19 . One study used IBM Watson for Oncology, a cancer treatment planning assistant, which provides relevant literature, such as clinical trial data, for a particular treatment it has recommended - though it is still largely a black-box tool 20 , 25 .

Strengths, weaknesses and perceptions of implemented AIs

In the next sections, we focus specifically on cost-savings and improvements in health outcomes, effect on workflows and time to treatment and diagnosis, local adequacy of AI, and user-friendliness, reliability and trust in AI technologies summarised in Table 4 .

Reliability of AI tools

Concordance between the AI tools and physicians was reported in four studies 17 , 20 , 21 , 22 . Perfect concordance was reported in small samples of triaged breast lumps using ultrasound and radiation treatment plans 17 , 21 , but also some discordance between clinical decision support systems and the local treatment options available 20 , 22 .

Concordance of the IBM Watson for Oncology between physicians’ clinical decisions and treatment suggestions varied from 12% to 96% across several cancers 20 . This included cases where a suggested treatment was too expensive, not available, considered to be too aggressive or inconvenient for the patient, or locally available alternatives would have been preferred. Baidu Inc’s ‘Brilliant Doctor’ clinical decision support system made generally good suggestions, but sometimes disagreed with physicians on their first choice of diagnosis and treatment 22 . Participating physicians reported that inadequate care recommendations were usually a result of the system’s poor interoperability with other IT systems, use of inaccurate information, and missing information on patient’s income and insurance background 22 . The misalignment with the local clinical context was attributed to the training protocols used. For example, both tools were trained on data outside of their applied contexts, and thus did not fully account for the local disease incidence and treatment options available 20 , 22 .

Effect on workflows and time to treatment and diagnosis

AI technologies improved workflows in a number of ways. Non-sonographer healthcare workers (HCWs) could be trained in AI-based ultrasound triage, thus reducing the workloads of formally trained sonographers 17 . Similarly, automated radiation treatment planning reduced the time spent by radiation oncologists in making treatment plans 21 . COVID-19 triage was also more time-efficient in an Ecuadorian hospital once an AI computed tomography (CT)-screening tool was implemented 18 . The ‘Brilliant Doctor’ clinical decision support system also had a partially positive impact in rural Chinese primary-care clinics by suggesting diagnostic alternatives to physicians, thus facilitating medical information search and potentially reducing the likelihood of medical errors 22 . Notably, however, higher workloads were reported in clinical settings with low capacity for adopting new AI tools 15 , 22 .

Shortened time to diagnosis or treatment were reported in two studies. Delft’s ‘CAD4TB’ TB screening tool reduced time to the initiation of treatment compared to standard sputum screening tests in a Malawian primary-care clinic, while a social support chatbot improved response times for individuals seeking social support in online forums in China 19 , 24 .

User-friendliness and compatibility with existing infrastructure

User-friendliness and compatibility with existing infrastructure are quintessential in this context, as healthcare personnel or patients may not be trained or used to using new technologies while being short on time, resources and making potentially life-changing decisions under pressure. These aspects were noted in four of our included studies.

The ‘Brilliant Doctor’ clinical decision support system was found to require too much information from physicians, which was perceived as too time-consuming in a majority of cases 22 . Lacking integration with existing IT systems also resulted in critical laboratory information not being factored into the AI’s decision making process 22 . Physicians in Peruvian TB clinics also reported problems with an app-based TB-diagnostics tool utilising chest X-rays, including issues such as crashes of the app or mistranslations 15 . Poor internet connectivity inside the clinics and the overall limited availability of X-ray viewers throughout clinics impeded the uploading of X-ray images to the TB diagnostic tool by nurses 15 .

Fan et al. 23 reported that self-diagnosis chatbots were used mostly by younger patients. Although a majority of user feedback for the ‘Doctor Bot’ chatbot was positive, some chatbot users also perceived the provided information to be insufficient, overwhelming, or difficult to understand 23 .

Garzon-Chavez et al. 18 reported a rather successful incorporation of the chest CT AI-assisted triage tool into the hospital’s COVID-19 triage process which required cases identified as high COVID-19 risk arriving at the emergency room to first undergo CT-based screening. Later during the pandemic, once Reverse Transcription Polymerase Chain Reaction (RT-PCR) tests became more readily available, the AI-assisted chest CT scans remained the dominant form of triage due to its speed despite lower accuracy.

Trust in AI systems

User-friendliness is linked to another critical point when introducing AI systems in healthcare: end-user trust in the technologies. Two of our included studies discussed user trust in AI technologies.

Physicians interviewed in Wang et al. 22 expressed distrust in clinical decision support systems, as the basis on which diagnostic or therapeutic decision-making occurred was not sufficiently transparent. Similarly, Fan et al. 23 reported that diagnoses produced by the AI self-diagnosis chatbot were perceived as inaccurate by some users.

Wang et al. 24 further pointed out problematic behaviour by their social support chatbot, whose identity was hidden from end-users on an online social support forum. In one case, in comforting a user who recently had a child, the AI mimicked a human response implying it had the same experience with its own baby. Given the chatbot’s identity was hidden, this raised questions about how AIs should be trained in order to avoid responding inappropriately to user posts 24 .

Cost-savings and improvements in health outcomes

Only MacPherson et al. conducted a cost-effectiveness study of their AI tool. Compared to usual care, the AI ‘CAD4TB’ TB-screening tool improved patients’ quality-adjusted life-years (QALYs) by reducing the average time to receive treatment. However, the cost per QALY was measured to be $4,620·47 per QALY gained, which was deemed to be beyond the willingness-to-pay in the Malawian context 19 . Wang et al. 24 found that an AI chatbot performed comparably with humans in promoting positive responses from online forum users seeking emotional support.

Local adequacy of AI

Local adequacy of AI tools was a common theme in our studies, with three studies discussing challenges with applying AI tools to new lower-resource contexts.

Zhou et al. 20 suggested the US-based training of IBM Watson for Oncology using US medical literature has led to inappropriate treatment suggestions in the Chinese context. Ugarte-Gil et al. 15 reported unexpected complications with the implementation of their TB diagnostic tool, with their implementation sites having less internet connectivity, X-ray viewer capacity and mobile technology proficiency among health care workers than they had expected, which reduced the effectiveness of the AI tool. Wang et al. 22 reported the AI clinical decision support tool had not well accounted for rural primary-care physician workflows in its design, and its usefulness could have been improved as a triage assistant rather than a physician assistant.

The literature on AI applications for healthcare in LMICs has been steadily growing in recent years and is so far largely dominated by studies and reports from China and India 26 . Despite the substantial improvements in the technical capabilities of AI in different branches of medicine, such as ophthalmology and radiology 27 , 28 , many studies were not included in this review because they were proof of concepts and did not describe AI implementations in real-world, low-resource settings, limiting our understanding of the true performance and benefits of AIs 29 . This research is critical to understand both the adaptation to and potential performance of AI tools to medical and other health-related fields in settings where this technology has so far not yet played a strong role 30 . However, we found that researchers are actively addressing this knowledge gap. We came across a rather large number of LMIC-based publications of research protocols related to planned or on-going AI evaluations, as well as studies published since the time we performed our literature search that would have met our inclusion criteria. For instance, recent ophthalmology studies from Thailand and Rwanda have demonstrated the potential of AI-assisted diabetic retinopathy screening in LMICs while also flagging issues similar to those of our included studies, such as the challenge of integrating AI systems into existing workflows and infrastructure 31 , 32 . The private sector is also highly active in developing AI tools for healthcare, as our grey literature search revealed (see Table 5 ). None of the AI tools described in the grey literature provide concrete evidence to show that they improve health outcomes, or reduce costs associated with healthcare, although one can assume that some tools are hugely beneficial, such as automated drone deliveries of medical supplies in rural Rwanda 33 . Increased efforts to provide prospective evaluations of such tools would be beneficial for the wider healthcare community by offering lessons in which AI tools can improve health outcomes and/or reduce costs in particular contexts, and what may be required for said AI tools to be successful in their implementation.

The performance of AI applications in healthcare settings varies greatly, as was also observed in previous reviews of AI applications in medical imaging when compared to clinicians 34 , 35 . This is similar to included studies in this review that found inconsistencies in diagnostic sensitivity and specificity between AI tools and physician assessments 17 , 18 , 20 , 22 . Also, we were unable to identify many studies performing prospective feasibility testing or trials of AI tools in real-world settings in order to test their performance 34 , 35 . The reported performance of AI tools tested on retrospective datasets should be treated with caution, as a tool’s accuracy likely diminishes when applied to new data from different contexts 18 , 35 . Further studies of the performance of AI tools applied in healthcare settings are required to take into account data and concept drift 36 . Based on our review, existing evidence is also limited by inconsistent levels of reported transparency with respect to AI implementation and performance. For instance, there seems to be no systematic approach to report on the use of the type of datasets used for AI training, testing and validation, the underlying training algorithms, and key AI outputs that would allow a more direct comparison of AI performance, as well as to identify potential causes of poor performance 34 .

The underlying dataset is a key element of training the AI tool and its performance. Data from included research suggested that AI systems were trained on data collected outside of the implementation context 17 , 20 , 22 . However, AI-models trained on high-income country data may introduce bias into AI outputs, leading to poor performance or, worse, wrong results - which is harmful in a health-context and also harmful in establishing AI in healthcare because trust may be broken. Given that data is dynamic and may change its statistical features over time (data and concept drift), it is critical that AI models receive context-specific and updated data on a frequent basis; otherwise, AI models’ performance may worsen over time. This could lead to a downward spiral, as poor performance is likely to lead to poor acceptance of HCWs and a loss of trust in AI-based systems. While middle-income countries, like China and South Africa, have substantial collections of data pertaining to both the health system and health service delivery at the national and sub-national levels, the selection of training data is more limited in many low-income countries 11 . On the other hand, available context-specific data sets might be underused, untapped, or deemed too limited or inadequate, as the contained information is too asymmetric, asynchronous or varied in type, and spread across locations to facilitate reliable AI training 11 . There are no clear estimates on the amount of training data needed in designing an AI project. To better leverage small datasets in the context of LMICs, additional modelling techniques and simple classifiers should be considered, like the Naive Bayes algorithm, which allows a sufficiently strong learning process if applied to small datasets 37 . While public health institutions, donor-funded programmes and the business sector all generate large volumes of data, such data is often inaccessible to researchers and AI implementers 38 . Data collection and storage is too fragmented, or only intended for very specific purposes, such as programme reporting, policy development, strategic planning and advocacy 39 . Furthermore, some LMICs still face challenges in digitization of routinely collected data, as well as limited digital literacy with respect to data collection and management 38 . Ongoing efforts to harmonise fragmentations in health information systems that foster accurate, reliable, timely, interoperable datasets will be crucial in advancing AI technologies 38 . Routine data collecting platforms, such as OpenMRS or DHIS2, are well-established in low- and middle-income countries, and other initiatives, such as Health and Demographic Surveillance Systems 40 , provide enormous and standardised population health datasets encompassing decades. Yet, data ownership and data sharing rules can still pose barriers to accessing this data for research and commercial purposes. The Nairobi data sharing guidelines of 2014, as well as the Global Digital Health Index, are both first steps toward finding solutions to this topic. In order to develop datasets that may be used for AI, privacy regulations, data access and ownership agreements, and other essential challenges must be overcome. Public health agencies can play an important role in encouraging data sharing and providing public access to health data – both internal and private-sector generated health data – while also developing the governance mechanisms required to protect individual privacy.

Usability and integration of digital health tools, including AI tools, remain a challenge in high- and low-resource settings alike. Coiera 41 and Cabitza et al. 42 identified some of the complex challenges of the “last mile of implementation” that cause a poor translation of statistically high-performing AI into real-world applications. Especially in low-resource settings, the effectiveness of AI tools depends on how well these technologies can be utilised or integrated by end-users within an existing infrastructure 43 . In order to perform well in a real-world setting, AI tools should complement existing organisational networks of people, processes and technologies 41 , 42 . Inadequate design of user-interfaces can further limit the positive impact of an AI tool on clinical applicability, irrespective of diagnostic accuracy 42 . Complex or confusing user interfaces can lead to frustration among end-users or limited successful tool application, negatively impacting the uptake of technologies by front-line health workers or patients in low-resource settings 15 , 22 , 44 . Successful introduction of novel digital tools in low-resource settings therefore needs to account for and increase the basic capacity of HCWs to adopt technologically complex tools 44 . In some of the studies included in our review, AI integration was limited due to incompatibility with existing electronic health record systems, which in turn limited its performance as decisions could not be fully supported by relevant health record data. Another barrier to successful AI implementation includes the often unstable internet connectivity in some low-resource areas, since poor or intermittent internet access disrupts the use of cloud-based tools needed to upload key data elements, such as radiology images 44 .

Trust and acceptance of users is a critical feature of AI for global health and healthcare in general. Trust in AI applications has been found to be stronger if a technology and algorithms are understandable and assist users toward their goals 45 . A majority of reviewed studies still relied on a ‘black box’ approach, which leaves it unclear how the algorithms used eventually arrive at results. Furthermore, only half of studies provided a transparent description of their AI methodologies. Healthcare AI should be transparent about the source of data, qualify AI-based suggestions, and be explainable when they are used by clinicians and patients to make decisions 46 . Otherwise, it could negatively affect the trust foundations and increase the likelihood of rejection of the healthcare AI technology. Of course, patient data security is an essential aspect, particularly as cyberattacks get more sophisticated 47 . The adoption of approaches and structures similar to those regulating the pharma industry and the production of medicines might therefore be a feasible path forward for AI in healthcare. Likewise, AI healthcare applications may need to go through a similar process of preclinical research, clinical research, authority evaluation and post-market safety monitoring. It is also necessary to investigate future revisions of medical curricula to incorporate elements strengthening future HCW’s digital literacy and knowledge which may increase trust and effective usage of technologies, such as AI-based systems. Currently, users often have not received sufficient training and feel overwhelmed. Therefore, digital systems are often regarded as additional burdens. Another approach that appears to build user trust and hence potentially boost technology acceptance is the slow introduction of innovations, which “allows for incremental social learning” 45 . In general, technology acceptance is a complex process 48 . Other factors, such as a thorough understanding of the users’ benefits in contrast to other available technologies and pathways, undoubtedly play an essential part in lessening innovation resistance. It seems beneficial to proactively communicate from the start of the development process 45 . Overall, trust is a complex and delicate component and should be a key priority particularly at the start of the wider implementation of AI-based healthcare applications. Otherwise users, both patients and health care workers, may reject the technologies and impede further progress.

Affordability is an important characteristic of AI tools in a LMIC context. Even if the technologies are efficacious, this benefit cannot be realised if they are more expensive than legacy approaches to which HCWs are familiar. Our review and the wider literature suggest there is a dearth of evidence on the improvements in health outcomes and cost-savings associated with the implementation of AI tools in any context 49 , and of eHealth tools more generally 50 . We hypothesise that this finding reflects the maturity of AI healthcare research, since cost-effectiveness analyses necessarily occur later in the AI tool development and implementation timeline. To evaluate whether AI tools are affordable in LMICs, there is a need for more cost-effectiveness analysis studies.

A number of local challenges were reported in the studies included in this review as well as the wider literature. AI-based systems were not sufficiently integrated in existing workflows and infrastructure; healthcare centres in LMICs are subject to system outages caused by power or internet connectivity disruptions 15 , 32 , 51 , and, as a result of donor-funded agendas in LMICs, there is intermittent advancement that is susceptible to trends or “fashions” 38 , eroding faith in these systems further due to their lack of utility and continuity. Additionally, there seems to be a concern among HCWs in LMICs that AI may eventually take over their jobs, impeding its further adoption 52 .AI applications in healthcare require a holistic systems approach to implementation. Consideration of the multiple interacting facilitators and barriers to AI implementation in real-world settings is required, in addition to the technical performance AI system in addressing a specific health problem, in order to have maximal impact on human health. Future implementations may also want to consider ‘effective coverage’ - the need, use and quality of health intervention - as a performance metric 53 . Further studies are required in order to address contextual challenges, such as trust and HCW job insecurity, data insecurity and sustainability, in order to inform future AI implementations in healthcare in LMICs.

Although we attempted to perform a broad search of studies of AI deployed in healthcare in LMICs when performing our database search, we may have missed important papers that would have met our inclusion criteria. We mitigated the risk of this by also performing Google Scholar search with broad search terms, as well as exploring grey literature extensively, looking at papers cited in multiple reviews of AI in healthcare and research presented at various AI and healthcare conferences. Only articles published in English were included. This is a limitation of the review since China is an area with a highly active AI research field. However, there are research articles published in English, produced in China, that we were able to include in the review. Articles also had to have been peer-reviewed, which notably excluded a small number of recently published manuscripts on https://arxiv.org/ . We concentrated exclusively on completed studies, which may have resulted in a significant reduction in the number of papers, leaving out ongoing research activity that may have been communicated via other channels. The field of AI research is rapidly evolving, therefore our review has also excluded relevant new research that has been published between the time of our database search and the publishing of this paper.

This systematic review has identified ten articles where a wide variety of AI technologies that have been implemented in varying healthcare settings across seven LMICs. AI has a demonstrated potential in triage, diagnostics and treatment planning settings. However, many challenges and barriers to successful implementation exist. Greater transparency and availability of algorithms and datasets used to train AIs could allow for a great understanding of why particular tools perform well or poorly. Further studies of AI use-cases in healthcare settings are required along a number of avenues, including: prospective studies that demonstrate real-world reliability and cost-effectiveness of AI tools, analyses of end-user perspectives of AI usability and trust in AI technologies, and how to effectively integrate AI systems into existing healthcare infrastructure.

To identify and map all relevant AI studies in LMICs that addressed our research questions, we considered a systematic scoping review as the most suitable methodology for our evidence review 54 . We followed five iterative stages as described by Arksey and O’Malley and systematically reviewed identified literature in line with published scoping review guidelines 55 , 56 , 57 . We report our findings in accordance with the Preferred Reporting Items for System Reviews and Meta-Analyses Extension for Scoping Reviews (PRISMA-ScR) 58 .

Databases searched

Our literature search included five electronic databases: Scopus, EMBASE, MEDLINE, Global Health and APA PsycInfo. A search strategy for each database was developed to identify relevant studies (see Supplementary Table 1 for search terms used). We further expanded our search to include grey literature via Google Scholar 59 . We also conducted a handsearch of journals and conference papers discussing AI applications in global health.

Overall, we included only peer-reviewed literature. Since the field of AI in healthcare is a rapidly evolving field, numerous publications were available ahead of print. In these instances, we only included pre-prints that had already undergone at least initial peer-review. We also reviewed papers presented at AI conferences, as it is common in the field of AI that publications are made available at key conferences which also peer-review submissions.

Search criteria

We applied a variety of search terms consisting of concepts related to AI, healthcare, and LMICs to identify a broad range of peer-reviewed, original records on AI, health and healthcare in LMICs. Our literature search included records published between 1st January 2009 and 30th September 2021. We limited our search to literature published after 2009, as this year marks the point at which graphic processing units (GPUs) were repurposed for AI applications, thus providing a substantial boost in the speed at which AI models could be trained and implemented 60 . LMICs were defined based on the World Bank Group Classification of Economies as of January 2021 61 . We only included records describing original studies. Records without full-text and articles such as commentaries, letters, policy briefs and study protocols were excluded. Our search further included records that described a quantitative and/or qualitative evaluation of an implemented AI application related to healthcare. Hence, studies merely describing theoretical AI approaches, such as machine learning methods in a non-specific or non-LMIC context without defining a real-world application of AI in a LMIC health context, were not considered.

Study screening and selection

Records identified by the above database searches were entered into the Covidence Systematic Review Software for further title and abstract review 62 . Inclusion and exclusion criteria were identified following the PICOS (population, intervention, comparison, outcome, study design) framework (see Table 6 for details) 63 . Three reviewers (T.C.H., R.S. and M.A.) screened titles and abstracts independently to select those articles fully meeting set inclusion criteria related to the application of AI in healthcare in an LMIC. Discrepancies in reviewer ratings were discussed and decided within the entire research team (T.C.H., R.S., M.A., St.B. and S.B.). Once relevant articles had been identified, the reviewers (T.C.H., R.S. and M.A.) screened all full texts to exclude those articles which did not meet inclusion based on full-text review.

Data extraction and synthesis

We used a data extraction form to chart characteristics and map key findings from the final set of articles (see Supplementary Fig. 1 ). Key AI characteristics included aspects such as the application field and context, dataset sources and algorithms used. Additionally, we mapped the specific use of each AI application as an assistant for either patients, health workers, or physicians 64 . We extracted descriptive and methodological characteristics of each reviewed study. Content mapping focused on extracting and comparing as well as pertinent outcomes and reported lessons learned.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability

All data generated and analysed during this study are included in the article and its supplementary information files.

Atkinson, R. D. & Castro, D. Digital Quality of Life: Understanding the Personal and Social Benefits of the Information Technology Revolution . https://papers.ssrn.com/abstract=1278185 (2008).

Murdoch, T. B. & Detsky, A. S. The inevitable application of big data to health care. JAMA 309 , 1351–1352 (2013).

Article   CAS   PubMed   Google Scholar  

Topol, E. The Creative Destruction Of Medicine: How The Digital Revolution Will Create Better Health Care (Basic Books, 2012).

Ceruzzi, P. E. Computing: A Concise History . (MIT Press, 2012).

Wang, P. On defining artificial intelligence. J. Artif. Gen. Intell. 10 , 1–37 (2019).

Article   Google Scholar  

Russell, S. & Norvig, P. Artificial Intelligence: A Modern Approach (Prentice Hall, 2002).

Davenport, T. & Kalakota, R. The potential for artificial intelligence in healthcare. Future Healthc. J. 6 , 94–98 (2019).

Article   PubMed   PubMed Central   Google Scholar  

Liang, H. et al. Evaluation and accurate diagnoses of pediatric diseases using artificial intelligence. Nat. Med. 25 , 433–438 (2019).

Steinhubl, S. R., Muse, E. D. & Topol, E. J. The emerging field of mobile health. Sci. Transl. Med. 7 , 283rv3 (2015).

Goldhahn, J., Rampton, V. & Spinas, G. A. Could artificial intelligence make doctors obsolete? BMJ 363 , k4563 (2018).

Article   PubMed   Google Scholar  

Reddy, C. L., Mitra, S., Meara, J. G., Atun, R. & Afshar, S. Artificial Intelligence and its role in surgical care in low-income and middle-income countries. Lancet Digit. Health 1 , e384–e386 (2019).

Frenk, J. et al. Health professionals for a new century: transforming education to strengthen health systems in an interdependent world. Lancet 376 , 1923–1958 (2010).

Oren, O., Gersh, B. J. & Bhatt, D. L. Artificial intelligence in medical imaging: switching from radiographic pathological data to clinically meaningful endpoints. Lancet Digit. Health 2 , e486–e488 (2020).

Lee, J. et al. Interventions to improve district-level routine health data in low-income and middle-income countries: a systematic review. BMJ Glob. Health 6 , e004223 (2021).

Ugarte-Gil, C. et al. Implementing a socio-technical system for computer-aided tuberculosis diagnosis in Peru: A field trial among health professionals in resource-constraint settings. Health Inform. J. 26 , 2762–2775 (2020).

Ganju, A., Satyan, S., Tanna, V. & Menezes, S. R. AI for improving children’s health: a community case study. Front. Artif. Intell. 3 , 544972 (2021).

Love, S. M. et al. Palpable breast lump triage by minimally trained operators in mexico using computer-assisted diagnosis and low-cost ultrasound. J. Glob. Oncol . https://doi.org/10.1200/JGO.17.00222 (2018).

Garzon-Chavez, D. et al. Adapting for the COVID-19 pandemic in Ecuador, a characterization of hospital strategies and patients. PLoS ONE 16 , e0251295 (2021).

Article   CAS   PubMed   PubMed Central   Google Scholar  

MacPherson, P. et al. Computer-aided X-ray screening for tuberculosis and HIV testing among adults with cough in Malawi (the PROSPECT study): A randomised trial and cost-effectiveness analysis. PLoS Med. 18 , e1003752 (2021).

Zhou, N. et al. Concordance study between IBM watson for oncology and clinical practice for patients with cancer in China. Oncologist 24 , 812–819 (2019).

Kisling, K. et al. Fully automatic treatment planning for external-beam radiation therapy of locally advanced cervical cancer: a tool for low-resource clinics. J. Glob. Oncol . https://doi.org/10.1200/JGO.18.00107 (2019).

Wang, D. et al. “Brilliant AI Doctor” in rural clinics: challenges in AI-powered clinical decision support system deployment. in Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems 1–18 (ACM, 2021).

Fan, X. et al. Utilization of self-diagnosis health chatbots in real-world settings: case study. J. Med. Internet Res. 23 , e19928 (2021).

Wang, L. et al. CASS: towards building a social-support chatbot for online health community. Proc. ACM Hum.-Comput. Interact . 5 , 1-31 (2021).

Bumrungrad International Hospital. IBM Watson for Oncology Demo . https://www.youtube.com/watch?v=338CIHlVi7A (2015).

Guo, Y., Hao, Z., Zhao, S., Gong, J. & Yang, F. Artificial intelligence in health care: bibliometric analysis. J. Med. Internet Res. 22 , e18228 (2020).

Lu, W. et al. Applications of artificial intelligence in ophthalmology: general overview. J. Ophthalmol . 2018 , 5278196 (2018).

Amisha, Malik, P., Pathania, M. & Rathaur, V. K. Overview of artificial intelligence in medicine. J. Fam. Med. Prim. Care 8 , 2328–2331 (2019).

Article   CAS   Google Scholar  

Roberts, M. et al. Common pitfalls and recommendations for using machine learning to detect and prognosticate for COVID-19 using chest radiographs and CT scans. Nat. Mach. Intell. 3 , 199–217 (2021).

Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 , 195 (2019).

Mathenge, W. et al. Impact of artificial intelligence assessment of diabetic retinopathy on referral service uptake in a low resource setting: The RAIDERS randomized trial. Ophthalmol. Sci . 2 , 100168 (2022).

Ruamviboonsuk, P. et al. Real-time diabetic retinopathy screening by deep learning in a multisite national screening programme: a prospective interventional cohort study. Lancet Digit. Health 4 , e235–e244 (2022).

Mhlanga, M., Cimini, T., Amaechi, M., Nwaogwugwu, C. & McGahan, A. From A to O-Positive: Blood Delivery Via Drones in Rwanda. Reach Alliance https://reachalliance.org/wp-content/uploads/2021/03/Zipline-Rwanda-Final-April19.pdf (2021).

Liu, X. et al. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. Lancet Digit. Health 1 , e271–e297 (2019).

Nagendran, M. et al. Artificial intelligence versus clinicians: systematic review of design, reporting standards, and claims of deep learning studies. BMJ 368 , m689 (2020).

Žliobaitė, I., Pechenizkiy, M. & Gama, J. An overview of concept drift applications. in Big Data Analysis: New Algorithms for a New Society (eds. Japkowicz, N. & Stefanowski, J.) 91–114 (Springer International Publishing, 2016).

5 Ways to Deal with the Lack of Data in Machine Learning. KDnuggets . https://www.kdnuggets.com/5-ways-to-deal-with-the-lack-of-data-in-machine-learning.html/ .

GIZ. From Strategy To Implementation - On The Pathways Of The Youngest Countries In Sub-saharan Africa Towards Digital Transformation Of Health Systems . https://www.governinghealthfutures2030.org/pdf/resources/FromStrategyToImplementation-GIZReport.pdf (2021).

Nutley, T. & Reynolds, H. Improving the use of health data for health system strengthening. Glob. Health Action 6 , 20001 (2013).

Ye, Y., Wamukoya, M., Ezeh, A., Emina, J. B. O. & Sankoh, O. Health and demographic surveillance systems: a step towards full civil registration and vital statistics system in sub-Sahara Africa? BMC Public Health 12 , 741 (2012).

Coiera, E. The last mile: where artificial intelligence meets reality. J. Med. Internet Res. 21 , e16323 (2019).

Cabitza, F., Campagner, A. & Balsano, C. Bridging the “last mile” gap between AI implementation and operation: “data awareness” that matters. Ann. Transl. Med. 8 , 501 (2020).

Asan, O. & Choudhury, A. Research trends in artificial intelligence applications in human factors health care: mapping review. JMIR Hum. Factors 8 , e28236 (2021).

Wallis, L., Blessing, P., Dalwai, M. & Shin, S. D. Integrating mHealth at point of care in low- and middle-income settings: the system perspective. Glob. Health Action 10 , 1327686 (2017).

Hengstler, M., Enkel, E. & Duelli, S. Applied artificial intelligence and trust—the case of autonomous vehicles and medical assistance devices. Technol. Forecast. Soc. Change 105 , 105–120 (2016).

Nundy, S., Montgomery, T. & Wachter, R. M. Promoting trust between patients and physicians in the era of artificial intelligence. JAMA 322 , 497–498 (2019).

Gafni, R. & Pavel, T. Cyberattacks against the health-care sectors during the COVID-19 pandemic. Inf. Comput. Secur. 30 , 137–150 (2021).

Venkatesh, V. & Bala, H. Technology acceptance model 3 and a research agenda on interventions. Decis. Sci. 39 , 273–315 (2008).

Wolff, J., Pauling, J., Keck, A. & Baumbach, J. The economic impact of artificial intelligence in health care: systematic review. J. Med. Internet Res. 22 , e16866 (2020).

Sanyal, C., Stolee, P., Juzwishin, D. & Husereau, D. Economic evaluations of eHealth technologies: a systematic review. PLoS ONE 13 , e0198112 (2018).

Chawla, S. et al. Electricity and generator availability in LMIC hospitals: improving access to safe surgery. J. Surg. Res. 223 , 136–141 (2018).

Antwi, W. K., Akudjedu, T. N. & Botwe, B. O. Artificial intelligence in medical imaging practice in Africa: a qualitative content analysis study of radiographers’ perspectives. Insights Imaging 12 , 80 (2021).

Ng, M. et al. Effective coverage: a metric for monitoring universal health coverage. PLoS Med. 11 , e1001730 (2014).

Munn, Z. et al. Systematic review or scoping review? Guidance for authors when choosing between a systematic or scoping review approach. BMC Med. Res. Methodol. 18 , 143 (2018).

Arksey, H. & O’Malley, L. Scoping studies: towards a methodological framework. Int. J. Soc. Res. Methodol. 8 , 19–32 (2005).

Peters, M. D. J. et al. Guidance for conducting systematic scoping reviews. JBI Evid. Implement. 13 , 141–146 (2015).

Google Scholar  

Muka, T. et al. A 24-step guide on how to design, conduct, and successfully publish a systematic review and meta-analysis in medical research. Eur. J. Epidemiol. 35 , 49–60 (2020).

Tricco, A. C. et al. PRISMA extension for scoping reviews (PRISMA-ScR): checklist and explanation. Ann. Intern. Med. 169 , 467–473 (2018).

Haddaway, N. R., Collins, A. M., Coughlin, D. & Kirk, S. The role of google scholar in evidence reviews and its applicability to grey literature searching. PLoS ONE 10 , e0138237 (2015).

Raina, R., Madhavan, A. & Ng, A. Y. Large-scale deep unsupervised learning using graphics processors. In Proceedings of the 26th Annual International Conference on Machine Learning 873–880 (Association for Computing Machinery, 2009).

World Bank Country and Lending Groups – World Bank Data Help Desk. https://datahelpdesk.worldbank.org/knowledgebase/articles/906519-world-bank-country-and-lending-groups .

Harrison, H., Griffin, S. J., Kuhn, I. & Usher-Smith, J. A. Software tools to support title and abstract screening for systematic reviews in healthcare: an evaluation. BMC Med. Res. Methodol. 20 , 7 (2020).

Methley, A. M., Campbell, S., Chew-Graham, C., McNally, R. & Cheraghi-Sohi, S. PICO, PICOS and SPIDER: a comparison study of specificity and sensitivity in three search tools for qualitative systematic reviews. BMC Health Serv. Res . 14 , 579 (2014).

Artificial Intelligence in Global Health: Defining a Collective Path Forward . https://www.usaid.gov/sites/default/files/documents/1864/AI-in-Global-Health_webFinal_508.pdf (2019).

Bereskin, Caulder, P. L.-I., Kovarik, R. & Cowan, C. AI in focus: BlueDot and the response to COVID-19. Lexology https://www.lexology.com/library/detail.aspx?g=a94f63b4-2829-4f62-97f7-43f2aecd12a6 (2020).

ASEAN BioDiaspora Virtual Center. COVID-19 Situational Report in the ASEAN Region . 16 https://asean.org/wp-content/uploads/2021/10/COVID-19_Situational-Report_ASEAN-BioDiaspora-Regional-Virtual-Center_11Oct2021-1.pdf (2021).

Smart delivery robot-Pudu robotics. Smart delivery robot-Pudu robotics https://www.pudutech.com/ .

Simonite, T. Chinese Hospitals Deploy AI to Help Diagnose Covid-19. Wired .

Li, K. et al. Assessing the predictive accuracy of lung cancer, metastases, and benign lesions using an artificial intelligence-driven computer aided diagnosis system. Quant. Imaging Med. Surg. 11 , 3629–3642 (2021).

Weinstein, E. China’s Use of AI in its COVID-19 Response . https://cset.georgetown.edu/publication/chinas-use-of-ai-in-its-covid-19-response/ (2020).

Liu, X. et al. A 2-year investigation of the impact of the computed tomography–derived fractional flow reserve calculated using a deep learning algorithm on routine decision-making for coronary artery disease management. Eur. Radiol. 31 , 7039–7046 (2021).

Ruijin Hospital: Develop AI-powered chronic disease management products with 4Paradigm. 4Paradigm . https://en.4paradigm.com/content/details_262_1198.html .

Han, M. Langfang’s epidemic prevention and control strategy, robots are online on duty. Beijing Daily https://ie.bjd.com.cn/5b165687a010550e5ddc0e6a/contentApp/5b1a1310e4b03aa54d764015/AP5e4aae66e4b0c4aab142c4d8?isshare=1&app=8ED108F8-A848-43A8-B32F-83FD7330B638&from=timeline (2020).

Research on the Application of Intelligent Triage Innovation Technology in Southwest Medical University Hospital. Futong . http://www.futong.com.cn/intell-medical-case2.html (2020).

iFLYTEK Corporate Social Responsibility Report . https://www.iflytek.com/en/usr/uploads/2020/09/csr.pdf (2020).

Across China: Drones for blood deliveries take off in China - Xinhua | English.news.cn. http://www.xinhuanet.com/english/2021-03/27/c_139839745.htm (2021).

Truog, S., Lawrence, E., Defawe, O., Ramirez Rufino, S. & Perez Richiez, O. Medical Cargo Drones in Rural Dominican Republic . https://publications.iadb.org/publications/english/document/Medical-Cargo-Drones-in-Rural-Dominican-Republic.pdf (2020).

Knoblauch, A. M. et al. Bi-directional drones to strengthen healthcare provision: experiences and lessons from Madagascar, Malawi and Senegal. BMJ Glob. Health 4 , e001541 (2019).

He, J. et al. Artificial intelligence-based screening for diabetic retinopathy at community hospital. Eye 34 , 572–576 (2020).

Rajalakshmi, R., Subashini, R., Anjana, R. M. & Mohan, V. Automated diabetic retinopathy detection in smartphone-based fundus photography using artificial intelligence. Eye 32 , 1138–1144 (2018).

Mollura, D. J. et al. Artificial intelligence in low- and middle-income countries: innovating global health radiology. Radiology 297 , 513–520 (2020).

Partnerships. Alexapath . http://alexapath.com/Company/Partnership (2020).

Nakasi, R., Tusubira, J. F., Zawedde, A., Mansourian, A. & Mwebaze, E. A web-based intelligence platform for diagnosis of malaria in thick blood smear images: a case for a developing country. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 4238–4244 (IEEE, 2020).

Morales, H. M. P., Guedes, M., Silva, J. S. & Massuda, A. COVID-19 in Brazil—preliminary analysis of response supported by artificial intelligence in municipalities. Front. Digit. Health 3 , 52 (2021).

Chinas AI doctor-bot help each doctor treat 600-700 patients daily. China Experience . https://www.china-experience.com/china-experience-insights/chinas-ai-doctor-bot-help-each-doctor-treat-600-700-patients-daily (2020).

Sapio Analytics launches ‘empathetic’ healthcare chatbot. MobiHealthNews https://www.mobihealthnews.com/news/asia/sapio-analytics-launches-empathetic-healthcare-chatbot (2021).

Index Labs TZ Company Limited. eShangazi is one-year-old! Medium https://medium.com/@indexlabstz/eshangazi-is-one-year-old-46b2b93978a4 (2018).

Patient Retention Solution. BroadReach Healthcare https://broadreachcorporation.com/patient-retention-solution/ (2020).

Sophisticated nudging in HIV: combining predictive analytics and behavioural insights. Indlela https://indlela.org/sophisticated-nudging-in-hiv-combining-predictive-analytics-and-behavioural-insights/ (2021).

Digital health: 5 innovative projects. Terre des hommes . https://www.tdh.ch/en/news/digital-health-5-innovative-projects (2021).

Ubenwa - 2019 In Review. https://www.ubenwa.ai/ubenwa-2019-highlight.html (2020).

Download references


We thank Yash Kulkarni (Imperial College London) and Nicholas Robinson (Imperial College London) for their assistance in our initial database search. We also thank Dr. Laurence Court (MD Anderson Cancer Centre) for having insightful discussions with us about automated radiation treatment planning and their research. The authors wish to thank the German Research Foundation (Deutsche Forschungsgemeinschaft) for supporting this work as part of a Deutsche Forschungsgemeinschaft–funded research unit (Forschungsgruppe FOR 2936/project: 409670289). For the publication fee we acknowledge financial support by Deutsche Forschungsgemeinschaft within the funding programme “Open Access Publikationskosten“ as well as by Heidelberg University.

Author information

Authors and affiliations.

Heidelberg Institute of Global Health (HIGH), Faculty of Medicine and University Hospital, Heidelberg University, Heidelberg, Germany

Tadeusz Ciecierski-Holmes, Miriam Axt, Stephan Brenner & Sandra Barteit

University of Cambridge, School of Clinical Medicine, Addenbrooke’s Hospital, Cambridge, CB2 0SP, UK

Tadeusz Ciecierski-Holmes

Imperial College London, Faculty of Medicine, Sir Alexander Fleming Building, London, SW7 2DD, UK

Ritvij Singh

You can also search for this author in PubMed   Google Scholar


The authors confirm contribution to the paper as follows: conceptualisation: T.C.H., R.S., St.B. and S.B.; data curation: T.C.H. and R.S.; formal analysis: T.C.H., St.B. and S.B.; funding acquisition: N.A.; investigation: T.C.H., R.S., M.A., St.B. and S.B.; methodology: T.C.H., R.S., St.B. and S.B.; project administration: T.C.H.; resources: T.C.H. and R.S.; software: not applicable; supervision: T.C.H., St.B. and S.B.; validation: T.C.H., R.S. and M.A.; visualisation: T.C.H. and St.B.; writing – original draft: T.C.H., R.S., M.A. and S.B.; and writing – review & editing: T.C.H., R.S., M.A., St.B. and S.B. All authors reviewed the results and approved the final version of the manuscript.

Corresponding author

Correspondence to Tadeusz Ciecierski-Holmes .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary material, npj reporting summary, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Ciecierski-Holmes, T., Singh, R., Axt, M. et al. Artificial intelligence for strengthening healthcare systems in low- and middle-income countries: a systematic scoping review. npj Digit. Med. 5 , 162 (2022). https://doi.org/10.1038/s41746-022-00700-y

Download citation

Received : 22 April 2022

Accepted : 29 September 2022

Published : 28 October 2022

DOI : https://doi.org/10.1038/s41746-022-00700-y

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Research and application of artificial intelligence in dentistry from lower-middle income countries – a scoping review.

  • Samira Adnan
  • Abhishek Lal

BMC Oral Health (2024)

Artificial Intelligence in Chest Radiology: Advancements and Applications for Improved Global Health Outcomes

  • Mohammad Jalloul
  • Dana Alkhulaifat
  • Farouk Dako

Current Pulmonology Reports (2024)

Harnessing oncology real-world data with AI

  • Piers Mahon
  • Giovanni Tonon

Nature Cancer (2023)

A transparent artificial intelligence framework to assess lung disease in pulmonary hypertension

  • Michail Mamalakis
  • Krit Dwivedi
  • Andrew J. Swift

Scientific Reports (2023)

Feasibility and accuracy of the screening for diabetic retinopathy using a fundus camera and an artificial intelligence pre-evaluation application

  • C. B. Giorda

Acta Diabetologica (2023)

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

ai in medicine and healthcare research paper

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • J Korean Med Sci
  • v.35(42); 2020 Nov 2

Logo of jkms

Artificial Intelligence in Health Care: Current Applications and Issues

Chan-woo park.

1 Department of Orthopedic Surgery, Samsung Medical Center, School of Medicine, Sungkyunkwan University, Seoul, Korea.

Sung Wook Seo

2 Division of Pulmonary and Critical Care Medicine, Department of Medicine, Samsung Medical Center, School of Medicine, Sungkyunkwan University, Seoul, Korea.

BeomSeok Ko

3 Department of Surgery, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea.

Byung Wook Choi

4 Department of Radiology, Severance Hospital, Yonsei University College of Medicine, Seoul, Korea.

Chang Min Park

5 Department of Radiology, Seoul National University College of Medicine, Seoul, Korea.

Dong Kyung Chang

6 Division of Gastroenterology, Department of Medicine, Samsung Medical Center, School of Medicine, Sungkyunkwan University, Seoul, Korea.

Hwiuoung Kim

Hyunchul kim.

7 Department of R&D Planning, Korea Health Industry Development Institute (KHIDI), Cheongju, Korea.

8 Health Innovation Big Data Center, Asan Institute for Life Science, Asan Medical Center, Seoul, Korea.

Jinhee Jang

9 Department of Radiology, Seoul St. Mary's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Korea.

Jong Chul Ye

10 Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Korea.

Jong Hong Jeon

11 Protocol Engineering Center, Electronics and Telecommunications Research Institute (ETRI), Daejeon, Korea.

Joon Beom Seo

12 Department of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea.

Kwang Joon Kim

13 Division of Geriatrics, Department of Internal Medicine, Severance Hospital, Yonsei University College of Medicine, Seoul, Korea.

Kyu-Hwan Jung

14 VUNO Inc., Seoul, Korea.

15 Department of Convergence Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea.

Seungwook Paek

16 Lunit Inc., Seoul, Korea.

Soo-Yong Shin

17 Big Data Research Center, Samsung Medical Center, School of Medicine, Sungkyunkwan University, Seoul, Korea.

Soyoung Yoo

Yoon sup choi.

18 Digital Healthcare Partners, Seoul, Korea.

Youngjun Kim

19 Center for Bionics, Korea Institute of Science and Technology (KIST), Seoul, Korea.

Hyung-Jin Yoon

20 Department of Biomedical Engineering, Seoul National University College of Medicine, Seoul, Korea.

In recent years, artificial intelligence (AI) technologies have greatly advanced and become a reality in many areas of our daily lives. In the health care field, numerous efforts are being made to implement the AI technology for practical medical treatments. With the rapid developments in machine learning algorithms and improvements in hardware performances, the AI technology is expected to play an important role in effectively analyzing and utilizing extensive amounts of health and medical data. However, the AI technology has various unique characteristics that are different from the existing health care technologies. Subsequently, there are a number of areas that need to be supplemented within the current health care system for the AI to be utilized more effectively and frequently in health care. In addition, the number of medical practitioners and public that accept AI in the health care is still low; moreover, there are various concerns regarding the safety and reliability of AI technology implementations. Therefore, this paper aims to introduce the current research and application status of AI technology in health care and discuss the issues that need to be resolved.

Graphical Abstract

An external file that holds a picture, illustration, etc.
Object name is jkms-35-e379-abf001.jpg


Continuous developments in artificial intelligence (AI) technologies are expected to bring innovations to the future of health care. Machine learning, a subfield of AI, is the study of computer algorithm that is automatically improved through experience by applying mathematical approaches. 1 Deep learning, a subset of machine learning, refers to an algorithm that learns by processing input data through artificial neural networks that mimic neurons in the biologic brain. 2 The explosive growth of digital data, expansion of computing power fueled by innovation in hardware technologies, such as graphics processing unit, and rapid developments in machine learning algorithms, popularly implemented using deep learning, are all leaving a significant mark in the health care field. Accordingly, numerous medical journals have already published vast number of studies analyzing massive amounts of health data by using machine learning technology to diagnose and treat patients. 3 , 4 Further, various studies have shown that applying AI in health care gives better results compared with the existing technologies. Some of these studies include analyzing medical images with AI technology to discriminate images and use them for treatments; predicting the course of a disease through various medical and health care data; developing medical devices that can support decision-making during treatments or for diagnosis; and encrypting medical data. 5 , 6 , 7 , 8 , 9 , 10 , 11

Further, various attempts have been made to develop and commercialize AI-based medical devices. In addition to the top medical device manufacturers, such as General Electric, Siemens, and Phillips, leading global information technology (IT) companies, such as Samsung, Google, Apple Microsoft, and Amazon, and numerous competitive startups have demonstrated significant research achievements in the use of AI in health care. Based on these research achievements, the companies are working to establish business achievements as well. Additionally, these efforts made by the industry and medical field are contributing to the successful approval of AI-based medical devices by regulatory agencies. In the U.S., the Food and Drug Administration (FDA) approved the use of AI-based medical devices for the first time in 2017, and in Korea, the Ministry of Food and Drug Safety has approved the use of AI-based medical devices since 2018.

However, there are still many concerns regarding AI-based medical technology owing to how different AI-based health care technologies are from traditional health care technologies; thus, the number of actual clinical treatment implementations using AI is still limited. 12 , 13 To introduce and implement the AI technology in the actual medical site and to provide meaningful outcomes to the people involved in health care, including doctors and patients, various challenges need to be addressed. Therefore, this study discusses the status of the current developments in domestic and foreign AI technology in health care and examines the issues that need to be resolved to implement AI in health care.


Unlike the typical technologies that deal with physical realms, AI technology is breaking new ground, as it has implications in more psychological realms such as experience, intelligence, and judgment of the experts. In particular, since the drastic improvement in the performance of machine learning algorithms for pattern recognition with the introduction of deep learning technology, the ability of AI technology in analyzing data patterns has become similar to that of an average human ability for specific tasks (e.g., image recognition and speech recognition). 14 As deep learning algorithms are based on artificial neural network resembling the network of neurons in the human brain and can learn very complex non-linear relationships, they are being actively used in the tasks dealing with medical data. 15 Accordingly, a number of studies on the use of AI-based technologies in health care are currently being conducted ( Table 1 ).

IT = information technology.

Medical image analysis

In addition to radiology, the use of machine learning algorithms in medical image analysis has been expanded widely to most medical departments that use images for fields such as pathology, dermatology, cardiology, gastroenterology, and ophthalmology. In detail, machine learning algorithms use computed tomography (CT), magnetic resonance imaging, ultrasound, pathology image, fundus image, and endoscope data to diagnose or classify the severity of the disease. 8 , 16 , 17 , 18 , 19 , 20 , 21 , 22 When the machine learning algorithm was applied to the real-time colonoscopy, the accuracy of the diagnosis was 94% and the negative predictive value was 96% in analyzing 466 tiny polyps. 23 , 24 Among the many deep learning algorithms, the convolutional neural network algorithm, which retains high performance in image pattern analysis, has proven to be beneficial in analyzing medical images with complex patterns. 4 , 25

In the medical industry, Siemens Healthineers has developed AI-based AI-Rad Companion Chest CT software to assist chest CT diagnosis 26 , and GE Healthcare is also working on the development of AI-based medical image analysis technology. Further, Philips Healthcare has developed IntelliSpace Discovery, an open platform for AI development and deployment, 27 and is working to commercialize its IntelliSite Pathology Solution in the digital pathology diagnosis field. 27 Arterys has received FDA clearance on its Cardio AI, Liver AI, and Lung AI software to establish Medical Imaging Cloud AI platform. 28 Apart from the abovementioned examples, various companies, such as Zebra Medical Vison and Aidoc, are working to commercialize AI-based medical image analysis tools. In Korea, a number of startups such as Vuno, Lunit, JLK Inspection, and Deepnoid are in the process of commercializing AI-based medical image analysis systems by receiving approval from the Ministry of Food and Drug Safety.

Smart IoT devices, signal, and in-vitro diagnostic analysis

Various technology giants such as IBM, Google, Apple, and Samsung are competing to develop and commercialize devices and services that can assist in improving the user health by acquiring health information from daily life using a combination of Internet of Things (IoT) technologies and wearable devices. In 2017, Apple installed a deep learning algorithm that detects atrial fibrillation on their smartwatch and received FDA approval. 23 Using the photoplethysmography and accelerometer sensors, it learns the user's usual heart rates at rest and at activity, and sends a warning sign if there is a significant difference from the expected values. The deep learning algorithm also shows excellent accuracy in analyzing electrocardiogram (ECG). In a recent study, 91,232 single-lead ECGs were analyzed using a deep learning algorithm and showed high diagnostic performance similar to those of cardiologists. 29 Applying the ECG pattern analysis algorithm to smart devices would be especially useful for patients with cardiovascular disease or chronic kidney disease with high blood potassium levels.

Global companies are developing other innovative medical devices as well, including non-invasive glucose meters. 30 , 31 Such bio-signal monitoring devices can be used for disease management and treatment in different hospital environments, such as intensive care unit, operating room, emergency room, and recovery room, in addition to being used for predicting and diagnosing diseases based on the accumulated real-time information. One prime example of this is the collaboration between Medtronic and IBM in developing Sugar.IQ by combining Medtronic's continuous glucose monitoring and IBM's AI Watson. 32 The Sugar.IQ has been proven to directly help diabetic patients by reducing the number of hypoglycemia and hyperglycemia symptoms. 33 In addition, PhysiQ's pinpointIQ device, which recently received the FDA approval, helps in preparing for sudden and fatal situations by continuously measuring the patient conditions from the wearable biosensors and by detecting subtle changes in the conditions. 34 Philips has also commercialized its Connected Care Solutions, which is an IT solution that enables the medical staff to monitor the patient conditions from anywhere including the emergency room, recovery room, and nurses station, through tablet PCs and smartphones. 35 Meanwhile, findings from studies that reported improvement in diagnosis sensitivity and specificity by applying machine learning to the multi-biomarker analysis have been used in real-world cases to classify the cancer-related biomarkers; moreover, machine learning has been applied in cases of in-vitro diagnosis of diseases such as tuberculosis and dementia. 36 , 37 , 38 , 39

AI using electronic medical records (EMRs)

Currently, various attempts are actively being made to develop AI systems using EMRs. In the U.S., EMR companies including Allscripts, Athenahealth, Cerner, eClinicalWorks, and Epic are conducting research on optimizing hospital treatment processes using AI technology. 40 In Korea, EvidNet is working to develop multi-hospital clinical big data analysis technology based on observational health data sciences and informatics and common data model (CDM). 41 , 42 Further, IBM has developed Watson for Oncology, which provides optimal personalized treatments for cancer patients, and its clinical trials are underway. 43 As for the Korean companies, Selvas AI and Linewalks are conducting similar studies.


Issues of utilizing health care data.

In most cases, health care data typically include personal identification information such as personal code, number, text, voice, sound, and image. To create a data-driven AI medical device, a large amount of these data carrying sensitive personal information are required, but obtaining such sensitive information may lead to legal issues regarding personal privacy. 44 , 45 , 46 The widespread use of wireless AI devices in health care requires a new technologies such as IoT and cloud computing, which enables to deal with the processing and storage limitations in such devices. 47 However, cloud-assisted AI devices can give rise to serious security concerns to the private health care data. To solve this issue, technical research and attempts to change laws and regulations are in progress. In terms of technical research, various encryption technologies and de-identification or anonymization technologies of removing identity information are being developed. A popular example is the CDM-based distributed research network, which has recently received considerable attention in Korea. 48 Furthermore, a number of privacy-preserving data mining technologies such as the federated learning and homomorphic encryption have been developed in the U.S. 49 , 50

Many countries worldwide are currently forming institutional and legal systems to address the conflicting interests between the use of health care information and the protection of personal information. In the U.S., the Health Insurance Portability and Accountability Act (HIPAA) established in 1996 has given individuals the data rights for medical information copies, and the Blue Button system has been established to allow individuals to diversify the use of data through viewing their own personal health records online. According to the HIPAA guidelines, anonymization on 18 identifiers of protected health information has been established to efficiently facilitate the use of health data. 51 In addition, through the enactment of the Health Information Technology for Economic and Clinical Health Act (HITECH Act) in 2009, an electronic health record has been developed and promoted to increase the interoperability of medical information between hospitals. 52 Furthermore, the Centers for Medicare & Medicaid has launched the MyHealthEData and Blue Button 2.0 services in 2018 to enable patients to access and control their medical records and other health data. In Europe, through the General Data Protection Regulation established in 2016, the basic individual rights for personal information have been reinforced by mandating the EU members to protect personal information in accordance with the six data protection principles. 53 In Korea, efforts have been made in recent years to promote research on big data by providing rights to collect and use health care data based on the revision of the Bioethics and Safety Act. However, despite these efforts from many countries, no country has been able to systematically resolve the privacy issues regarding health care data.

Regulatory affairs and policies for new devices

Most AI-based medical devices exist in the form of software, and they are generally new devices different from the traditional devices in terms of regulatory affairs. Hence, new policies need to be established to approve and regulate such devices. The International Medical Device Regulators Forum has categorized these AI software intended to be used for medical purposes as “Software as a Medical Device (SaMD)”. 54 In the U.S., the Digital Health Unit was established by the FDA's Center for Devices and Radiological Health in May 2017 to promote the expertise in digital health care device approvals and regulations, and the FDA has announced the guidelines for SaMD in December 2017. 55 The FDA acknowledges that the current regulations for traditional medical devices are not suitable for SaMDs that are faster in development and modification. 56 FDA has recently developed a Software Precertification Program, which enables faster marketing of SaMD through a developer-centered certification pathway unlike the existing pathway centered on individual products. Manufacturers who achieved ‘organizational excellence’ in this pathway can obtain exemption from premarket review for low-risk products. Japan, in accordance with the AI medical development plans announced in 2018, is planning to create comprehensive rules governing the use of AI in medical devices to minimize the existing AI medical device–related disputes and prevent the resulting R&D hinderances. 57 Lastly in Korea, the “Approval and Review Guidelines for Big Data and AI-based Medical Devices” and “Review Guidelines for Clinical Effectiveness of AI-based Medical Devices” were announced in 2017, making them some of the first AI-related approval guidelines in the world. 58 However, the standardized review index for the safety and effectiveness of the AI-based medical devices is still lacking worldwide. 4

Safety and liability issues

The report on AI published by the U.S. National Science and Technology Council emphasizes on the fairness and safety to prevent any discrimination or failure and to prevent AI from causing unintended consequences ( Fig. 1 ). 59 , 60 For example, if the AI is developed to have a bias toward a specific population group, a mismanagement could occur in the prevention or diagnosis of a disease and thus discrimination could arise in which a specific population group is excluded from benefits. To address this, the federal government is promoting verification of the effectiveness and fairness of AI through the evidence-based assessment, and the federal government funding for research is mandated to be allocated based on the transparency, effectiveness, and fairness. Furthermore, the government is recommending the universities and secondary schools to include topics related to ethics, safety, and privacy in the AI or data science curriculum. The council has also highlighted the cyber security related to AI and emphasized on establishing responsible strategies and plans at a federal level, including R&D for the sustainable security system development and operation in response to cyberattacks.

An external file that holds a picture, illustration, etc.
Object name is jkms-35-e379-g001.jpg

AI = artificial intelligence.

The present health care system assumes that all responsibility lies in the hands of the medical staff in the event of a medical accident. The AI-based medical technology may affect the judgment of physicians in various areas and may sometimes cause negative impacts, resulting in medical accidents. In such cases, liability issues would arise, and in the current health care system, it is highly likely that the medical institution or physicians who ultimately introduced the AI-based medical technology would be responsible for the case. Hence, physicians need to learn how to better utilize and interpret AI algorithms and be aware of potential legal consequences associated with AI use in medical practice. 61 In addition, efforts in academia and policymaking should be made to straighten the liability issues and to evaluate the medical accident risks based on the various characteristics of AI technologies. 62 Specifically, new policies should be introduced for the establishment and operation of the AI monitoring centers in medical institutions and a national level safety monitoring center for monitoring the safety of the AI-based medical technologies. In addition, a system for measuring the liability and strengthening the awareness of patients and medical staff on medical accidents that may take place when applying the AI-based medical technologies should be established.

Balanced application with existing health care systems

Applying a newly developed AI technology to real-world health care service can lead to unexpected problems. Therefore, the introduction of AI devices should be in harmony with existing health care systems, and the performance of AI devices must be monitored periodically. It is also important that AI devices should be easy to use and familiar to medical staff and patients to avoid any misunderstandings and errors when making medical decisions.

As the AI technology in its nature mainly relies on data, performance changes may be found when the pre-applied AI technology is retrained with the desired field data. This change may not always result in a performance improvement; rather it may result in a performance degradation. 63 In addition, the AI device performances may vary when the distribution or severity of patients in the institution changes depending on the differences in social, economic, and medical environments. Although the current laws and regulations in Korea do not allow the use of field data in improving the AI device performances, considering the nature of the technology, retraining using field data could be approved in the near future provided certain conditions are fulfilled. Accordingly, the performance of AI devices should be periodically checked even after the clinical application to prevent any unexpected performance degradation or malfunction.

Considering the importance and complexity of the modern health care field, the AI technology should be applied to health care as naturally and seamlessly as possible without causing excessive changes in the current medical practices. For this reason, it is necessary to implement the interaction and interface technologies that can enable the medical staff to apply AI technology to the medical field in a natural way even if they do not directly understand the technical aspects of the AI devices. These technologies can be implemented in the form of conversational AI, voice recognition, real-time recommendation, monitoring, and various visual overlay technologies. At the same time, careful considerations should be given to these user interface elements when developing and integrating AI technologies into health care.


AI technologies are expected to bring innovations to the existing medical technologies and future health care. The currently available AI-based health care technologies have shown outstanding results in accurately diagnosing and classifying patient conditions and predicting the course of diseases by using the accumulated medical data. Accordingly, these technologies are expected to bring contributions in assisting the medical staff in the treatment decision-making and in the process improving the treatment results. However, AI-based health care technologies currently have various issues regarding privacy, reliability, safety, and liability. For the AI technologies to be more actively applied in health care, general public awareness of AI, establishment of standardized guidelines, and systematic improvements will be required in the future in addition to the technological advancements.


The content of this article is the official opinion of the Korean Society of Artificial Intelligence in Medicine (KOSAIM), and is a summary of the White Paper on Artificial Intelligence in Medicine. The authors would like to thank all the members of KOSAIM for their supports in the publication of this article.

Disclosure: The authors have no potential conflicts of interest to disclose.

Author Contributions:

  • Conceptualization: Yoon HJ, Seo JB.
  • Writing - original draft: Park CW, Seo SW, Kang N.
  • Writing - review & editing: Ko BS, Choi BW, Park CM, Chang DK, Kim H, Kim H, Lee H, Jang J, Ye JC, Jeon JH, Seo JB, Kim KJ, Jung KH, Kim N, Paek S, Shin SY, Yoo S, Choi YS, Kim Y, Yoon HJ.
  • See us on facebook
  • See us on twitter
  • See us on youtube
  • See us on linkedin
  • See us on instagram

AI improves accuracy of skin cancer diagnoses in Stanford Medicine-led study

Artificial intelligence algorithms powered by deep learning improve skin cancer diagnostic accuracy for doctors, nurse practitioners and medical students in a study led by the Stanford Center for Digital Health.

April 11, 2024 - By Krista Conger


Artificial intelligence helped clinicians diagnose skin cancer more accurately, a Stanford Medicine-led study found. Chanelle Malambo/peopleimages.com   -  stock.adobe.com

A new study led by researchers at Stanford Medicine finds that computer algorithms powered by artificial intelligence based on deep learning can help health care practitioners to diagnose skin cancers more accurately. Even dermatologists benefit from AI guidance, although their improvement is less than that seen for non-dermatologists.

“This is a clear demonstration of how AI can be used in collaboration with a physician to improve patient care,” said professor of dermatology and of epidemiology Eleni Linos , MD. Linos leads the Stanford Center for Digital Health , which was launched to tackle some of the most pressing research questions at the intersection of technology and health by promoting collaboration between engineering, computer science, medicine and the humanities.

Linos, associate dean of research and the Ben Davenport and Lucy Zhang Professor in Medicine, is the senior author of the study , which was published on April 9 in npj Digital Medicine . Postdoctoral scholar Jiyeong Kim , PhD, and visiting researcher Isabelle Krakowski, MD, are the lead authors of the research.

“Previous studies have focused on how AI performs when compared with physicians,” Kim said. “Our study compared physicians working without AI assistance with physicians using AI when diagnosing skin cancers.”

AI algorithms are increasingly used in clinical settings, including dermatology. They are created by feeding a computer hundreds of thousands or even millions of images of skin conditions labeled with information such as diagnosis and patient outcome. Through a process called deep learning, the computer eventually learns to recognize telltale patterns in the images that correlate with specific skin diseases including cancers. Once trained, an algorithm written by the computer can be used to suggest possible diagnoses based on an image of a patient’s skin that it has not been exposed to.


Eleni Linos

These diagnostic algorithms aren’t used alone, however. They are overseen by clinicians who also assess the patient, come to their own conclusions about a patient’s diagnosis and choose whether to accept the algorithm’s suggestion.

An accuracy boost

Kim and Linos’ team reviewed 12 studies detailing more than 67,000 evaluations of potential skin cancers by a variety of practitioners with and without AI assistance. They found that, overall, health care practitioners working without aid from artificial intelligence were able to accurately diagnose about 75% of people with skin cancer — a statistical measurement known as sensitivity. Conversely, the workers correctly diagnosed about 81.5% of people with cancer-like skin conditions but who did not have cancer — a companion measurement known as specificity.

Health care practitiones who used AI to guide their diagnoses did better. Their diagnoses were about 81.1% sensitive and 86.1% specific. The improvement may seem small, but the differences are critical for people told they don’t have cancer, but do, or for those who do have cancer but are told they are healthy.

When the researchers split the health care practitioners by specialty or level of training, they saw that medical students, nurse practitioners and primary care doctors benefited the most from AI guidance — improving on average about 13 points in sensitivity and 11 points in specificity. Dermatologists and dermatology residents performed better overall, but the sensitivity and specificity of their diagnoses also improved with AI.

“I was surprised to see everyone’s accuracy improve with AI assistance, regardless of their level of training,” Linos said. “This makes me very optimistic about the use of AI in clinical care. Soon our patients will not just be accepting, but expecting, that we use AI assistance to provide them with the best possible care.”


Jiyeong Kim

Researchers at the Stanford Center for Digital Health, including Kim, are interested in learning more about the promise of and barriers to integrating AI-based tools into health care. In particular, they are planning to investigate how the perceptions and attitudes of physicians and patients to AI will influence its implementation.

“We want to better understand how humans interact with and use AI to make clinical decisions,” Kim said. 

Previous studies have indicated that a clinician’s degree of confidence in their own clinical decision, the degree of confidence of the AI, and whether the clinician and the AI agree on the diagnosis all influence whether the clinician incorporates the algorithm’s advice when making clinical decisions for a patient.

Medical specialties like dermatology and radiology, which rely heavily on images — visual inspection, pictures, X-rays, MRIs and CT scans, among others — for diagnoses are low-hanging fruit for computers that can pick out levels of detail beyond what a human eye (or brain) can reasonably process. But even other more symptom-based specialties, or prediction modeling, are likely to benefit from AI intervention, Linos and Kim feel. And it’s not just patients who stand to benefit.

“If this technology can simultaneously improve a doctor’s diagnostic accuracy and save them time, it’s really a win-win. In addition to helping patients, it could help reduce physician burnout and improve the human interpersonal relationships between doctors and their patients,” Linos said. “I have no doubt that AI assistance will eventually be used in all medical specialties. The key question is how we make sure it is used in a way that helps all patients regardless of their background and simultaneously supports physician well-being.”

Researchers from the Karolinska Institute, the Karolinska University Hospital and the University of Nicosia contributed to the research.

The study was funded by the National Institutes of Health (grants K24AR075060 and R01AR082109), Radiumhemmet Research, the Swedish Cancer Society and the Swedish Research Council.

For more news about responsible AI in health and medicine,  sign up  for the RAISE Health newsletter.

Register  for the RAISE Health Symposium on May 14.

Krista Conger

About Stanford Medicine

Stanford Medicine is an integrated academic health system comprising the Stanford School of Medicine and adult and pediatric health care delivery systems. Together, they harness the full potential of biomedicine through collaborative research, education and clinical care for patients. For more information, please visit med.stanford.edu .

Artificial intelligence

Exploring ways AI is applied to health care

Stanford Medicine Magazine: AI

ai in medicine and healthcare research paper

Biomedical Informatics

  • AIME Workshops

AI for Reliable and Equitable Real World Evidence Generation in Medicine

Call for Submissions

Organizing committee.

The " AI for Reliable and Equitable Real World Evidence Generation in Medicine " workshop is dedicated to advancing the understanding and exploring the transformative role of artificial intelligence (AI) in analyzing real-world data (RWD) for real-world evidence (RWE) generation, leading to evidence-based medicine (EBM). Focused on leading-edge research and innovation, the workshop will feature research papers and panel discussions that delve into key aspects of machine learning innovations and applications in RWE generation from EHRs and claims, including structured data, natural language processing (NLP) of clinical notes, medical imaging, and waveform data processing from wearable devices. The workshop will feature both innovative AI methodology as well as their applications to real-world problems and their impact on transforming evidence-based medicine. The workshop seeks to facilitate in-depth discussions on the integration of AI technologies to enhance the reliability and equity of RWE generation. The workshop serves as a platform for engaging multiple stakeholders across healthcare research, including researchers, clinicians, pharmaceutical and industry professionals to delve into the intricacies of these advanced methodologies, fostering dialogue and collaboration. Attendees can anticipate in-depth discussions, presentations, and networking opportunities, gaining valuable insights into the forefront of AI-driven strategies shaping the future of these discoveries.

AI encompasses statistical and computational machine learning, deep learning, and generative AI (e.g., Large Language Model, Diffusion Models, etc), all are welcomed approaches. We include innovative AI methods as well as application of AI methods to the field of evidence generation for real-world effectiveness, safety, and equity research.


Scientific paper program committee.

ai in medicine and healthcare research paper

Kun Huang, PhD, MS Professor of Medicine, Chair of Genomic Data Science of Precision Health Initiative, Assistant Dean for Data Science Indiana University

ai in medicine and healthcare research paper

Mattia Prosperi, PhD Professor And Associate Dean For AI And Innovation University of Florida

ai in medicine and healthcare research paper

Daniel O. Scharfstein, ScD Chief, Division of Biostatistics in Department of Population Health Sciences University of Utah

ai in medicine and healthcare research paper

Xiaoqian Jiang, PhD Associate Vice President of Medical AI, Chair of Department of Data Science and Artificial Intelligence The University of Texas Health Science Center at Houston
  • Download PDF
  • Share X Facebook Email LinkedIn
  • Permissions

Prediction Models and Clinical Outcomes—A Call for Papers

  • 1 Department of Medicine, University of Washington, Seattle
  • 2 Deputy Editor, JAMA Network Open
  • 3 Epidemiology, Rutgers The State University of New Jersey, New Brunswick
  • 4 Statistical Editor, JAMA Network Open
  • 5 Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, Massachusetts
  • 6 Editor, JAMA Network Open

The need to classify disease and predict outcomes is as old as medicine itself. Nearly 50 years ago, the advantage of applying multivariable statistics to these problems became evident. 1 Since then, the increasing availability of databases containing often-complex clinical information from tens or even hundreds of millions of patients, combined with powerful statistical techniques and computing environments, has spawned exponential growth in efforts to create more useful, focused, and accurate prediction models. JAMA Network Open receives dozens of manuscripts weekly that present new or purportedly improved instruments intended to predict a vast array of clinical outcomes. Although we are able to accept only a small fraction of those submitted, we have, nonetheless, published nearly 2000 articles dealing with predictive models over the past 6 years.

The profusion of predictive models has been accompanied by the growing recognition of the necessity for standards to help ensure accuracy of these models. An important milestone was the publication of the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis ( TRIPOD ) guidelines nearly a decade ago. 2 TRIPOD is a reporting guideline intended to enable readers to better understand the methods used in published studies but does not prescribe what actual methods should be applied. Since then, while the field has continued to advance and technology improve, many predictive models in widespread use, when critically evaluated, have been found to neither adhere to reporting standards nor perform as well as expected. 3 , 4

There are numerous reasons why performance of models falls short, even when efforts are made to adhere to methodologic standards. Despite the vast amounts of data that are often brought to bear, they may not be appropriate to the task, or they may have been collected and analyzed in ways that are biased. Additionally, that some models fall short may simply reflect the inherent difficulty of predicting relatively uncommon events that occur as a result of complex biological processes occurring within complex clinical environments. Moreover, clinical settings are highly variable, and predictive models typically perform worse outside of the environments in which they were developed. A comprehensive discussion of these issues is beyond the scope of this article, but as physicist Neils Bohr once remarked, “it is very difficult to predict—especially the future.” 5

Although problems with accuracy are well documented, hundreds of predictive models are in regular use in clinical practice and are frequently the basis for critically important decisions. Many such models have been widely adopted without subsequent efforts to confirm that they actually continue to perform as expected. That is not to say that such models are without utility, because even a suboptimal model may perform better than an unaided clinician. Nevertheless, we believe that a fresh examination of selected, well-established predictive models is warranted if not previously done. JAMA Network Open has published articles addressing prediction of relatively common clinical complications, such as recurrent gastrointestinal bleeding. 6 We think there remains considerable opportunity for research in this vein. In particular, we seek studies that examine current performance of commonly applied clinical prediction rules. We are particularly interested in studies using data from a variety of settings and databases as well as studies that simultaneously assess multiple models addressing the same or similar outcomes.

We also remain interested in the derivation of new models that address a clear clinical need. They should utilize data that are commonly collected as part of routine care, or in principle can be readily extracted from electronic health records. We generally require that prediction models be validated with at least 1 other dataset distinct from the development dataset. In practice, this means data from different health systems or different publicly available or commercial datasets. We note that internal validation techniques, such as split samples, hold-out, k -fold, and others, are not designed to overcome the intrinsic differences between data sources and, therefore, are not suited to quantifying performance externally. While the population to which the models apply should be described explicitly, ideally any such models should be applicable to patients from the wide range of races, ethnicities, and backgrounds commonly encountered in clinic practice. Most importantly, we are interested in examples of models that have been evaluated in clinical settings, assessing their feasibility and potential clinical benefit. This includes studies with negative as well as positive outcomes.

Please see the journal’s Instructions for Authors for information on manuscript preparation and submission. 7 This is not a time-limited call for studies on this topic.

Published: April 12, 2024. doi:10.1001/jamanetworkopen.2024.9640

Open Access: This is an open access article distributed under the terms of the CC-BY License . © 2024 Fihn SD et al. JAMA Network Open .

Corresponding Author: Stephan D. Fihn, MD, MPH, Department of Medicine, University of Washington, 325 Ninth Ave, Box 359780, Seattle, WA 98104 ( [email protected] ).

Conflict of Interest Disclosures: Dr Berlin reported receiving consulting fees from Kenvue related to acetaminophen outside the submitted work. No other disclosures were reported.

See More About

Fihn SD , Berlin JA , Haneuse SJPA , Rivara FP. Prediction Models and Clinical Outcomes—A Call for Papers. JAMA Netw Open. 2024;7(4):e249640. doi:10.1001/jamanetworkopen.2024.9640

Manage citations:

© 2024

Select Your Interests

Customize your JAMA Network experience by selecting one or more topics from the list below.

  • Academic Medicine
  • Acid Base, Electrolytes, Fluids
  • Allergy and Clinical Immunology
  • American Indian or Alaska Natives
  • Anesthesiology
  • Anticoagulation
  • Art and Images in Psychiatry
  • Artificial Intelligence
  • Assisted Reproduction
  • Bleeding and Transfusion
  • Caring for the Critically Ill Patient
  • Challenges in Clinical Electrocardiography
  • Climate and Health
  • Climate Change
  • Clinical Challenge
  • Clinical Decision Support
  • Clinical Implications of Basic Neuroscience
  • Clinical Pharmacy and Pharmacology
  • Complementary and Alternative Medicine
  • Consensus Statements
  • Coronavirus (COVID-19)
  • Critical Care Medicine
  • Cultural Competency
  • Dental Medicine
  • Dermatology
  • Diabetes and Endocrinology
  • Diagnostic Test Interpretation
  • Drug Development
  • Electronic Health Records
  • Emergency Medicine
  • End of Life, Hospice, Palliative Care
  • Environmental Health
  • Equity, Diversity, and Inclusion
  • Facial Plastic Surgery
  • Gastroenterology and Hepatology
  • Genetics and Genomics
  • Genomics and Precision Health
  • Global Health
  • Guide to Statistics and Methods
  • Hair Disorders
  • Health Care Delivery Models
  • Health Care Economics, Insurance, Payment
  • Health Care Quality
  • Health Care Reform
  • Health Care Safety
  • Health Care Workforce
  • Health Disparities
  • Health Inequities
  • Health Policy
  • Health Systems Science
  • History of Medicine
  • Hypertension
  • Images in Neurology
  • Implementation Science
  • Infectious Diseases
  • Innovations in Health Care Delivery
  • JAMA Infographic
  • Law and Medicine
  • Leading Change
  • Less is More
  • LGBTQIA Medicine
  • Lifestyle Behaviors
  • Medical Coding
  • Medical Devices and Equipment
  • Medical Education
  • Medical Education and Training
  • Medical Journals and Publishing
  • Mobile Health and Telemedicine
  • Narrative Medicine
  • Neuroscience and Psychiatry
  • Notable Notes
  • Nutrition, Obesity, Exercise
  • Obstetrics and Gynecology
  • Occupational Health
  • Ophthalmology
  • Orthopedics
  • Otolaryngology
  • Pain Medicine
  • Palliative Care
  • Pathology and Laboratory Medicine
  • Patient Care
  • Patient Information
  • Performance Improvement
  • Performance Measures
  • Perioperative Care and Consultation
  • Pharmacoeconomics
  • Pharmacoepidemiology
  • Pharmacogenetics
  • Pharmacy and Clinical Pharmacology
  • Physical Medicine and Rehabilitation
  • Physical Therapy
  • Physician Leadership
  • Population Health
  • Primary Care
  • Professional Well-being
  • Professionalism
  • Psychiatry and Behavioral Health
  • Public Health
  • Pulmonary Medicine
  • Regulatory Agencies
  • Reproductive Health
  • Research, Methods, Statistics
  • Resuscitation
  • Rheumatology
  • Risk Management
  • Scientific Discovery and the Future of Medicine
  • Shared Decision Making and Communication
  • Sleep Medicine
  • Sports Medicine
  • Stem Cell Transplantation
  • Substance Use and Addiction Medicine
  • Surgical Innovation
  • Surgical Pearls
  • Teachable Moment
  • Technology and Finance
  • The Art of JAMA
  • The Arts and Medicine
  • The Rational Clinical Examination
  • Tobacco and e-Cigarettes
  • Translational Medicine
  • Trauma and Injury
  • Treatment Adherence
  • Ultrasonography
  • Users' Guide to the Medical Literature
  • Vaccination
  • Venous Thromboembolism
  • Veterans Health
  • Women's Health
  • Workflow and Process
  • Wound Care, Infection, Healing

Get the latest research based on your areas of interest.

Others also liked.

  • Register for email alerts with links to free full-text articles
  • Access PDFs of free articles
  • Manage your interests
  • Save searches and receive search alerts

The latest news in Healthcare IT – straight to your inbox.


  • Global Edition

National Academy of Medicine drafts code of conduct for AI in healthcare

ai in medicine and healthcare research paper

Photo:  harishs /Pixabay

The National Academy of Medicine has issued a landscape review, code principles and commitments saying accurate, safe, reliable and ethical AI transformation in healthcare and biomedical science is achievable.

Based on the Leadership Consortium’s Learning Health System Core Principles, an initiative the academy has spearheaded since 2006, the organization said its new draft framework promotes responsible behavior in AI development, use and ongoing assessment. 

Its core tenets require inclusive collaboration, ongoing safety assessment, efficiency and environmental protection.


The full  commentary , which contains a landscape review and “Draft Code of Conduct Framework: Code Principles and Code Commitments,” was developed through the academy’s AI Code of Conduct initiative, under a steering committee of expert stakeholders, according to an announcement.

The code principles and the proposed code commitments "reflect simple guideposts to guide and gauge behavior in a complex system and provide a starting point for real-time decision-making and detailed implementation plans to promote the responsible use of AI," the National Academy of Medicine said.

The academy's Artificial Intelligence Code of Conduct initiative that launched in January 2023 engaged many stakeholders – listed in the acknowledgments – in co-creating the new draft framework.

"The promise of AI technologies to transform health and healthcare is tremendous, but there is concern that their improper use can have harmful effects," Victor Dzau, academy president, said in a statement.

"There is a pressing need for establishing principles, guidelines and safeguards for the use of AI in healthcare,” he added.

Beginning with an extensive review of the existing literature surrounding AI guidelines, frameworks and principles – some 60 publications – the editors named three areas of inconsistency: inclusive collaboration, ongoing safety assessment, and efficiency or environmental protection. 

"These issues are of particular importance as they highlight the need for clear, intentional action between and among various stakeholders comprising the interstitium, or connective tissue that unify a system in pursuit of a shared vision," they wrote.

Their commentary also identifies additional risks of the use of AI in healthcare, including misdiagnosis, overuse of resources, privacy breaches and workforce displacement or "inattention based on over-reliance on AI."

The 10 code principles and six code commitments in the framework ensure that best AI practices maximize human health while minimizing potential risks, the academy said, noting they serve as "basic guideposts" to support organizational improvement at scale.

"Health and healthcare organizations that orient their visions and activities to these 10 principles will help advance the system-wide alignment, performance and continuous improvement so important in the face of today’s challenges and opportunities," the academy said.

"This new framework puts us on the path to safe, effective and ethical use of AI, as its transformational potential is put to use in health and medicine," Michael McGinnis, National Academy of Medicine executive officer, added.

Peter Lee, president of Microsoft Research and an academy steering committee member, noted that the academy invites  public comment  (through May 1) to refine the framework and accelerate AI integration in healthcare. 

"Such advancements are pivotal in surmounting the barriers we face in U.S. healthcare today, ensuring a healthier tomorrow for all," Lee said.

In addition to input from stakeholders, the academy said it would convene critical contributors into workgroups and test the framework in case studies. The academy will also consult individuals, patient advocates, health systems, product development partners and key stakeholders – including government agencies – before it releases a final code of conduct for AI in healthcare framework.


Last year, the Coalition for Health AI developed a  blueprint for AI  that took a patient-centric approach to addressing barriers to trust and other challenges of AI to help inform the academy’s AI Code of Conduct.

It was built on the  White House's AI Bill of Rights  and the National Institute of Standards and Technology's  AI Risk Management Framework .

"Transparency and trust in AI tools that will be influencing medical decisions is absolutely paramount for patients and clinicians," Dr. Brian Anderson, chief digital health physician at MITRE, a CHAI cofounder and now its chief executive officer, said in the blueprint's announcement.

While most healthcare leaders agree that trust is a chief driver to improving healthcare delivery and patient outcomes with AI,  how health systems should put ethical AI into practice  is a terrain littered with unanswered questions.

"We don't have yet a scalable plan as a nation in terms of how we're going to support critical access hospitals or [federally qualified health centers] or health systems that are less resourced, that don't have the ability to stand up these governance committees or these very fancy dashboards that are going to be monitoring for model drift and performance," he told  Healthcare IT News  last month.


"The new draft code of conduct framework is an important step toward creating a path forward to safely reap the benefits of improved health outcomes and medical breakthroughs possible through responsible use of AI," Dzau said in the National Academy of Medicine’s announcement.

Andrea Fox is senior editor of Healthcare IT News. Email:  [email protected] Healthcare IT News is a HIMSS Media publication.

More regional news

Hand on a laptop keyboard in red light

Ransomware roundup: Possible Change Healthcare double extortion, LockBit reorganizes and more

Imran Qureshi of b.well Connected Health on AI

ChatGPT's potential impact on preventative care and emergency visits

Stethoscope on top of iPad

HIMSSCast: Providers very optimistic on AI's future in healthcare, new study says

Imran Qureshi of b.well Connected Health on AI

White Papers

More Whitepapers

More Webinars

Michael Nusbaum at MH Nusbaum & Associates_Palm trees and skyscrapers in Orlando Photo by Gabriele Maltinti/iStock/Getty Images Plus

More Stories

ai in medicine and healthcare research paper

  • Artificial Intelligence
  • Cloud Computing
  • Government & Policy
  • Interoperability
  • Patient Engagement
  • Population Health
  • Precision Medicine
  • Privacy & Security
  • Women In Health IT
  • Learning Center
  • Research Papers
  • Special Projects
  • In-Person Events

The Daily Brief Newsletter

Search form, top stories.

Imran Qureshi of b.well Connected Health on AI


New AI method captures uncertainty in medical images

In biomedicine, segmentation involves annotating pixels from an important structure in a medical image, like an organ or cell. Artificial intelligence models can help clinicians by highlighting pixels that may show signs of a certain disease or anomaly.

However, these models typically only provide one answer, while the problem of medical image segmentation is often far from black and white. Five expert human annotators might provide five different segmentations, perhaps disagreeing on the existence or extent of the borders of a nodule in a lung CT image.

"Having options can help in decision-making. Even just seeing that there is uncertainty in a medical image can influence someone's decisions, so it is important to take this uncertainty into account," says Marianne Rakic, an MIT computer science PhD candidate.

Rakic is lead author of a paper with others at MIT, the Broad Institute of MIT and Harvard, and Massachusetts General Hospital that introduces a new AI tool that can capture the uncertainty in a medical image.

Known as Tyche (named for the Greek divinity of chance), the system provides multiple plausible segmentations that each highlight slightly different areas of a medical image. A user can specify how many options Tyche outputs and select the most appropriate one for their purpose.

Importantly, Tyche can tackle new segmentation tasks without needing to be retrained. Training is a data-intensive process that involves showing a model many examples and requires extensive machine-learning experience.

Because it doesn't need retraining, Tyche could be easier for clinicians and biomedical researchers to use than some other methods. It could be applied "out of the box" for a variety of tasks, from identifying lesions in a lung X-ray to pinpointing anomalies in a brain MRI.

Ultimately, this system could improve diagnoses or aid in biomedical research by calling attention to potentially crucial information that other AI tools might miss.

"Ambiguity has been understudied. If your model completely misses a nodule that three experts say is there and two experts say is not, that is probably something you should pay attention to," adds senior author Adrian Dalca, an assistant professor at Harvard Medical School and MGH, and a research scientist in the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL).

Their co-authors include Hallee Wong, a graduate student in electrical engineering and computer science; Jose Javier Gonzalez Ortiz PhD '23; Beth Cimini, associate director for bioimage analysis at the Broad Institute; and John Guttag, the Dugald C. Jackson Professor of Computer Science and Electrical Engineering. Rakic will present Tyche at the IEEE Conference on Computer Vision and Pattern Recognition, where Tyche has been selected as a highlight.

Addressing ambiguity

AI systems for medical image segmentation typically use neural networks. Loosely based on the human brain, neural networks are machine-learning models comprising many interconnected layers of nodes, or neurons, that process data.

After speaking with collaborators at the Broad Institute and MGH who use these systems, the researchers realized two major issues limit their effectiveness. The models cannot capture uncertainty and they must be retrained for even a slightly different segmentation task.

Some methods try to overcome one pitfall, but tackling both problems with a single solution has proven especially tricky, Rakic says.

"If you want to take ambiguity into account, you often have to use an extremely complicated model. With the method we propose, our goal is to make it easy to use with a relatively small model so that it can make predictions quickly," she says.

The researchers built Tyche by modifying a straightforward neural network architecture.

A user first feeds Tyche a few examples that show the segmentation task. For instance, examples could include several images of lesions in a heart MRI that have been segmented by different human experts so the model can learn the task and see that there is ambiguity.

The researchers found that just 16 example images, called a "context set," is enough for the model to make good predictions, but there is no limit to the number of examples one can use. The context set enables Tyche to solve new tasks without retraining.

For Tyche to capture uncertainty, the researchers modified the neural network so it outputs multiple predictions based on one medical image input and the context set. They adjusted the network's layers so that, as data move from layer to layer, the candidate segmentations produced at each step can "talk" to each other and the examples in the context set.

In this way, the model can ensure that candidate segmentations are all a bit different, but still solve the task.

"It is like rolling dice. If your model can roll a two, three, or four, but doesn't know you have a two and a four already, then either one might appear again," she says.

They also modified the training process so it is rewarded by maximizing the quality of its best prediction.

If the user asked for five predictions, at the end they can see all five medical image segmentations Tyche produced, even though one might be better than the others.

The researchers also developed a version of Tyche that can be used with an existing, pretrained model for medical image segmentation. In this case, Tyche enables the model to output multiple candidates by making slight transformations to images.

Better, faster predictions

When the researchers tested Tyche with datasets of annotated medical images, they found that its predictions captured the diversity of human annotators, and that its best predictions were better than any from the baseline models. Tyche also performed faster than most models.

"Outputting multiple candidates and ensuring they are different from one another really gives you an edge," Rakic says.

The researchers also saw that Tyche could outperform more complex models that have been trained using a large, specialized dataset.

For future work, they plan to try using a more flexible context set, perhaps including text or multiple types of images. In addition, they want to explore methods that could improve Tyche's worst predictions and enhance the system so it can recommend the best segmentation candidates.

This research is funded, in part, by the National Institutes of Health, the Eric and Wendy Schmidt Center at the Broad Institute of MIT and Harvard, and Quanta Computer.

  • Medical Education and Training
  • Today's Healthcare
  • Medical Imaging
  • Diseases and Conditions
  • Medical Technology
  • Wearable Technology
  • Engineering
  • Confocal laser scanning microscopy
  • Scanning electron microscope
  • Radiography
  • Personalized medicine
  • Interventional radiology
  • Computational neuroscience

Story Source:

Materials provided by Massachusetts Institute of Technology . Original written by Adam Zewe. Note: Content may be edited for style and length.

Cite This Page :

Explore More

  • Genes for Strong Muscles: Healthy Long Life
  • Brightest Gamma-Ray Burst
  • Stellar Winds of Three Sun-Like Stars Detected
  • Fences Causing Genetic Problems for Mammals
  • Ozone Removes Mating Barriers Between Fly ...
  • Parkinson's: New Theory On Origins and Spread
  • Clash of Stars Solves Stellar Mystery
  • Secure Quantum Computing at Home
  • Ocean Currents: Collapse of Antarctic Ice ...
  • Pacific Cities Much Older Than Previously ...

Trending Topics

Strange & offbeat.

This paper is in the following e-collection/theme issue:

Published on 12.4.2024 in Vol 26 (2024)

Application of AI in in Multilevel Pain Assessment Using Facial Images: Systematic Review and Meta-Analysis

Authors of this article:

Author Orcid Image

  • Jian Huo 1 * , MSc   ; 
  • Yan Yu 2 * , MMS   ; 
  • Wei Lin 3 , MMS   ; 
  • Anmin Hu 2, 3, 4 , MMS   ; 
  • Chaoran Wu 2 , MD, PhD  

1 Boston Intelligent Medical Research Center, Shenzhen United Scheme Technology Company Limited, Boston, MA, United States

2 Department of Anesthesia, Shenzhen People's Hospital, The First Affiliated Hospital of Southern University of Science and Technology, Shenzhen Key Medical Discipline, Shenzhen, China

3 Shenzhen United Scheme Technology Company Limited, Shenzhen, China

4 The Second Clinical Medical College, Jinan University, Shenzhen, China

*these authors contributed equally

Corresponding Author:

Chaoran Wu, MD, PhD

Department of Anesthesia

Shenzhen People's Hospital, The First Affiliated Hospital of Southern University of Science and Technology

Shenzhen Key Medical Discipline

No 1017, Dongmen North Road

Shenzhen, 518020

Phone: 86 18100282848

Email: [email protected]

Background: The continuous monitoring and recording of patients’ pain status is a major problem in current research on postoperative pain management. In the large number of original or review articles focusing on different approaches for pain assessment, many researchers have investigated how computer vision (CV) can help by capturing facial expressions. However, there is a lack of proper comparison of results between studies to identify current research gaps.

Objective: The purpose of this systematic review and meta-analysis was to investigate the diagnostic performance of artificial intelligence models for multilevel pain assessment from facial images.

Methods: The PubMed, Embase, IEEE, Web of Science, and Cochrane Library databases were searched for related publications before September 30, 2023. Studies that used facial images alone to estimate multiple pain values were included in the systematic review. A study quality assessment was conducted using the Quality Assessment of Diagnostic Accuracy Studies, 2nd edition tool. The performance of these studies was assessed by metrics including sensitivity, specificity, log diagnostic odds ratio (LDOR), and area under the curve (AUC). The intermodal variability was assessed and presented by forest plots.

Results: A total of 45 reports were included in the systematic review. The reported test accuracies ranged from 0.27-0.99, and the other metrics, including the mean standard error (MSE), mean absolute error (MAE), intraclass correlation coefficient (ICC), and Pearson correlation coefficient (PCC), ranged from 0.31-4.61, 0.24-2.8, 0.19-0.83, and 0.48-0.92, respectively. In total, 6 studies were included in the meta-analysis. Their combined sensitivity was 98% (95% CI 96%-99%), specificity was 98% (95% CI 97%-99%), LDOR was 7.99 (95% CI 6.73-9.31), and AUC was 0.99 (95% CI 0.99-1). The subgroup analysis showed that the diagnostic performance was acceptable, although imbalanced data were still emphasized as a major problem. All studies had at least one domain with a high risk of bias, and for 20% (9/45) of studies, there were no applicability concerns.

Conclusions: This review summarizes recent evidence in automatic multilevel pain estimation from facial expressions and compared the test accuracy of results in a meta-analysis. Promising performance for pain estimation from facial images was established by current CV algorithms. Weaknesses in current studies were also identified, suggesting that larger databases and metrics evaluating multiclass classification performance could improve future studies.

Trial Registration: PROSPERO CRD42023418181; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=418181


The definition of pain was revised to “an unpleasant sensory and emotional experience associated with, or resembling that associated with, actual or potential tissue damage” in 2020 [ 1 ]. Acute postoperative pain management is important, as pain intensity and duration are critical influencing factors for the transition of acute pain to chronic postsurgical pain [ 2 ]. To avoid the development of chronic pain, guidelines were promoted and discussed to ensure safe and adequate pain relief for patients, and clinicians were recommended to use a validated pain assessment tool to track patients’ responses [ 3 ]. However, these tools, to some extent, depend on communication between physicians and patients, and continuous data cannot be provided [ 4 ]. The continuous assessment and recording of patient pain intensity will not only reduce caregiver burden but also provide data for chronic pain research. Therefore, automatic and accurate pain measurements are necessary.

Researchers have proposed different approaches to measuring pain intensity. Physiological signals, for example, electroencephalography and electromyography, have been used to estimate pain [ 5 - 7 ]. However, it was reported that current pain assessment from physiological signals has difficulties isolating stress and pain with machine learning techniques, as they share conceptual and physiological similarities [ 8 ]. Recent studies have also investigated pain assessment tools for certain patient subgroups. For example, people with deafness or an intellectual disability may not be able to communicate well with nurses, and an objective pain evaluation would be a better option [ 9 , 10 ]. Measuring pain intensity from patient behaviors, such as facial expressions, is also promising for most patients [ 4 ]. As the most comfortable and convenient method, computer vision techniques require no attachments to patients and can monitor multiple participants using 1 device [ 4 ]. However, pain intensity, which is important for pain research, is often not reported.

With the growing trend of assessing pain intensity using artificial intelligence (AI), it is necessary to summarize current publications to determine the strengths and gaps of current studies. Existing research has reviewed machine learning applications for acute postoperative pain prediction, continuous pain detection, and pain intensity estimation [ 10 - 14 ]. Input modalities, including facial recordings and physiological signals such as electroencephalography and electromyography, were also reviewed [ 5 , 8 ]. There have also been studies focusing on deep learning approaches [ 11 ]. AI was applied in children and infant pain evaluation as well [ 15 , 16 ]. However, no study has focused on pain intensity measurement, and no comparison of test accuracy results has been made.

Current AI applications in pain research can be categorized into 3 types: pain assessment, pain prediction and decision support, and pain self-management [ 14 ]. We consider accurate and automatic pain assessment to be the most important area and the foundation of future pain research. In this study, we performed a systematic review and meta-analysis to assess the diagnostic performance of current publications for multilevel pain evaluation.

This study was registered with PROSPERO (International Prospective Register of Systematic Reviews; CRD42023418181) and carried out strictly following the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines [ 17 ] .

Study Eligibility

Studies that reported AI techniques for multiclass pain intensity classification were eligible. Records including nonhuman or infant participants or 2-class pain detection were excluded. Only studies using facial images of the test participants were accepted. Clinically used pain assessment tools, such as the visual analog scale (VAS) and numerical rating scale (NRS), and other pain intensity indicators, were rejected in the meta-analysis. Textbox 1 presents the eligibility criteria.

Study characteristics and inclusion criteria

  • Participants: children and adults aged 12 months or older
  • Setting: no restrictions
  • Index test: artificial intelligence models that measure pain intensity from facial images
  • Reference standard: no restrictions for systematic review; Prkachin and Solomon pain intensity score for meta-analysis
  • Study design: no need to specify

Study characteristics and exclusion criteria

  • Participants: infants aged 12 months or younger and animal subjects
  • Setting: no need to specify
  • Index test: studies that use other information such as physiological signals
  • Reference standard: other pain evaluation tools, e.g., NRS, VAS, were excluded from meta-analysis
  • Study design: reviews

Report characteristics and inclusion criteria

  • Year: published between January 1, 2012, and September 30, 2023
  • Language: English only
  • Publication status: published
  • Test accuracy metrics: no restrictions for systematic reviews; studies that reported contingency tables were included for meta-analysis

Report characteristics and exclusion criteria

  • Year: no need to specify
  • Language: no need to specify
  • Publication status: preprints not accepted
  • Test accuracy metrics: studies that reported insufficient metrics were excluded from meta-analysis

Search Strategy

In this systematic review, databases including PubMed, Embase, IEEE, Web of Science, and the Cochrane Library were searched until December 2022, and no restrictions were applied. Keywords were “artificial intelligence” AND “pain recognition.” Multimedia Appendix 1 shows the detailed search strategy.

Data Extraction

A total of 2 viewers screened titles and abstracts and selected eligible records independently to assess eligibility, and disagreements were solved by discussion with a third collaborator. A consentient data extraction sheet was prespecified and used to summarize study characteristics independently. Table S5 in Multimedia Appendix 1 shows the detailed items and explanations for data extraction. Diagnostic accuracy data were extracted into contingency tables, including true positives, false positives, false negatives, and true negatives. The data were used to calculate the pooled diagnostic performance of the different models. Some studies included multiple models, and these models were considered independent of each other.

Study Quality Assessment

All included studies were independently assessed by 2 viewers using the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) tool [ 18 ]. QUADAS-2 assesses bias risk across 4 domains, which are patient selection, index test, reference standard, and flow and timing. The first 3 domains are also assessed for applicability concerns. In the systematic review, a specific extension of QUADAS-2, namely, QUADAS-AI, was used to specify the signaling questions [ 19 ].


Meta-analyses were conducted between different AI models. Models with different algorithms or training data were considered different. To evaluate the performance differences between models, the contingency tables during model validation were extracted. Studies that did not report enough diagnostic accuracy data were excluded from meta-analysis.

Hierarchical summary receiver operating characteristic (SROC) curves were fitted to evaluate the diagnostic performance of AI models. These curves were plotted with 95% CIs and prediction regions around averaged sensitivity, specificity, and area under the curve estimates. Heterogeneity was assessed visually by forest plots. A funnel plot was constructed to evaluate the risk of bias.

Subgroup meta-analyses were conducted to evaluate the performance differences at both the model level and task level, and subgroups were created based on different tasks and the proportion of positive and negative samples.

All statistical analyses and plots were produced using RStudio (version 4.2.2; R Core Team) and the R package meta4diag (version 2.1.1; Guo J and Riebler A) [ 20 ].

Study Selection and Included Study Characteristics

A flow diagram representing the study selection process is shown in ( Figure 1 ). After removing 1039 duplicates, the titles and abstracts of a total of 5653 papers were screened, and the percentage agreement of title or abstract screening was 97%. After screening, 51 full-text reports were assessed for eligibility, among which 45 reports were included in the systematic review [ 21 - 65 ]. The percentage agreement of the full-text review was 87%. In 40 of the included studies, contingency tables could not be made. Meta-analyses were conducted based on 8 AI models extracted from 6 studies. Individual study characteristics included in the systematic review are provided in Tables 1 and 2 . The facial feature extraction method can be categorized into 2 classes: geometrical features (GFs) and deep features (DFs). One typical method of extracting GFs is to calculate the distance between facial landmarks. DFs are usually extracted by convolution operations. A total of 20 studies included temporal information, but most of them (18) extracted temporal information through the 3D convolution of video sequences. Feature transformation was also commonly applied to reduce the time for training or fuse features extracted by different methods before inputting them into the classifier. For classifiers, support vector machines (SVMs) and convolutional neural networks (CNNs) were mostly used. Table 1 presents the model designs of the included studies.

ai in medicine and healthcare research paper

a No temporal features are shown by – symbol, time information extracted from 2 images at different time by +, and deep temporal features extracted through the convolution of video sequences by ++.

b SVM: support vector machine.

c GF: geometric feature.

d GMM: gaussian mixture model.

e TPS: thin plate spline.

f DML: distance metric learning.

g MDML: multiview distance metric learning.

h AAM: active appearance model.

i RVR: relevance vector regressor.

j PSPI: Prkachin and Solomon pain intensity.

k I-FES: individual facial expressiveness score.

l LSTM: long short-term memory.

m HCRF: hidden conditional random field.

n GLMM: generalized linear mixed model.

o VLAD: vector of locally aggregated descriptor.

p SVR: support vector regression.

q MDS: multidimensional scaling.

r ELM: extreme learning machine.

s Labeled to distinguish different architectures of ensembled deep learning models.

t DCNN: deep convolutional neural network.

u GSM: gaussian scale mixture.

v DOML: distance ordering metric learning.

w LIAN: locality and identity aware network.

x BiLSTM: bidirectional long short-term memory.

a UNBC: University of Northern British Columbia-McMaster shoulder pain expression archive database.

b LOSO: leave one subject out cross-validation.

c ICC: intraclass correlation coefficient.

d CT: contingency table.

e AUC: area under the curve.

f MSE: mean standard error.

g PCC: Pearson correlation coefficient.

h RMSE: root mean standard error.

i MAE: mean absolute error.

j ICC: intraclass coefficient.

k CCC: concordance correlation coefficient.

l Reported both external and internal validation results and summarized as intervals.

Table 2 summarizes the characteristics of model training and validation. Most studies used publicly available databases, for example, the University of Northern British Columbia-McMaster shoulder pain expression archive database [ 57 ]. Table S4 in Multimedia Appendix 1 summarizes the public databases. A total of 7 studies used self-prepared databases. Frames from video sequences were the most used test objects, as 37 studies output frame-level pain intensity, while few measure pain intensity from video sequences or photos. It was common that a study redefined pain levels to have fewer classes than ground-truth labels. For model validation, cross-validation and leave-one-subject-out validation were commonly used. Only 3 studies performed external validation. For reporting test accuracies, different evaluation metrics were used, including sensitivity, specificity, mean absolute error (MAE), mean standard error (MSE), Pearson correlation coefficient (PCC), and intraclass coefficient (ICC).

Methodological Quality of Included Studies

Table S2 in Multimedia Appendix 1 presents the study quality summary, as assessed by QUADAS-2. There was a risk of bias in all studies, specifically in terms of patient selection, caused by 2 issues. First, the training data are highly imbalanced, and any method to adjust the data distribution may introduce bias. Next, the QUADAS-AI correspondence letter [ 19 ] specifies that preprocessing of images that changes the image size or resolution may introduce bias. However, the applicability concern is low, as the images properly represent the feeling of pain. Studies that used cross-fold validation or leave-one-out cross-validation were considered to have a low risk of bias. Although the Prkachin and Solomon pain intensity (PSPI) score was used by most of the studies, its ability to represent individual pain levels was not clinically validated; as such, the risk of bias and applicability concerns were considered high when the PSPI score was used as the index test. As an advantage of computer vision techniques, the time interval between the index tests was short and was assessed as having a low risk of bias. Risk proportions are shown in Figure 2 . For all 315 entries, 39% (124) were assessed as high-risk. In total, 5 studies had the lowest risk of bias, with 6 domains assessed as low risk [ 26 , 27 , 31 , 32 , 59 ].

ai in medicine and healthcare research paper

Pooled Performance of Included Models

In 6 studies included in the meta-analysis, there were 8 different models. The characteristics of these models are summarized in Table S1 in Multimedia Appendix 2 [ 23 , 24 , 26 , 32 , 41 , 57 ]. Classification of PSPI scores greater than 0, 2, 3, 6, and 9 was selected and considered as different tasks to create contingency tables. The test performance is shown in Figure 3 as hierarchical SROC curves; 27 contingency tables were extracted from 8 models. The sensitivity, specificity, and LDOR were calculated, and the combined sensitivity was 98% (95% CI 96%-99%), the specificity was 98% (95% CI 97%-99%), the LDOR was 7.99 (95% CI 6.73-9.31) and the AUC was 0.99 (95% CI 0.99-1).

ai in medicine and healthcare research paper

Subgroup Analysis

In this study, subgroup analysis was conducted to investigate the performance differences within models. A total of 8 models were separated and summarized as a forest plot in Multimedia Appendix 3 [ 23 , 24 , 26 , 32 , 41 , 57 ]. For model 1, the pooled sensitivity, specificity, and LDOR were 95% (95% CI 86%-99%), 99% (95% CI 98%-100%), and 8.38 (95% CI 6.09-11.19), respectively. For model 2, the pooled sensitivity, specificity, and LDOR were 94% (95% CI 84%-99%), 95% (95% CI 88%-99%), and 6.23 (95% CI 3.52-9.04), respectively. For model 3, the pooled sensitivity, specificity, and LDOR were 100% (95% CI 99%-100%), 100% (95% CI 99%-100%), and 11.55% (95% CI 8.82-14.43), respectively. For model 4, the pooled sensitivity, specificity, and LDOR were 83% (95% CI 43%-99%), 94% (95% CI 79%-99%), and 5.14 (95% CI 0.93-9.31), respectively. For model 5, the pooled sensitivity, specificity, and LDOR were 92% (95% CI 68%-99%), 94% (95% CI 78%-99%), and 6.12 (95% CI 1.82-10.16), respectively. For model 6, the pooled sensitivity, specificity, and LDOR were 94% (95% CI 74%-100%), 94% (95% CI 78%-99%), and 6.59 (95% CI 2.21-11.13), respectively. For model 7, the pooled sensitivity, specificity, and LDOR were 98% (95% CI 90%-100%), 97% (95% CI 87%-100%), and 8.31 (95% CI 4.3-12.29), respectively. For model 8, the pooled sensitivity, specificity, and LDOR were 98% (95% CI 93%-100%), 97% (95% CI 88%-100%), and 8.65 (95% CI 4.84-12.67), respectively.

Heterogeneity Analysis

The meta-analysis results indicated that AI models are applicable for estimating pain intensity from facial images. However, extreme heterogeneity existed within the models except for models 3 and 5, which were proposed by Rathee and Ganotra [ 24 ] and Semwal and Londhe [ 32 ]. A funnel plot is presented in Figure 4 . A high risk of bias was observed.

ai in medicine and healthcare research paper

Pain management has long been a critical problem in clinical practice, and the use of AI may be a solution. For acute pain management, automatic measurement of pain can reduce the burden on caregivers and provide timely warnings. For chronic pain management, as specified by Glare et al [ 2 ], further research is needed, and measurements of pain presence, intensity, and quality are one of the issues to be solved for chronic pain studies. Computer vision could improve pain monitoring through real-time detection for clinical use and data recording for prospective pain studies. To our knowledge, this is the first meta-analysis dedicated to AI performance in multilevel pain level classification.

In this study, one model’s performance at specific pain levels was described by stacking multiple classes into one to make each task a binary classification problem. After careful selection in both the medical and engineering databases, we observed promising results of AI in evaluating multilevel pain intensity through facial images, with high sensitivity (98%), specificity (98%), LDOR (7.99), and AUC (0.99). It is reasonable to believe that AI can accurately evaluate pain intensity from facial images. Moreover, the study quality and risk of bias were evaluated using an adapted QUADAS-2 assessment tool, which is a strength of this study.

To investigate the source of heterogeneity, it was assumed that a well-designed model should have familiar size effects regarding different levels, and a subgroup meta-analysis was conducted. The funnel and forest plots exhibited extreme heterogeneity. The model’s performance at specific pain levels was described and summarized by a forest plot. Within-model heterogeneity was observed in Multimedia Appendix 3 [ 23 , 24 , 26 , 32 , 41 , 57 ] except for 2 models. Models 3 and 5 were different in many aspects, including their algorithms and validation methods, but were both trained with a relatively small data set, and the proportion of positive and negative classes was relatively close to 1. Because training with imbalanced data is a critical problem in computer vision studies [ 66 ], for example, in the University of Northern British Columbia-McMaster pain data set, fewer than 10 frames out of 48,398 had a PSPI score greater than 13. Here, we emphasized that imbalanced data sets are one major cause of heterogeneity, resulting in the poorer performance of AI algorithms.

We tentatively propose a method to minimize the effect of training with imbalanced data by stacking multiple classes into one class, which is already presented in studies included in the systematic review [ 26 , 32 , 42 , 57 ]. Common methods to minimize bias include resampling and data augmentation [ 66 ]. This proposed method is used in the meta-analysis to compare the test results of different studies as well. The stacking method is available when classes are only different in intensity. A disadvantage of combined classes is that the model would be insufficient in clinical practice when the number of classes is low. Commonly used pain evaluation tools, such as VAS, have 10 discrete levels. It is recommended that future studies set the number of pain levels to be at least 10 for model training.

This study is limited for several reasons. First, insufficient data were included because different performance metrics (mean standard error and mean average error) were used in most studies, which could not be summarized into a contingency table. To create a contingency table that can be included in a meta-analysis, the study should report the following: the number of objects used in each pain class for model validation, and the accuracy, sensitivity, specificity, and F 1 -score for each pain class. This table cannot be created if a study reports the MAE, PCC, and other commonly used metrics in AI development. Second, a small study effect was observed in the funnel plot, and the heterogeneity could not be minimized. Another limitation is that the PSPI score is not clinically validated and is not the only tool that assesses pain from facial expressions. There are other clinically validated pain intensity assessment methods, such as the Faces Pain Scale-revised, Wong-Baker Faces Pain Rating Scale, and Oucher Scale [ 3 ]. More databases could be created based on the above-mentioned tools. Finally, AI-assisted pain assessments were supposed to cover larger populations, including incommunicable patients, for example, patients with dementia or patients with masked faces. However, only 1 study considered patients with dementia, which was also caused by limited databases [ 50 ].

AI is a promising tool that can help in pain research in the future. In this systematic review and meta-analysis, one approach using computer vision was investigated to measure pain intensity from facial images. Despite some risk of bias and applicability concerns, CV models can achieve excellent test accuracy. Finally, more CV studies in pain estimation, reporting accuracy in contingency tables, and more pain databases are encouraged for future studies. Specifically, the creation of a balanced public database that contains not only healthy but also nonhealthy participants should be prioritized. The recording process would be better in a clinical environment. Then, it is recommended that researchers report the validation results in terms of accuracy, sensitivity, specificity, or contingency tables, as well as the number of objects for each pain class, for the inclusion of a meta-analysis.


WL, AH, and CW contributed to the literature search and data extraction. JH and YY wrote the first draft of the manuscript. All authors contributed to the conception and design of the study, the risk of bias evaluation, data analysis and interpretation, and contributed to and approved the final version of the manuscript.

Data Availability

The data sets generated during and analyzed during this study are available in the Figshare repository [ 67 ].

Conflicts of Interest

None declared.

PRISMA checklist, risk of bias summary, search strategy, database summary and reported items and explanations.

Study performance summary.

Forest plot presenting pooled performance of subgroups in meta-analysis.

  • Raja SN, Carr DB, Cohen M, Finnerup NB, Flor H, Gibson S, et al. The revised International Association for the Study of Pain definition of pain: concepts, challenges, and compromises. Pain. 2020;161(9):1976-1982. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Glare P, Aubrey KR, Myles PS. Transition from acute to chronic pain after surgery. Lancet. 2019;393(10180):1537-1546. [ CrossRef ] [ Medline ]
  • Chou R, Gordon DB, de Leon-Casasola OA, Rosenberg JM, Bickler S, Brennan T, et al. Management of postoperative pain: a clinical practice guideline from the American Pain Society, the American Society of Regional Anesthesia and Pain Medicine, and the American Society of Anesthesiologists' Committee on Regional Anesthesia, Executive Committee, and Administrative Council. J Pain. 2016;17(2):131-157. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Hassan T, Seus D, Wollenberg J, Weitz K, Kunz M, Lautenbacher S, et al. Automatic detection of pain from facial expressions: a survey. IEEE Trans Pattern Anal Mach Intell. 2021;43(6):1815-1831. [ CrossRef ] [ Medline ]
  • Mussigmann T, Bardel B, Lefaucheur JP. Resting-State Electroencephalography (EEG) biomarkers of chronic neuropathic pain. A systematic review. Neuroimage. 2022;258:119351. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Moscato S, Cortelli P, Chiari L. Physiological responses to pain in cancer patients: a systematic review. Comput Methods Programs Biomed. 2022;217:106682. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Thiam P, Hihn H, Braun DA, Kestler HA, Schwenker F. Multi-modal pain intensity assessment based on physiological signals: a deep learning perspective. Front Physiol. 2021;12:720464. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Rojas RF, Brown N, Waddington G, Goecke R. A systematic review of neurophysiological sensing for the assessment of acute pain. NPJ Digit Med. 2023;6(1):76. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Mansutti I, Tomé-Pires C, Chiappinotto S, Palese A. Facilitating pain assessment and communication in people with deafness: a systematic review. BMC Public Health. 2023;23(1):1594. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • El-Tallawy SN, Ahmed RS, Nagiub MS. Pain management in the most vulnerable intellectual disability: a review. Pain Ther. 2023;12(4):939-961. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Gkikas S, Tsiknakis M. Automatic assessment of pain based on deep learning methods: a systematic review. Comput Methods Programs Biomed. 2023;231:107365. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Borna S, Haider CR, Maita KC, Torres RA, Avila FR, Garcia JP, et al. A review of voice-based pain detection in adults using artificial intelligence. Bioengineering (Basel). 2023;10(4):500. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • De Sario GD, Haider CR, Maita KC, Torres-Guzman RA, Emam OS, Avila FR, et al. Using AI to detect pain through facial expressions: a review. Bioengineering (Basel). 2023;10(5):548. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Zhang M, Zhu L, Lin SY, Herr K, Chi CL, Demir I, et al. Using artificial intelligence to improve pain assessment and pain management: a scoping review. J Am Med Inform Assoc. 2023;30(3):570-587. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Hughes JD, Chivers P, Hoti K. The clinical suitability of an artificial intelligence-enabled pain assessment tool for use in infants: feasibility and usability evaluation study. J Med Internet Res. 2023;25:e41992. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Fang J, Wu W, Liu J, Zhang S. Deep learning-guided postoperative pain assessment in children. Pain. 2023;164(9):2029-2035. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. 2021;372:n71. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Whiting PF, Rutjes AWS, Westwood ME, Mallett S, Deeks JJ, Reitsma JB, et al. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med. 2011;155(8):529-536. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Sounderajah V, Ashrafian H, Rose S, Shah NH, Ghassemi M, Golub R, et al. A quality assessment tool for artificial intelligence-centered diagnostic test accuracy studies: QUADAS-AI. Nat Med. 2021;27(10):1663-1665. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Guo J, Riebler A. meta4diag: Bayesian bivariate meta-analysis of diagnostic test studies for routine practice. J Stat Soft. 2018;83(1):1-31. [ CrossRef ]
  • Hammal Z, Cohn JF. Automatic detection of pain intensity. Proc ACM Int Conf Multimodal Interact. 2012;2012:47-52. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Adibuzzaman M, Ostberg C, Ahamed S, Povinelli R, Sindhu B, Love R, et al. Assessment of pain using facial pictures taken with a smartphone. 2015. Presented at: 2015 IEEE 39th Annual Computer Software and Applications Conference; July 01-05, 2015;726-731; Taichung, Taiwan. [ CrossRef ]
  • Majumder A, Dutta S, Behera L, Subramanian VK. Shoulder pain intensity recognition using Gaussian mixture models. 2015. Presented at: 2015 IEEE International WIE Conference on Electrical and Computer Engineering (WIECON-ECE); December 19-20, 2015;130-134; Dhaka, Bangladesh. [ CrossRef ]
  • Rathee N, Ganotra D. A novel approach for pain intensity detection based on facial feature deformations. J Vis Commun Image Represent. 2015;33:247-254. [ CrossRef ]
  • Sikka K, Ahmed AA, Diaz D, Goodwin MS, Craig KD, Bartlett MS, et al. Automated assessment of children's postoperative pain using computer vision. Pediatrics. 2015;136(1):e124-e131. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Rathee N, Ganotra D. Multiview distance metric learning on facial feature descriptors for automatic pain intensity detection. Comput Vis Image Und. 2016;147:77-86. [ CrossRef ]
  • Zhou J, Hong X, Su F, Zhao G. Recurrent convolutional neural network regression for continuous pain intensity estimation in video. 2016. Presented at: 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW); June 26-July 01, 2016; Las Vegas, NV. [ CrossRef ]
  • Egede J, Valstar M, Martinez B. Fusing deep learned and hand-crafted features of appearance, shape, and dynamics for automatic pain estimation. 2017. Presented at: 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017); May 30-June 03, 2017;689-696; Washington, DC. [ CrossRef ]
  • Martinez DL, Rudovic O, Picard R. Personalized automatic estimation of self-reported pain intensity from facial expressions. 2017. Presented at: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW); July 21-26, 2017;2318-2327; Honolulu, HI. [ CrossRef ]
  • Bourou D, Pampouchidou A, Tsiknakis M, Marias K, Simos P. Video-based pain level assessment: feature selection and inter-subject variability modeling. 2018. Presented at: 2018 41st International Conference on Telecommunications and Signal Processing (TSP); July 04-06, 2018;1-6; Athens, Greece. [ CrossRef ]
  • Haque MA, Bautista RB, Noroozi F, Kulkarni K, Laursen C, Irani R. Deep multimodal pain recognition: a database and comparison of spatio-temporal visual modalities. 2018. Presented at: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018); May 15-19, 2018;250-257; Xi'an, China. [ CrossRef ]
  • Semwal A, Londhe ND. Automated pain severity detection using convolutional neural network. 2018. Presented at: 2018 International Conference on Computational Techniques, Electronics and Mechanical Systems (CTEMS); December 21-22, 2018;66-70; Belgaum, India. [ CrossRef ]
  • Tavakolian M, Hadid A. Deep binary representation of facial expressions: a novel framework for automatic pain intensity recognition. 2018. Presented at: 2018 25th IEEE International Conference on Image Processing (ICIP); October 07-10, 2018;1952-1956; Athens, Greece. [ CrossRef ]
  • Tavakolian M, Hadid A. Deep spatiotemporal representation of the face for automatic pain intensity estimation. 2018. Presented at: 2018 24th International Conference on Pattern Recognition (ICPR); August 20-24, 2018;350-354; Beijing, China. [ CrossRef ]
  • Wang J, Sun H. Pain intensity estimation using deep spatiotemporal and handcrafted features. IEICE Trans Inf & Syst. 2018;E101.D(6):1572-1580. [ CrossRef ]
  • Bargshady G, Soar J, Zhou X, Deo RC, Whittaker F, Wang H. A joint deep neural network model for pain recognition from face. 2019. Presented at: 2019 IEEE 4th International Conference on Computer and Communication Systems (ICCCS); February 23-25, 2019;52-56; Singapore. [ CrossRef ]
  • Casti P, Mencattini A, Comes MC, Callari G, Di Giuseppe D, Natoli S, et al. Calibration of vision-based measurement of pain intensity with multiple expert observers. IEEE Trans Instrum Meas. 2019;68(7):2442-2450. [ CrossRef ]
  • Lee JS, Wang CW. Facial pain intensity estimation for ICU patient with partial occlusion coming from treatment. 2019. Presented at: BIBE 2019; The Third International Conference on Biological Information and Biomedical Engineering; June 20-22, 2019;1-4; Hangzhou, China.
  • Saha AK, Ahsan GMT, Gani MO, Ahamed SI. Personalized pain study platform using evidence-based continuous learning tool. 2019. Presented at: 2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC); July 15-19, 2019;490-495; Milwaukee, WI. [ CrossRef ]
  • Tavakolian M, Hadid A. A spatiotemporal convolutional neural network for automatic pain intensity estimation from facial dynamics. Int J Comput Vis. 2019;127(10):1413-1425. [ FREE Full text ] [ CrossRef ]
  • Bargshady G, Zhou X, Deo RC, Soar J, Whittaker F, Wang H. Ensemble neural network approach detecting pain intensity from facial expressions. Artif Intell Med. 2020;109:101954. [ CrossRef ] [ Medline ]
  • Bargshady G, Zhou X, Deo RC, Soar J, Whittaker F, Wang H. Enhanced deep learning algorithm development to detect pain intensity from facial expression images. Expert Syst Appl. 2020;149:113305. [ CrossRef ]
  • Dragomir MC, Florea C, Pupezescu V. Automatic subject independent pain intensity estimation using a deep learning approach. 2020. Presented at: 2020 International Conference on e-Health and Bioengineering (EHB); October 29-30, 2020;1-4; Iasi, Romania. [ CrossRef ]
  • Huang D, Xia Z, Mwesigye J, Feng X. Pain-attentive network: a deep spatio-temporal attention model for pain estimation. Multimed Tools Appl. 2020;79(37-38):28329-28354. [ CrossRef ]
  • Mallol-Ragolta A, Liu S, Cummins N, Schuller B. A curriculum learning approach for pain intensity recognition from facial expressions. 2020. Presented at: 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020); November 16-20, 2020;829-833; Buenos Aires, Argentina. [ CrossRef ]
  • Peng X, Huang D, Zhang H. Pain intensity recognition via multi‐scale deep network. IET Image Process. 2020;14(8):1645-1652. [ FREE Full text ] [ CrossRef ]
  • Tavakolian M, Lopez MB, Liu L. Self-supervised pain intensity estimation from facial videos via statistical spatiotemporal distillation. Pattern Recognit Lett. 2020;140:26-33. [ CrossRef ]
  • Xu X, de Sa VR. Exploring multidimensional measurements for pain evaluation using facial action units. 2020. Presented at: 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020); November 16-20, 2020;786-792; Buenos Aires, Argentina. [ CrossRef ]
  • Pikulkaew K, Boonchieng W, Boonchieng E, Chouvatut V. 2D facial expression and movement of motion for pain identification with deep learning methods. IEEE Access. 2021;9:109903-109914. [ CrossRef ]
  • Rezaei S, Moturu A, Zhao S, Prkachin KM, Hadjistavropoulos T, Taati B. Unobtrusive pain monitoring in older adults with dementia using pairwise and contrastive training. IEEE J Biomed Health Inform. 2021;25(5):1450-1462. [ CrossRef ] [ Medline ]
  • Semwal A, Londhe ND. S-PANET: a shallow convolutional neural network for pain severity assessment in uncontrolled environment. 2021. Presented at: 2021 IEEE 11th Annual Computing and Communication Workshop and Conference (CCWC); January 27-30, 2021;0800-0806; Las Vegas, NV. [ CrossRef ]
  • Semwal A, Londhe ND. ECCNet: an ensemble of compact convolution neural network for pain severity assessment from face images. 2021. Presented at: 2021 11th International Conference on Cloud Computing, Data Science & Engineering (Confluence); January 28-29, 2021;761-766; Noida, India. [ CrossRef ]
  • Szczapa B, Daoudi M, Berretti S, Pala P, Del Bimbo A, Hammal Z. Automatic estimation of self-reported pain by interpretable representations of motion dynamics. 2021. Presented at: 2020 25th International Conference on Pattern Recognition (ICPR); January 10-15, 2021;2544-2550; Milan, Italy. [ CrossRef ]
  • Ting J, Yang YC, Fu LC, Tsai CL, Huang CH. Distance ordering: a deep supervised metric learning for pain intensity estimation. 2021. Presented at: 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA); December 13-16, 2021;1083-1088; Pasadena, CA. [ CrossRef ]
  • Xin X, Li X, Yang S, Lin X, Zheng X. Pain expression assessment based on a locality and identity aware network. IET Image Process. 2021;15(12):2948-2958. [ FREE Full text ] [ CrossRef ]
  • Alghamdi T, Alaghband G. Facial expressions based automatic pain assessment system. Appl Sci. 2022;12(13):6423. [ FREE Full text ] [ CrossRef ]
  • Barua PD, Baygin N, Dogan S, Baygin M, Arunkumar N, Fujita H, et al. Automated detection of pain levels using deep feature extraction from shutter blinds-based dynamic-sized horizontal patches with facial images. Sci Rep. 2022;12(1):17297. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Fontaine D, Vielzeuf V, Genestier P, Limeux P, Santucci-Sivilotto S, Mory E, et al. Artificial intelligence to evaluate postoperative pain based on facial expression recognition. Eur J Pain. 2022;26(6):1282-1291. [ CrossRef ] [ Medline ]
  • Hosseini E, Fang R, Zhang R, Chuah CN, Orooji M, Rafatirad S, et al. Convolution neural network for pain intensity assessment from facial expression. 2022. Presented at: 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC); July 11-15, 2022;2697-2702; Glasgow, Scotland. [ CrossRef ]
  • Huang Y, Qing L, Xu S, Wang L, Peng Y. HybNet: a hybrid network structure for pain intensity estimation. Vis Comput. 2021;38(3):871-882. [ CrossRef ]
  • Islamadina R, Saddami K, Oktiana M, Abidin TF, Muharar R, Arnia F. Performance of deep learning benchmark models on thermal imagery of pain through facial expressions. 2022. Presented at: 2022 IEEE International Conference on Communication, Networks and Satellite (COMNETSAT); November 03-05, 2022;374-379; Solo, Indonesia. [ CrossRef ]
  • Swetha L, Praiscia A, Juliet S. Pain assessment model using facial recognition. 2022. Presented at: 2022 6th International Conference on Intelligent Computing and Control Systems (ICICCS); May 25-27, 2022;1-5; Madurai, India. [ CrossRef ]
  • Wu CL, Liu SF, Yu TL, Shih SJ, Chang CH, Mao SFY, et al. Deep learning-based pain classifier based on the facial expression in critically ill patients. Front Med (Lausanne). 2022;9:851690. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Ismail L, Waseem MD. Towards a deep learning pain-level detection deployment at UAE for patient-centric-pain management and diagnosis support: framework and performance evaluation. Procedia Comput Sci. 2023;220:339-347. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Vu MT, Beurton-Aimar M. Learning to focus on region-of-interests for pain intensity estimation. 2023. Presented at: 2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG); January 05-08, 2023;1-6; Waikoloa Beach, HI. [ CrossRef ]
  • Kaur H, Pannu HS, Malhi AK. A systematic review on imbalanced data challenges in machine learning: applications and solutions. ACM Comput Surv. 2019;52(4):1-36. [ CrossRef ]
  • Data for meta-analysis of pain assessment from facial images. Figshare. 2023. URL: https:/​/figshare.​com/​articles/​dataset/​Data_for_Meta-Analysis_of_Pain_Assessment_from_Facial_Images/​24531466/​1 [accessed 2024-03-22]


Edited by A Mavragani; submitted 26.07.23; peer-reviewed by M Arab-Zozani, M Zhang; comments to author 18.09.23; revised version received 08.10.23; accepted 28.02.24; published 12.04.24.

©Jian Huo, Yan Yu, Wei Lin, Anmin Hu, Chaoran Wu. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 12.04.2024.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.

  • Alzheimer's disease & dementia
  • Arthritis & Rheumatism
  • Attention deficit disorders
  • Autism spectrum disorders
  • Biomedical technology
  • Diseases, Conditions, Syndromes
  • Endocrinology & Metabolism
  • Gastroenterology
  • Gerontology & Geriatrics
  • Health informatics
  • Inflammatory disorders
  • Medical economics
  • Medical research
  • Medications
  • Neuroscience
  • Obstetrics & gynaecology
  • Oncology & Cancer
  • Ophthalmology
  • Overweight & Obesity
  • Parkinson's & Movement disorders
  • Psychology & Psychiatry
  • Radiology & Imaging
  • Sleep disorders
  • Sports medicine & Kinesiology
  • Vaccination
  • Breast cancer
  • Cardiovascular disease
  • Chronic obstructive pulmonary disease
  • Colon cancer
  • Coronary artery disease
  • Heart attack
  • Heart disease
  • High blood pressure
  • Kidney disease
  • Lung cancer
  • Multiple sclerosis
  • Myocardial infarction
  • Ovarian cancer
  • Post traumatic stress disorder
  • Rheumatoid arthritis
  • Schizophrenia
  • Skin cancer
  • Type 2 diabetes
  • Full List »

share this!

April 12, 2024

This article has been reviewed according to Science X's editorial process and policies . Editors have highlighted the following attributes while ensuring the content's credibility:


peer-reviewed publication

trusted source

Study shows AI improves accuracy of skin cancer diagnoses

by Krista Conger, Stanford University


A new study led by researchers at Stanford Medicine finds that computer algorithms powered by artificial intelligence based on deep learning can help health care practitioners to diagnose skin cancers more accurately. Even dermatologists benefit from AI guidance, although their improvement is less than that seen for non-dermatologists.

"This is a clear demonstration of how AI can be used in collaboration with a physician to improve patient care ," said professor of dermatology and of epidemiology Eleni Linos, MD. Linos leads the Stanford Center for Digital Health, which was launched to tackle some of the most pressing research questions at the intersection of technology and health by promoting collaboration between engineering, computer science, medicine and the humanities.

Linos, associate dean of research and the Ben Davenport and Lucy Zhang Professor in Medicine, is the senior author of the study , which was published in npj Digital Medicine . Postdoctoral scholar Jiyeong Kim, Ph.D., and visiting researcher Isabelle Krakowski, MD, are the lead authors of the research.

"Previous studies have focused on how AI performs when compared with physicians," Kim said. "Our study compared physicians working without AI assistance with physicians using AI when diagnosing skin cancers ."

AI algorithms are increasingly used in clinical settings , including dermatology. They are created by feeding a computer hundreds of thousands or even millions of images of skin conditions labeled with information such as diagnosis and patient outcome.

Through a process called deep learning , the computer eventually learns to recognize telltale patterns in the images that correlate with specific skin diseases including cancers. Once trained, an algorithm written by the computer can be used to suggest possible diagnoses based on an image of a patient's skin that it has not been exposed to.

These diagnostic algorithms aren't used alone, however. They are overseen by clinicians who also assess the patient, come to their own conclusions about a patient's diagnosis and choose whether to accept the algorithm's suggestion.

An accuracy boost

Kim and Linos' team reviewed 12 studies detailing more than 67,000 evaluations of potential skin cancers by a variety of practitioners with and without AI assistance. They found that, overall, health care practitioners working without aid from artificial intelligence were able to accurately diagnose about 75% of people with skin cancer—a statistical measurement known as sensitivity. Conversely, the workers correctly diagnosed about 81.5% of people with cancer-like skin conditions but who did not have cancer—a companion measurement known as specificity.

Health care practitioners who used AI to guide their diagnoses did better. Their diagnoses were about 81.1% sensitive and 86.1% specific. The improvement may seem small, but the differences are critical for people told they don't have cancer, but do, or for those who do have cancer but are told they are healthy.

When the researchers split the health care practitioners by specialty or level of training, they saw that medical students , nurse practitioners and primary care doctors benefited the most from AI guidance—improving on average about 13 points in sensitivity and 11 points in specificity. Dermatologists and dermatology residents performed better overall, but the sensitivity and specificity of their diagnoses also improved with AI.

"I was surprised to see everyone's accuracy improve with AI assistance, regardless of their level of training," Linos said. "This makes me very optimistic about the use of AI in clinical care. Soon our patients will not just be accepting, but expecting, that we use AI assistance to provide them with the best possible care."

Researchers at the Stanford Center for Digital Health, including Kim, are interested in learning more about the promise of and barriers to integrating AI-based tools into health care. In particular, they are planning to investigate how the perceptions and attitudes of physicians and patients to AI will influence its implementation.

"We want to better understand how humans interact with and use AI to make clinical decisions," Kim said.

Previous studies have indicated that a clinician's degree of confidence in their own clinical decision, the degree of confidence of the AI, and whether the clinician and the AI agree on the diagnosis all influence whether the clinician incorporates the algorithm's advice when making clinical decisions for a patient.

Medical specialties like dermatology and radiology, which rely heavily on images—visual inspection, pictures, X-rays, MRIs and CT scans, among others—for diagnoses are low-hanging fruit for computers that can pick out levels of detail beyond what a human eye (or brain) can reasonably process. But even other more symptom-based specialties, or prediction modeling, are likely to benefit from AI intervention, Linos and Kim feel. And it's not just patients who stand to benefit.

"If this technology can simultaneously improve a doctor's diagnostic accuracy and save them time, it's really a win-win. In addition to helping patients, it could help reduce physician burnout and improve the human interpersonal relationships between doctors and their patients," Linos said.

"I have no doubt that AI assistance will eventually be used in all medical specialties. The key question is how we make sure it is used in a way that helps all patients regardless of their background and simultaneously supports physician well-being."

Explore further

Feedback to editors

ai in medicine and healthcare research paper

Researchers demonstrate miniature brain stimulator in humans

4 hours ago

ai in medicine and healthcare research paper

Study reveals potential to reverse lung fibrosis using the body's own healing technique

21 hours ago

ai in medicine and healthcare research paper

Researchers discover cell 'crosstalk' that triggers cancer cachexia

ai in medicine and healthcare research paper

Study improves understanding of effects of household air pollution during pregnancy

22 hours ago

ai in medicine and healthcare research paper

Wearable sensors for Parkinson's can improve with machine learning, data from healthy adults

ai in medicine and healthcare research paper

New insights on B cells: Researchers explore building better antibodies and curbing autoimmune diseases

ai in medicine and healthcare research paper

Grieving pet owners comforted by 'supernatural' interactions

23 hours ago

ai in medicine and healthcare research paper

Cell's 'garbage disposal' may have another role: Helping neurons near skin sense the environment

ai in medicine and healthcare research paper

Chlamydia vaccine shows promise in early trial

ai in medicine and healthcare research paper

Researchers find no link between COVID-19 virus and development of asthma in children

Related stories.

ai in medicine and healthcare research paper

Doctors have more difficulty diagnosing disease when looking at images of darker skin: AI may offer a solution

Feb 5, 2024

ai in medicine and healthcare research paper

AI-based app can help physicians find skin melanoma

Mar 20, 2024

ai in medicine and healthcare research paper

Study shows skin cancer diagnoses using AI are as reliable as those made by medical experts

Oct 23, 2023

ai in medicine and healthcare research paper

Skin cancer diagnosis: Exploring reinforcement learning for improved performance of AI

Jul 28, 2023

ai in medicine and healthcare research paper

AI getting better at detecting skin cancer

Nov 3, 2023

ai in medicine and healthcare research paper

Dermatology program brings timely and accurate diagnosis of skin conditions to underserved communities

Jan 24, 2023

Recommended for you

ai in medicine and healthcare research paper

Genomic deletions explain why some types of melanoma resist targeted therapies

Apr 12, 2024

ai in medicine and healthcare research paper

Scientists uncover a missing link between poor diet and higher cancer risk

ai in medicine and healthcare research paper

Metformin may help the immune system better identify breast cancer cells

ai in medicine and healthcare research paper

Softer tumors fuel more aggressive spread of triple-negative breast cancer, research shows

ai in medicine and healthcare research paper

Scientists use wearable technology to detect stress levels during sleep

Apr 11, 2024

Let us know if there is a problem with our content

Use this form if you have come across a typo, inaccuracy or would like to send an edit request for the content on this page. For general inquiries, please use our contact form . For general feedback, use the public comments section below (please adhere to guidelines ).

Please select the most appropriate category to facilitate processing of your request

Thank you for taking time to provide your feedback to the editors.

Your feedback is important to us. However, we do not guarantee individual replies due to the high volume of messages.

E-mail the story

Your email address is used only to let the recipient know who sent the email. Neither your address nor the recipient's address will be used for any other purpose. The information you enter will appear in your e-mail message and is not retained by Medical Xpress in any form.

Newsletter sign up

Get weekly and/or daily updates delivered to your inbox. You can unsubscribe at any time and we'll never share your details to third parties.

More information Privacy policy

Donate and enjoy an ad-free experience

We keep our content available to everyone. Consider supporting Science X's mission by getting a premium account.

E-mail newsletter


  1. (PDF) The role of artificial intelligence in healthcare: a structured

    ai in medicine and healthcare research paper

  2. Introduction Map of Artificial Intelligence in Health Care and Medicine

    ai in medicine and healthcare research paper

  3. Artificial Intelligence Special Publication

    ai in medicine and healthcare research paper

  4. Techment

    ai in medicine and healthcare research paper

  5. (PDF) Overview of artificial intelligence in medicine

    ai in medicine and healthcare research paper

  6. (PDF) Use of Artificial Intelligence in Healthcare and Medicine

    ai in medicine and healthcare research paper


  1. Expert Talk

  2. Mycin


  4. Introduction to AI In Medicine And Healthcare

  5. The New Wave of AI in Healthcare: Interview with Jianying Hu, IBM

  6. Will AI Replace Doctors?


  1. Artificial intelligence in healthcare: transforming the practice of medicine

    AI can enable healthcare systems to achieve their 'quadruple aim' by democratising and standardising a future of connected and AI augmented care, precision diagnostics, precision therapeutics and, ultimately, precision medicine (Table (Table1 1). 30 Research in the application of AI healthcare continues to accelerate rapidly, with potential ...

  2. Artificial Intelligence in Healthcare: Review and Prediction Case

    The remainder of this paper is oriented toward the main AI applications. ... Proceedings of International Conference on Advancements of Medicine and Health Care through Technology; 2009 Sep 23-26; Cluj-Napoca, Romania (2009) ... Global evolution of research in artificial intelligence in health and medicine: a bibliometric study. J Clin Med, 8 ...

  3. AI in health and medicine

    Artificial intelligence (AI) is poised to broadly reshape medicine, potentially improving the experiences of both clinicians and patients. We discuss key findings from a 2-year weekly effort to ...

  4. The role of artificial intelligence in healthcare: a structured

    Research topics. Artificial Intelligence (AI) is a novel topic based on already known knowledge, which has several healthcare implications. Due to its continuous growth, there is the potential for a structured literature review (SLR) investigating how AI can contribute to healthcare implementation. 5069 results.

  5. The potential for artificial intelligence to transform healthcare

    Artificial intelligence (AI) has the potential to transform care delivery by improving health outcomes, patient safety, and the affordability and accessibility of high-quality care. AI will be ...

  6. Artificial intelligence in medicine: current trends and future

    Artificial intelligence (AI) research within medicine is growing rapidly. In 2016, healthcare AI projects attracted more investment than AI projects within any other sector of the global economy. 1 However, among the excitement, there is equal scepticism, with some urging caution at inflated expectations. 2 This article takes a close look at current trends in medical AI and the future ...

  7. Artificial Intelligence and Machine Learning in Clinical Medicine, 2023

    Medical schools of South Asian countries need to incorporate artificial intelligence-powered chatbots in existing undergraduate medical curricula, Journal of Integrative Medicine and Public Health ...

  8. Artificial Intelligence in Medicine

    The editors announce both a series of articles focusing on AI and machine learning in health care and the 2024 launch of a new journal, NEJM AI, a forum for evidence, resource sharing, and discussi...

  9. Artificial intelligence in healthcare: An essential guide for health

    Abstract. Artificial Intelligence (AI) is evolving rapidly in healthcare, and various AI applications have been developed to solve some of the most pressing problems that health organizations currently face. It is crucial for health leaders to understand the state of AI technologies and the ways that such technologies can be used to improve the ...

  10. Artificial Intelligence in Medicine: Today and Tomorrow

    Artificial intelligence-powered medical technologies are rapidly evolving into applicable solutions for clinical practice. Deep learning algorithms can deal with increasing amounts of data provided by wearables, smartphones, and other mobile monitoring sensors in different areas of medicine. Currently, only very specific settings in clinical practice benefit from the application of artificial ...

  11. AI in health and medicine

    Abstract. Artificial intelligence (AI) is poised to broadly reshape medicine, potentially improving the experiences of both clinicians and patients. We discuss key findings from a 2-year weekly effort to track and share key developments in medical AI. We cover prospective studies and advances in medical image analysis, which have reduced the ...

  12. Artificial intelligence innovation in healthcare: Literature review

    1. Introduction1.1. Background. The emergence of artificial intelligence (AI) within healthcare holds much promise. AI in healthcare helps to enhance patient diagnoses, improve prevention and treatment, increase cost-efficiency, and serve as a way to provide equitable access and treatment for all [1, 2].The term AI innovation refers to the creation of new knowledge, tools, and ideas [3] using ...

  13. AI in Medicine

    Artificial Intelligence in Medicine. A.L. Beam and OthersN Engl J Med 2023; 388:1220-1221. The editors announce both a series of articles focusing on AI and machine learning in health care and the ...

  14. AI in Healthcare: Applications, Challenges, and Future Prospects

    Artificial Intelligence (AI) has emerged as a transformative force in healthcare, offering a. plethora of applications that promise to revolutionize the industry. This paper explores the. diverse ...

  15. Artificial intelligence for strengthening healthcare systems in low

    To allow for a comprehensive outline of AI technologies applied to both medical practice and healthcare delivery, this paper systematically reviews and identifies all relevant literature across a ...

  16. Generative AI in Medicine and Healthcare: Promises ...

    Generative AI (artificial intelligence) refers to algorithms and models, such as OpenAI's ChatGPT, that can be prompted to generate various types of content. In this narrative review, we present a selection of representative examples of generative AI applications in medicine and healthcare. We then briefly discuss some associated issues, such as trust, veracity, clinical safety and ...

  17. Artificial Intelligence applications in healthcare: A bibliometric and

    This paper presents a comprehensive analysis of AI's current healthcare research landscape. Employing bibliometric analytics, it explores document trends, top sources, influential countries, dynamic keywords, and emerging research topics. The study highlights the United States as a dominant force in AI healthcare research, with over 5,000 ...

  18. (PDF) Artificial Intelligence in Healthcare

    Abstract. Artificial intelligence is to reduce human cognitive functions. It is bringing an approach to healthcare, powdered by increasing the availability of healthcare data and rapid progress of ...

  19. Use of Artificial Intelligence in Healthcare and Medicine

    This research paper will discuss different types of Artificial Intelligence techniques and how AI has been used in healthcare. ... Artificial Intelligence In Medicine, 52(2), 57-58. doi: 10.1016/j ...

  20. Artificial Intelligence in Health Care: Current Applications and Issues

    Therefore, this paper aims to introduce the current research and application status of AI technology in health care and discuss the issues that need to be resolved. Keywords: Further, various attempts have been made to develop and commercialize AI-based medical devices.

  21. AI is already reshaping care. Here's what it means for doctors

    Current and potential uses for AI, barriers to widespread adoption and risks physicians need to be aware of are explained in a comprehensive report, Future of Health: The Emerging Landscape of Augmented Intelligence in Health Care (PDF), that was produced by the AMA and Manatt Health. "AI is going to change everything in medicine.

  22. AI improves accuracy of skin cancer diagnoses in Stanford Medicine-led

    Stanford Medicine is an integrated academic health system comprising the Stanford School of Medicine and adult and pediatric health care delivery systems. Together, they harness the full potential of biomedicine through collaborative research, education and clinical care for patients.

  23. AI for Reliable and Equitable Real World Evidence Generation in Medicine

    The "AI for Reliable and Equitable Real World Evidence Generation in Medicine" workshop is dedicated to advancing the understanding and exploring the transformative role of artificial intelligence (AI) in analyzing real-world data (RWD) for real-world evidence (RWE) generation, leading to evidence-based medicine (EBM).Focused on leading-edge research and innovation, the workshop will feature ...

  24. Prediction Models and Clinical Outcomes—A Call for Papers

    The need to classify disease and predict outcomes is as old as medicine itself. Nearly 50 years ago, the advantage of applying multivariable statistics to these problems became evident. 1 Since then, the increasing availability of databases containing often-complex clinical information from tens or even hundreds of millions of patients, combined with powerful statistical techniques and ...

  25. National Academy of Medicine drafts code of conduct for AI in healthcare

    The academy's Artificial Intelligence Code of Conduct initiative that launched in January 2023 engaged many stakeholders - listed in the acknowledgments - in co-creating the new draft framework. "The promise of AI technologies to transform health and healthcare is tremendous, but there is concern that their improper use can have harmful ...

  26. AI in Healthcare: Top 5 Medical AI Tools We Use in 2024

    AI in Healthcare Examples: Top 5 Medical AI Tools in 2024. 1. Merative. Merative's 'My Clinical Diary' Interface. Source: TrustRadius. Merative, formerly known as IBM Watson Health, has emerged as a leading data, analytics, and technology partner for the healthcare industry in 2024. With its rebranding and acquisition by Francisco ...

  27. New AI method captures uncertainty in medical images

    New AI method captures uncertainty in medical images. Date: April 11, 2024. Source: Massachusetts Institute of Technology. Summary: Tyche is a machine-learning framework that can generate ...

  28. Fair and equitable AI in biomedical research and healthcare: Social

    Artificial intelligence (AI) offers opportunities but also challenges for biomedical research and healthcare. This position paper shares the results of the international conference "Fair medicine and AI" (online 3-5 March 2021). ... Moreover, the field of AI in medicine and healthcare is embedded within a powerful promissory environment.

  29. Journal of Medical Internet Research

    Background: The continuous monitoring and recording of patients' pain status is a major problem in current research on postoperative pain management. In the large number of original or review articles focusing on different approaches for pain assessment, many researchers have investigated how computer vision (CV) can help by capturing facial expressions.

  30. Study shows AI improves accuracy of skin cancer diagnoses

    Provided by Stanford University. A new study led by researchers at Stanford Medicine finds that computer algorithms powered by artificial intelligence based on deep learning can help health care ...