Sarita A Joshi

Sarita A Joshi

Sarita A. Joshi, an AI Specialist at Google Cloud and a Senior Member of the IEEE, empowers healthcare organizations to achieve transformative outcomes with AI. Her expertise is built on years of leading AI initiatives at Google and Amazon Web Services, where she served as Senior Science Manager and spearheaded AI Centers of Excellence (CoEs). With a background spanning consulting, R&D, and product engineering at industry giants like Amazon, Accenture, and Philips Healthcare, Sarita brings a unique blend of technical acumen and strategic vision. Her contributions extend to the research community through speaking engagements and peer review work at leading AI conferences such as ACM, NeurIPS, AAAI, and IEEE. Sarita holds a Master's degree in Computer Science from Northeastern University, equipping her with the knowledge and experience to guide others in navigating the complexities of AI in healthcare.
Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Preview abstract The global adoption of Large Language Models (LLMs) in healthcare shows promise for enhancing clinical workflows and improving patient outcomes. However, Automatic Speech Recognition (ASR) errors in critical medical entities remain a significant challenge. These errors can lead to severe consequences if undetected. This study investigates the prevalence and impact of ASR errors in medical transcription across Africa, Europe, and North America. By examining variations in accented English across three continents, we analyze the impact of regional speech patterns on ASR performance. Our research quantifies both the potential and limitations of LLMs in mitigating ASR inaccuracies within various medical settings, with particular attention to performance variations across regional accents and medical terminology. Our findings highlight significant disparities in ASR accuracy across regions and identify specific conditions under which LLM corrections prove most effective. View details
    Preview abstract In the rapidly evolving landscape of medical documentation, transcribing clinical dialogues accurately is increasingly paramount. This study explores the potential of Large Language Models (LLMs) to enhance the accuracy of Automatic Speech Recognition (ASR) systems in medical transcription. Utilizing the Primock57 dataset, which encompasses a diverse range of medical consultations, we apply advanced LLMs to refine ASR-generated transcripts. Our research is multifaceted, focusing on improvements in general Word Error Rate (WER), Medical Concept WER (MC-WER) for the accurate transcription of essential medical terms, and speaker diarization accuracy. Additionally, we assess the role of LLM post-processing in improving semantic textual similarity, thereby preserving the contextual integrity of clinical dialogues. Through a series of experiments, we compare the efficacy of zero-shot and Chain-of-Thought (CoT) prompting techniques in enhancing diarization and correction accuracy. Our findings demonstrate that LLMs, particularly through CoT prompting, not only improve the diarization accuracy of existing ASR systems but also achieve state-of-the-art performance in this domain. This improvement extends to more accurately capturing medical concepts and enhancing the overall semantic coherence of the transcribed dialogues. These findings illustrate the dual role of LLMs in augmenting ASR outputs and independently excelling in transcription tasks, holding significant promise for transforming medical ASR systems and leading to more accurate and reliable patient records in healthcare settings. View details