Khaled Saab’s Post

View profile for Khaled Saab, graphic

Research Scientist at Google DeepMind; Stanford PhD

Introducing *Med-Gemini*, a family of models that extends the best of Gemini into medicine! ✨⚕️ *Highlights* of what you can do with Med-Gemini: > Answer medical questions with up-to-date knowledge using agentic web search 🔎❤️🩹 > Converse about your medical images, videos, and long multi-visit health records 📷📹📃 > Do a literature search by uploading tens of biomedical papers and asking questions 📚 > And so much more! 🏗️ *Development* of Med-Gemini included: > Advancing clinical reasoning with self-training and search > Improving multimodal understanding with fine-tuning > Leveraging long-context capabilities with chain-of-reasoning Paper: https://lnkd.in/gx5i_PZS Below, I talk more about self-training with web search to improve Gemini’s clinical reasoning. ---------- *Clinical reasoning* is an iterative process where physicians combine their knowledge with patient information to form a case representation, guiding further data collection until a diagnosis is confirmed. Importantly, as medical knowledge rapidly evolves, clinicians also integrate up-to-date information from authoritative sources. *Lack of reasoning traces for supervision* is a challenge when training LLMs to improve their clinical reasoning. A popular dataset for medically fine-tuning LLMs is the MedQA (USMLE) train set – but this dataset only contains answer choices to questions. Also, how do we train an LLM to effectively integrate information from sources on the web? *Self-training with web search*, where we use a handful of expert reasoning traces (with and without web search integration) to kickstart a self-training loop, alleviates the need to have large scale reasoning supervision. For each training question, Med-Gemini generates reasoning traces using the expert demonstrations as in-context examples. We then use the reasoning traces that got to the correct answer to self-train Med-Gemini. *Uncertainty-guided search at inference* is how we decide when to invoke search at inference. We ask Med-Gemini to answer the same question multiple times, and estimate uncertainty by calculating the entropy across the predicted answers – high uncertainty invokes an iteration of web search. In conclusion, we found that self-training with web search greatly improved Med-Gemini’s clinical reasoning ability, and led to a new SoTA on MedQA (USMLE), and generalized to other challenging benchmarks. Check out the paper for more details. ---------- It was so fun to learn from and work alongside amazing mentors and teammates Vivek Natarajan, Tao Tu, Alan Karthikesalingam MD PhD, Wei-Hung Weng, Ryutaro Tanno, Elahe Vedadi, David Stutz, Mike Schäkermann and so many more folks across Google Research/DeepMind. It was an incredible honor to be supported by our inspiring leaders  Jeff Dean, Demis Hassabis, Oriol Vinyals, koray kavukcuoglu, Yossi Matias, Greg Corrado, Joëlle Barral So excited to continue on the journey of investigating the art of the possible in health AI!

View profile for Vivek Natarajan, graphic

AI Researcher, Google

Incredibly delighted to introduce Med-Gemini, our latest family of multimodal models for medicine. Building on Gemini’s core capabilities, Med-Gemini models are specialized for  - advanced reasoning with seamless web-search integration - complex multimodal understanding spanning 1 million+ context tokens The results are promising. We evaluate Med-Gemini on 14 medical benchmarks spanning text, images, surgical videos, EHRs, waveforms, genomics and more, establishing new SoTA on 10 of them, and surpassing the GPT-4 model family on every benchmark where a direct comparison is viable, often by a wide margin. On the popular MedQA (USMLE) benchmark, Med-Gemini achieves SoTA performance of 91.1% accuracy. On 7 multimodal medical benchmarks, Med-Gemini improves over GPT-4V by an average relative margin of 44.5%. Perhaps the most exciting advance is the biomedical applications unlocked by long-context capabilities of Med-Gemini (currently beyond the capabilities of other popular models) In addition to SoTA results on long-context EHR and video tasks, we also include qualitative demonstrations of what such Med-Gemini capabilities might potentially enable including: - supporting multimodal diagnostic conversations - facilitating improved clinician-EHR interactions - accelerating biomedical research with the ability to summarize and generate insights from dozens of full content research articles. Our paper has more details including expert level performance on real-world relevant tasks such as medical summarization. In due course, we expect the capabilities to be available via Google Cloud MedLM APIs. Zooming out, its incredible to see how far we have come in less than a year. We have gone from LLMs that were only doing single-turn medical QA (Med-PaLM 2) to large multimodal models (LMMs) that can natively understand and intelligently converse about biomedical data spanning millions+ tokens and perform complex tasks at expert level. At the same time, capabilities advancements go hand-in-hand with reliability demonstrations for biomedicine. We are excited to keep pushing the frontiers. Paper link - https://lnkd.in/geJWPd9X A real privilege to work with incredible team mates and leads spanning Google Research Google DeepMind Google Health Google Cloud Verily including Alan Karthikesalingam MD PhD Tao Tu Khaled Saab Juraj Gottweis Wei-Hung Weng Ryutaro Tanno S. Sara Mahdavi David Stutz Ellery Wulczyn Yong Cheng Tomer Golany Mike Schäkermann Jimmy Hu Tim Strother Chunjong Park David Barrett Elahe Vedadi Fan Zhang Juan Manuel Zambrano Chaves Anil Palepu Daniel McDuff Luyang Liu Christopher Semturs Nenad Tomašev Aishwarya Kamath Natasha Latysheva Jean-Baptiste ALAYRAC Basil Mustafa Neil Houlsby Philip Mansfield Jonathan Krause Kavita Kulkarni Renée Wong Ehud Rivlin Yossi Matias Joëlle Barral Greg Corrado Dale Webster Ewa Dominowska Jonathon Shlens S. M. Ali Eslami Claire Cui Oriol Vinyals Koray Kavukcuoglu James Manyika Jeff Dean Demis Hassabis

  • No alternative text description for this image
  • No alternative text description for this image
  • No alternative text description for this image

Amazing work

Like
Reply
Akhil Vydyula

Instructor & Author | Senior Data Engineer | Udemy Instructor | 1M+ Students Trained in Data Science | Top 0.01% in Machine Learning | Top 1% in Topmate Bookings | Gen AI | LLM | Pyspark | Big Data • SQL

1mo
Like
Reply
Rami Roberto Saab

🔍 Connecting you to the talent you need | Founder & CEO at Invictus Direct

1mo

Inspiring work Khaled Saab!

Like
Reply
Kirstin Schauble

Senior Systems Engineer @ ANELLO Photonics | Stanford EE PhD

1mo

Incredible 🚀

Like
Reply
See more comments

To view or add a comment, sign in

Explore topics