Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Get the Reddit app

Scan this QR code to download the app now
Or check it out in the app stores
r/MachineLearning icon
r/MachineLearning icon
Go to MachineLearning
r/MachineLearning

ml. Beginners please see learnmachinelearning


Members Online

[D] Monitoring and Debugging RAG Systems in Production

Discussion

Hi!

I’m part of a team from MIT, where we specialize in developing advanced tools for data visualizations of the latent space. We are currently exploring how visualizations can help increase the effectiveness of RAG monitoring systems and would love to gather insights from how people manage RAGs currently.

We know there are existing monitoring tools like Ragas, Arize(Phoenix), LangSmith. We are curious on how

  • Frequency you are looking at monitoring data

  • What does the end-user application your RAG support look like?

We believe that a visualization tool could greatly enhance the ability to monitor and debug RAG systems in real-time by:

  • Providing intuitive, graphical representations of system performance and behavior.

  • Highlighting potential issues and bottlenecks at a glance.

If you’re willing to share more detailed insights through an interview, please let us know! Happy to get connected and learn more!

Share
Sort by:
Best
Open comment sort options
u/newpeak avatar

RAG is not a LLMOps problem currently, because it still has a lot of room for improvement in terms of relevance. Therefore, monitoring the pipeline is not helpful. When the effects of RAG is good enough, could the requirements of LLMOps emerged.

Solutions of taking RAG into an orchestration problem does belong to the so called RAG 1.0 which does not make sense. RAG 2.0 means an end-to-end solution which requires a series of components to work closely together including :

  1. Excellent data chunking tools to recognize the semantics of unstructured data.

  2. Query model or query rewrite operators because there are always semantic gap between questions and answers.

  3. Good enough databases for retrieval. Within the predictable future, hyrbid search is always a MUST, such as Blended RAG even requires three-ways of hybrid search.

  4. Ranking models which are tailord for the vertical scenarios.

  5. Dynamic orchestration for Agentic RAG.

I think you can focus on any of the above as they have more pressing needs think you could take focuse on any of the above which could have much higher emergent. ps, we are striving to make the above happen through https://github.com/infiniflow/ragflow , based on which some further works could be done.

u/newpeak - really great response.

More replies
u/aveho_adhuc_7409 avatar

RAG monitoring can be a black box, really interested in seeing what you develop!