LangChain’s Post

View organization page for LangChain, graphic

239,009 followers

3w Edited

🧑⚖️ Self-improving evaluators in LangSmith One method for evaluating LLM systems is to use another LLM "as a judge". These "LLM-as-a-Judges" can review raw text, using a prompt to guide the grader and automate human review. However, these "LLM-as-a-Judge" systems require constant prompt engineering to align with human preferences. In LangSmith, you can now use "LLM-as-a-Judge" evaluators with a self-improving feedback loop: + Allow a human to easily correct the "LLM-as-a-Judge" + And easily pass these back to the 'LLM-as-a-Judge' as few shot examples In part 1 last week, we showed how to apply self-improving evaluators to any LangSmith project: + The evaluator is applied to all traces in your project automatically and can run on production logs + It's easy to review, correct, and pass back correction to improve the evaluator Here in part 2, we show how to pin self-improving evaluators to any LangSmith dataset: + The evaluator is applied on every experiment run on your dataset In both cases, the evaluator can be self-improved with human feedback! 🎥 Video: https://lnkd.in/gi6CG6qH 📓 Docs: https://lnkd.in/gPbYCnvm 🤓 Data flywheel resource: https://lnkd.in/gurFTjC9 ✍️ Blog: https://lnkd.in/gvEgXuJU

5 Comments

Techlusion

The concept of self-improving evaluators in LangSmith is a game-changer! It's impressive how it allows a human to easily correct and improve the LLM-as-a-Judge systems, creating a continuous feedback loop for enhanced performance. The ability to apply these evaluators to any project or dataset further adds to its utility. Excited to see more advancements in this space! Keep up the excellent work.

MicroDigital Technologies

Nag Maddula - Please have a look for the product modeler , to refine the answers before publishing to user !

Azaz Rasool

juggling | its fun when you have a clear view of your north star 💫

Naif AlOraini

Charles Dadi

@Nexa Forward | Streamline your business with AI.

Idriss Mortadi

1 Reaction

See more comments

To view or add a comment, sign in

More Relevant Posts

Chris Colledge

Follow for new ideas on AI, Cloud, and FinOps.
3w
Report this post
One thing a lot of people have asked me is how they can continuall benchmark whether the selected LLM is correct for their use case or not and, if not, how to select a more suitable LLM. "Do we need to evaluate EVERY prompt/response?" "Is human validation better than machine validation?" "How can we scale that?" Highly recommend watching this video about self improving evaluator, basically a way of using minimal human input to train and LLM judge to continually evaluate prompt + response in langchain driven applications.
LangChain

239,009 followers
3w Edited

🧑⚖️ Self-improving evaluators in LangSmith One method for evaluating LLM systems is to use another LLM "as a judge". These "LLM-as-a-Judges" can review raw text, using a prompt to guide the grader and automate human review. However, these "LLM-as-a-Judge" systems require constant prompt engineering to align with human preferences. In LangSmith, you can now use "LLM-as-a-Judge" evaluators with a self-improving feedback loop: + Allow a human to easily correct the "LLM-as-a-Judge" + And easily pass these back to the 'LLM-as-a-Judge' as few shot examples In part 1 last week, we showed how to apply self-improving evaluators to any LangSmith project: + The evaluator is applied to all traces in your project automatically and can run on production logs + It's easy to review, correct, and pass back correction to improve the evaluator Here in part 2, we show how to pin self-improving evaluators to any LangSmith dataset: + The evaluator is applied on every experiment run on your dataset In both cases, the evaluator can be self-improved with human feedback! 🎥 Video: https://lnkd.in/gi6CG6qH 📓 Docs: https://lnkd.in/gPbYCnvm 🤓 Data flywheel resource: https://lnkd.in/gurFTjC9 ✍️ Blog: https://lnkd.in/gvEgXuJU
Like Comment
To view or add a comment, sign in
Ashish Srivastava

Principal Consultant at AI & Cloud Native Labs - HCL Tech| Gen AI | Cloud Architect | AWS
3w
Report this post
Interesting concept ..Similar ideas can be chalked for governance as well
LangChain

239,009 followers
3w Edited

🧑⚖️ Self-improving evaluators in LangSmith One method for evaluating LLM systems is to use another LLM "as a judge". These "LLM-as-a-Judges" can review raw text, using a prompt to guide the grader and automate human review. However, these "LLM-as-a-Judge" systems require constant prompt engineering to align with human preferences. In LangSmith, you can now use "LLM-as-a-Judge" evaluators with a self-improving feedback loop: + Allow a human to easily correct the "LLM-as-a-Judge" + And easily pass these back to the 'LLM-as-a-Judge' as few shot examples In part 1 last week, we showed how to apply self-improving evaluators to any LangSmith project: + The evaluator is applied to all traces in your project automatically and can run on production logs + It's easy to review, correct, and pass back correction to improve the evaluator Here in part 2, we show how to pin self-improving evaluators to any LangSmith dataset: + The evaluator is applied on every experiment run on your dataset In both cases, the evaluator can be self-improved with human feedback! 🎥 Video: https://lnkd.in/gi6CG6qH 📓 Docs: https://lnkd.in/gPbYCnvm 🤓 Data flywheel resource: https://lnkd.in/gurFTjC9 ✍️ Blog: https://lnkd.in/gvEgXuJU
Like Comment
To view or add a comment, sign in
Bhaskara Reddy Sannapureddy

Senior Project Manager|Infosys|B.E(Hons) BITS, Pilani & PGD in ML & AI at IIITB & Master of Science in ML & AI at LJMU, UK | (Building AI for World & Create AICX)(Learn, Unlearn, Relearn)
1w Edited
Report this post
Automatically creating a knowledge graph with LLMs is hard - one issue is you end up with a lot of duplicates. Explore entity deduplication, using a combination of text embeddings and word distance. Result is you can use LLMs to do extraction over separate chunks and consolidate at the end! Check out full blog post by #tb_tomaz here: https://lnkd.in/g5qsdC5s https://lnkd.in/gJ2KFK-e
Like Comment
To view or add a comment, sign in
Evan LaPointe

Founder at CORE - the science of high-performance teams.
6mo
Report this post
Sad news to hear about the DAA (for those of you who know what that is). I am working together with other industry leaders to bring a phoenix out of these ashes, offering members incredibly-valuable content and community that helps you in your analytics career. If you want to sign up to receive info, click the link below. For those of you who know about purposeful already, please share your experiences to your network so people know what this is and how it will help them. Please re-post if you're in analytics! https://lnkd.in/dKGznibv

Purposeful Analysts Community

https://typeform.com

3 Comments
Like Comment
To view or add a comment, sign in
Genesis Aka

1,127 followers
1mo
Report this post
The content provides guidance on generating embeddings, configuring the search index, and experimenting with different searches for information retrieval, including vector, full text, hybrid searches, and filtering. It covers topics such as vector search algorithm, search types, manual multiple queries, and search evaluation methods. It emphasizes experimentation to find the right approach.</div><div class="read-more"><a href="" class="more-link">Continue reading</a>https://lnkd.in/gvuZVUN3

Developing a RAG solution – Information retrieval phase
Like Comment
To view or add a comment, sign in
Tom Yeh

CS Prof | AI by Hand ✍️ | CU Boulder
2mo
Report this post
RAG = Query + Prompt + Context + LLM ~ RSVP 👉 https://by-hand.ai/rag What is a RAG? This is a simple equation I came up with to conceptualize my understanding of what a RAG is. ### Beginner's Guide to RAG ### If you want to hear me explain my RAG equation, please join the webinar tomorrow. The RSVP link is above. #rag #vectordatabases #aibyhand --𝗖𝗼𝗻𝗰𝗲𝗽𝘁𝘀 -- I will cover the key concepts involved in the RAG equation: • Indexing • Loading • Splitting • Embedding • Storing • Retrieval • Augmentation • Generation If time permits, I may talk about advanced techniques such as • Multiple Embedding Spaces • RankGPT • Multi-query Retrieval • Contextual Compression • Hypothetic Document Embeddings -- 𝗠𝗮𝘁𝗵 -- As always, you should expect math in my webinar. But I assure you the math will be fun, accessible, and useful. -- 𝗧𝗼𝗼𝗹𝘀 -- I will include examples about a range of useful tools, such as SingleStore LlamaIndex Superlinked Hugging Face's MTEB Leaderboard -- 𝗛𝗼𝘀𝘁𝘀 -- Matt Brown Akmal Chaudhri, Esq. They were wonderful hosts for the previous webinar on vector databases. I'm glad they will be hosting me again. I know I am in good hands. 🙌 Thank you for helping me promote this webinar! Alex Wang Aman Chadha Brij kishore Pandey RSVP 👉 https://by-hand.ai/rag REPOST ♻️ to help more people know about this webinar.
8 Comments
Like Comment
To view or add a comment, sign in
Mrityunjay Kumar

Data-Philia-ist @ Databricks | Empowering data teams to solve world’s toughest problems !!
2mo
Report this post
Great session by Tom Yeh !!
Tom Yeh

CS Prof | AI by Hand ✍️ | CU Boulder
2mo

RAG = Query + Prompt + Context + LLM ~ RSVP 👉 https://by-hand.ai/rag What is a RAG? This is a simple equation I came up with to conceptualize my understanding of what a RAG is. ### Beginner's Guide to RAG ### If you want to hear me explain my RAG equation, please join the webinar tomorrow. The RSVP link is above. #rag #vectordatabases #aibyhand --𝗖𝗼𝗻𝗰𝗲𝗽𝘁𝘀 -- I will cover the key concepts involved in the RAG equation: • Indexing • Loading • Splitting • Embedding • Storing • Retrieval • Augmentation • Generation If time permits, I may talk about advanced techniques such as • Multiple Embedding Spaces • RankGPT • Multi-query Retrieval • Contextual Compression • Hypothetic Document Embeddings -- 𝗠𝗮𝘁𝗵 -- As always, you should expect math in my webinar. But I assure you the math will be fun, accessible, and useful. -- 𝗧𝗼𝗼𝗹𝘀 -- I will include examples about a range of useful tools, such as SingleStore LlamaIndex Superlinked Hugging Face's MTEB Leaderboard -- 𝗛𝗼𝘀𝘁𝘀 -- Matt Brown Akmal Chaudhri, Esq. They were wonderful hosts for the previous webinar on vector databases. I'm glad they will be hosting me again. I know I am in good hands. 🙌 Thank you for helping me promote this webinar! Alex Wang Aman Chadha Brij kishore Pandey RSVP 👉 https://by-hand.ai/rag REPOST ♻️ to help more people know about this webinar.
Like Comment
To view or add a comment, sign in
Paulo Cysne

Data Science Leader | 24,500+ followers | Transforming businesses with AI / Data Science / Machine Learning
1mo
Report this post
Building knowledge graphs automatically is hard. That’s why we provide a toolkit to let you manually define every entity, relation of a knowledge graph and link each to a chunk of text - letting you use graph-based RAG techniques to pull in arbitrary amounts of context.
LlamaIndex

186,081 followers
1mo

Building knowledge graphs automatically is hard. That’s why we provide a toolkit to let you manually define every entity, relation of a knowledge graph and link each to a chunk of text - letting you use graph-based RAG techniques to pull in arbitrary amounts of context. This lower-level toolkit lets you build graphs from scratch: 1. Explicitly specify the entities and relations 2. Specify text chunks related to any entity (e.g. through a MENTIONS relation) 3. Retrieve from the graph store, filtering by ids, properties, or through Cypher. Can also get back text nodes. 4. Upsert existing nodes with new data. 5. Delete nodes as well. You’ll have full control as an advanced user to tailor knowledge graphs to your needs. Check out our full guide below: https://lnkd.in/geGNHBEz
Like Comment
To view or add a comment, sign in
Yuval Shukroon

Product Manager, Pinpoint & Publisher Tools at Google
10mo
Report this post
Pinpoint’s feature for extracting structured data is now open to all users - as officially unveiled in the recent ONA 2023 conference! This feature tackles one of the most demanding tasks reporters face - making sense of large collections of forms or similarly structured documents they obtain through FOIA or by any other means. This task usually requires days of human review and/or advanced coding skills to resolve With Pinpoint reporters can now quickly annotate and extract all the structured data from these documents, exporting it into easy-to-use spreadsheets - in minutes rather than days and with no coding required. Watch our demo video for a glimpse into its capabilities: https://lnkd.in/dgscqSGn Reporters or academics? Check out Pinpoint at g.co/pinpoint

Extract Structured Data Using Google's Pinpoint

https://www.youtube.com/

3 Comments
Like Comment
To view or add a comment, sign in
LlamaIndex

186,081 followers
1mo
Report this post
Building knowledge graphs automatically is hard. That’s why we provide a toolkit to let you manually define every entity, relation of a knowledge graph and link each to a chunk of text - letting you use graph-based RAG techniques to pull in arbitrary amounts of context. This lower-level toolkit lets you build graphs from scratch: 1. Explicitly specify the entities and relations 2. Specify text chunks related to any entity (e.g. through a MENTIONS relation) 3. Retrieve from the graph store, filtering by ids, properties, or through Cypher. Can also get back text nodes. 4. Upsert existing nodes with new data. 5. Delete nodes as well. You’ll have full control as an advanced user to tailor knowledge graphs to your needs. Check out our full guide below: https://lnkd.in/geGNHBEz
5 Comments
Like Comment
To view or add a comment, sign in

239,009 followers

View Profile Follow

LangChain’s Post

More Relevant Posts

Extract Structured Data Using Google's Pinpoint

https://www.youtube.com/

Explore topics