Li Yin’s Post

LightRAG | AI researcher | x MetaAI

2mo

Both AI research and engineering use Pytorch, but there is not a single shared library between the RAG and agent research and engineering communities. I have read hundreds of papers on RAG and agents, almost none of which use frameworks like Langchain and LlamaIndex. I have talked to many ML engineers building RAG and agent products; most people only use the utility functions of these tools like text splitters or output parsers. The majority of people who use these two popular frameworks are software engineers who don't have much background in AI. Building RAG and agent applications is no different from training a model; you just currently manually tune your prompts. You need evaluations, you need iterations, you need to get the in-context learning working well and gather more data to automate your prompting process and to move into model finetuning. The problem with researchers’ code is that it has no abstraction at all. It only works on the benchmarking dataset. They lack powerful string processing that can generalize the tool calls. Their code is hard to read as it is not organized. Thus, it is really hard to adapt to production. The problems with the frameworks are that they have abstracted so many layers but are not supported with really strong engineering to provide robust adaptation to different use cases. They are more like no-code solutions, which are perfect for people to use out-of-the-box, but terrible for products and research. There can be a solution great for both the ML research and the product community: a library that provides strong string processing, an easy-to-understand tool interface to support tool calls, various output formats, and various access to different models with great monitoring capabilities like TensorBoard for Pytorch. That is the mission and purpose of LightRAG. Let researchers and engineers work on what matters: prompts, datasets, evaluations, and model finetuning without building everything from scratch alone with subpar solutions or using an overly engineered framework which requires them to spend more time figuring out how to customize instead of on what matters. I know I'm not a popular voice, but this is super important, and I think the majority of the community is moving in the wrong direction, even some of those people I looked up to and considered the best in the field. If you are passionate about a library that really unites RAG and Agent research and engineering, please work with me. I know I'm not able to do this alone.

126 Comments

Li Yin

LightRAG | AI researcher | x MetaAI

2mo

Thank to everyone for your support. I never planned a library. I am solving a few hard problems with RAG and agents in our production, and only then found that we have learned so much from both the research and the engineering, and we have our own library already. I did think hard on open-sourcing it or not. It is core tech foundation to our startup. But I think we can do something great here if we can open-source it and to unite the community to build something amazing. To join the discussion on such a library, please join the Discord (light-rag) channel. https://discord.gg/ezzszrRZvT Looking forward to taking more input for the final direction and structure of the library. Here is my previous post on LightRAG: https://www.linkedin.com/posts/li-yin-ai_join-the-people-search-copilot-discord-server-activity-7188204575042990080-A_rQ?utm_source=share&utm_medium=member_desktop

24 Reactions

Chris Blodgett

VP Development Formax.us

2mo

The problem I have with the RAG frameworks is most of the stuff really isn't really that useful and the layers of abstraction just over complicates and makes them not very flexible. Like the prompt template in langchang - it is basically just a glorified python f"" string with {} passed in as its parameters. The 'chains' basically come down to you prompting the models in very specific ways and restricting the output so much that you can funnel it to do what you want it to , that's pretty much it. I don't need a framework when the gist of the matter involves creating a string to pass to a LLM :) Most of the work is just coming up with the prompts you want, in the order you want, with enough code around the responses to get what you need out of them for what you're trying to do. Most of what I've seen in the frameworks can be done with a fairly minimal amount of python code and then you'll know exactly what is going on in the most efficient way possible.

4 Reactions

Laurentiu Petrea

Director of AI | Creator of CataLlama | I build LLMs, Simple AI Agents & RAG pipelines | 12 years in tech 6 years in Leadership, Operations & Strategy | Engineering & Product

2mo

100% my experience. We started with LangChain and now we are removing it and going “native” for reasons you already mentioned. Whenever I was looking for scripts to fine-tune, it was a mess because everything was so disorganised and poorly written. (we fixed that) - same for RAG. Bridging the gap is needed, software engineers need to learn AI without the abstraction layers of popular frameworks and ML engineers need to write code that others can work with. Probably that’s why MLOps is perceived as so hard: someone needs to make sense of the code. Wrapping FastAPI on top and dockerising is piece of cake. What do you need help with?

18 Reactions

Ion Moșnoi

2mo

i mostly use langchain to save retrieve information using Faiss, i dont use character splitter, because it does not make any sense to split the text at random when it passed 1000 characters. I use NER to detect titles, paragrams, and other important concept features, in order to split and link the chunks based on their properties, different application, requires different NER classes. It will be hard to make engineers and researchers to use same library.

3 Reactions

Olli-Pekka Heinisuo

Co-founder, Senior Software Architect @ Softlandia

2mo

If you are doing production grade RAG systems you should not use Langchain or similar abstraction libraries. There is no holy grail library for RAG, and probably never will be, since RAG does not generalize to all problems and use cases. The libs get you 80 % there and then you need to throw them away to deliver the final hardest 20 %. Instead of these one-for-all libraries, assemble custom RAG for your specific use case from smaller building blocks. Maybe that helps you to build your own library. The current abstraction level in the libraries is too high. I think the Unix philosophy of doing one thing and doing it well is the correct approach.

5 Reactions

(Samuel) Yongrui S.

Founder&CEO of Chat Data

2mo

Building based on other people’s framework takes more time than using all the existing basic programming tools to build our own for such simple problems like RAG and is less flexible. Why would people learn to use such tools like langchain at all? Just a waste of time.

1 Reaction

Leo Walker

Data Scientist | Military Veteran

2mo

I always enjoy it when I see the LangChain or LlamaIndex team replicating the research papers with their tools the ones that come to mind are the RAPTOR approach to RAG. https://youtu.be/jbGchdTL7d0?si=bQt4OhpAllikImlR

7 Reactions

Kevin Tran

Research Scientist @ Stanford GSB | LinkedIn Top Voice 2019 in Data Science & Analytics

2mo

Practically, this is very much agreeable. I think the best framework is one that is high level enough that you can iterate your ideas quickly without having to start from scratch and that you still have a lot of flexibility to adapt to your specific use case. For example, pandas, scikit-learn, huggingface etc seem like great frameworks for their intended purposes. LightRAG sounds promising based on your information. I just joined your Slack channel.

2 Reactions

David Xue

Chief AI Consultant at Blue Cargo Inc | Empowering 100+ Researchers in Industrial AI Application

2mo

This might be the best or the boldest perspective I’ve seen on the challenges between research and practical application in RAG and fine-tuned. I’ve experienced similar frustrations over the last six months, initially excited by the potential of frameworks like Langchain and LlamaIndex, only to be disappointed by their inflexibility due to excessive abstraction. I completely agree that bridging the gap between practical application and research is daunting yet crucial. I greatly appreciate the efforts being made with LightRAG to address these issues. I might contribute not on the research side but possibly on the fund， marketing or community aspects. Appreciate your work and perservance.

1 Reaction

Siddhesh Gunjal

Senior Research Scientist @ NielsenIQ | Creator of Slackker (PyPi Package) | Former Adjunct faculty @ upGrad

2mo

This is something I have realised too. We are currently using haystack by deepset to build our production level products. It has a good balance of abstraction and supports building good RAG systems with 3rd party embedding stores. But sometimes it's a hassle to bring everything to one place for different use cases. So yes, I really loved your idea for LightRAG and I would love to contribute.

1 Reaction

See more comments

To view or add a comment, sign in

More Relevant Posts

Irene Bratsis 🎲

Product I Data | AI | Machine Learning | B2B SaaS | Author of AI PM Handbook (Packt)| Ex Tesla, Experian
4w Edited
Report this post
Iteration is the operative word when it comes to deep tech/machine learning/deep learning. 💡 We're still in the discovery phase. 💡 There are still 5, maybe 10 total overarching use cases for machine learning algorithms. 💡 We're still learning new ways to make products more effective with machine learning. 💡 MOST AI/ML programs still fail. I was asked to read and offer my insights for Dr. Maicon Melo Alves' "Accelerate Model Training with PyTorch 2.X: Build more accurate models by boosting the model training process". By far the biggest advantage of this book is that it makes complicated subjects like understanding model training, CPU/GPU usage, data pipeline and model complexity limitations more accessible. Frameworks like PyTorch allow for the spirit of innovation and experimentation with ML to thrive because they help minimize the limitations that impact AI/ML training, tuning and deployment the most. As we continue to see AI/ML transform businesses across industries this decade, more people will need to understand how to navigate the limitations of AI whether they're technical or not. Whether you're in leadership, product, devs, data science or machine learning, you'll benefit from the accessibility and depth of this book. Thank you for bringing me in the loop Vinishka Kalra and thank you again Dr. Maicon Melo Alves for writing it! Check it out at the following link:

Accelerate Model Training with PyTorch 2.X: Build more accurate models by boosting the model training process

amazon.com

4 Comments
Like Comment
To view or add a comment, sign in
Isaac Miller

Graduate Research Assistant | Northwestern AI | BS/MS in Computer Science
8mo
Report this post
Is Prompt Engineering here to stay? Not if Stanford University's NLP lab has anything to say about it. They just introduced DSPy, a way to write pipelines instead of prompts. I have been playing with it for a side project, and it makes the prompting so much easier. All you need is input, output, and pipeline architecture, and DSPy does the rest at an expert level of prompting techniques. Between this and Promptbreeder, another way to systematically improve prompts without human feedback, it seems like raw human prompting is on the out. https://lnkd.in/gcBinRsf

GitHub - stanfordnlp/dspy: DSPy: The framework for programming—not prompting—foundation models

github.com
Like Comment
To view or add a comment, sign in
Galavalli Muddu krishna

Artificial intelligence engineer at Foviatech GmbH
1y
Report this post
🔀 TensorFlow vs. PyTorch: Understanding the Differences in Deep Learning Frameworks 🔀 When it comes to deep learning frameworks, two prominent names stand out: TensorFlow and PyTorch. While both are widely used and offer powerful capabilities, there are some key differences to consider. Let's explore them: 🔵 TensorFlow is a popular and mature framework developed by Google, known for its scalability and deployment across various platforms. Its static computational graph enables efficient optimization and deployment. TensorFlow also boasts a strong ecosystem with extensive community support, pre-built models, and features like eager execution for intuitive debugging and prototyping. 🟡 PyTorch, on the other hand, is a user-friendly framework developed by Facebook's AI Research lab. It prioritizes simplicity, flexibility, and an intuitive programming style. With dynamic computational graphs, PyTorch facilitates easy debugging and dynamic model construction. It has gained popularity for its Pythonic interface and support for dynamic control flow, appealing to researchers and practitioners who prefer an interactive and flexible approach to deep learning. ⚖️ Key Differences: 1️⃣ Computational Graph: TensorFlow uses a static computational graph, while PyTorch uses a dynamic computational graph, allowing for more flexibility during model construction and debugging. 2️⃣ Model Deployment: TensorFlow is focused on production-level deployment and offers tools like TensorFlow Serving and TensorFlow.js. PyTorch is more flexible for rapid prototyping and experimentation and also supports deployment through frameworks like TorchServe. 3️⃣ Ecosystem and Community: TensorFlow has a robust ecosystem with extensive community support, including pre-trained models and tools like TensorFlow Hub. PyTorch has a growing ecosystem with a strong research community, providing access to cutting-edge research and pre-trained models through platforms like the PyTorch Hub. 🤔 Which to Choose? The choice between TensorFlow and PyTorch depends on your specific needs and preferences. If scalability and production deployment are paramount, TensorFlow's extensive deployment options and mature ecosystem may be ideal. On the other hand, if you prioritize flexibility, interactive development, and a vibrant research community, PyTorch's dynamic nature and Pythonic style may be the better fit. I recently created a GitHub repository that provides a simplified implementation of image classification using TensorFlow and PyTorch. If you're looking to kickstart your image classification projects, this repository can be an excellent starting point. Take a look and explore its potential for your next image classification endeavors. Github: https://lnkd.in/ewjvmR3f 🔗 Resources: TensorFlow: https://www.tensorflow.org PyTorch: https://pytorch.org #DeepLearning #TensorFlow #PyTorch #AI #MachineLearning

TensorFlow

tensorflow.org
Like Comment
To view or add a comment, sign in
Akshay Pachaar

AI Engineering @LightningAI ⚡️ | BITS Pilani | 3 Patents | 𝕏 (147K+)
3mo Edited
Report this post
5 GitHub repositories that will give you superpowers as an AI/ML Engineer: 1️⃣ Awesome Artificial Intelligence A curated list of Artificial Intelligence: - courses - books - video lectures - and papers with code Check this out 👇 https://lnkd.in/g3bQGjfr 2️⃣ Cleanlab You're missing out on a lot if you haven't started using Cleanlab yet! Cleanlab helps you clean data and labels by automatically detecting issues in a ML dataset. It's like a magic wand! 🪄✨ Check this out👇 https://lnkd.in/dY2fp5YW 3️⃣ 500 projects with code A curated list of 500 AI, Machine learning, Computer vision & NLP Projects with code Check this out👇 https://lnkd.in/gnY3K75d 4️⃣ Prompt Engineering Guide A guide that contains all the latest papers, learning guides, lectures, references, and tools related to prompt engineering for LLMs. Happy Prompting! Check this out👇 https://lnkd.in/dC6jBSDZ 5️⃣ Lit-GPT A fully open-source & hackable implementation of SOTA LLMs, comes with reproducible project templates to get started right away on Lightning AI Studios! ⚡️ Key features include: - Quantization - Flash attention - Finetuning using LoRA Check this out👇 https://lnkd.in/dzKepQUv ____________________________ If you Enjoyed reading this & are interested in - Python 🐍 - ML/MLOps 🛠 - CV/NLP 🗣 - LLMs 🧠 - AI Engineering ⚙️ Find me → https://lnkd.in/em_V4unu ✔️ I also write a technical Newsletter on AI Engineering!MLSpring! Join 8K+ readers: https://lnkd.in/dd9aimZH
6 Comments
Like Comment
To view or add a comment, sign in
InstaDatahelp AI News

132 followers
10mo
Report this post
From Research to Production: Deploying PyTorch Models for Real-World Applications Introduction: PyTorch, an open-source machine learning library, has gained immense popularity among researchers and practitioners due to its flexibility, ease of use, and efficient computation capabilities. It provides a powerful platform for developing and training deep learning models. However, the journey from research to production involves several challenges, including model deployment. In this article, we will explore the process of deploying PyTorch models for real-world applications, highlighting the key considerations and best practices. 1. Preparing the Model: Before deploying a PyTorch model, it is crucial to ensure that it is well-prepared for production. This involves several steps, including model optimization, serialization, and compatibility checks. a. Model Optimization: To ensure efficient inference, it is essential to optimize the PyTorch model. This can be achieved by applying techniques like model pruning, quantization, and compression. Pruning removes unnecessary connections or weights from the model, reducing its size and improving inference speed. Quantization reduces the precision of model weights, resulting in smaller model sizes and faster computations. Compression techniques like weight sharing and knowledge distillation further reduce the model size without significant loss in performance. b. Serialization: PyTorch models need to be serialized before deployment. Serialization converts the model into a format that can be easily stored and loaded. PyTorch provides tools like `torch.save- -` and `torch.load- -` to serialize and deserialize models. This step ensures that the model can be easily transferred and deployed on different platforms. c. Compatibility Checks: Before deploying the model, it is crucial to ensure compatibility with the target deployment environment. This involves checking the PyTorch version, required dependencies, and hardware requirements. It is essential to verify that the deployment environment has the necessary libraries and hardware accelerators - e.g., GPUs - to run the model efficiently. 2. Choosing the Deployment Method: PyTorch models can be deployed using various methods, depending on the specific requirements of the application. Some common deployment methods include: a. Web Services: Deploying PyTorch models as web services allows easy integration with other applications. Frameworks like Flask and Django can be used to create RESTful APIs that expose the model's functionality. This enables seamless integration with web and mobile applications. b. Containerization: Containerization platforms like Docker provide a convenient way to package the PyTorch model along with its dependencies and deploy it as a self-contained unit. This ensures consistent deployment across different environments and simplifies the deployment process. c. Edge Devices: PyTorch models can be deployed directly on edge devices

From Research to Production: Deploying PyTorch Models for Real-World Applications

https://instadatahelp.com
Like Comment
To view or add a comment, sign in
Dr. Subhabaha Pal

Co-Founder, Chief AI & Analytics Advisor @ InstaDataHelp | Innovator and Patent-Holder in Gen AI and LLM | Data Science Thought Leader and Blogger | FRSS(UK) FSASS FROASD | 16+ Years of Excellence
10mo
Report this post
From Research to Production: Deploying PyTorch Models for Real-World Applications Introduction: PyTorch, an open-source machine learning library, has gained immense popularity among researchers and practitioners due to its flexibility, ease of use, and efficient computation capabilities. It provides a powerful platform for developing and training deep learning models. However, the journey from research to production involves several challenges, including model deployment. In this article, we will explore the process of deploying PyTorch models for real-world applications, highlighting the key considerations and best practices. 1. Preparing the Model: Before deploying a PyTorch model, it is crucial to ensure that it is well-prepared for production. This involves several steps, including model optimization, serialization, and compatibility checks. a. Model Optimization: To ensure efficient inference, it is essential to optimize the PyTorch model. This can be achieved by applying techniques like model pruning, quantization, and compression. Pruning removes unnecessary connections or weights from the model, reducing its size and improving inference speed. Quantization reduces the precision of model weights, resulting in smaller model sizes and faster computations. Compression techniques like weight sharing and knowledge distillation further reduce the model size without significant loss in performance. b. Serialization: PyTorch models need to be serialized before deployment. Serialization converts the model into a format that can be easily stored and loaded. PyTorch provides tools like `torch.save- -` and `torch.load- -` to serialize and deserialize models. This step ensures that the model can be easily transferred and deployed on different platforms. c. Compatibility Checks: Before deploying the model, it is crucial to ensure compatibility with the target deployment environment. This involves checking the PyTorch version, required dependencies, and hardware requirements. It is essential to verify that the deployment environment has the necessary libraries and hardware accelerators - e.g., GPUs - to run the model efficiently. 2. Choosing the Deployment Method: PyTorch models can be deployed using various methods, depending on the specific requirements of the application. Some common deployment methods include: a. Web Services: Deploying PyTorch models as web services allows easy integration with other applications. Frameworks like Flask and Django can be used to create RESTful APIs that expose the model's functionality. This enables seamless integration with web and mobile applications. b. Containerization: Containerization platforms like Docker provide a convenient way to package the PyTorch model along with its dependencies and deploy it as a self-contained unit. This ensures consistent deployment across different environments and simplifies the deployment process. c. Edge Devices: PyTorch models can be deployed directly on edge devices

From Research to Production: Deploying PyTorch Models for Real-World Applications

https://instadatahelp.com
Like Comment
To view or add a comment, sign in
Ricky ҈̿҈̿҈̿҈̿҈̿҈̿Costa̿҈̿҈̿҈̿҈̿҈̿҈̿҈̿҈̿҈̿҈̿҈̿҈̿҈̿҈̿҈̿҈

AI Engineer
1y
Report this post
TorchScale, an open-source toolkit, is a game-changer for researchers and developers looking to scale up Transformers efficiently and effectively. It offers several modeling techniques that can improve modeling generality and capability, as well as training stability and efficiency. This innovative toolkit has been rigorously tested and the experimental results on language modeling and neural machine translation are impressive, showing that TorchScale can successfully scale Transformers to different sizes without tears. It's a must-have for any researcher or developer working with Transformers. 🚀🔥 #Transformers #MachineLearning Title: TorchScale: Transformers at Scale Code: https://lnkd.in/ehGKZxnz Graph: https://lnkd.in/ek5jtAnj Paper: https://lnkd.in/er8gWx3r ⭐️: 1015 Check out other GitHub repos on the NLP Index: https://lnkd.in/geg6zPm

GitHub - microsoft/torchscale: Transformers at any scale

github.com
Like Comment
To view or add a comment, sign in
Kailash Ahirwar

Building TryOn AI (TryOn Labs) | Gen AI For Immersive Fashion | Raven Protocol | Mate Labs | Author of Generative Adversarial Networks Projects
6mo
Report this post
Putting large and complex models on mobile and edge devices can be expensive and tricky. But, what's the solution? Knowledge Distillation (KD) is a technique that helps reduce large and complex models to smaller and faster models. It is beneficial in many ways: 1. Less expensive models 2. Smaller and faster models 3. Reduced memory and computation requirements 4. Improved model performance on a specific task Use torchdistill, a modular and configuration-driven framework for reproducible deep learning and knowledge distillation experiments. It is written by Yoshitomo Matsubara Checkout the GitHub repository: https://lnkd.in/g7bzu2M6 #torchdistill #MachineLearning #DeepLearning #ArtificialIntelligence #knowledgedistillation #largelanguagemodels #largelanguagemodel #llms

GitHub - yoshitomo-matsubara/torchdistill: A coding-free framework built on PyTorch for reproducible deep learning studies. 🏆25 knowledge distillation methods presented at CVPR, ICLR, ECCV, NeurIPS, ICCV, etc are implemented so far. 🎁 Trained models, training logs and configurations are available for ensuring the reproducibiliy and benchmark.

github.com
Like Comment
To view or add a comment, sign in
Yogeshwar Shendye

Master of Computer Science Student at Dalhousie University
12mo Edited
Report this post
Hello connections! I hope you are doing well. From last few days, I have been working on implementing vision transformer. Back in 2017, when the Google published "Attention is All You Need" for NLP problems, people might not have thought about implementing the "attention" on vision data. Almost 4 years after this, Google came up with a way of implementing the "attention mechanism" for computer vision in the paper titled "An Image Is Worth 16X16 Words:Transformers For Image Recognition At Scale". I first tried out the implementation of Vision Transformer by lucidrains at https://lnkd.in/ehh9b98r. Then I decided to give a shot to actually implement the vision transformer using pytorch just as described in the paper. During this my friend Paras Mehta has been of really great help. You can have a look at my implementation at https://lnkd.in/eTYtAtA2. #artificialintelliegence #transformers #googleresearch #pytorch #computervision

GitHub - lucidrains/vit-pytorch: Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

github.com
Like Comment
To view or add a comment, sign in
Pawel Bulowski

Generative AI Director @ APTIV | I will help you drive your generative AI transformation | AWS, Azure, GCP | Cert. FinOps Practitioner | I Speak: 🇺🇸 🇪🇸 🇫🇷 🇵🇱
3mo
Report this post
PyTorch has launched 𝐭𝐨𝐫𝐜𝐡𝐭𝐮𝐧𝐞, an innovative library designed to streamline the fine-tuning of large language models. 🎉🤖 torchtune is engineered for both flexibility and efficiency, featuring: ➡ Modular building blocks for tailored workflows 🧱 ➡ Customisable training recipes for diverse needs 📜 ➡ Parameter-Efficient Fine-Tuning (PEFT) techniques for optimal resource use 🧠 ➡ A seamless workflow from dataset preparation to model evaluation 🔄 Crafted with PyTorch's commitment to simplicity and extensibility, torchtune offers an intuitive, easily adaptable approach to fine-tuning 🚀 here's more info: https://lnkd.in/de7Z3gtB #AI #MachineLearning #PyTorch #torchtune #LanguageModels #LLMs #GenAI #FineTuning

torchtune: Easily fine-tune LLMs using PyTorch

pytorch.org
Like Comment
To view or add a comment, sign in

24,761 followers

292 Posts

View Profile Follow

Li Yin’s Post

More Relevant Posts

Explore topics