I have mentioned in my LinkedIn shares and TELLUS International blog posts that using RAG (Retrieval Augmented Generation) with a Large Language Model (LLM) is an interesting opportunity for organizations to combine internal information with an LLM model. A typical example of this could be a Q&A database and its information used with an LLM. A recent article in ZDnet, RAG is the practice of "having an LLM respond ot a prompt by sending a request to some external data source, such as a vector database, and retrieve authoritative data". Furhermore, "the most common use of RAG is to reduce the propensity of LLMs to produce hallucinations, where a model assets falsehoods confidenty, states the Zdnet article." RAG does not come without its issues though and the article provides valuable insights of what current academic resarch has identified and how vendors are trying to circumvent the potential issues with the use of RAG. New research is suggesting LLM training methods to make RAG more reliable and avoid hallucinations or incorrect results. I recommend you to review the article from Zdnet (written by Tiernan Ray, Senior Contributing Writer). It is very valuable content. #RAG #AI #TELLUSInt #continouslearning #GenAI TELLUS International https://lnkd.in/ggR9ksAm
TELLUS Internationalโs Post
More Relevant Posts
-
AI Transformation, Business Modeling, Software Pricing/Packaging, and Advisory. Published author with a strong software business background. Providing interim management roles in the software/IT field
I have mentioned in my LinkedIn shares and TELLUS International blog posts that using RAG (Retrieval Augmented Generation) with a Large Language Model (LLM) is an interesting opportunity for organizations to combine internal information with an LLM model. A typical example of this could be a Q&A database and its information used with an LLM. A recent article in ZDnet, RAG is the practice of "having an LLM respond ot a prompt by sending a request to some external data source, such as a vector database, and retrieve authoritative data". Furhermore, "the most common use of RAG is to reduce the propensity of LLMs to produce hallucinations, where a model assets falsehoods confidenty, states the Zdnet article." RAG does not come without its issues though and the article provides valuable insights of what current academic resarch has identified and how vendors are trying to circumvent the potential issues with the use of RAG. New research is suggesting LLM training methods to make RAG more reliable and avoid hallucinations or incorrect results. I recommend you to review the article from Zdnet (written by Tiernan Ray, Senior Contributing Writer). It is very valuable content. #RAG #AI #TELLUSInt #continouslearning #GenAI TELLUS International https://lnkd.in/gXqQkyp3
Make room for RAG: How Gen AI's balance of power is shifting
zdnet.com
To view or add a comment, sign in
-
In my experience, many practitioners and companies are still curious about how Retrieval Augmented Generation (#RAG) works, when it should be used, how to connect the dots, and so on. Thinking of this, I've come up with a new blog post at LLMs HowTo Retrieval #RAG and how it enhances AI capabilities beyond traditional large language models. ๐คโจ In this introduction post, I delve into: ๐ซ The limitations of conventional large language models ๐ How RAG addresses these challenges by integrating dynamic knowledge retrieval ๐ญ The practical applications and benefits of using RAG in various industries. Whether you're a data scientist, software engineer, or a curious tech enthusiast, hopefully this post helps cut through the noise when it comes to #RAG and #LLMs ๐ฅ ๐ https://lnkd.in/ewnQGFvV What are your thoughts on the potential of RAG to transform AI applications? What other info do you wanna learn about RAG? Shoot it in the comments ๐ #AI #MachineLearning #DataScience #RAG #ArtificialIntelligence #SemanticSearch #GPT #Chatbots #RetrievalAugmentedGeneration
Understanding Retrieval Augmented Generation (RAG): Supercharging LLM Capabilities with Embeddings and Semantic Search โ LLMs.HowTo
llmshowto.com
To view or add a comment, sign in
-
1/2 ๐ The power of RAG: Enhances model performance for more accurate and context-aware content. ๐ง Build robust language model applications with LangChain, simplifying the RAG process. ๐ต๏ธ Query transformation: Ensures models understand and process queries accurately. ๐ HyDE: Boosts document retrieval efficiency by generating multiple document vector representations. ๐ Smart routing: Selects the best data source for accurate and reliable information retrieval. ๐ Diverse retrieval techniques: Self-RAG, adaptive RAG, and CRAG, each suited for different scenarios. ๐งฉ Generation phase: The final step, synthesizing information to create coherent and accurate responses. ๐ Real-world application: Demonstrates RAG's flexibility and power using Neo4J scenarios. ๐ป Code examples and tools: Detailed guides to help users understand and implement RAG easily. ๐ LangSmith: Visualize the entire RAG process with a GUI for easier debugging and optimization. #AI #Tech #NLP #DataScience #Innovation #Developers #Tools #Technology #Applications https://lnkd.in/gwuHWFuJ
Learn RAG with Langchain ๐ฆโ๏ธ๐ฅ
sakunaharinda.xyz
To view or add a comment, sign in
-
Check out my latest article "The Practical Limitations and Advantages of Retrieval Augmented Generation (RAG)" on Towards Data Science! My team has observed a tendency to believe that RAG can enable an language model to answer any question despite its complexity, but the truth is that it still has many shortcomings. Here are some key pain points and competencies of RAG in a nutshell. Limitations: - The ability to Iteratively Reason - The inability to identify biased or un-factual data, and then pass that off to the LLM - The dependency on the organization of underlying data Beneficial Capabilities: - Accessing Domain Specific or Confidential Information (with the proper security implementation) - Provide the language model with the most updated information and circumvent the knowledge cutoff of its training data To further understand where RAG shines or misses the mark, read my article below! #RAG #LLM #SLM #Limitation #EthicalAI #AI #artificialintelligence https://lnkd.in/g5RUVDbe
The Limitations and Advantages of Retrieval Augmented Generation (RAG)
towardsdatascience.com
To view or add a comment, sign in
-
Senior Data Scientist/Machine Learning/NLP/Deep Learning Talks about #rlhf, #generativeai, #nlppractitioner, #largelanguagemodels, and #machinelearningsolutions.
Meta AI Research has introduced : Shepherd, a language model specifically tuned to critique model responses and suggest refinements, extending beyond the capabilities of an untuned model to identify diverse errors and provide suggestions to remedy them. #meta #llm #nlp #nlppractitioner #ai #ai4good Repo link : https://lnkd.in/gESAKKY3 Paper link : https://lnkd.in/gnXRYc45
GitHub - facebookresearch/Shepherd: This is the repo for the paper Shepherd -- A Critic for Language Model Generation
github.com
To view or add a comment, sign in
-
Enterprise Solution Architect at Adobe | GenAI | AI & ML | Data Engineering | Data Platform | Delta Lake| Data Warehouse | Micro Services | Python | Scala| Spark | AWS | AZURE
Boost Your LLM Application performance - RAG vs FineTuning As the demand for Large Language Models (LLMs) continues to rise, developers and organisations are actively creating applications to leverage their capabilities. Yet, when the pre-trained LLMs fail to meet the desired expectations, a crucial question arises: how can we enhance the performance of LLM applications? This dilemma leads us to explore options like Retrieval-Augmented Generation (RAG) and model fine-tuning to optimize the outcomes. RAG This method combines the capabilities of retrieval (or search) with LLM text generation. It incorporates a retriever system that retrieves pertinent document snippets from an extensive database, along with an LLM that generates responses using the information gathered from these snippets. Essentially, RAG enables the model to access external information, enhancing the quality of its responses. Fine-Tuning This process involves refining a pre-trained LLM by training it on a targeted dataset, customizing it for a specific task, or enhancing its performance. Fine-tuning entails adjusting the model's weights using our data, tailoring it to meet the specific requirements of our application. Considerations for Choice: Both RAG and fine-tuning are potent techniques for improving LLM-based applications, each addressing distinct aspects of the optimization process. Choosing between them is pivotal, as they offer comparable outcomes while differing in complexity, cost, and quality. Complexity vs. Context: Choose RAG when a deep and diverse external context is essential, even if it adds complexity. Opt for fine-tuning for simpler tasks where a specialized, task-specific model is sufficient. Data Availability: If abundant task-specific data is available, fine-tuning can be effective. RAG is preferable when extensive external information is required. Resource Constraints: RAG demands more computational resources due to its external retrieval mechanism. Fine-tuning might be a better choice for resource-constrained environments. Task Complexity: For complex tasks requiring nuanced understanding, RAG's ability to leverage diverse external contexts can provide significant advantages. My evaluation is grounded in the observation that both methods yielded similar results yet varied in their intricacy and resource requirements. Ultimately, the choice between RAG and fine-tuning depends on the specific requirements of the task, the availability of data, and the complexity of the desired application. #generativeAI, #RAG, #retrievalaugmentedgeneration, #LLM #finetuning
To view or add a comment, sign in
-
-
How to Use Cross Attention and Self Attention Blocks with LLMs LLMs use cross attention and self attention blocks, which are based on the attention mechanism, to process and generate natural language. The attention mechanism computes the relevance or similarity between different parts of an input sequence. Cross attention blocks use two different input sequences, while self attention blocks use the same input sequence. Cross attention and self attention blocks enable LLMs to: - Capture the semantic and syntactic relationships within and across different input sequences - Generate coherent and relevant output sequences - Handle variable-length input sequences - Be efficient and scalable, compared to other methods To use cross attention and self attention blocks, you need to: - Have a basic understanding of the transformer architecture, which is the foundation of LLMs - Have access to a pre-trained LLM or a framework that allows you to train your own LLM Here are some examples of how LLMs and cross attention and self attention blocks can be applied to different tasks or domains: - Content creation: You can use LLMs to generate different types of content, such as blog posts, product descriptions, reviews, or social media posts. You can also use cross attention and self attention blocks to adjust the tone, style, and depth of your content, depending on your audience and purpose. For example, you can use a role prompt to instruct your LLM to write like a poet, a journalist, a comedian, or a teacher. - Text summarization: You can use LLMs to generate summaries of long texts, such as articles, books, or reports. You can also use cross attention and self attention blocks to control the level of detail and the perspective of your summaries, depending on your needs and preferences. For example, you can use a prefix prompt to provide some context or guidance to your LLM, such as the main topic, the key points, or the target length of the summary. - Text analysis: You can use LLMs to analyze the sentiment, tone, or style of a given text, such as a customer feedback, a tweet, or a speech. You can also use cross attention and self attention blocks to compare and contrast different texts, or to identify the role or the intention of the author. For example, you can use a chain-of-thought prompt to ask your LLM to explain why a text is positive, negative, or neutral, or to provide some evidence or examples to support its analysis. #genai #llm #gpt4 #learning
To view or add a comment, sign in
-
Going through the list of papers for #ICML2024 and #ICLR2024 and found some interesting ones that I'll share over the next few days. But here's the big one that got me hooked: Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs Paper: https://lnkd.in/dWkn_d_6 Authors: Yeonhong Park and others. Essentially, the paper proposes: โก๏ธ Rapid generation of multiple models at different quantization levels โ Think about having 2, 3, 4, 5, 6, 7 or 8-bit models always in memory, but with a memory requirement of only an 8-bit model (well, almost, they do need some metadata but its an order of magnitude lower memory requirement). โก๏ธ Specialized LLM serving engine with the support to serve different quantization levels of the same model. โ Why is this important โ Imagine the scenario where you have a highly finetuned LLM (or an SLM) that you deploy for your users. As a failsafe, you also need to deploy a secondary LLM that comes into play if the first one goes down. However, as soon as the secondary LLM comes online, the users will notice - it's not a finetuned version of your first model! Or even if you deploy a finetuned version, you end up deploying TWO models, in memory, in parallel. This paper solves this problem - a possible failsafe method for an in-production LLM that failovers to smaller/higher quantization levels based on bandwidth and user requirements. Incredible! For a truly redundant system though this still doesn't work because if the LLM serving engine goes down, all models go down as well. So don't throw away your failsafe LLM just yet, but this might be a handy alternative. Book a free 15 minute call with me: https://lnkd.in/dGnHs_Mn #icml2024 #iclr2024 #llm #nlp #machinelearning #consulting #generativeai #openai #microsoft #iclr #icml #research #paper
Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs
arxiv.org
To view or add a comment, sign in
-
Producing accurate and reliable AI-generated text has always been a challenge. The paper ๐๐ผ๐ฟ๐ฟ๐ฒ๐ฐ๐๐ถ๐๐ฒ ๐ฅ๐ฒ๐๐ฟ๐ถ๐ฒ๐๐ฎ๐น ๐๐๐ด๐บ๐ฒ๐ป๐๐ฒ๐ฑ ๐๐ฒ๐ป๐ฒ๐ฟ๐ฎ๐๐ถ๐ผ๐ป (๐๐ฅ๐๐), published on 1/29/24 (Link -> https://lnkd.in/eh3si68K), introduces an innovative solution to enhance the robustness and accuracy of text generated by Large Language Models (LLMs). The researchers developed a method that significantly improves the performance of retrieval-augmented generation (RAG) models by introducing a corrective mechanism that evaluates and refines the relevance of retrieved documents. This ensures that the generated text is not only fluent but also factually accurate. ๐ง๐ต๐ฒ ๐ฃ๐ฟ๐ผ๐ฏ๐น๐ฒ๐บ ๐๐ถ๐๐ต ๐๐๐ฟ๐ฟ๐ฒ๐ป๐ ๐๐๐ ๐ LLMs, despite their ability to understand instructions and produce fluent texts, are prone to hallucinations - generating information that might be irrelevant or incorrect. The traditional method to mitigate this issue, RAG, relies heavily on the relevance of retrieved documents, which can sometimes be inaccurate or misleading. ๐๐ป๐๐ฟ๐ผ๐ฑ๐๐ฐ๐ถ๐ป๐ด ๐๐ฅ๐๐: ๐ ๐ก๐ผ๐๐ฒ๐น ๐๐ฝ๐ฝ๐ฟ๐ผ๐ฎ๐ฐ๐ต CRAG proposes a lightweight ๐ฟ๐ฒ๐๐ฟ๐ถ๐ฒ๐๐ฎ๐น ๐ฒ๐๐ฎ๐น๐๐ฎ๐๐ผ๐ฟ ๐๐ผ ๐ฎ๐๐๐ฒ๐๐ ๐๐ต๐ฒ ๐พ๐๐ฎ๐น๐ถ๐๐ ๐ผ๐ณ ๐ฟ๐ฒ๐๐ฟ๐ถ๐ฒ๐๐ฒ๐ฑ ๐ฑ๐ผ๐ฐ๐๐บ๐ฒ๐ป๐๐. It uses a decompose-then-recompose algorithm to focus on key information, filtering out irrelevant content. This method is plug-and-play, easily integrating with existing RAG-based approaches to improve their performance across various datasets. ๐๐บ๐ฝ๐ถ๐ฟ๐ถ๐ฐ๐ฎ๐น ๐๐๐ถ๐ฑ๐ฒ๐ป๐ฐ๐ฒ ๐ผ๐ณ ๐ฆ๐๐ฐ๐ฐ๐ฒ๐๐ The researchers conducted experiments across four datasets covering both short- and long-form generation tasks. The results showed that CRAG significantly outperformed existing RAG and Self-RAG approaches, demonstrating its adaptability and generalizability in improving factual accuracy in AI-generated text. ๐ง๐ถ๐บ๐ฒ ๐๐ผ ๐๐ฝ๐ด๐ฟ๐ฎ๐ฑ๐ฒ ๐๐ผ๐๐ฟ ๐ฅ๐๐ ๐๐ผ ๐๐ฅ๐๐! #llm #llms #rag #crag #artificialintelligence
To view or add a comment, sign in
-
Journalist at The Technology Letter, ZDNet, Barron's Advisor
1moThank you for reading.