TELLUS International’s Post

View organization page for TELLUS International, graphic

1,458 followers

1mo

I have mentioned in my LinkedIn shares and TELLUS International blog posts that using RAG (Retrieval Augmented Generation) with a Large Language Model (LLM) is an interesting opportunity for organizations to combine internal information with an LLM model. A typical example of this could be a Q&A database and its information used with an LLM. A recent article in ZDnet, RAG is the practice of "having an LLM respond ot a prompt by sending a request to some external data source, such as a vector database, and retrieve authoritative data". Furhermore, "the most common use of RAG is to reduce the propensity of LLMs to produce hallucinations, where a model assets falsehoods confidenty, states the Zdnet article." RAG does not come without its issues though and the article provides valuable insights of what current academic resarch has identified and how vendors are trying to circumvent the potential issues with the use of RAG. New research is suggesting LLM training methods to make RAG more reliable and avoid hallucinations or incorrect results. I recommend you to review the article from Zdnet (written by Tiernan Ray, Senior Contributing Writer). It is very valuable content. #RAG #AI #TELLUSInt #continouslearning #GenAI TELLUS International https://lnkd.in/ggR9ksAm

Make room for RAG: How Gen AI's balance of power is shifting

zdnet.com

1 Comment

Tiernan Ray

Journalist at The Technology Letter, ZDNet, Barron's Advisor

1mo

Thank you for reading.

To view or add a comment, sign in

More Relevant Posts

Dr. Petri I. Salonen

AI Transformation, Business Modeling, Software Pricing/Packaging, and Advisory. Published author with a strong software business background. Providing interim management roles in the software/IT field
1mo
Report this post
I have mentioned in my LinkedIn shares and TELLUS International blog posts that using RAG (Retrieval Augmented Generation) with a Large Language Model (LLM) is an interesting opportunity for organizations to combine internal information with an LLM model. A typical example of this could be a Q&A database and its information used with an LLM. A recent article in ZDnet, RAG is the practice of "having an LLM respond ot a prompt by sending a request to some external data source, such as a vector database, and retrieve authoritative data". Furhermore, "the most common use of RAG is to reduce the propensity of LLMs to produce hallucinations, where a model assets falsehoods confidenty, states the Zdnet article." RAG does not come without its issues though and the article provides valuable insights of what current academic resarch has identified and how vendors are trying to circumvent the potential issues with the use of RAG. New research is suggesting LLM training methods to make RAG more reliable and avoid hallucinations or incorrect results. I recommend you to review the article from Zdnet (written by Tiernan Ray, Senior Contributing Writer). It is very valuable content. #RAG #AI #TELLUSInt #continouslearning #GenAI TELLUS International https://lnkd.in/gXqQkyp3

Make room for RAG: How Gen AI's balance of power is shifting

zdnet.com
Like Comment
To view or add a comment, sign in
LLMs HowTo

146 followers
2mo
Report this post
In my experience, many practitioners and companies are still curious about how Retrieval Augmented Generation (#RAG) works, when it should be used, how to connect the dots, and so on. Thinking of this, I've come up with a new blog post at LLMs HowTo Retrieval #RAG and how it enhances AI capabilities beyond traditional large language models. 🤖✨ In this introduction post, I delve into: 😫 The limitations of conventional large language models 🚀 How RAG addresses these challenges by integrating dynamic knowledge retrieval 🏭 The practical applications and benefits of using RAG in various industries. Whether you're a data scientist, software engineer, or a curious tech enthusiast, hopefully this post helps cut through the noise when it comes to #RAG and #LLMs 🔥 🔗 https://lnkd.in/ewnQGFvV What are your thoughts on the potential of RAG to transform AI applications? What other info do you wanna learn about RAG? Shoot it in the comments 📄 #AI #MachineLearning #DataScience #RAG #ArtificialIntelligence #SemanticSearch #GPT #Chatbots #RetrievalAugmentedGeneration

Understanding Retrieval Augmented Generation (RAG): Supercharging LLM Capabilities with Embeddings and Semantic Search – LLMs.HowTo

llmshowto.com
Like Comment
To view or add a comment, sign in
kaikai luo

CEO
2w
Report this post
1/2 🚀 The power of RAG: Enhances model performance for more accurate and context-aware content. 🔧 Build robust language model applications with LangChain, simplifying the RAG process. 🕵️ Query transformation: Ensures models understand and process queries accurately. 📚 HyDE: Boosts document retrieval efficiency by generating multiple document vector representations. 🔍 Smart routing: Selects the best data source for accurate and reliable information retrieval. 🌐 Diverse retrieval techniques: Self-RAG, adaptive RAG, and CRAG, each suited for different scenarios. 🧩 Generation phase: The final step, synthesizing information to create coherent and accurate responses. 🌟 Real-world application: Demonstrates RAG's flexibility and power using Neo4J scenarios. 💻 Code examples and tools: Detailed guides to help users understand and implement RAG easily. 🔍 LangSmith: Visualize the entire RAG process with a GUI for easier debugging and optimization. #AI #Tech #NLP #DataScience #Innovation #Developers #Tools #Technology #Applications https://lnkd.in/gwuHWFuJ

Learn RAG with Langchain 🦜⛓️💥

sakunaharinda.xyz
Like Comment
To view or add a comment, sign in
Sandi Besen

Artificial Intelligence Applied Research @ Neudesic, an IBM Company
3mo
Report this post
Check out my latest article "The Practical Limitations and Advantages of Retrieval Augmented Generation (RAG)" on Towards Data Science! My team has observed a tendency to believe that RAG can enable an language model to answer any question despite its complexity, but the truth is that it still has many shortcomings. Here are some key pain points and competencies of RAG in a nutshell. Limitations: - The ability to Iteratively Reason - The inability to identify biased or un-factual data, and then pass that off to the LLM - The dependency on the organization of underlying data Beneficial Capabilities: - Accessing Domain Specific or Confidential Information (with the proper security implementation) - Provide the language model with the most updated information and circumvent the knowledge cutoff of its training data To further understand where RAG shines or misses the mark, read my article below! #RAG #LLM #SLM #Limitation #EthicalAI #AI #artificialintelligence https://lnkd.in/g5RUVDbe

The Limitations and Advantages of Retrieval Augmented Generation (RAG)

towardsdatascience.com

2 Comments
Like Comment
To view or add a comment, sign in
Sakil Ansari

Senior Data Scientist/Machine Learning/NLP/Deep Learning Talks about #rlhf, #generativeai, #nlppractitioner, #largelanguagemodels, and #machinelearningsolutions.
11mo
Report this post
Meta AI Research has introduced : Shepherd, a language model specifically tuned to critique model responses and suggest refinements, extending beyond the capabilities of an untuned model to identify diverse errors and provide suggestions to remedy them. #meta #llm #nlp #nlppractitioner #ai #ai4good Repo link : https://lnkd.in/gESAKKY3 Paper link : https://lnkd.in/gnXRYc45

GitHub - facebookresearch/Shepherd: This is the repo for the paper Shepherd -- A Critic for Language Model Generation

github.com

4 Comments
Like Comment
To view or add a comment, sign in
Rakesh Yadava

Enterprise Solution Architect at Adobe | GenAI | AI & ML | Data Engineering | Data Platform | Delta Lake| Data Warehouse | Micro Services | Python | Scala| Spark | AWS | AZURE
9mo
Report this post
Boost Your LLM Application performance - RAG vs FineTuning As the demand for Large Language Models (LLMs) continues to rise, developers and organisations are actively creating applications to leverage their capabilities. Yet, when the pre-trained LLMs fail to meet the desired expectations, a crucial question arises: how can we enhance the performance of LLM applications? This dilemma leads us to explore options like Retrieval-Augmented Generation (RAG) and model fine-tuning to optimize the outcomes. RAG This method combines the capabilities of retrieval (or search) with LLM text generation. It incorporates a retriever system that retrieves pertinent document snippets from an extensive database, along with an LLM that generates responses using the information gathered from these snippets. Essentially, RAG enables the model to access external information, enhancing the quality of its responses. Fine-Tuning This process involves refining a pre-trained LLM by training it on a targeted dataset, customizing it for a specific task, or enhancing its performance. Fine-tuning entails adjusting the model's weights using our data, tailoring it to meet the specific requirements of our application. Considerations for Choice: Both RAG and fine-tuning are potent techniques for improving LLM-based applications, each addressing distinct aspects of the optimization process. Choosing between them is pivotal, as they offer comparable outcomes while differing in complexity, cost, and quality. Complexity vs. Context: Choose RAG when a deep and diverse external context is essential, even if it adds complexity. Opt for fine-tuning for simpler tasks where a specialized, task-specific model is sufficient. Data Availability: If abundant task-specific data is available, fine-tuning can be effective. RAG is preferable when extensive external information is required. Resource Constraints: RAG demands more computational resources due to its external retrieval mechanism. Fine-tuning might be a better choice for resource-constrained environments. Task Complexity: For complex tasks requiring nuanced understanding, RAG's ability to leverage diverse external contexts can provide significant advantages. My evaluation is grounded in the observation that both methods yielded similar results yet varied in their intricacy and resource requirements. Ultimately, the choice between RAG and fine-tuning depends on the specific requirements of the task, the availability of data, and the complexity of the desired application. #generativeAI, #RAG, #retrievalaugmentedgeneration, #LLM #finetuning
Like Comment
To view or add a comment, sign in
Amarjyoti Roy Chowdhury

Data Engineer | Long distance runner | Table tennis | Fitness freak | Ex-TCSer
7mo
Report this post
How to Use Cross Attention and Self Attention Blocks with LLMs LLMs use cross attention and self attention blocks, which are based on the attention mechanism, to process and generate natural language. The attention mechanism computes the relevance or similarity between different parts of an input sequence. Cross attention blocks use two different input sequences, while self attention blocks use the same input sequence. Cross attention and self attention blocks enable LLMs to: - Capture the semantic and syntactic relationships within and across different input sequences - Generate coherent and relevant output sequences - Handle variable-length input sequences - Be efficient and scalable, compared to other methods To use cross attention and self attention blocks, you need to: - Have a basic understanding of the transformer architecture, which is the foundation of LLMs - Have access to a pre-trained LLM or a framework that allows you to train your own LLM Here are some examples of how LLMs and cross attention and self attention blocks can be applied to different tasks or domains: - Content creation: You can use LLMs to generate different types of content, such as blog posts, product descriptions, reviews, or social media posts. You can also use cross attention and self attention blocks to adjust the tone, style, and depth of your content, depending on your audience and purpose. For example, you can use a role prompt to instruct your LLM to write like a poet, a journalist, a comedian, or a teacher. - Text summarization: You can use LLMs to generate summaries of long texts, such as articles, books, or reports. You can also use cross attention and self attention blocks to control the level of detail and the perspective of your summaries, depending on your needs and preferences. For example, you can use a prefix prompt to provide some context or guidance to your LLM, such as the main topic, the key points, or the target length of the summary. - Text analysis: You can use LLMs to analyze the sentiment, tone, or style of a given text, such as a customer feedback, a tweet, or a speech. You can also use cross attention and self attention blocks to compare and contrast different texts, or to identify the role or the intention of the author. For example, you can use a chain-of-thought prompt to ask your LLM to explain why a text is positive, negative, or neutral, or to provide some evidence or examples to support its analysis. #genai #llm #gpt4 #learning
Like Comment
To view or add a comment, sign in
Anmol Sharma

Leading Teams bringing Generative AI and LLMs to Production
2mo
Report this post
Going through the list of papers for #ICML2024 and #ICLR2024 and found some interesting ones that I'll share over the next few days. But here's the big one that got me hooked: Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs Paper: https://lnkd.in/dWkn_d_6 Authors: Yeonhong Park and others. Essentially, the paper proposes: ➡️ Rapid generation of multiple models at different quantization levels → Think about having 2, 3, 4, 5, 6, 7 or 8-bit models always in memory, but with a memory requirement of only an 8-bit model (well, almost, they do need some metadata but its an order of magnitude lower memory requirement). ➡️ Specialized LLM serving engine with the support to serve different quantization levels of the same model. ❓ Why is this important ❓ Imagine the scenario where you have a highly finetuned LLM (or an SLM) that you deploy for your users. As a failsafe, you also need to deploy a secondary LLM that comes into play if the first one goes down. However, as soon as the secondary LLM comes online, the users will notice - it's not a finetuned version of your first model! Or even if you deploy a finetuned version, you end up deploying TWO models, in memory, in parallel. This paper solves this problem - a possible failsafe method for an in-production LLM that failovers to smaller/higher quantization levels based on bandwidth and user requirements. Incredible! For a truly redundant system though this still doesn't work because if the LLM serving engine goes down, all models go down as well. So don't throw away your failsafe LLM just yet, but this might be a handy alternative. Book a free 15 minute call with me: https://lnkd.in/dGnHs_Mn #icml2024 #iclr2024 #llm #nlp #machinelearning #consulting #generativeai #openai #microsoft #iclr #icml #research #paper

Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs

arxiv.org
Like Comment
To view or add a comment, sign in
Bassel Haidar

| AI | Data | Automation |
5mo
Report this post
Producing accurate and reliable AI-generated text has always been a challenge. The paper 𝗖𝗼𝗿𝗿𝗲𝗰𝘁𝗶𝘃𝗲 𝗥𝗲𝘁𝗿𝗶𝗲𝘃𝗮𝗹 𝗔𝘂𝗴𝗺𝗲𝗻𝘁𝗲𝗱 𝗚𝗲𝗻𝗲𝗿𝗮𝘁𝗶𝗼𝗻 (𝗖𝗥𝗔𝗚), published on 1/29/24 (Link -> https://lnkd.in/eh3si68K), introduces an innovative solution to enhance the robustness and accuracy of text generated by Large Language Models (LLMs). The researchers developed a method that significantly improves the performance of retrieval-augmented generation (RAG) models by introducing a corrective mechanism that evaluates and refines the relevance of retrieved documents. This ensures that the generated text is not only fluent but also factually accurate. 𝗧𝗵𝗲 𝗣𝗿𝗼𝗯𝗹𝗲𝗺 𝘄𝗶𝘁𝗵 𝗖𝘂𝗿𝗿𝗲𝗻𝘁 𝗟𝗟𝗠𝘀 LLMs, despite their ability to understand instructions and produce fluent texts, are prone to hallucinations - generating information that might be irrelevant or incorrect. The traditional method to mitigate this issue, RAG, relies heavily on the relevance of retrieved documents, which can sometimes be inaccurate or misleading. 𝗜𝗻𝘁𝗿𝗼𝗱𝘂𝗰𝗶𝗻𝗴 𝗖𝗥𝗔𝗚: 𝗔 𝗡𝗼𝘃𝗲𝗹 𝗔𝗽𝗽𝗿𝗼𝗮𝗰𝗵 CRAG proposes a lightweight 𝗿𝗲𝘁𝗿𝗶𝗲𝘃𝗮𝗹 𝗲𝘃𝗮𝗹𝘂𝗮𝘁𝗼𝗿 𝘁𝗼 𝗮𝘀𝘀𝗲𝘀𝘀 𝘁𝗵𝗲 𝗾𝘂𝗮𝗹𝗶𝘁𝘆 𝗼𝗳 𝗿𝗲𝘁𝗿𝗶𝗲𝘃𝗲𝗱 𝗱𝗼𝗰𝘂𝗺𝗲𝗻𝘁𝘀. It uses a decompose-then-recompose algorithm to focus on key information, filtering out irrelevant content. This method is plug-and-play, easily integrating with existing RAG-based approaches to improve their performance across various datasets. 𝗘𝗺𝗽𝗶𝗿𝗶𝗰𝗮𝗹 𝗘𝘃𝗶𝗱𝗲𝗻𝗰𝗲 𝗼𝗳 𝗦𝘂𝗰𝗰𝗲𝘀𝘀 The researchers conducted experiments across four datasets covering both short- and long-form generation tasks. The results showed that CRAG significantly outperformed existing RAG and Self-RAG approaches, demonstrating its adaptability and generalizability in improving factual accuracy in AI-generated text. 𝗧𝗶𝗺𝗲 𝘁𝗼 𝘂𝗽𝗴𝗿𝗮𝗱𝗲 𝘆𝗼𝘂𝗿 𝗥𝗔𝗚 𝘁𝗼 𝗖𝗥𝗔𝗚! #llm #llms #rag #crag #artificialintelligence
4 Comments
Like Comment
To view or add a comment, sign in

1,458 followers

View Profile Follow

TELLUS International’s Post

More Relevant Posts

Explore topics