-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question]: How can I persist node mappings and retrieve them back for RecursiveRetrieval? #14583
Comments
Hello @harshsavasil! I'm a bot here to help you with any questions or issues you have. Let's see how we can solve this together. To persist node mappings and retrieve them back from the index to use in Persisting Node MappingsTo persist the node mappings, call the node_mapping.persist(persist_dir="your_persist_directory") Retrieving Node MappingsTo retrieve the node mappings back from the index, use the retrieved_node_mapping = SimpleObjectNodeMapping.from_persist_dir(persist_dir="your_persist_directory") Using in
|
@dosu can't we store these node mappings in a database for better storage and efficient retrieval? |
Yes, you can store node mappings in a database for better storage and efficient retrieval. You can achieve this by serializing the nodes and their relationships into a format suitable for database storage, such as JSON, and then using a database client to insert these records into your database. Here is an example of how you might serialize the nodes and store them in a SQLite database: import sqlite3
import json
from typing import List
from llama_index.core.schema import BaseNode, TextNode
from llama_index.core.node_parser.relational.unstructured_element import UnstructuredElementNodeParser
# Initialize the parser
parser = UnstructuredElementNodeParser()
# Example function to store nodes in a SQLite database
def store_nodes_in_db(nodes: List[BaseNode], db_path: str):
# Connect to the SQLite database (or create it if it doesn't exist)
conn = sqlite3.connect(db_path)
cursor = conn.cursor()
# Create a table for storing nodes if it doesn't exist
cursor.execute('''
CREATE TABLE IF NOT EXISTS nodes (
id TEXT PRIMARY KEY,
type TEXT,
content TEXT,
relationships TEXT,
metadata TEXT
)
''')
# Insert nodes into the database
for node in nodes:
node_data = {
'id': node.id,
'type': node.type,
'content': node.get_content(),
'relationships': json.dumps(node.relationships),
'metadata': json.dumps(node.metadata)
}
cursor.execute('''
INSERT OR REPLACE INTO nodes (id, type, content, relationships, metadata)
VALUES (:id, :type, :content, :relationships, :metadata)
''', node_data)
# Commit the transaction and close the connection
conn.commit()
conn.close()
# Example usage
text_node = TextNode(id="example_id", content="Example content")
nodes = parser.get_nodes_from_node(text_node)
store_nodes_in_db(nodes, 'nodes.db') In this example:
This approach can be adapted to other types of databases (e.g., PostgreSQL, MongoDB) by using the appropriate database client and query syntax [1]. |
@dosu How will I retrieve these nodes from the database for RAG? |
To retrieve nodes from the database for Retrieval-Augmented Generation (RAG), you can create a custom retriever that fetches the nodes from the database and converts them into Step-by-Step Guide
Example Implementationimport sqlite3
from typing import List
from llama_index.core.base.base_retriever import BaseRetriever
from llama_index.core.schema import NodeWithScore, TextNode
from llama_index.core.indices.query.schema import QueryBundle
class SQLiteRetriever(BaseRetriever):
"""Retriever that fetches data from an SQLite database."""
def __init__(self, db_path: str, top_k: int = 10) -> None:
"""Initialize with database path and top_k results."""
self.db_path = db_path
self.top_k = top_k
def _retrieve(self, query_bundle: QueryBundle) -> List[NodeWithScore]:
"""Retrieve nodes from the SQLite database."""
conn = sqlite3.connect(self.db_path)
cursor = conn.cursor()
# Example query to fetch data based on some criteria
cursor.execute("SELECT content, score FROM nodes WHERE content LIKE ?", ('%' + query_bundle.query_str + '%',))
results = cursor.fetchall()
conn.close()
# Convert results to NodeWithScore objects
result_nodes = [
NodeWithScore(node=TextNode(text=row[0]), score=row[1])
for row in results[:self.top_k]
]
return result_nodes
# Example usage
db_path = 'nodes.db'
retriever = SQLiteRetriever(db_path=db_path, top_k=10)
query_bundle = QueryBundle(query_str="example query")
nodes = retriever._retrieve(query_bundle) Integrating with
|
Question Validation
Question
Hey,
I'm trying to create RAG applications on top of web pages. I'm using UnstructuredElementNodeParser to parse my documents.
As a result of calling
base_nodes, node_mappings = node_parser.get_base_nodes_and_mappings(raw_nodes)
, I get base nodes and node mappings.I'm using vector index to store the base nodes, but what about node mappings? How can I store my node mappings and retrieve them later from the index to pass them into a RecursiveRetriever for retrieval?
The text was updated successfully, but these errors were encountered: