Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

我想要将知识图谱可视化,按照官网上的方式编写代码执行后报错,报错如下,具体应该如何解决下面的问题,实现可视化呢 #14622

Open
1 task done
li-hhhh opened this issue Jul 8, 2024 · 7 comments
Labels
question Further information is requested

Comments

@li-hhhh
Copy link

li-hhhh commented Jul 8, 2024

Question Validation

  • I have searched both the documentation and discord for an answer.

Question

报错:
Traceback (most recent call last):
File "/home/cdhd/PythonCode/pythonProject/Knowledge/Neo4j/graph_neo4j.py", line 93, in
net.show("example.html")
File "/home/cdhd/PythonCode/pythonProject/.venv/lib/python3.10/site-packages/pyvis/network.py", line 546, in show
self.write_html(name, open_browser=False,notebook=True)
File "/home/cdhd/PythonCode/pythonProject/.venv/lib/python3.10/site-packages/pyvis/network.py", line 530, in write_html
out.write(self.html)
UnicodeEncodeError: 'ascii' codec can't encode character '\xa9' in position 230491: ordinal not in range(128)
代码:
g = index.get_networkx_graph()

directed:false--生成无向图 true--生成有向图

net = Network(notebook=True, cdn_resources="in_line", directed=True)

net.from_nx(g)
net.show("example.html")
import IPython

IPython.display.HTML(filename='example.html')

@li-hhhh li-hhhh added the question Further information is requested label Jul 8, 2024
Copy link

dosubot bot commented Jul 8, 2024

要解决 UnicodeEncodeError 并成功实现知识图谱的可视化,请确保你的 HTML 文件正确地使用 UTF-8 编码。你可以在 HTML 文件中包含以下 meta 标签来指定 UTF-8 编码:

<meta charset="utf-8">

如果你仍然遇到 UnicodeEncodeError,可能是由于文件的读取或写入方式不正确。确保你的环境正确处理 UTF-8 编码。

以下是一个使用 Python 的 http.server 模块来正确处理 UTF-8 编码并提供 HTML 文件的示例:

import http.server
import socketserver

PORT = 8000

Handler = http.server.SimpleHTTPRequestHandler

with socketserver.TCPServer(("", PORT), Handler) as httpd:
    print("Serving at port", PORT)
    httpd.serve_forever()

将此脚本保存在与你的 HTML 文件相同的目录中并运行它。然后,在浏览器中导航到 http://localhost:8000/example.html 以查看知识图谱。

如果问题仍然存在,请确保 HTML 文件中的所有文本内容都正确地使用 UTF-8 编码。你可以使用支持 UTF-8 编码的文本编辑器来验证并保存文件。

此外,LlamaIndex 仓库中提到了一些替代方法来可视化知识图谱,这些方法可能会避免 pyvis 中的 Unicode 处理问题。例如,可以使用 yfiles_jupyter_graphsnetworkx 进行可视化:

使用 yfiles_jupyter_graphs

def show_jupyter_graph(self) -> None:
    try:
        from yfiles_jupyter_graphs import GraphWidget
    except ImportError:
        raise ImportError(
            "Please install yfiles_jupyter_graphs to visualize the graph: `pip install yfiles_jupyter_graphs`"
        )

    w = GraphWidget()
    nodes = []
    edges = []
    for node in self.graph.nodes.values():
        node = {"id": node.id, "properties": {"label": node.id}}
        nodes.append(node)
    for triplet in self.graph.triplets:
        edge = {
            "id": triplet[1],
            "start": triplet[0],
            "end": triplet[2],
            "properties": {"label": triplet[1]},
        }
        edges.append(edge)
    w.nodes = nodes
    w.edges = edges
    display(w)

使用 networkx

def get_networkx_graph(self, limit: int = 100) -> Any:
    try:
        import networkx as nx
    except ImportError:
        raise ImportError(
            "Please install networkx to visualize the graph: `pip install networkx`"
        )

    g = nx.Graph()
    subjs = list(self.index_struct.table.keys())

    rel_map = self._graph_store.get_rel_map(subjs=subjs, depth=1, limit=limit)

    added_nodes = set()
    for keyword in rel_map:
        for path in rel_map[keyword]:
            subj = keyword
            for i in range(0, len(path), 2):
                if i + 2 >= len(path):
                    break

                if subj not in added_nodes:
                    g.add_node(subj)
                    added_nodes.add(subj)

                rel = path[i + 1]
                obj = path[i + 2]

                g.add_edge(subj, obj, label=rel, title=rel)
                subj = obj
    return g

这些方法提供了 pyvis 的替代方案,可以帮助避免你可能遇到的 Unicode 处理问题[1][2][3]

To continue talking to Dosu, mention @dosu.

@li-hhhh
Copy link
Author

li-hhhh commented Jul 8, 2024

要解决 UnicodeEncodeError 并成功实现知识图谱的可视化,请确保你的 HTML 文件正确地使用 UTF-8 编码。你可以在 HTML 文件中包含以下 meta 标签来指定 UTF-8 编码:

<meta charset="utf-8">

如果你仍然遇到 UnicodeEncodeError,可能是由于文件的读取或写入方式不正确。确保你的环境正确处理 UTF-8 编码。

以下是一个使用 Python 的 http.server 模块来正确处理 UTF-8 编码并提供 HTML 文件的示例:

import http.server
import socketserver

PORT = 8000

Handler = http.server.SimpleHTTPRequestHandler

with socketserver.TCPServer(("", PORT), Handler) as httpd:
    print("Serving at port", PORT)
    httpd.serve_forever()

将此脚本保存在与你的 HTML 文件相同的目录中并运行它。然后,在浏览器中导航到 http://localhost:8000/example.html 以查看知识图谱。

如果问题仍然存在,请确保 HTML 文件中的所有文本内容都正确地使用 UTF-8 编码。你可以使用支持 UTF-8 编码的文本编辑器来验证并保存文件。

此外,LlamaIndex 仓库中提到了一些替代方法来可视化知识图谱,这些方法可能会避免 pyvis 中的 Unicode 处理问题。例如,可以使用 yfiles_jupyter_graphsnetworkx 进行可视化:

使用 yfiles_jupyter_graphs

def show_jupyter_graph(self) -> None:
    try:
        from yfiles_jupyter_graphs import GraphWidget
    except ImportError:
        raise ImportError(
            "Please install yfiles_jupyter_graphs to visualize the graph: `pip install yfiles_jupyter_graphs`"
        )

    w = GraphWidget()
    nodes = []
    edges = []
    for node in self.graph.nodes.values():
        node = {"id": node.id, "properties": {"label": node.id}}
        nodes.append(node)
    for triplet in self.graph.triplets:
        edge = {
            "id": triplet[1],
            "start": triplet[0],
            "end": triplet[2],
            "properties": {"label": triplet[1]},
        }
        edges.append(edge)
    w.nodes = nodes
    w.edges = edges
    display(w)

使用 networkx

def get_networkx_graph(self, limit: int = 100) -> Any:
    try:
        import networkx as nx
    except ImportError:
        raise ImportError(
            "Please install networkx to visualize the graph: `pip install networkx`"
        )

    g = nx.Graph()
    subjs = list(self.index_struct.table.keys())

    rel_map = self._graph_store.get_rel_map(subjs=subjs, depth=1, limit=limit)

    added_nodes = set()
    for keyword in rel_map:
        for path in rel_map[keyword]:
            subj = keyword
            for i in range(0, len(path), 2):
                if i + 2 >= len(path):
                    break

                if subj not in added_nodes:
                    g.add_node(subj)
                    added_nodes.add(subj)

                rel = path[i + 1]
                obj = path[i + 2]

                g.add_edge(subj, obj, label=rel, title=rel)
                subj = obj
    return g

这些方法提供了 pyvis 的替代方案,可以帮助避免你可能遇到的 Unicode 处理问题[1][2][3]

To continue talking to Dosu, mention @dosu.

我不是很明白你的意思,所以请将你的解决方��在我的代码中做修改,谢谢,我的代码如下:
from llama_index.core import SimpleDirectoryReader,KnowledgeGraphIndex,StorageContext,Settings
import nest_asyncio

from llama_index.embeddings.huggingface import HuggingFaceEmbedding

from llama_index.llms.huggingface import HuggingFaceLLM
from llama_index.core import PromptTemplate
from llama_index.graph_stores.neo4j import Neo4jGraphStore
import torch
from pyvis.network import Network

import yapf

nest_asyncio.apply()

Load documents

documents = SimpleDirectoryReader('/home/cdhd/PythonCode/pythonProject/TestFile/公司主要人员架构类').load_data()
model_config = {"protected_namespaces": ()}

SYSTEM_PROMPT = """You are a helpful AI assistant.
"""

query_wrapper_prompt = PromptTemplate(
"[INST]<>\n" + SYSTEM_PROMPT + "<>\n\n{query_str}[/INST] "
)

llm = HuggingFaceLLM(
context_window=4096,
max_new_tokens=1024,
generate_kwargs={"temperature": 0.2, "do_sample": True},
query_wrapper_prompt=query_wrapper_prompt,
# tokenizer_name='/home/cdhd/PythonCode/pythonProject/model/gpt2',
# model_name='/home/cdhd/PythonCode/pythonProject/model/gpt2',
tokenizer_name='/home/cdhd/Desktop/glm4/glm-4-9b-chat',
model_name='/home/cdhd/Desktop/glm4/glm-4-9b-chat',
device_map="auto",
# change these settings below depending on your GPU
model_kwargs={"torch_dtype": torch.float16},

)

llm = AutoModel.from_pretrained('/home/cdhd/Desktop/glm4/glm-4-9b-chat', trust_remote_code=True)

Settings.llm = llm

embed_model = HuggingFaceEmbedding(
model_name='/home/cdhd/PythonCode/pythonProject/model/bge-large-zh-v1.5/models--BAAI--bge-large-zh-v1.5/snapshots/79e7739b6ab944e86d6171e44d24c997fc1e0116',
# model_name = r"E:\Huggingface_model\bge-large-zh-v1.5",
# cache_folder=r"E:\Huggingface_model\bge-large-zh-v1.5\models--BAAI--bge-large-zh-v1.5\snapshots\79e7739b6ab944e86d6171e44d24c997fc1e0116",
)
Settings.embed_model = embed_model

graph_store = Neo4jGraphStore(
username="neo4j",
password="12345678",
url="bolt://localhost:7687",
database="neo4j"

)

清理数据集

graph_store.query(
"""
MATCH (n) DETACH DELETE n
"""
)

创建存储上下文

storage_context = StorageContext.from_defaults(graph_store=graph_store)

构建包含嵌入的索引

index = KnowledgeGraphIndex.from_documents(
documents,
storage_context=storage_context,
max_triplets_per_chunk=2,
include_embeddings=True,
)
retriever = index.as_retriever()
ans = retriever.retrieve("公司的董事长是谁?")
print(ans)
for node in ans:
print(node.text)

query_engine = index.as_query_engine(
include_text = True,response_mode="tree_summarize"
)

response = query_engine.query(
"公司的董事长是谁?"
)
print("response:",response)
print("hhhhhhhhhhhh")
g = index.get_networkx_graph()

directed:false--生成无向图 true--生成有向图

net = Network(notebook=True, cdn_resources="in_line", directed=True)

net.from_nx(g)
net.show("example.html")

import IPython

IPython.display.HTML(filename='Knowledge_graph.html')

Copy link

dosubot bot commented Jul 8, 2024

要解决 UnicodeEncodeError 并成功实现知识图谱的可视化,请确保你的 HTML 文件正确地使用 UTF-8 编码。你可以在 HTML 文件中包含以下 meta 标签来指定 UTF-8 编码:

<meta charset="utf-8">

此外,确保在写入 HTML 文件时使用 UTF-8 编码。以下是修改后的代码:

from llama_index.core import SimpleDirectoryReader, KnowledgeGraphIndex, StorageContext, Settings
import nest_asyncio
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.huggingface import HuggingFaceLLM
from llama_index.core import PromptTemplate
from llama_index.graph_stores.neo4j import Neo4jGraphStore
import torch
from pyvis.network import Network
import IPython

nest_asyncio.apply()

# Load documents
documents = SimpleDirectoryReader('/home/cdhd/PythonCode/pythonProject/TestFile/公司主要人员架构类').load_data()
model_config = {"protected_namespaces": ()}

SYSTEM_PROMPT = """You are a helpful AI assistant.
"""

query_wrapper_prompt = PromptTemplate(
    "[INST]<<SYS>>\n" + SYSTEM_PROMPT + "<</SYS>>\n\n{query_str}[/INST] "
)

llm = HuggingFaceLLM(
    context_window=4096,
    max_new_tokens=1024,
    generate_kwargs={"temperature": 0.2, "do_sample": True},
    query_wrapper_prompt=query_wrapper_prompt,
    tokenizer_name='/home/cdhd/Desktop/glm4/glm-4-9b-chat',
    model_name='/home/cdhd/Desktop/glm4/glm-4-9b-chat',
    device_map="auto",
    model_kwargs={"torch_dtype": torch.float16},
)

Settings.llm = llm

embed_model = HuggingFaceEmbedding(
    model_name='/home/cdhd/PythonCode/pythonProject/model/bge-large-zh-v1.5/models--BAAI--bge-large-zh-v1.5/snapshots/79e7739b6ab944e86d6171e44d24c997fc1e0116',
)
Settings.embed_model = embed_model

graph_store = Neo4jGraphStore(
    username="neo4j",
    password="12345678",
    url="bolt://localhost:7687",
    database="neo4j"
)

# 清理数据集
graph_store.query(
    """
MATCH (n) DETACH DELETE n
"""
)

# 创建存储上下文
storage_context = StorageContext.from_defaults(graph_store=graph_store)

# 构建包含嵌入的索引
index = KnowledgeGraphIndex.from_documents(
    documents,
    storage_context=storage_context,
    max_triplets_per_chunk=2,
    include_embeddings=True,
)

retriever = index.as_retriever()
ans = retriever.retrieve("公司的董事长是谁?")
print(ans)
for node in ans:
    print(node.text)

query_engine = index.as_query_engine(
    include_text=True, response_mode="tree_summarize"
)

response = query_engine.query(
    "公司的董事长是谁?"
)
print("response:", response)

# 获取 NetworkX 图
g = index.get_networkx_graph()

# directed:false--生成无向图  true--生成有向图
net = Network(notebook=True, cdn_resources="in_line", directed=True)
net.from_nx(g)

# 确保使用 UTF-8 编码写入 HTML 文件
with open("example.html", "w", encoding="utf-8") as f:
    f.write(net.generate_html())

# 显示 HTML 文件
IPython.display.HTML(filename='example.html')

此代码确保在写入 HTML 文件时使用 UTF-8 编码,从而避免 UnicodeEncodeError

To continue talking to Dosu, mention @dosu.

@li-hhhh
Copy link
Author

li-hhhh commented Jul 8, 2024

要解决 UnicodeEncodeError 并成功实现知识图谱的可视化,请确保你的 HTML 文件正确地使用 UTF-8 编码。你可以在 HTML 文件中包含以下 meta 标签来指定 UTF-8 编码:

<meta charset="utf-8">

此外,确保在写入 HTML 文件时使用 UTF-8 编码。以下是修改后的代码:

from llama_index.core import SimpleDirectoryReader, KnowledgeGraphIndex, StorageContext, Settings
import nest_asyncio
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.huggingface import HuggingFaceLLM
from llama_index.core import PromptTemplate
from llama_index.graph_stores.neo4j import Neo4jGraphStore
import torch
from pyvis.network import Network
import IPython

nest_asyncio.apply()

# Load documents
documents = SimpleDirectoryReader('/home/cdhd/PythonCode/pythonProject/TestFile/公司主要人员架构类').load_data()
model_config = {"protected_namespaces": ()}

SYSTEM_PROMPT = """You are a helpful AI assistant.
"""

query_wrapper_prompt = PromptTemplate(
    "[INST]<<SYS>>\n" + SYSTEM_PROMPT + "<</SYS>>\n\n{query_str}[/INST] "
)

llm = HuggingFaceLLM(
    context_window=4096,
    max_new_tokens=1024,
    generate_kwargs={"temperature": 0.2, "do_sample": True},
    query_wrapper_prompt=query_wrapper_prompt,
    tokenizer_name='/home/cdhd/Desktop/glm4/glm-4-9b-chat',
    model_name='/home/cdhd/Desktop/glm4/glm-4-9b-chat',
    device_map="auto",
    model_kwargs={"torch_dtype": torch.float16},
)

Settings.llm = llm

embed_model = HuggingFaceEmbedding(
    model_name='/home/cdhd/PythonCode/pythonProject/model/bge-large-zh-v1.5/models--BAAI--bge-large-zh-v1.5/snapshots/79e7739b6ab944e86d6171e44d24c997fc1e0116',
)
Settings.embed_model = embed_model

graph_store = Neo4jGraphStore(
    username="neo4j",
    password="12345678",
    url="bolt://localhost:7687",
    database="neo4j"
)

# 清理数据集
graph_store.query(
    """
MATCH (n) DETACH DELETE n
"""
)

# 创建存储上下文
storage_context = StorageContext.from_defaults(graph_store=graph_store)

# 构建包含嵌入的索引
index = KnowledgeGraphIndex.from_documents(
    documents,
    storage_context=storage_context,
    max_triplets_per_chunk=2,
    include_embeddings=True,
)

retriever = index.as_retriever()
ans = retriever.retrieve("公司的董事长是谁?")
print(ans)
for node in ans:
    print(node.text)

query_engine = index.as_query_engine(
    include_text=True, response_mode="tree_summarize"
)

response = query_engine.query(
    "公司的董事长是谁?"
)
print("response:", response)

# 获取 NetworkX 图
g = index.get_networkx_graph()

# directed:false--生成无向图  true--生成有向图
net = Network(notebook=True, cdn_resources="in_line", directed=True)
net.from_nx(g)

# 确保使用 UTF-8 编码写入 HTML 文件
with open("example.html", "w", encoding="utf-8") as f:
    f.write(net.generate_html())

# 显示 HTML 文件
IPython.display.HTML(filename='example.html')

此代码确保在写入 HTML 文件时使用 UTF-8 编码,从而避免 UnicodeEncodeError

To continue talking to Dosu, mention @dosu.

这种方式我已经尝试过了,虽然html文件中有内容且没有报错,但是将文件放在网页打开之后依然没有我想要的网络图,打开显示的是空白页面,这应该如何解决呢?

Copy link

dosubot bot commented Jul 8, 2024

要解决 HTML 文件在浏览器中显示空白页面的问题,首先需要确保 HTML 文件的基本结构是正确的,并且包含必要的元素。以下是一个可能的解决方案:

  1. 确保 HTML 文件包含 <html>, <head>, 和 <body> 标签。
  2. 确保在 <head> 标签中包含 <meta charset="utf-8"> 以设置字符编码。
  3. 确保在 <body> 标签中包含一些内容或元素,以便浏览器有东西可以渲染。

以下是一个示例 HTML 文件的基本结构:

<!DOCTYPE html>
<html>
<head>
    <meta charset="utf-8">
    <title>Example Page</title>
    <script>
        // Your JavaScript code here
    </script>
    <style>
        /* Your CSS code here */
    </style>
</head>
<body>
    <h1>Hello, World!</h1>
    <p>This is a sample HTML file.</p>
</body>
</html>

在你的代码中,确保 net.generate_html() 生成的 HTML 文件包含这些基本结构。你可以手动检查生成的 HTML 文件,或者在生成 HTML 文件时添加这些结构。

以下是修改后的代码,确保生成的 HTML 文件包含必要的结构:

from llama_index.core import SimpleDirectoryReader, KnowledgeGraphIndex, StorageContext, Settings
import nest_asyncio
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.huggingface import HuggingFaceLLM
from llama_index.core import PromptTemplate
from llama_index.graph_stores.neo4j import Neo4jGraphStore
import torch
from pyvis.network import Network
import IPython

nest_asyncio.apply()

# Load documents
documents = SimpleDirectoryReader('/home/cdhd/PythonCode/pythonProject/TestFile/公司主要人员架构类').load_data()
model_config = {"protected_namespaces": ()}

SYSTEM_PROMPT = """You are a helpful AI assistant.
"""

query_wrapper_prompt = PromptTemplate(
    "[INST]<<SYS>>\n" + SYSTEM_PROMPT + "<</SYS>>\n\n{query_str}[/INST] "
)

llm = HuggingFaceLLM(
    context_window=4096,
    max_new_tokens=1024,
    generate_kwargs={"temperature": 0.2, "do_sample": True},
    query_wrapper_prompt=query_wrapper_prompt,
    tokenizer_name='/home/cdhd/Desktop/glm4/glm-4-9b-chat',
    model_name='/home/cdhd/Desktop/glm4/glm-4-9b-chat',
    device_map="auto",
    model_kwargs={"torch_dtype": torch.float16},
)

Settings.llm = llm

embed_model = HuggingFaceEmbedding(
    model_name='/home/cdhd/PythonCode/pythonProject/model/bge-large-zh-v1.5/models--BAAI--bge-large-zh-v1.5/snapshots/79e7739b6ab944e86d6171e44d24c997fc1e0116',
)
Settings.embed_model = embed_model

graph_store = Neo4jGraphStore(
    username="neo4j",
    password="12345678",
    url="bolt://localhost:7687",
    database="neo4j"
)

# 清理数据集
graph_store.query(
    """
MATCH (n) DETACH DELETE n
"""
)

# 创建存储上下文
storage_context = StorageContext.from_defaults(graph_store=graph_store)

# 构建包含嵌入的索引
index = KnowledgeGraphIndex.from_documents(
    documents,
    storage_context=storage_context,
    max_triplets_per_chunk=2,
    include_embeddings=True,
)

retriever = index.as_retriever()
ans = retriever.retrieve("公司的董事长是谁?")
print(ans)
for node in ans:
    print(node.text)

query_engine = index.as_query_engine(
    include_text=True, response_mode="tree_summarize"
)

response = query_engine.query(
    "公司的董事长是谁?"
)
print("response:", response)

# 获取 NetworkX 图
g = index.get_networkx_graph()

# directed:false--生成无向图  true--生成有向图
net = Network(notebook=True, cdn_resources="in_line", directed=True)
net.from_nx(g)

# 确保使用 UTF-8 编码写入 HTML 文件,并包含基本结构
html_content = f"""
<!DOCTYPE html>
<html>
<head>
    <meta charset="utf-8">
    <title>Knowledge Graph</title>
</head>
<body>
    {net.generate_html()}
</body>
</html>
"""

with open("example.html", "w", encoding="utf-8") as f:
    f.write(html_content)

# 显示 HTML 文件
IPython.display.HTML(filename='example.html')

此代码确保生成的 HTML 文件包含基本的 HTML 结构,并使用 UTF-8 编码写入文件,从而避免显示空白页面的问题[1].

To continue talking to Dosu, mention @dosu.

@li-hhhh
Copy link
Author

li-hhhh commented Jul 9, 2024

要解决 HTML 文件在浏览器中显示空白页面的问题,首先需要确保 HTML 文件的基本结构是正确的,并且包含必要的元素。以下是一个可能的解决方案:

  1. 确保 HTML 文件包含 <html>, <head>, 和 <body> 标签。
  2. 确保在 <head> 标签中包含 <meta charset="utf-8"> 以设置字符编码。
  3. 确保在 <body> 标签中包含一些内容或元素,以便浏览器有东西可以渲染。

以下是一个示例 HTML 文件的基本结构:

<!DOCTYPE html>
<html>
<head>
    <meta charset="utf-8">
    <title>Example Page</title>
    <script>
        // Your JavaScript code here
    </script>
    <style>
        /* Your CSS code here */
    </style>
</head>
<body>
    <h1>Hello, World!</h1>
    <p>This is a sample HTML file.</p>
</body>
</html>

在你的代码中,确保 net.generate_html() 生成的 HTML 文件包含这些基本结构。你可以手动检查生成的 HTML 文件,或者在生成 HTML 文件时添加这些结构。

以下是修改后的代码,确保生成的 HTML 文件包含必要的结构:

from llama_index.core import SimpleDirectoryReader, KnowledgeGraphIndex, StorageContext, Settings
import nest_asyncio
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.huggingface import HuggingFaceLLM
from llama_index.core import PromptTemplate
from llama_index.graph_stores.neo4j import Neo4jGraphStore
import torch
from pyvis.network import Network
import IPython

nest_asyncio.apply()

# Load documents
documents = SimpleDirectoryReader('/home/cdhd/PythonCode/pythonProject/TestFile/公司主要人员架构类').load_data()
model_config = {"protected_namespaces": ()}

SYSTEM_PROMPT = """You are a helpful AI assistant.
"""

query_wrapper_prompt = PromptTemplate(
    "[INST]<<SYS>>\n" + SYSTEM_PROMPT + "<</SYS>>\n\n{query_str}[/INST] "
)

llm = HuggingFaceLLM(
    context_window=4096,
    max_new_tokens=1024,
    generate_kwargs={"temperature": 0.2, "do_sample": True},
    query_wrapper_prompt=query_wrapper_prompt,
    tokenizer_name='/home/cdhd/Desktop/glm4/glm-4-9b-chat',
    model_name='/home/cdhd/Desktop/glm4/glm-4-9b-chat',
    device_map="auto",
    model_kwargs={"torch_dtype": torch.float16},
)

Settings.llm = llm

embed_model = HuggingFaceEmbedding(
    model_name='/home/cdhd/PythonCode/pythonProject/model/bge-large-zh-v1.5/models--BAAI--bge-large-zh-v1.5/snapshots/79e7739b6ab944e86d6171e44d24c997fc1e0116',
)
Settings.embed_model = embed_model

graph_store = Neo4jGraphStore(
    username="neo4j",
    password="12345678",
    url="bolt://localhost:7687",
    database="neo4j"
)

# 清理数据集
graph_store.query(
    """
MATCH (n) DETACH DELETE n
"""
)

# 创建存储上下文
storage_context = StorageContext.from_defaults(graph_store=graph_store)

# 构建包含嵌入的索引
index = KnowledgeGraphIndex.from_documents(
    documents,
    storage_context=storage_context,
    max_triplets_per_chunk=2,
    include_embeddings=True,
)

retriever = index.as_retriever()
ans = retriever.retrieve("公司的董事长是谁?")
print(ans)
for node in ans:
    print(node.text)

query_engine = index.as_query_engine(
    include_text=True, response_mode="tree_summarize"
)

response = query_engine.query(
    "公司的董事长是谁?"
)
print("response:", response)

# 获取 NetworkX 图
g = index.get_networkx_graph()

# directed:false--生成无向图  true--生成有向图
net = Network(notebook=True, cdn_resources="in_line", directed=True)
net.from_nx(g)

# 确保使用 UTF-8 编码写入 HTML 文件,并包含基本结构
html_content = f"""
<!DOCTYPE html>
<html>
<head>
    <meta charset="utf-8">
    <title>Knowledge Graph</title>
</head>
<body>
    {net.generate_html()}
</body>
</html>
"""

with open("example.html", "w", encoding="utf-8") as f:
    f.write(html_content)

# 显示 HTML 文件
IPython.display.HTML(filename='example.html')

此代码确保生成的 HTML 文件包含基本的 HTML 结构,并使用 UTF-8 编码写入文件,从而避免显示空白页面的问题[1].

To continue talking to Dosu, mention @dosu.

���照你的方式修改,运行代码之后,html文件在网页上依旧无法将知识图谱可视化,页面显示的是一个空白页中有一个分割线。请问还有其他的方式解决吗

Copy link

dosubot bot commented Jul 9, 2024

要解决 HTML 文件在浏览器中显示空白页面的问题,可以尝试以下步骤:

  1. 检查生成的 HTML 文件:确保 net.generate_html() 生成的 HTML 文件包含所有必要的 JavaScript 和 CSS 资源。

  2. 手动检查 HTML 文件:打开生成的 HTML 文件,查看源代码,确保所有的 JavaScript 和 CSS 资源都正确加载。

  3. 调试 JavaScript:在浏览器的开发者工具中检查控制台输出,查看是否有任何错误或警告。

  4. 简化代码:尝试生成一个最小化的示例,确保基本的图形可视化功能正常工作。

以下是一个简化的示例代码,确保生成的 HTML 文件包含所有必要的资源,并且可以正确显示图形:

from llama_index.core import SimpleDirectoryReader, KnowledgeGraphIndex, StorageContext, Settings
import nest_asyncio
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.huggingface import HuggingFaceLLM
from llama_index.core import PromptTemplate
from llama_index.graph_stores.neo4j import Neo4jGraphStore
import torch
from pyvis.network import Network
import IPython

nest_asyncio.apply()

# Load documents
documents = SimpleDirectoryReader('/home/cdhd/PythonCode/pythonProject/TestFile/公司主要人员架构类').load_data()
model_config = {"protected_namespaces": ()}

SYSTEM_PROMPT = """You are a helpful AI assistant.
"""

query_wrapper_prompt = PromptTemplate(
    "[INST]<<SYS>>\n" + SYSTEM_PROMPT + "<</SYS>>\n\n{query_str}[/INST] "
)

llm = HuggingFaceLLM(
    context_window=4096,
    max_new_tokens=1024,
    generate_kwargs={"temperature": 0.2, "do_sample": True},
    query_wrapper_prompt=query_wrapper_prompt,
    tokenizer_name='/home/cdhd/Desktop/glm4/glm-4-9b-chat',
    model_name='/home/cdhd/Desktop/glm4/glm-4-9b-chat',
    device_map="auto",
    model_kwargs={"torch_dtype": torch.float16},
)

Settings.llm = llm

embed_model = HuggingFaceEmbedding(
    model_name='/home/cdhd/PythonCode/pythonProject/model/bge-large-zh-v1.5/models--BAAI--bge-large-zh-v1.5/snapshots/79e7739b6ab944e86d6171e44d24c997fc1e0116',
)
Settings.embed_model = embed_model

graph_store = Neo4jGraphStore(
    username="neo4j",
    password="12345678",
    url="bolt://localhost:7687",
    database="neo4j"
)

# 清理数据集
graph_store.query(
    """
MATCH (n) DETACH DELETE n
"""
)

# 创建存储上下文
storage_context = StorageContext.from_defaults(graph_store=graph_store)

# 构建包含嵌入的索引
index = KnowledgeGraphIndex.from_documents(
    documents,
    storage_context=storage_context,
    max_triplets_per_chunk=2,
    include_embeddings=True,
)

retriever = index.as_retriever()
ans = retriever.retrieve("公司的董事长是谁?")
print(ans)
for node in ans:
    print(node.text)

query_engine = index.as_query_engine(
    include_text=True, response_mode="tree_summarize"
)

response = query_engine.query(
    "公司的董事长是谁?"
)
print("response:", response)

# 获取 NetworkX 图
g = index.get_networkx_graph()

# directed:false--生成无向图  true--生成有向图
net = Network(notebook=True, cdn_resources="in_line", directed=True)
net.from_nx(g)

# 生成 HTML 文件
html_content = net.generate_html()

# 确保 HTML 文件包含基本结构
html_content = f"""
<!DOCTYPE html>
<html>
<head>
    <meta charset="utf-8">
    <title>Knowledge Graph</title>
    <style>
        body {{ font-family: Arial, sans-serif; }}
        #mynetwork {{ width: 100%; height: 800px; border: 1px solid lightgray; }}
    </style>
</head>
<body>
    <div id="mynetwork"></div>
    {html_content}
</body>
</html>
"""

with open("example.html", "w", encoding="utf-8") as f:
    f.write(html_content)

# 显示 HTML 文件
IPython.display.HTML(filename='example.html')

此代码确保生成的 HTML 文件包含基本的 HTML 结构,并使用 UTF-8 编码写入文件。如果问题仍然存在,请检查浏览器的开发者工具中的控制台输出,查看是否有任何错误或警告,并根据提示进行修正[1][2].

To continue talking to Dosu, mention @dosu.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
1 participant