GPT4All 本地部署

在本文中，我们将学习如何在仅使用 CPU 的计算机上部署和使用 GPT4All 模型（我正在使用没有 GPU 的 Macbook Pro！）并学习如何使用 Python 与我们的文档进行交互。一组 PDF 文件或在线文章将成为我们问答的知识库。

1. GPT4All 是什么

根据官方网站 GPT4All 的描述，它是一个免费使用、本地运行的、注重隐私的聊天机器人。不需要 GPU 或互联网。

GTP4All 是一个生态系统，用于训练和部署在消费级 CPU 上本地运行的强大且定制的大型语言模型。

我们的 GPT4All 模型是一个 4GB 的文件，您可以下载并连接到 GPT4All 开源生态系统软件中。Nomic AI 提供高质量和安全的软件生态系统，努力实现个人和组织轻松训练和本地实施自己的大型语言模型。

2. 它是如何工作的？

该过程非常简单（当您了解之后），并且可以在其他模型上重复。具体步骤如下：

加载 GPT4All 模型；
使用 Langchain 检索并加载我们的文档；
将文档分割成小块，以便嵌入式能够理解；
使用 FAISS 根据我们要传递给 GPT4All 的问题创建嵌入式向量数据库；
在基于问题的语境中在嵌入式向量数据库上执行相似性搜索（语义搜索）：这将作为我们问题的上下文；
使用 Langchain 将问题和上下文提供给 GPT4All，并等待答案；

所以我们需要的是嵌入式向量。嵌入式向量是一条信息的数值表示，例如文本、文档、图像、音频等。该表示捕捉了被嵌入的语义含义，这正是我们所需要的。对于这个项目，我们无法依赖于重型 GPU 模型：因此，我们将下载 Alpaca 原生模型，并使用 Langchain 中的 LlamaCppEmbeddings。不用担心！每一步都有详细的解释。

3. Let's start coding

3.1. 创建虚拟环境

为你的新 Python 项目创建一个新的文件夹，例如 GPT4ALL_Fabio（请用你的名字替换 Fabio）：

PowerShell

mkdir GPT4ALL_Fabiocd GPT4ALL_Fabio

接下来，创建一个新的 Python 虚拟环境。如果你安装了多个 Python 版本，请指定你想要使用的版本。在这个例子中，我将使用与 Python 3.10 关联的主要安装。

PowerShell

python3 -m venv .venv

命令 python3 -m venv .venv 创建了一个名为 .venv 的新虚拟环境（点号会创建一个名为 venv 的隐藏目录）。

虚拟环境提供了一个隔离的 Python 安装环境，允许你仅针对特定项目安装软件包和依赖项，而不会影响系统范围的 Python 安装或其他项目。这种隔离有助于保持一致性，并防止不同项目需求之间的潜在冲突。

一旦虚拟环境创建好了，你可以使用以下命令来激活它：

PowerShell

source .venv/bin/activate

3.2. 安装依赖

对于我们正在构建的项目，我们不需要太多的软件包。我们只需要以下几个：

GPT4All 的 Python 绑定；
Langchain 用于与我们的文档交互；

PowerShell

pip install pygpt4all==1.0.1
pip install pyllamacpp==1.0.6
pip install langchain==0.0.149
pip install unstructured==0.6.5
pip install pdf2image==1.16.3
pip install pytesseract==0.3.10
pip install pypdf==3.8.1
pip install faiss-cpu==1.7.4

对于 LangChain，你可以看到我们指定了版本号。这个库最近正在接受很多更新，所以为了确保我们的设置明天也能正常工作，最好指定一个我们知道能正常工作的版本。Unstructured 是 pdf 加载器、pytesseract 和 pdf2image 的必需依赖库。

注意：GitHub 存储库中有一个 requirements.txt 文件，其中包含与此项目相关的所有版本。你可以在将其下载到主项目文件目录后，使用以下命令一次性进行安装：

PowerShell

pip install -r requirements.txt

请记住，某些库有根据你在虚拟环境中运行的 Python 版本提供的不同版本可用。

3.3. 下载模型

这是一个非常重要的步骤。

对于这个项目，我们当然需要 GPT4All 模型。在 Nomic AI 网站上描述的过程非常复杂，并且需要一些我们不一定都拥有的硬件（比如我）。所以这里是已经转换并准备好使用的模型链接。只需点击下载。

正如简介中简要描述的，我们还需要嵌入模型，这是一个可以在我们的 CPU 上运行而不会崩溃的模型。点击这里的链接下载已经转换为 4 位并准备好用作嵌入模型的 alpaca-native-7B-ggml 模型。

为什么我们需要嵌入？如果你还记得流程图中的第一步，在收集知识库文档之后，我们需要将它们嵌入。LLamaCPP 嵌入来自 Alpaca 模型，非常适合这项工作，而且这个模型也很小（4 GB）。顺便说一下，你也可以在问答环节中使用 Alpaca 模型！

2023.05.25 更新：Mani Windows 用户在使用 llamaCPP 嵌入时遇到了问题。这主要是因为在安装 python 包 llama-cpp-python 时使用了以下命令：

PowerShell

pip install llama-cpp-python

这个命令会从源代码编译库。通常 Windows 上的机器默认没有安装 CMake 或 C 编译器。但不要担心，有解决办法。

在 Windows 上运行 llama-cpp-python 的安装过程时，需要编译源代码，但由于 Windows 默认没有安装 CMake 和 C 编译器，因此无法从源代码构建。

在 Mac 用户使用 Xtools 和 Linux 用户上，通常操作系统中已经安装了 C 编译器。

3.4. 为了避免问题，你必须使用预编译的 wheel

访问 https://github.com/abetlen/llama-cpp-python/releases。

找到适用于你架构和 Python 版本的预编译的 wheel 版本，你必须选择 0.1.49 版本，因为更高的版本不兼容。

在我这里，我使用的是 Windows 10 64 位，Python 3.10。

所以我的文件是 llama_cpp_python-0.1.49-cp310-cp310-win_amd64.whl：

Note：这个问题在 GitHub 存储库中有记录 https://github.com/fabiomatricardi/GPT4All_Medium/issues/2。

下载完成后，你需要将这两个模型文件放到 models 目录中，如下所示。

目录结构和放置模型文件的位置。

4. 与 GPT4All 的基本交互

由于我们想要对 GPT 模型的交互进行控制，我们需要创建一个 Python 文件（我们称之为 pygpt4all_test.py），导入依赖项并给模型发送指令。你会发现这很容易。

Python

from pygpt4all.models.gpt4all import GPT4All

这是我们模型的 Python 绑定。现在我们可以调用它并开始提问。让我们试一个有创意的问题。

我们创建一个函数来读取模型的回调，并要求 GPT4All 完成我们的句子。

Python

def new_text_callback(text):
    print(text, end="")

model = GPT4All('./models/gpt4all-converted.bin')
model.generate("Once upon a time, ", n_predict=55, new_text_callback=new_text_callback)

第一条语句告诉我们的程序在哪里找到模型（记住我们在上面的部分所做的事情）。

第二条语句要求模型生成一个回答，并完成我们的提示文本 "Once upon a time,"。

要运行它，请确保虚拟环境仍处于激活状态，然后简单地运行：

PowerShell

python3 pygpt4all_test.py

你应该会看到模型正在加载的文本和句子的完成。根据你的硬件资源，这可能需要一些时间。

结果可能与你的不同 … 但对我们来说重要的是它正在工作，我们可以继续使用 LangChain 创建一些高级功能。

注意（更新于 2023.05.23）：如果你遇到与 pygpt4all 相关的错误，请查看故障排除部分，其中提供了由 Rajneesh Aggarwal 或 Oscar Jeong 给出的解决方案。

5. LangChain 在 GPT4All 上的模板

LangChain 框架是一个非常强大的库。它提供了组件，以一种易于使用的方式与语言模型进行交互，并且还提供了 Chains。可以将 Chains 视为以特定方式组装这些组件，以最好地完成特定的用例。它们旨在成为一个更高级的接口，使人们能够轻松地开始使用特定的用例。这些 Chains 也可以进行定制。

在我们的下一个 Python 测试中，我们将使用一个 Prompt Template。语言模型接受文本作为输入，这段文本通常被称为 prompt。通常情况下，这不仅仅是一个硬编码的字符串，而是模板、示例和用户输入的组合。LangChain 提供了多个类和函数，使构建和处理 prompt 变得简单。让我们看看如何实现。

创建一个新的 Python 文件，命名为 my_langchain.py。

Python

# Import of langchain Prompt Template and Chain
from langchain import PromptTemplate, LLMChain

# Import llm to be able to interact with GPT4All directly from langchain
from langchain.llms import GPT4All

# Callbacks manager is required for the response handling 
from langchain.callbacks.base import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

local_path = './models/gpt4all-converted.bin' 
callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])

我们从 LangChain 中导入了 Prompt Template 和 Chain，以及 GPT4All llm 类，以便能够直接与我们的 GPT 模型进行交互。

然后，在设置了 llm 路径之后（与之前一样），我们实例化了回调管理器，以便能够捕获我们查询的响应。

创建一个模板非常简单：根据文档教程，我们可以使用如下代码：

Python

template = """Question: {question}

Answer: Let's think step by step on it.

"""
prompt = PromptTemplate(template=template, input_variables=["question"])

template 变量是一个多行字符串，包含了与模型的交互结构：在花括号中插入模板的外部变量，对于我们的情况就是问题。

由于它是一个变量，你可以决定它是一个硬编码的问题还是用户输入的问题：这里是两个示例。

Python

# Hardcoded question
question = "What Formula 1 pilot won the championship in the year Leonardo di Caprio was born?"

# User input question...
question = input("Enter your question: ")

对于我们的测试运行，我们将注释掉用户输入的问题。现在我们只需要将我们的模板、问题和语言模型连接在一起。

Python

template = """Question: {question}
Answer: Let's think step by step on it.
"""

prompt = PromptTemplate(template=template, input_variables=["question"])

# initialize the GPT4All instance
llm = GPT4All(model=local_path, callback_manager=callback_manager, verbose=True)

# link the language model with our prompt template
llm_chain = LLMChain(prompt=prompt, llm=llm)

# Hardcoded question
question = "What Formula 1 pilot won the championship in the year Leonardo di Caprio was born?"

# User imput question...
# question = input("Enter your question: ")

#Run the query and get the results
llm_chain.run(question)

记得验证你的虚拟环境仍然处于激活状态，并运行以下命令：

PowerShell

python3 my_langchain.py

你可能会得到与我不同的结果。令人惊奇的是，你可以看到 GPT4All 在尝试为你找到答案时所遵循的整个推理过程。调整问题可能会得到更好的结果。

6. 使用 LangChain 和 GPT4All 回答关于文件的问题

在这里，我们开始了令人惊奇的部分，因为我们将使用 GPT4All 作为一个聊天机器人来回答我们的问题。

根据与 GPT4All 进行问答的工作流程，我们需要加载我们的 PDF 文件，并将其分成块。接下来，我们需要为我们的嵌入向量准备一个向量存储库。我们需要将我们的分块文档输入到向量存储库中进行信息检索，然后将它们与该数据库上的相似性搜索一起作为 LLM 查询的上下文进行嵌入。

为此，我们将直接使用 Langchain 库中的 FAISS。FAISS 是 Facebook AI Research 开发的开源库，旨在快速在大规模高维数据集中查找相似项。它提供索引和搜索方法，使得在数据集中快速找到最相似的项变得更加简单和快速。对我们来说特别方便的是，它简化了信息检索，并允许我们本地保存创建的数据库：这意味着在第一次创建后，将非常快速地加载数据库以供以后使用。

6.1. 创建向量索引数据库

创建一个新的文件叫做 my_knowledge_qna.py：

Python

from langchain import PromptTemplate, LLMChain
from langchain.llms import GPT4All
from langchain.callbacks.base import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

# function for loading only TXT files
from langchain.document_loaders import TextLoader

# text splitter for create chunks
from langchain.text_splitter import RecursiveCharacterTextSplitter

# to be able to load the pdf files
from langchain.document_loaders import UnstructuredPDFLoader
from langchain.document_loaders import PyPDFLoader
from langchain.document_loaders import DirectoryLoader

# Vector Store Index to create our database about our knowledge
from langchain.indexes import VectorstoreIndexCreator

# LLamaCpp embeddings from the Alpaca model
from langchain.embeddings import LlamaCppEmbeddings

# FAISS  library for similaarity search
from langchain.vectorstores.faiss import FAISS

import os  #for interaaction with the files
import datetime

第一组库与之前使用的相同：另外我们使用 Langchain 进行向量存储索引的创建，使用 LlamaCppEmbeddings 与我们的 Alpaca 模型进行交互（使用 cpp 库进行 4 位量化和编译），以及 PDF 加载器。

我们还要加载具有自己路径的 LLMs：一个用于嵌入和一个用于文本生成。

Python

# assign the path for the 2 models GPT4All and Alpaca for the embeddings 
gpt4all_path = './models/gpt4all-converted.bin' 
llama_path = './models/ggml-model-q4_0.bin' 
# Calback manager for handling the calls with  the model
callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])

# create the embedding object
embeddings = LlamaCppEmbeddings(model_path=llama_path)
# create the GPT4All llm object
llm = GPT4All(model=gpt4all_path, callback_manager=callback_manager, verbose=True)

为了测试，让我们看看是否成功读取了所有的 PDF 文件：第一步是声明三个函数，用于处理每个单独的文档。第一个函数是将提取的文本分割成块，第二个函数是创建带有元数据的向量索引（例如页码等），最后一个函数是用于测试相似性搜索（稍后我将更详细地解释）。

Python

# Split text 
def split_chunks(sources):
    chunks = []
    splitter = RecursiveCharacterTextSplitter(chunk_size=256, chunk_overlap=32)
 for chunk in splitter.split_documents(sources):
        chunks.append(chunk)
 return chunks


def create_index(chunks):
    texts = [doc.page_content for doc in chunks]
    metadatas = [doc.metadata for doc in chunks]

    search_index = FAISS.from_texts(texts, embeddings, metadatas=metadatas)

 return search_index


def similarity_search(query, index):
 # k is the number of similarity searched that matches the query
 # default is 4
    matched_docs = index.similarity_search(query, k=3) 
    sources = []
 for doc in matched_docs:
        sources.append(
            {
                "page_content": doc.page_content,
                "metadata": doc.metadata,
            }
        )

 return matched_docs, sources

现在我们可以测试文档目录中的索引生成：我们需要将所有的 PDF 文件放在那里。Langchain 还有一种加载整个文件夹的方法，不论文件类型如何：由于后续处理比较复杂，我将在下一篇关于 LaMini 模型的文章中介绍。

我们将把这些函数应用到列表中的第一个文档上。

Python

# get the list of pdf files from the docs directory into a list  format
pdf_folder_path = './docs'
doc_list = [s for s in os.listdir(pdf_folder_path) if s.endswith('.pdf')]
num_of_docs = len(doc_list)
# create a loader for the PDFs from the path
loader = PyPDFLoader(os.path.join(pdf_folder_path, doc_list[0]))
# load the documents with Langchain
docs = loader.load()
# Split in chunks
chunks = split_chunks(docs)
# create the db vector index
db0 = create_index(chunks)

在第一行中，我们使用 os 库来获取 docs 目录中的 PDF 文件列表。然后，我们使用 Langchain 加载 docs 文件夹中的第一个文档（doc_list[0]），将其分割成块，然后使用 LLama 嵌入创建向量数据库。

正如您所看到的，我们使用了 pyPDF 方法。这个方法稍微复杂一些，因为您需要逐个加载文件，但是使用 pypdf 将 PDF 加载到文档数组中使您能够得到一个数组，其中每个文档都包含页面内容和带有页码的元数据。当您想要知道我们将通过查询提供给 GPT4All 的上下文的来源时，这非常方便。以下是来自 readthedocs 的示例：

然后执行下面的命令运行：

PowerShell

python3 my_knowledge_qna.py

在加载用于嵌入的模型之后，您将看到令牌在索引中的工作方式：不要惊慌，因为这需要时间，特别是如果您只在 CPU 上运行，就像我一样（这需要 8 分钟）。

正如我之前解释的，pyPDF 方法较慢，但为我们提供了用于相似性搜索的额外数据。为了遍历所有文件，我们将使用 FAISS 中的一种便捷方法，它允许我们将不同的数据库合并在一起。现在我们做的是使用上面的代码生成第一个数据库（我们将其称为 db0），然后使用 for 循环创建列表中下一个文件的索引，并立即将其与 db0 合并。

以下是代码：请注意，我添加了一些日志，以便使用 datetime.datetime.now() 提供进度状态，并打印结束时间和开始时间的差值来计算操作所需的时间（如果您不喜欢，可以将其删除）。

合并指令如下：

Python

# merge dbi with the existing db0db0.merge_from(dbi)

最后一条指令是将我们的数据库保存到本地：整个生成过程可能需要数小时（取决于文档的数量），所以我们只需执行一次这个操作非常好！

Python

# Save the databasae locallydb0.save_local("my_faiss_index")

以下是完整的代码。当我们与 GPT4All 互动并直接从文件夹加载索引时，我们将对其中许多部分进行注释。

Python

# get the list of pdf files from the docs directory into a list  format
pdf_folder_path = './docs'
doc_list = [s for s in os.listdir(pdf_folder_path) if s.endswith('.pdf')]
num_of_docs = len(doc_list)
# create a loader for the PDFs from the path
general_start = datetime.datetime.now() #not used now but useful
print("starting the loop...")
loop_start = datetime.datetime.now() #not used now but useful
print("generating fist vector database and then iterate with .merge_from")
loader = PyPDFLoader(os.path.join(pdf_folder_path, doc_list[0]))
docs = loader.load()
chunks = split_chunks(docs)
db0 = create_index(chunks)
print("Main Vector database created. Start iteration and merging...")
for i in range(1,num_of_docs):
    print(doc_list[i])
    print(f"loop position {i}")
    loader = PyPDFLoader(os.path.join(pdf_folder_path, doc_list[i]))
    start = datetime.datetime.now() #not used now but useful
    docs = loader.load()
    chunks = split_chunks(docs)
    dbi = create_index(chunks)
    print("start merging with db0...")
    db0.merge_from(dbi)
    end = datetime.datetime.now() #not used now but useful
    elapsed = end - start #not used now but useful
 #total time
    print(f"completed in {elapsed}")
    print("-----------------------------------")
loop_end = datetime.datetime.now() #not used now but useful
loop_elapsed = loop_end - loop_start #not used now but useful
print(f"All documents processed in {loop_elapsed}")
print(f"the daatabase is done with {num_of_docs} subset of db index")
print("-----------------------------------")
print(f"Merging completed")
print("-----------------------------------")
print("Saving Merged Database Locally")
# Save the databasae locally
db0.save_local("my_faiss_index")
print("-----------------------------------")
print("merged database saved as my_faiss_index")
general_end = datetime.datetime.now() #not used now but useful
general_elapsed = general_end - general_start #not used now but useful
print(f"All indexing completed in {general_elapsed}")
print("-----------------------------------")

6.2. 向 GPT4All 提问关于您的文档的问题

现在我们到了这一步。我们有了我们的索引，我们可以加载它，并使用一个提示模板向 GPT4All 提问。我们先从一个硬编码的问题开始，然后循环遍历我们的输入问题。

将以下代码放入一个名为 db_loading.py 的 Python 文件中，并在终端中使用 python3 db_loading.py 命令运行它:

Python

from langchain import PromptTemplate, LLMChain
from langchain.llms import GPT4All
from langchain.callbacks.base import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
# function for loading only TXT files
from langchain.document_loaders import TextLoader
# text splitter for create chunks
from langchain.text_splitter import RecursiveCharacterTextSplitter
# to be able to load the pdf files
from langchain.document_loaders import UnstructuredPDFLoader
from langchain.document_loaders import PyPDFLoader
from langchain.document_loaders import DirectoryLoader
# Vector Store Index to create our database about our knowledge
from langchain.indexes import VectorstoreIndexCreator
# LLamaCpp embeddings from the Alpaca model
from langchain.embeddings import LlamaCppEmbeddings
# FAISS  library for similaarity search
from langchain.vectorstores.faiss import FAISS
import os  #for interaaction with the files
import datetime

# TEST FOR SIMILARITY SEARCH

# assign the path for the 2 models GPT4All and Alpaca for the embeddings 
gpt4all_path = './models/gpt4all-converted.bin' 
llama_path = './models/ggml-model-q4_0.bin' 
# Calback manager for handling the calls with  the model
callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])

# create the embedding object
embeddings = LlamaCppEmbeddings(model_path=llama_path)
# create the GPT4All llm object
llm = GPT4All(model=gpt4all_path, callback_manager=callback_manager, verbose=True)

# Split text 
def split_chunks(sources):
    chunks = []
    splitter = RecursiveCharacterTextSplitter(chunk_size=256, chunk_overlap=32)
 for chunk in splitter.split_documents(sources):
        chunks.append(chunk)
 return chunks


def create_index(chunks):
    texts = [doc.page_content for doc in chunks]
    metadatas = [doc.metadata for doc in chunks]

    search_index = FAISS.from_texts(texts, embeddings, metadatas=metadatas)

 return search_index


def similarity_search(query, index):
 # k is the number of similarity searched that matches the query
 # default is 4
    matched_docs = index.similarity_search(query, k=3) 
    sources = []
 for doc in matched_docs:
        sources.append(
            {
                "page_content": doc.page_content,
                "metadata": doc.metadata,
            }
        )

 return matched_docs, sources

# Load our local index vector db
index = FAISS.load_local("my_faiss_index", embeddings)
# Hardcoded question
query = "What is a PLC and what is the difference with a PC"
docs = index.similarity_search(query)
# Get the matches best 3 results - defined in the function k=3
print(f"The question is: {query}")
print("Here the result of the semantic search on the index, without GPT4All..")
print(docs[0])

打印的文本是与查询最匹配的前 3 个来源的列表，还提供了文档名称和页码。

现在我们可以使用相似性搜索作为查询的上下文，使用提示模板。在这 3 个函数之后，只需将所有代码替换为以下内容：

Python

# Load our local index vector db
index = FAISS.load_local("my_faiss_index", embeddings)

# create the prompt template
template = """
Please use the following context to answer questions.
Context: {context}
---
Question: {question}
Answer: Let's think step by step."""

# Hardcoded question
question = "What is a PLC and what is the difference with a PC"
matched_docs, sources = similarity_search(question, index)
# Creating the context
context = "\n".join([doc.page_content for doc in matched_docs])
# instantiating the prompt template and the GPT4All chain
prompt = PromptTemplate(template=template, input_variables=["context", "question"]).partial(context=context)
llm_chain = LLMChain(prompt=prompt, llm=llm)
# Print the result
print(llm_chain.run(question))

运行之后你会得到一个这样的结果：

Text

Please use the following context to answer questions.
Context: 1.What is a PLC
2.Where and Why it is used
3.How a PLC is different from a PC
PLC is especially important in industries where safety and reliability are
critical, such as manufacturing plants, chemical plants, and power plants.
How a PLC is different from a PC
Because a PLC is a specialized computer used in industrial and
manufacturing applications to control machinery and processes.,the
hardware components of a typical PLC must be able to interact with
industrial device. So a typical PLC hardware include:
---
Question: What is a PLC and what is the difference with a PC
Answer: Let's think step by step. 1) A Programmable Logic Controller (PLC), 
also called Industrial Control System or ICS, refers to an industrial computer 
that controls various automated processes such as manufacturing 
machines/assembly lines etcetera through sensors and actuators connected 
with it via inputs & outputs. It is a form of digital computers which has 
the ability for multiple instruction execution (MIE), built-in memory 
registers used by software routines, Input Output interface cards(IOC) 
to communicate with other devices electronically/digitally over networks 
or buses etcetera
2). A Programmable Logic Controller is widely utilized in industrial 
automation as it has the ability for more than one instruction execution. 
It can perform tasks automatically and programmed instructions, which allows 
it to carry out complex operations that are beyond a 
Personal Computer (PC) capacity. So an ICS/PLC contains built-in memory 
registers used by software routines or firmware codes etcetera but 
PC doesn't contain them so they need external interfaces such as 
hard disks drives(HDD), USB ports, serial and parallel 
communication protocols to store data for further analysis or 
report generation.

如果你希望使用用户输入来替换问题：

Python

question = "What is a PLC and what is the difference with a PC"

就像下面这样：

Python

question = input("Your question: ")

7. 结论

现在是你进行实验的时候了。对与你的文档相关的所有主题提出不同的问题，并观察结果。在 Prompt 和模板方面，肯定还有很大的改进空间：你可以在这里找到一些灵感。而且 Langchain 的文档真的很棒（我都能够理解！）。

参考链接

GPT4All 本地部署 ​

1. GPT4All 是什么 ​

2. 它是如何工作的？ ​

3. Let's start coding ​

3.1. 创建虚拟环境 ​

3.2. 安装依赖 ​

3.3. 下载模型 ​

3.4. 为了避免问题，你必须使用预编译的 wheel ​

4. 与 GPT4All 的基本交互 ​

5. LangChain 在 GPT4All 上的模板 ​

6. 使用 LangChain 和 GPT4All 回答关于文件的问题 ​

6.1. 创建向量索引数据库 ​

6.2. 向 GPT4All 提问关于您的文档的问题 ​

7. 结论 ​

GPT4All 本地部署

1. GPT4All 是什么

2. 它是如何工作的？

3. Let's start coding

3.1. 创建虚拟环境

3.2. 安装依赖

3.3. 下载模型

3.4. 为了避免问题，你必须使用预编译的 wheel

4. 与 GPT4All 的基本交互

5. LangChain 在 GPT4All 上的模板

6. 使用 LangChain 和 GPT4All 回答关于文件的问题

6.1. 创建向量索引数据库

6.2. 向 GPT4All 提问关于您的文档的问题

7. 结论