mirror of
https://github.com/infiniflow/ragflow.git
synced 2026-05-21 16:40:07 +08:00
### What problem does this PR solve? restore openai-compatible chat completions api ### Type of change - [x] Refactoring
2795 lines
71 KiB
Markdown
2795 lines
71 KiB
Markdown
---
|
||
sidebar_position: 5
|
||
slug: /python_api_reference
|
||
sidebar_custom_props: {
|
||
categoryIcon: SiPython
|
||
}
|
||
---
|
||
# Python API
|
||
|
||
A complete reference for RAGFlow's Python APIs. Before proceeding, please ensure you [have your RAGFlow API key ready for authentication](https://ragflow.io/docs/dev/acquire_ragflow_api_key).
|
||
|
||
:::tip NOTE
|
||
Run the following command to download the Python SDK:
|
||
|
||
```bash
|
||
pip install ragflow-sdk
|
||
```
|
||
|
||
:::
|
||
|
||
---
|
||
|
||
## ERROR CODES
|
||
|
||
---
|
||
|
||
| Code | Message | Description |
|
||
|------|-----------------------|----------------------------|
|
||
| 400 | Bad Request | Invalid request parameters |
|
||
| 401 | Unauthorized | Unauthorized access |
|
||
| 403 | Forbidden | Access denied |
|
||
| 404 | Not Found | Resource not found |
|
||
| 500 | Internal Server Error | Server internal error |
|
||
| 1001 | Invalid Chunk ID | Invalid Chunk ID |
|
||
| 1002 | Chunk Update Failed | Chunk update failed |
|
||
|
||
---
|
||
|
||
## OpenAI-Compatible API
|
||
|
||
---
|
||
|
||
### Create chat completion
|
||
|
||
Creates a model response for the given historical chat conversation via OpenAI's API.
|
||
|
||
#### Parameters
|
||
|
||
##### chat_id: `string`, *Required*
|
||
|
||
Existing chat assistant ID. This value is part of the request path: `/api/v1/openai/<chat_id>/chat/completions`.
|
||
|
||
##### model: `string`, *Required*
|
||
|
||
The model used to generate the response. You may also use the legacy placeholder value `"model"` to keep using the chat assistant's configured model.
|
||
|
||
##### messages: `list[object]`, *Required*
|
||
|
||
A list of historical chat messages used to generate the response. This must contain at least one message with the `user` role.
|
||
|
||
##### stream: `boolean`
|
||
|
||
Whether to receive the response as a stream. Set this to `false` explicitly if you prefer to receive the entire response in one go instead of as a stream.
|
||
|
||
#### Returns
|
||
|
||
- Success: Response [message](https://platform.openai.com/docs/api-reference/chat/create) like OpenAI
|
||
- Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from openai import OpenAI
|
||
import json
|
||
|
||
model = "glm-4-flash@ZHIPU-AI"
|
||
client = OpenAI(api_key="ragflow-api-key", base_url="http://ragflow_address/api/v1/openai/<chat_id>/chat")
|
||
|
||
stream = True
|
||
reference = True
|
||
|
||
request_kwargs = dict(
|
||
model=model,
|
||
messages=[
|
||
{"role": "system", "content": "You are a helpful assistant."},
|
||
{"role": "user", "content": "Who are you?"},
|
||
{"role": "assistant", "content": "I am an AI assistant named..."},
|
||
{"role": "user", "content": "Can you tell me how to install neovim"},
|
||
],
|
||
extra_body={
|
||
"reference": reference,
|
||
"reference_metadata": {
|
||
"include": True,
|
||
"fields": ["author", "year", "source"],
|
||
},
|
||
},
|
||
)
|
||
|
||
if stream:
|
||
completion = client.chat.completions.create(stream=True, **request_kwargs)
|
||
for chunk in completion:
|
||
print(chunk)
|
||
else:
|
||
resp = client.chat.completions.with_raw_response.create(
|
||
stream=False, **request_kwargs
|
||
)
|
||
print("status:", resp.http_response.status_code)
|
||
raw_text = resp.http_response.text
|
||
print("raw:", raw_text)
|
||
|
||
data = json.loads(raw_text)
|
||
print("assistant:", data["choices"][0]["message"].get("content"))
|
||
print("reference:", data["choices"][0]["message"].get("reference"))
|
||
```
|
||
|
||
When `extra_body.reference` is `true`, the streamed final chunk may include `choices[0].delta.reference`, and the non-stream response may include `choices[0].message.reference`.
|
||
|
||
When `extra_body.reference_metadata.include` is `true`, each reference chunk may include a `document_metadata` object in both streaming and non-streaming responses.
|
||
|
||
## DATASET MANAGEMENT
|
||
|
||
---
|
||
|
||
### Create dataset
|
||
|
||
```python
|
||
RAGFlow.create_dataset(
|
||
name: str,
|
||
avatar: Optional[str] = None,
|
||
description: Optional[str] = None,
|
||
embedding_model: Optional[str] = "BAAI/bge-large-zh-v1.5@BAAI",
|
||
permission: str = "me",
|
||
chunk_method: str = "naive",
|
||
parser_config: DataSet.ParserConfig = None
|
||
) -> DataSet
|
||
```
|
||
|
||
Creates a dataset.
|
||
|
||
#### Parameters
|
||
|
||
##### name: `string`, *Required*
|
||
|
||
The unique name of the dataset to create. It must adhere to the following requirements:
|
||
|
||
- Maximum 128 characters.
|
||
- Case-insensitive.
|
||
|
||
##### avatar: `string`
|
||
|
||
Base64 encoding of the avatar. Defaults to `None`
|
||
|
||
##### description: `string`
|
||
|
||
A brief description of the dataset to create. Defaults to `None`.
|
||
|
||
|
||
##### permission
|
||
|
||
Specifies who can access the dataset to create. Available options:
|
||
|
||
- `"me"`: (Default) Only you can manage the dataset.
|
||
- `"team"`: All team members can manage the dataset.
|
||
|
||
##### chunk_method, `string`
|
||
|
||
The chunking method of the dataset to create. Available options:
|
||
|
||
- `"naive"`: General (default)
|
||
- `"manual`: Manual
|
||
- `"qa"`: Q&A
|
||
- `"table"`: Table
|
||
- `"paper"`: Paper
|
||
- `"book"`: Book
|
||
- `"laws"`: Laws
|
||
- `"presentation"`: Presentation
|
||
- `"picture"`: Picture
|
||
- `"one"`: One
|
||
- `"email"`: Email
|
||
|
||
##### parser_config
|
||
|
||
The parser configuration of the dataset. A `ParserConfig` object's attributes vary based on the selected `chunk_method`:
|
||
|
||
- `chunk_method`=`"naive"`:
|
||
`{"chunk_token_num":512,"delimiter":"\\n","html4excel":False,"layout_recognize":True,"raptor":{"use_raptor":False},"parent_child":{"use_parent_child":False,"children_delimiter":"\\n"}}`.
|
||
- `chunk_method`=`"qa"`:
|
||
`{"raptor": {"use_raptor": False}}`
|
||
- `chunk_method`=`"manuel"`:
|
||
`{"raptor": {"use_raptor": False}}`
|
||
- `chunk_method`=`"table"`:
|
||
`None`
|
||
- `chunk_method`=`"paper"`:
|
||
`{"raptor": {"use_raptor": False}}`
|
||
- `chunk_method`=`"book"`:
|
||
`{"raptor": {"use_raptor": False}}`
|
||
- `chunk_method`=`"laws"`:
|
||
`{"raptor": {"use_raptor": False}}`
|
||
- `chunk_method`=`"picture"`:
|
||
`None`
|
||
- `chunk_method`=`"presentation"`:
|
||
`{"raptor": {"use_raptor": False}}`
|
||
- `chunk_method`=`"one"`:
|
||
`None`
|
||
- `chunk_method`=`"knowledge-graph"`:
|
||
`{"chunk_token_num":128,"delimiter":"\\n","entity_types":["organization","person","location","event","time"]}`
|
||
- `chunk_method`=`"email"`:
|
||
`None`
|
||
|
||
#### Returns
|
||
|
||
- Success: A `dataset` object.
|
||
- Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow
|
||
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
dataset = rag_object.create_dataset(name="kb_1")
|
||
```
|
||
|
||
---
|
||
|
||
### Delete datasets
|
||
|
||
```python
|
||
RAGFlow.delete_datasets(ids: list[str] | None = None, delete_all: bool = False)
|
||
```
|
||
|
||
Deletes datasets by ID.
|
||
|
||
#### Parameters
|
||
|
||
##### ids: `list[str]` or `None`
|
||
|
||
The IDs of the datasets to delete. Defaults to `None`.
|
||
|
||
- If omitted, or set to `null` or an empty array, no datasets are deleted.
|
||
- If an array of IDs is provided, only the datasets matching those IDs are deleted.
|
||
|
||
##### delete_all: `bool`
|
||
|
||
Whether to delete all datasets owned by the current user when `ids` is omitted, or set to `None` or an empty list. Defaults to `False`.
|
||
|
||
#### Returns
|
||
|
||
- Success: No value is returned.
|
||
- Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
rag_object.delete_datasets(ids=["d94a8dc02c9711f0930f7fbc369eab6d","e94a8dc02c9711f0930f7fbc369eab6e"])
|
||
rag_object.delete_datasets(delete_all=True)
|
||
```
|
||
|
||
---
|
||
|
||
### List datasets
|
||
|
||
```python
|
||
RAGFlow.list_datasets(
|
||
page: int = 1,
|
||
page_size: int = 30,
|
||
orderby: str = "create_time",
|
||
desc: bool = True,
|
||
id: str = None,
|
||
name: str = None,
|
||
include_parsing_status: bool = False
|
||
) -> list[DataSet]
|
||
```
|
||
|
||
Lists datasets.
|
||
|
||
#### Parameters
|
||
|
||
##### page: `int`
|
||
|
||
Specifies the page on which the datasets will be displayed. Defaults to `1`.
|
||
|
||
##### page_size: `int`
|
||
|
||
The number of datasets on each page. Defaults to `30`.
|
||
|
||
##### orderby: `string`
|
||
|
||
The field by which datasets should be sorted. Available options:
|
||
|
||
- `"create_time"` (default)
|
||
- `"update_time"`
|
||
|
||
##### desc: `bool`
|
||
|
||
Indicates whether the retrieved datasets should be sorted in descending order. Defaults to `True`.
|
||
|
||
##### id: `string`
|
||
|
||
The ID of the dataset to retrieve. Defaults to `None`.
|
||
|
||
##### name: `string`
|
||
|
||
The name of the dataset to retrieve. Defaults to `None`.
|
||
|
||
##### include_parsing_status: `bool`
|
||
|
||
Whether to include document parsing status counts in each returned `DataSet` object. Defaults to `False`. When set to `True`, each `DataSet` object will include the following additional attributes:
|
||
|
||
- `unstart_count`: `int` Number of documents not yet started parsing.
|
||
- `running_count`: `int` Number of documents currently being parsed.
|
||
- `cancel_count`: `int` Number of documents whose parsing was cancelled.
|
||
- `done_count`: `int` Number of documents that have been successfully parsed.
|
||
- `fail_count`: `int` Number of documents whose parsing failed.
|
||
|
||
#### Returns
|
||
|
||
- Success: A list of `DataSet` objects.
|
||
- Failure: `Exception`.
|
||
|
||
#### Examples
|
||
|
||
##### List all datasets
|
||
|
||
```python
|
||
for dataset in rag_object.list_datasets():
|
||
print(dataset)
|
||
```
|
||
|
||
##### Retrieve a dataset by ID
|
||
|
||
```python
|
||
dataset = rag_object.list_datasets(id = "id_1")
|
||
print(dataset[0])
|
||
```
|
||
|
||
##### List datasets with parsing status
|
||
|
||
```python
|
||
for dataset in rag_object.list_datasets(include_parsing_status=True):
|
||
print(dataset.done_count, dataset.fail_count, dataset.running_count)
|
||
```
|
||
|
||
---
|
||
|
||
### Update dataset
|
||
|
||
```python
|
||
DataSet.update(update_message: dict)
|
||
```
|
||
|
||
Updates configurations for the current dataset.
|
||
|
||
#### Parameters
|
||
|
||
##### update_message: `dict[str, str|int]`, *Required*
|
||
|
||
A dictionary representing the attributes to update, with the following keys:
|
||
|
||
- `"name"`: `string` The revised name of the dataset.
|
||
- Basic Multilingual Plane (BMP) only
|
||
- Maximum 128 characters
|
||
- Case-insensitive
|
||
- `"avatar"`: (*Body parameter*), `string`
|
||
The updated base64 encoding of the avatar.
|
||
- Maximum 65535 characters
|
||
- `"embedding_model"`: (*Body parameter*), `string`
|
||
The updated embedding model name.
|
||
- Ensure that `"chunk_count"` is `0` before updating `"embedding_model"`.
|
||
- Maximum 255 characters
|
||
- Must follow `model_name@model_factory` format
|
||
- `"permission"`: (*Body parameter*), `string`
|
||
The updated dataset permission. Available options:
|
||
- `"me"`: (Default) Only you can manage the dataset.
|
||
- `"team"`: All team members can manage the dataset.
|
||
- `"pagerank"`: (*Body parameter*), `int`
|
||
refer to [Set page rank](https://ragflow.io/docs/dev/set_page_rank)
|
||
- Default: `0`
|
||
- Minimum: `0`
|
||
- Maximum: `100`
|
||
- `"chunk_method"`: (*Body parameter*), `enum<string>`
|
||
The chunking method for the dataset. Available options:
|
||
- `"naive"`: General (default)
|
||
- `"book"`: Book
|
||
- `"email"`: Email
|
||
- `"laws"`: Laws
|
||
- `"manual"`: Manual
|
||
- `"one"`: One
|
||
- `"paper"`: Paper
|
||
- `"picture"`: Picture
|
||
- `"presentation"`: Presentation
|
||
- `"qa"`: Q&A
|
||
- `"table"`: Table
|
||
- `"tag"`: Tag
|
||
|
||
#### Returns
|
||
|
||
- Success: No value is returned.
|
||
- Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow
|
||
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
dataset = rag_object.list_datasets(name="kb_name")
|
||
dataset = dataset[0]
|
||
dataset.update({"embedding_model":"BAAI/bge-zh-v1.5", "chunk_method":"manual"})
|
||
```
|
||
|
||
---
|
||
|
||
## FILE MANAGEMENT WITHIN DATASET
|
||
|
||
---
|
||
|
||
### Upload documents
|
||
|
||
```python
|
||
DataSet.upload_documents(document_list: list[dict])
|
||
```
|
||
|
||
Uploads documents to the current dataset.
|
||
|
||
#### Parameters
|
||
|
||
##### document_list: `list[dict]`, *Required*
|
||
|
||
A list of dictionaries representing the documents to upload, each containing the following keys:
|
||
|
||
- `"display_name"`: (Optional) The file name to display in the dataset.
|
||
- `"blob"`: (Optional) The binary content of the file to upload.
|
||
|
||
#### Returns
|
||
|
||
- Success: No value is returned.
|
||
- Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
dataset = rag_object.create_dataset(name="kb_name")
|
||
dataset.upload_documents([{"display_name": "1.txt", "blob": "<BINARY_CONTENT_OF_THE_DOC>"}, {"display_name": "2.pdf", "blob": "<BINARY_CONTENT_OF_THE_DOC>"}])
|
||
```
|
||
|
||
---
|
||
|
||
### Update document
|
||
|
||
```python
|
||
Document.update(update_message:dict)
|
||
```
|
||
|
||
Updates configurations for the current document.
|
||
|
||
#### Parameters
|
||
|
||
##### update_message: `dict[str, str|dict[]]`, *Required*
|
||
|
||
A dictionary representing the attributes to update, with the following keys:
|
||
|
||
- `"display_name"`: `string` The name of the document to update.
|
||
- `"meta_fields"`: `dict[str, Any]` The meta fields of the document.
|
||
- `"chunk_method"`: `string` The parsing method to apply to the document.
|
||
- `"naive"`: General
|
||
- `"manual`: Manual
|
||
- `"qa"`: Q&A
|
||
- `"table"`: Table
|
||
- `"paper"`: Paper
|
||
- `"book"`: Book
|
||
- `"laws"`: Laws
|
||
- `"presentation"`: Presentation
|
||
- `"picture"`: Picture
|
||
- `"one"`: One
|
||
- `"email"`: Email
|
||
- `"parser_config"`: `dict[str, Any]` The parsing configuration for the document. Its attributes vary based on the selected `"chunk_method"`:
|
||
- `"chunk_method"`=`"naive"`:
|
||
`{"chunk_token_num":128,"delimiter":"\\n","html4excel":False,"layout_recognize":True,"raptor":{"use_raptor":False},"parent_child":{"use_parent_child":False,"children_delimiter":"\\n"}}`.
|
||
- `chunk_method`=`"qa"`:
|
||
`{"raptor": {"use_raptor": False}}`
|
||
- `chunk_method`=`"manuel"`:
|
||
`{"raptor": {"use_raptor": False}}`
|
||
- `chunk_method`=`"table"`:
|
||
`None`
|
||
- `chunk_method`=`"paper"`:
|
||
`{"raptor": {"use_raptor": False}}`
|
||
- `chunk_method`=`"book"`:
|
||
`{"raptor": {"use_raptor": False}}`
|
||
- `chunk_method`=`"laws"`:
|
||
`{"raptor": {"use_raptor": False}}`
|
||
- `chunk_method`=`"presentation"`:
|
||
`{"raptor": {"use_raptor": False}}`
|
||
- `chunk_method`=`"picture"`:
|
||
`None`
|
||
- `chunk_method`=`"one"`:
|
||
`None`
|
||
- `chunk_method`=`"knowledge-graph"`:
|
||
`{"chunk_token_num":128,"delimiter":"\\n","entity_types":["organization","person","location","event","time"]}`
|
||
- `chunk_method`=`"email"`:
|
||
`None`
|
||
|
||
#### Returns
|
||
|
||
- Success: No value is returned.
|
||
- Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow
|
||
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
dataset = rag_object.list_datasets(id='id')
|
||
dataset = dataset[0]
|
||
doc = dataset.list_documents(id="wdfxb5t547d")
|
||
doc = doc[0]
|
||
doc.update([{"parser_config": {"chunk_token_num": 256}}, {"chunk_method": "manual"}])
|
||
```
|
||
|
||
---
|
||
|
||
### Download document
|
||
|
||
```python
|
||
Document.download() -> bytes
|
||
```
|
||
|
||
Downloads the current document.
|
||
|
||
#### Returns
|
||
|
||
The downloaded document in bytes.
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow
|
||
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
dataset = rag_object.list_datasets(id="id")
|
||
dataset = dataset[0]
|
||
doc = dataset.list_documents(id="wdfxb5t547d")
|
||
doc = doc[0]
|
||
open("~/ragflow.txt", "wb+").write(doc.download())
|
||
print(doc)
|
||
```
|
||
|
||
---
|
||
|
||
### List documents
|
||
|
||
```python
|
||
Dataset.list_documents(
|
||
id: str = None,
|
||
keywords: str = None,
|
||
page: int = 1,
|
||
page_size: int = 30,
|
||
order_by: str = "create_time",
|
||
desc: bool = True,
|
||
create_time_from: int = 0,
|
||
create_time_to: int = 0
|
||
) -> list[Document]
|
||
```
|
||
|
||
Lists documents in the current dataset.
|
||
|
||
#### Parameters
|
||
|
||
##### id: `string`
|
||
|
||
The ID of the document to retrieve. Defaults to `None`.
|
||
|
||
##### keywords: `string`
|
||
|
||
The keywords used to match document titles. Defaults to `None`.
|
||
|
||
##### page: `int`
|
||
|
||
Specifies the page on which the documents will be displayed. Defaults to `1`.
|
||
|
||
##### page_size: `int`
|
||
|
||
The maximum number of documents on each page. Defaults to `30`.
|
||
|
||
##### orderby: `string`
|
||
|
||
The field by which documents should be sorted. Available options:
|
||
|
||
- `"create_time"` (default)
|
||
- `"update_time"`
|
||
|
||
##### desc: `bool`
|
||
|
||
Indicates whether the retrieved documents should be sorted in descending order. Defaults to `True`.
|
||
|
||
##### create_time_from: `int`
|
||
Unix timestamp for filtering documents created after this time. 0 means no filter. Defaults to 0.
|
||
|
||
##### create_time_to: `int`
|
||
Unix timestamp for filtering documents created before this time. 0 means no filter. Defaults to 0.
|
||
|
||
#### Returns
|
||
|
||
- Success: A list of `Document` objects.
|
||
- Failure: `Exception`.
|
||
|
||
A `Document` object contains the following attributes:
|
||
|
||
- `id`: The document ID. Defaults to `""`.
|
||
- `name`: The document name. Defaults to `""`.
|
||
- `thumbnail`: The thumbnail image of the document. Defaults to `None`.
|
||
- `dataset_id`: The dataset ID associated with the document. Defaults to `None`.
|
||
- `chunk_method` The chunking method name. Defaults to `"naive"`.
|
||
- `source_type`: The source type of the document. Defaults to `"local"`.
|
||
- `type`: Type or category of the document. Defaults to `""`. Reserved for future use.
|
||
- `created_by`: `string` The creator of the document. Defaults to `""`.
|
||
- `size`: `int` The document size in bytes. Defaults to `0`.
|
||
- `token_count`: `int` The number of tokens in the document. Defaults to `0`.
|
||
- `chunk_count`: `int` The number of chunks in the document. Defaults to `0`.
|
||
- `progress`: `float` The current processing progress as a percentage. Defaults to `0.0`.
|
||
- `progress_msg`: `string` A message indicating the current progress status. Defaults to `""`.
|
||
- `process_begin_at`: `datetime` The start time of document processing. Defaults to `None`.
|
||
- `process_duration`: `float` Duration of the processing in seconds. Defaults to `0.0`.
|
||
- `run`: `string` The document's processing status:
|
||
- `"UNSTART"` (default)
|
||
- `"RUNNING"`
|
||
- `"CANCEL"`
|
||
- `"DONE"`
|
||
- `"FAIL"`
|
||
- `status`: `string` Reserved for future use.
|
||
- `parser_config`: `ParserConfig` Configuration object for the parser. Its attributes vary based on the selected `chunk_method`:
|
||
- `chunk_method`=`"naive"`:
|
||
`{"chunk_token_num":128,"delimiter":"\\n","html4excel":False,"layout_recognize":True,"raptor":{"use_raptor":False}}`.
|
||
- `chunk_method`=`"qa"`:
|
||
`{"raptor": {"use_raptor": False}}`
|
||
- `chunk_method`=`"manuel"`:
|
||
`{"raptor": {"use_raptor": False}}`
|
||
- `chunk_method`=`"table"`:
|
||
`None`
|
||
- `chunk_method`=`"paper"`:
|
||
`{"raptor": {"use_raptor": False}}`
|
||
- `chunk_method`=`"book"`:
|
||
`{"raptor": {"use_raptor": False}}`
|
||
- `chunk_method`=`"laws"`:
|
||
`{"raptor": {"use_raptor": False}}`
|
||
- `chunk_method`=`"presentation"`:
|
||
`{"raptor": {"use_raptor": False}}`
|
||
- `chunk_method`=`"picure"`:
|
||
`None`
|
||
- `chunk_method`=`"one"`:
|
||
`None`
|
||
- `chunk_method`=`"email"`:
|
||
`None`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow
|
||
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
dataset = rag_object.create_dataset(name="kb_1")
|
||
|
||
filename1 = "~/ragflow.txt"
|
||
blob = open(filename1 , "rb").read()
|
||
dataset.upload_documents([{"name":filename1,"blob":blob}])
|
||
for doc in dataset.list_documents(keywords="rag", page=0, page_size=12):
|
||
print(doc)
|
||
```
|
||
|
||
---
|
||
|
||
### Delete documents
|
||
|
||
```python
|
||
DataSet.delete_documents(ids: list[str] | None = None, delete_all: bool = False)
|
||
```
|
||
|
||
Deletes documents by ID.
|
||
|
||
#### Parameters
|
||
|
||
##### ids: `list[str]` or `None`
|
||
|
||
The IDs of the documents to delete. Defaults to `None`.
|
||
|
||
- If omitted, or set to `null` or an empty array, no documents are deleted.
|
||
- If an array of IDs is provided, only the documents matching those IDs are deleted.
|
||
|
||
##### delete_all: `bool`
|
||
|
||
Whether to delete all documents in the current dataset when `ids` is omitted, or set to `None` or an empty list. Defaults to `False`.
|
||
|
||
#### Returns
|
||
|
||
- Success: No value is returned.
|
||
- Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow
|
||
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
dataset = rag_object.list_datasets(name="kb_1")
|
||
dataset = dataset[0]
|
||
dataset.delete_documents(ids=["id_1","id_2"])
|
||
dataset.delete_documents(delete_all=True)
|
||
```
|
||
|
||
---
|
||
|
||
### Parse documents
|
||
|
||
```python
|
||
DataSet.async_parse_documents(document_ids:list[str]) -> None
|
||
```
|
||
|
||
Parses documents in the current dataset.
|
||
|
||
#### Parameters
|
||
|
||
##### document_ids: `list[str]`, *Required*
|
||
|
||
The IDs of the documents to parse.
|
||
|
||
#### Returns
|
||
|
||
- Success: No value is returned.
|
||
- Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
dataset = rag_object.create_dataset(name="dataset_name")
|
||
documents = [
|
||
{'display_name': 'test1.txt', 'blob': open('./test_data/test1.txt',"rb").read()},
|
||
{'display_name': 'test2.txt', 'blob': open('./test_data/test2.txt',"rb").read()},
|
||
{'display_name': 'test3.txt', 'blob': open('./test_data/test3.txt',"rb").read()}
|
||
]
|
||
dataset.upload_documents(documents)
|
||
documents = dataset.list_documents(keywords="test")
|
||
ids = []
|
||
for document in documents:
|
||
ids.append(document.id)
|
||
dataset.async_parse_documents(ids)
|
||
print("Async bulk parsing initiated.")
|
||
```
|
||
|
||
---
|
||
|
||
### Parse documents (with document status)
|
||
|
||
```python
|
||
DataSet.parse_documents(document_ids: list[str]) -> list[tuple[str, str, int, int]]
|
||
```
|
||
|
||
*Asynchronously* parses documents in the current dataset.
|
||
|
||
This method encapsulates `async_parse_documents()`. It awaits the completion of all parsing tasks before returning detailed results, including the parsing status and statistics for each document. If a keyboard interruption occurs (e.g., `Ctrl+C`), all pending parsing tasks will be cancelled gracefully.
|
||
|
||
#### Parameters
|
||
|
||
##### document_ids: `list[str]`, *Required*
|
||
|
||
The IDs of the documents to parse.
|
||
|
||
#### Returns
|
||
|
||
A list of tuples with detailed parsing results:
|
||
|
||
```python
|
||
[
|
||
(document_id: str, status: str, chunk_count: int, token_count: int),
|
||
...
|
||
]
|
||
```
|
||
- `status`: The final parsing state (e.g., `success`, `failed`, `cancelled`).
|
||
- `chunk_count`: The number of content chunks created from the document.
|
||
- `token_count`: The total number of tokens processed.
|
||
|
||
---
|
||
|
||
#### Example
|
||
|
||
```python
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
dataset = rag_object.create_dataset(name="dataset_name")
|
||
documents = dataset.list_documents(keywords="test")
|
||
ids = [doc.id for doc in documents]
|
||
|
||
try:
|
||
finished = dataset.parse_documents(ids)
|
||
for doc_id, status, chunk_count, token_count in finished:
|
||
print(f"Document {doc_id} parsing finished with status: {status}, chunks: {chunk_count}, tokens: {token_count}")
|
||
except KeyboardInterrupt:
|
||
print("\nParsing interrupted by user. All pending tasks have been cancelled.")
|
||
except Exception as e:
|
||
print(f"Parsing failed: {e}")
|
||
```
|
||
|
||
---
|
||
|
||
### Stop parsing documents
|
||
|
||
```python
|
||
DataSet.async_cancel_parse_documents(document_ids:list[str])-> None
|
||
```
|
||
|
||
Stops parsing specified documents.
|
||
|
||
#### Parameters
|
||
|
||
##### document_ids: `list[str]`, *Required*
|
||
|
||
The IDs of the documents for which parsing should be stopped.
|
||
|
||
#### Returns
|
||
|
||
- Success: No value is returned.
|
||
- Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
dataset = rag_object.create_dataset(name="dataset_name")
|
||
documents = [
|
||
{'display_name': 'test1.txt', 'blob': open('./test_data/test1.txt',"rb").read()},
|
||
{'display_name': 'test2.txt', 'blob': open('./test_data/test2.txt',"rb").read()},
|
||
{'display_name': 'test3.txt', 'blob': open('./test_data/test3.txt',"rb").read()}
|
||
]
|
||
dataset.upload_documents(documents)
|
||
documents = dataset.list_documents(keywords="test")
|
||
ids = []
|
||
for document in documents:
|
||
ids.append(document.id)
|
||
dataset.async_parse_documents(ids)
|
||
print("Async bulk parsing initiated.")
|
||
dataset.async_cancel_parse_documents(ids)
|
||
print("Async bulk parsing cancelled.")
|
||
```
|
||
|
||
---
|
||
|
||
## CHUNK MANAGEMENT WITHIN DATASET
|
||
|
||
---
|
||
|
||
### Add chunk
|
||
|
||
```python
|
||
Document.add_chunk(content:str, important_keywords:list[str] = [], questions:list[str] = [], image_base64:str = None, *, tag_kwd:list[str] = []) -> Chunk
|
||
```
|
||
|
||
Adds a chunk to the current document.
|
||
|
||
#### Parameters
|
||
|
||
##### content: `string`, *Required*
|
||
|
||
The text content of the chunk.
|
||
|
||
##### important_keywords: `list[str]`
|
||
|
||
The key terms or phrases to tag with the chunk.
|
||
|
||
##### questions: `list[str]`
|
||
|
||
Optional questions to use when embedding the chunk.
|
||
|
||
##### image_base64: `string`
|
||
|
||
A base64-encoded image to associate with the chunk. If the chunk already has an image, the new image will be vertically concatenated below the existing one.
|
||
|
||
##### tag_kwd: `list[str]`
|
||
|
||
Tag keywords to associate with the chunk.
|
||
|
||
#### Returns
|
||
|
||
- Success: A `Chunk` object.
|
||
- Failure: `Exception`.
|
||
|
||
A `Chunk` object contains the following attributes:
|
||
|
||
- `id`: `string`: The chunk ID.
|
||
- `content`: `string` The text content of the chunk.
|
||
- `important_keywords`: `list[str]` A list of key terms or phrases tagged with the chunk.
|
||
- `tag_kwd`: `list[str]` A list of tag keywords associated with the chunk.
|
||
- `questions`: `list[str]` A list of questions associated with the chunk.
|
||
- `image_id`: `string` The image ID associated with the chunk (empty string if no image).
|
||
- `create_time`: `string` The time when the chunk was created (added to the document).
|
||
- `create_timestamp`: `float` The timestamp representing the creation time of the chunk, expressed in seconds since January 1, 1970.
|
||
- `dataset_id`: `string` The ID of the associated dataset.
|
||
- `document_name`: `string` The name of the associated document.
|
||
- `document_id`: `string` The ID of the associated document.
|
||
- `available`: `bool` The chunk's availability status in the dataset. Value options:
|
||
- `False`: Unavailable
|
||
- `True`: Available (default)
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow
|
||
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
datasets = rag_object.list_datasets(id="123")
|
||
dataset = datasets[0]
|
||
doc = dataset.list_documents(id="wdfxb5t547d")
|
||
doc = doc[0]
|
||
chunk = doc.add_chunk(content="xxxxxxx")
|
||
```
|
||
|
||
Adding a chunk with an image:
|
||
|
||
```python
|
||
import base64
|
||
|
||
with open("image.jpg", "rb") as f:
|
||
img_b64 = base64.b64encode(f.read()).decode()
|
||
chunk = doc.add_chunk(content="description of image", image_base64=img_b64)
|
||
```
|
||
|
||
---
|
||
|
||
### List chunks
|
||
|
||
```python
|
||
Document.list_chunks(keywords: str = None, page: int = 1, page_size: int = 30, id : str = None) -> list[Chunk]
|
||
```
|
||
|
||
Lists chunks in the current document.
|
||
|
||
#### Parameters
|
||
|
||
##### keywords: `string`
|
||
|
||
The keywords used to match chunk content. Defaults to `None`
|
||
|
||
##### page: `int`
|
||
|
||
Specifies the page on which the chunks will be displayed. Defaults to `1`.
|
||
|
||
##### page_size: `int`
|
||
|
||
The maximum number of chunks on each page. Defaults to `30`.
|
||
|
||
##### id: `string`
|
||
|
||
The ID of the chunk to retrieve. Default: `None`
|
||
|
||
#### Returns
|
||
|
||
- Success: A list of `Chunk` objects.
|
||
- Failure: `Exception`.
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow
|
||
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
dataset = rag_object.list_datasets("123")
|
||
dataset = dataset[0]
|
||
docs = dataset.list_documents(keywords="test", page=1, page_size=12)
|
||
for chunk in docs[0].list_chunks(keywords="rag", page=0, page_size=12):
|
||
print(chunk)
|
||
```
|
||
|
||
---
|
||
|
||
### Delete chunks
|
||
|
||
```python
|
||
Document.delete_chunks(ids: list[str] | None = None, delete_all: bool = False)
|
||
```
|
||
|
||
Deletes chunks by ID.
|
||
|
||
#### Parameters
|
||
|
||
##### ids: `list[str]` or `None`
|
||
|
||
The IDs of the chunks to delete. Defaults to `None`.
|
||
|
||
- If omitted, or set to `null` or an empty array, no chunks are deleted.
|
||
- If an array of IDs is provided, only the chunks matching those IDs are deleted.
|
||
|
||
##### delete_all: `bool`
|
||
|
||
Whether to delete all chunks in the current document when `ids` is omitted, or set to `None` or an empty list. Defaults to `False`.
|
||
|
||
#### Returns
|
||
|
||
- Success: No value is returned.
|
||
- Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow
|
||
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
dataset = rag_object.list_datasets(id="123")
|
||
dataset = dataset[0]
|
||
doc = dataset.list_documents(id="wdfxb5t547d")
|
||
doc = doc[0]
|
||
chunk = doc.add_chunk(content="xxxxxxx")
|
||
doc.delete_chunks(["id_1","id_2"])
|
||
doc.delete_chunks(delete_all=True)
|
||
```
|
||
|
||
---
|
||
|
||
### Update chunk
|
||
|
||
```python
|
||
Chunk.update(update_message: dict)
|
||
```
|
||
|
||
Updates content or configurations for the current chunk.
|
||
|
||
#### Parameters
|
||
|
||
##### update_message: `dict[str, str|list[str]|bool]` *Required*
|
||
|
||
A dictionary representing the attributes to update, with the following keys:
|
||
|
||
- `"content"`: `string` The text content of the chunk.
|
||
- `"important_keywords"`: `list[str]` A list of key terms or phrases to tag with the chunk.
|
||
- `"questions"`: `list[str]` A list of questions associated with the chunk.
|
||
- `"tag_kwd"`: `list[str]` A list of tag keywords to associate with the chunk.
|
||
- `"positions"`: `list` Updated source positions for the chunk.
|
||
- `"available"`: `bool` The chunk's availability status in the dataset. Value options:
|
||
- `False`: Unavailable
|
||
- `True`: Available (default)
|
||
- `"image_base64"`: `string` Base64-encoded image content to associate with the chunk.
|
||
|
||
#### Returns
|
||
|
||
- Success: No value is returned.
|
||
- Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow
|
||
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
dataset = rag_object.list_datasets(id="123")
|
||
dataset = dataset[0]
|
||
doc = dataset.list_documents(id="wdfxb5t547d")
|
||
doc = doc[0]
|
||
chunk = doc.add_chunk(content="xxxxxxx")
|
||
chunk.update({"content":"sdfx..."})
|
||
```
|
||
|
||
---
|
||
|
||
### Retrieve chunks
|
||
|
||
```python
|
||
RAGFlow.retrieve(question:str="", dataset_ids:list[str]=None, document_ids=list[str]=None, page:int=1, page_size:int=30, similarity_threshold:float=0.2, vector_similarity_weight:float=0.3, top_k:int=1024,rerank_id:str=None,keyword:bool=False,cross_languages:list[str]=None,metadata_condition: dict=None) -> list[Chunk]
|
||
```
|
||
|
||
Retrieves chunks from specified datasets.
|
||
|
||
#### Parameters
|
||
|
||
##### question: `string`, *Required*
|
||
|
||
The user query or query keywords. Defaults to `""`.
|
||
|
||
##### dataset_ids: `list[str]`, *Required*
|
||
|
||
The IDs of the datasets to search. Defaults to `None`.
|
||
|
||
##### document_ids: `list[str]`
|
||
|
||
The IDs of the documents to search. Defaults to `None`. You must ensure all selected documents use the same embedding model. Otherwise, an error will occur.
|
||
|
||
##### page: `int`
|
||
|
||
The starting index for the documents to retrieve. Defaults to `1`.
|
||
|
||
##### page_size: `int`
|
||
|
||
The maximum number of chunks to retrieve. Defaults to `30`.
|
||
|
||
##### Similarity_threshold: `float`
|
||
|
||
The minimum similarity score. Defaults to `0.2`.
|
||
|
||
##### vector_similarity_weight: `float`
|
||
|
||
The weight of vector cosine similarity. Defaults to `0.3`. If x represents the vector cosine similarity, then (1 - x) is the term similarity weight.
|
||
|
||
##### top_k: `int`
|
||
|
||
The number of chunks engaged in vector cosine computation. Defaults to `1024`.
|
||
|
||
##### rerank_id: `string`
|
||
|
||
The ID of the rerank model. Defaults to `None`.
|
||
|
||
##### keyword: `bool`
|
||
|
||
Indicates whether to enable keyword-based matching:
|
||
|
||
- `True`: Enable keyword-based matching.
|
||
- `False`: Disable keyword-based matching (default).
|
||
|
||
##### cross_languages: `list[string]`
|
||
|
||
The languages that should be translated into, in order to achieve keywords retrievals in different languages.
|
||
|
||
##### metadata_condition: `dict`
|
||
|
||
filter condition for `meta_fields`.
|
||
|
||
#### Returns
|
||
|
||
- Success: A list of `Chunk` objects representing the document chunks.
|
||
- Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow
|
||
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
dataset = rag_object.list_datasets(name="ragflow")
|
||
dataset = dataset[0]
|
||
name = 'ragflow_test.txt'
|
||
path = './test_data/ragflow_test.txt'
|
||
documents =[{"display_name":"test_retrieve_chunks.txt","blob":open(path, "rb").read()}]
|
||
docs = dataset.upload_documents(documents)
|
||
doc = docs[0]
|
||
doc.add_chunk(content="This is a chunk addition test")
|
||
for c in rag_object.retrieve(dataset_ids=[dataset.id],document_ids=[doc.id]):
|
||
print(c)
|
||
```
|
||
|
||
---
|
||
|
||
## CHAT ASSISTANT MANAGEMENT
|
||
|
||
---
|
||
|
||
### Create chat assistant
|
||
|
||
```python
|
||
RAGFlow.create_chat(
|
||
name: str,
|
||
icon: str = "",
|
||
dataset_ids: list[str] | None = None,
|
||
llm_id: str | None = None,
|
||
llm_setting: dict | None = None,
|
||
prompt_config: dict | None = None,
|
||
**kwargs
|
||
) -> Chat
|
||
```
|
||
|
||
Creates a chat assistant.
|
||
|
||
#### Parameters
|
||
|
||
##### name: `string`, *Required*
|
||
|
||
The name of the chat assistant.
|
||
|
||
##### icon: `string`
|
||
|
||
Base64 encoding of the avatar. Defaults to `""`.
|
||
|
||
##### dataset_ids: `list[str]`
|
||
|
||
The IDs of the associated datasets. Defaults to `[]`. When omitted or empty, the SDK creates an empty chat assistant and you can attach datasets later.
|
||
|
||
##### llm_id: `str | None`
|
||
|
||
The LLM model name/ID to use. If `None`, the user’s default chat model is used. Defaults to `None`.
|
||
|
||
##### llm_setting: `dict | None`
|
||
|
||
Configuration for LLM generation parameters. Defaults to `None` (server-side defaults apply). Supported keys:
|
||
|
||
- `"temperature"`: `float` Controls the randomness of the model's output. Higher values increase creativity, while lower values make responses more deterministic. Defaults to `0.1`.
|
||
- `"top_p"`: `float` Sets the nucleus sampling threshold. The model considers only the results of the tokens with `top_p` probability mass. Defaults to `0.3`.
|
||
- `"presence_penalty"`: `float` Penalizes tokens based on whether they have appeared in the text so far, increasing the likelihood of the model talking about new topics. Defaults to `0.4`.
|
||
- `"frequency_penalty"`: `float` Penalizes tokens based on their existing frequency in the text, decreasing the likelihood of repeating the same lines. Defaults to `0.7`.
|
||
- `"max_token"`: `int` The maximum number of tokens to generate in the response. Defaults to `512`.
|
||
|
||
##### prompt_config: `dict | None`
|
||
|
||
Instructions and behavioral settings for the LLM. Defaults to `None` (server-side defaults apply). Supported keys:
|
||
|
||
- `"system"`: `string` The core system prompt or instructions defining the assistant's persona.
|
||
- `"empty_response"`: `string` The specific message returned when no relevant information is retrieved. If left blank, the LLM will generate its own response. Defaults to `None`.
|
||
- `"prologue"`: `string` The initial greeting displayed to the user. Defaults to `"Hi! I’m your assistant. What can I do for you?"`.
|
||
- `"quote"`: `boolean` Determines whether the assistant should include citations or source references in its responses. Defaults to `True`.
|
||
- `"parameters"`: `list[dict]` A list of variables utilized within the system prompt. Each entry must include a `"key"` (`string`) and an `"optional"` (`boolean`) status. The `knowledge` key is reserved for retrieved context chunks. Default: `[{"key": "knowledge", "optional": true}]`.
|
||
|
||
#### Returns
|
||
|
||
- Success: A `Chat` object representing the chat assistant.
|
||
- Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow
|
||
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
datasets = rag_object.list_datasets(name="kb_1")
|
||
dataset_ids = []
|
||
for dataset in datasets:
|
||
dataset_ids.append(dataset.id)
|
||
assistant = rag_object.create_chat("Miss R", dataset_ids=dataset_ids)
|
||
```
|
||
|
||
---
|
||
|
||
### Update chat assistant
|
||
|
||
```python
|
||
Chat.update(update_message: dict)
|
||
```
|
||
|
||
Performs a partial update to the configuration settings for the current chat assistant.
|
||
|
||
`Chat.update()` utilizes the `PATCH /api/v1/chats/{chat_id}` endpoint. Only the specified keys are modified, while all other existing fields are preserved.
|
||
|
||
#### Parameters
|
||
|
||
##### update_message: `dict`, *Required*
|
||
|
||
A dictionary containing the attributes to be updated. Supported keys include:
|
||
|
||
- `"name"`: `string` The updated name of the chat assistant.
|
||
- `"icon"`: `string` A Base64-encoded string representing the assistant's avatar.
|
||
- `"dataset_ids"`: `list[string]` A list of unique identifiers for the datasets associated with the assistant.
|
||
- `"llm_id"`: `string` The unique identifier or name of the LLM to be used.
|
||
- `"llm_setting"`: `dict` Configuration for LLM generation parameters:
|
||
- `"temperature"`: `float` Controls the randomness of the model's output.
|
||
- `"top_p"`: `float` Sets the nucleus sampling threshold.
|
||
- `"presence_penalty"`: `float` Penalizes tokens based on whether they have already appeared in the text.
|
||
- `"frequency_penalty"`: `float` Penalizes tokens based on their existing frequency in the text.
|
||
- `"max_token"`: `int` The maximum number of tokens to generate in the response.
|
||
- `"prompt_config"`: `dict` Instructions and behavioral settings for the LLM:
|
||
- `"system"`: `string` The core system prompt or instructions defining the assistant's persona.
|
||
- `"empty_response"`: `string` The message returned when no relevant information is retrieved. Leave blank to allow the LLM to improvise.
|
||
- `"prologue"`: `string` The initial greeting displayed to the user.
|
||
- `"quote"`: `boolean` Determines whether the assistant should include citations or source references.
|
||
- `"parameters"`: `list[dict]` Variables used within the system prompt (e.g., the reserved `knowledge` key).
|
||
- `"similarity_threshold"`: `float` The minimum similarity score required for retrieved context chunks. Defaults to `0.2`.
|
||
- `"vector_similarity_weight"`: `float` The weight assigned to vector cosine similarity within the hybrid search score. Defaults to `0.3`.
|
||
- `"top_n"`: `int` The number of top-ranked chunks provided to the LLM as context. Defaults to `6`.
|
||
- `"top_k"`: `int` The size of the initial candidate pool retrieved for reranking. Defaults to `1024`.
|
||
- `"rerank_id"`: `string` The unique identifier for the reranking model. If left empty, standard vector cosine similarity is used for ranking.
|
||
|
||
#### Returns
|
||
|
||
- Success: No value is returned.
|
||
- Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow
|
||
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
datasets = rag_object.list_datasets(name="kb_1")
|
||
dataset_id = datasets[0].id
|
||
assistant = rag_object.create_chat("Miss R", dataset_ids=[dataset_id])
|
||
assistant.update({"name": "Stefan", "llm_setting": {"temperature": 0.8}, "top_n": 8})
|
||
```
|
||
|
||
---
|
||
|
||
### Delete chat assistants
|
||
|
||
```python
|
||
RAGFlow.delete_chats(ids: list[str] | None = None, delete_all: bool = False)
|
||
```
|
||
|
||
Deletes chat assistants by ID.
|
||
|
||
#### Parameters
|
||
|
||
##### ids: `list[str]` or `None`
|
||
|
||
The IDs of the chat assistants to delete. Defaults to `None`.
|
||
|
||
- If omitted, or set to `null` or an empty array, no chat assistants are deleted.
|
||
- If an array of IDs is provided, only the chat assistants matching those IDs are deleted.
|
||
|
||
##### delete_all: `bool`
|
||
|
||
Whether to delete all chat assistants owned by the current user when `ids` is omitted, or set to `None` or an empty list. Defaults to `False`.
|
||
|
||
#### Returns
|
||
|
||
- Success: No value is returned.
|
||
- Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow
|
||
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
rag_object.delete_chats(ids=["id_1","id_2"])
|
||
rag_object.delete_chats(delete_all=True)
|
||
```
|
||
|
||
---
|
||
|
||
### List chat assistants
|
||
|
||
```python
|
||
RAGFlow.list_chats(
|
||
page: int = 1,
|
||
page_size: int = 30,
|
||
orderby: str = "create_time",
|
||
desc: bool = True,
|
||
id: str | None = None,
|
||
name: str | None = None,
|
||
keywords: str | None = None,
|
||
owner_ids: str | list[str] | None = None,
|
||
parser_id: str | None = None
|
||
) -> list[Chat]
|
||
```
|
||
|
||
Lists chat assistants.
|
||
|
||
#### Parameters
|
||
|
||
##### page: `int`
|
||
|
||
Specifies the page on which the chat assistants will be displayed. Defaults to `1`.
|
||
|
||
##### page_size: `int`
|
||
|
||
The number of chat assistants on each page. Defaults to `30`.
|
||
|
||
##### orderby: `string`
|
||
|
||
The attribute by which the results are sorted. Available options:
|
||
|
||
- `"create_time"` (default)
|
||
- `"update_time"`
|
||
|
||
##### desc: `bool`
|
||
|
||
Indicates whether the retrieved chat assistants should be sorted in descending order. Defaults to `True`.
|
||
|
||
##### id: `string | None`
|
||
|
||
Exact match on chat assistant ID. Defaults to `None`.
|
||
|
||
Filters results by the exact name of the chat assistant. Defaults to `None`.
|
||
|
||
##### keywords: `string | None`
|
||
|
||
Performs a case-insensitive fuzzy search against chat assistant names. Defaults to `None`.
|
||
|
||
##### owner_ids: `string | list[string] | None`
|
||
|
||
Filters results by one or more owner tenant IDs. Defaults to `None`.
|
||
|
||
##### parser_id: `string | None`
|
||
|
||
Filters results by a specific parser type identifier. Defaults to `None`.
|
||
|
||
If `id` or `name` is specified, exact filtering takes precedence over the fuzzy matching provided by `keywords`.
|
||
|
||
#### Returns
|
||
|
||
- Success: A list of `Chat` objects.
|
||
- Failure: `Exception`.
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow
|
||
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
for assistant in rag_object.list_chats():
|
||
print(assistant)
|
||
```
|
||
|
||
---
|
||
|
||
## SESSION MANAGEMENT
|
||
|
||
---
|
||
|
||
### Create session with chat assistant
|
||
|
||
```python
|
||
Chat.create_session(name: str = "New session") -> Session
|
||
```
|
||
|
||
Creates a session with the current chat assistant.
|
||
|
||
#### Parameters
|
||
|
||
##### name: `string`
|
||
|
||
The name of the chat session to create.
|
||
|
||
#### Returns
|
||
|
||
- Success: A `Session` object containing the following attributes:
|
||
- `id`: `string` The auto-generated unique identifier of the created session.
|
||
- `name`: `string` The name of the created session.
|
||
- `message`: `list[Message]` The opening message of the created session. Default: `[{"role": "assistant", "content": "Hi! I am your assistant, can I help you?"}]`
|
||
- `chat_id`: `string` The ID of the associated chat assistant.
|
||
- Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow
|
||
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
assistant = rag_object.list_chats(name="Miss R")
|
||
assistant = assistant[0]
|
||
session = assistant.create_session()
|
||
```
|
||
|
||
---
|
||
|
||
### Update chat assistant's session
|
||
|
||
```python
|
||
Session.update(update_message: dict)
|
||
```
|
||
|
||
Updates the current session of the current chat assistant.
|
||
|
||
#### Parameters
|
||
|
||
##### update_message: `dict[str, Any]`, *Required*
|
||
|
||
A dictionary representing the attributes to update, with only one key:
|
||
|
||
- `"name"`: `string` The revised name of the session.
|
||
|
||
#### Returns
|
||
|
||
- Success: No value is returned.
|
||
- Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow
|
||
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
assistant = rag_object.list_chats(name="Miss R")
|
||
assistant = assistant[0]
|
||
session = assistant.create_session("session_name")
|
||
session.update({"name": "updated_name"})
|
||
```
|
||
|
||
---
|
||
|
||
### List chat assistant's sessions
|
||
|
||
```python
|
||
Chat.list_sessions(
|
||
page: int = 1,
|
||
page_size: int = 30,
|
||
orderby: str = "create_time",
|
||
desc: bool = True,
|
||
id: str = None,
|
||
name: str = None,
|
||
user_id: str = None
|
||
) -> list[Session]
|
||
```
|
||
|
||
Lists sessions associated with the current chat assistant.
|
||
|
||
#### Parameters
|
||
|
||
##### page: `int`
|
||
|
||
Specifies the page on which the sessions will be displayed. Defaults to `1`.
|
||
|
||
##### page_size: `int`
|
||
|
||
The number of sessions on each page. Defaults to `30`.
|
||
|
||
##### orderby: `string`
|
||
|
||
The field by which sessions should be sorted. Available options:
|
||
|
||
- `"create_time"` (default)
|
||
- `"update_time"`
|
||
|
||
##### desc: `bool`
|
||
|
||
Indicates whether the retrieved sessions should be sorted in descending order. Defaults to `True`.
|
||
|
||
##### id: `string`
|
||
|
||
The ID of the chat session to retrieve. Defaults to `None`.
|
||
|
||
##### name: `string`
|
||
|
||
The name of the chat session to retrieve. Defaults to `None`.
|
||
|
||
##### user_id: `str`
|
||
|
||
The optional user-defined ID to filter sessions by. Defaults to `None`.
|
||
|
||
#### Returns
|
||
|
||
- Success: A list of `Session` objects associated with the current chat assistant.
|
||
- Failure: `Exception`.
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow
|
||
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
assistant = rag_object.list_chats(name="Miss R")
|
||
assistant = assistant[0]
|
||
for session in assistant.list_sessions():
|
||
print(session)
|
||
```
|
||
|
||
---
|
||
|
||
### Delete chat assistant's sessions
|
||
|
||
```python
|
||
Chat.delete_sessions(ids: list[str] | None = None, delete_all: bool = False)
|
||
```
|
||
|
||
Deletes sessions of the current chat assistant by ID.
|
||
|
||
#### Parameters
|
||
|
||
##### ids: `list[str]` or `None`
|
||
|
||
The IDs of the sessions to delete. Defaults to `None`.
|
||
|
||
- If omitted, or set to `null` or an empty array, no sessions are deleted.
|
||
- If an array of IDs is provided, only the sessions matching those IDs are deleted.
|
||
|
||
##### delete_all: `bool`
|
||
|
||
Whether to delete all sessions of the current chat assistant when `ids` is omitted, or set to `None` or an empty list. Defaults to `False`.
|
||
|
||
#### Returns
|
||
|
||
- Success: No value is returned.
|
||
- Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow
|
||
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
assistant = rag_object.list_chats(name="Miss R")
|
||
assistant = assistant[0]
|
||
assistant.delete_sessions(ids=["id_1","id_2"])
|
||
assistant.delete_sessions(delete_all=True)
|
||
```
|
||
|
||
---
|
||
|
||
### Converse with chat assistant
|
||
|
||
```python
|
||
Session.ask(question: str = "", stream: bool = False, **kwargs) -> Optional[Message, iter[Message]]
|
||
```
|
||
|
||
Asks a specified chat assistant a question to start an AI-powered conversation.
|
||
|
||
:::tip NOTE
|
||
In streaming mode, not all responses include a reference, as this depends on the system's judgement.
|
||
:::
|
||
|
||
#### Parameters
|
||
|
||
##### question: `string`, *Required*
|
||
|
||
The question to start an AI-powered conversation. Default to `""`
|
||
|
||
##### stream: `bool`
|
||
|
||
Indicates whether to output responses in a streaming way:
|
||
|
||
- `True`: Enable streaming (default).
|
||
- `False`: Disable streaming.
|
||
|
||
##### **kwargs
|
||
|
||
The parameters in prompt(system).
|
||
|
||
#### Returns
|
||
|
||
- A `Message` object containing the response to the question if `stream` is set to `False`.
|
||
- An iterator containing multiple `message` objects (`iter[Message]`) if `stream` is set to `True`
|
||
|
||
The following shows the attributes of a `Message` object:
|
||
|
||
##### id: `string`
|
||
|
||
The auto-generated message ID.
|
||
|
||
##### content: `string`
|
||
|
||
The content of the message. Defaults to `"Hi! I am your assistant, can I help you?"`.
|
||
|
||
##### reference: `list[Chunk]`
|
||
|
||
A list of `Chunk` objects representing references to the message, each containing the following attributes:
|
||
|
||
- `id` `string`
|
||
The chunk ID.
|
||
- `content` `string`
|
||
The content of the chunk.
|
||
- `img_id` `string`
|
||
The ID of the snapshot of the chunk. Applicable only when the source of the chunk is an image, PPT, PPTX, or PDF file.
|
||
- `document_id` `string`
|
||
The ID of the referenced document.
|
||
- `document_name` `string`
|
||
The name of the referenced document.
|
||
- `document_metadata` `dict`
|
||
Optional document metadata, returned only when `extra_body.reference_metadata.include` is `true`.
|
||
- `position` `list[str]`
|
||
The location information of the chunk within the referenced document.
|
||
- `dataset_id` `string`
|
||
The ID of the dataset to which the referenced document belongs.
|
||
- `similarity` `float`
|
||
A composite similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity. It is the weighted sum of `vector_similarity` and `term_similarity`.
|
||
- `vector_similarity` `float`
|
||
A vector similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity between vector embeddings.
|
||
- `term_similarity` `float`
|
||
A keyword similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity between keywords.
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow
|
||
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
assistant = rag_object.list_chats(name="Miss R")
|
||
assistant = assistant[0]
|
||
session = assistant.create_session()
|
||
|
||
print("\n==================== Miss R =====================\n")
|
||
print("Hello. What can I do for you?")
|
||
|
||
while True:
|
||
question = input("\n==================== User =====================\n> ")
|
||
print("\n==================== Miss R =====================\n")
|
||
|
||
cont = ""
|
||
for ans in session.ask(question, stream=True):
|
||
print(ans.content[len(cont):], end='', flush=True)
|
||
cont = ans.content
|
||
```
|
||
|
||
---
|
||
|
||
### Create session with agent
|
||
|
||
```python
|
||
Agent.create_session(**kwargs) -> Session
|
||
```
|
||
|
||
Creates a session with the current agent.
|
||
|
||
#### Parameters
|
||
|
||
##### **kwargs
|
||
|
||
The parameters in `begin` component.
|
||
|
||
Also supports:
|
||
|
||
- `release` (`bool | str`, optional): When set to `True` (or `"true"`), creates a session with the published agent app only.
|
||
|
||
#### Returns
|
||
|
||
- Success: A `Session` object containing the following attributes:
|
||
- `id`: `string` The auto-generated unique identifier of the created session.
|
||
- `message`: `list[Message]` The messages of the created session assistant. Default: `[{"role": "assistant", "content": "Hi! I am your assistant, can I help you?"}]`
|
||
- `agent_id`: `string` The ID of the associated agent.
|
||
- Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow, Agent
|
||
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
agent_id = "AGENT_ID"
|
||
agent = rag_object.get_agent(agent_id)
|
||
session = agent.create_session()
|
||
# Or create in release mode:
|
||
# session = agent.create_session(release=True)
|
||
```
|
||
|
||
---
|
||
|
||
### Converse with agent
|
||
|
||
```python
|
||
Session.ask(question: str = "", stream: bool = False, **kwargs) -> Optional[Message | iter[Message]]
|
||
```
|
||
|
||
Asks a specified agent through the unified completion endpoint.
|
||
|
||
:::tip NOTE
|
||
In streaming mode, not all responses include a reference, as this depends on the system's judgement.
|
||
:::
|
||
|
||
#### Parameters
|
||
|
||
##### question: `string`
|
||
|
||
The user message sent to the agent. If the **Begin** component takes parameters, `question` can be an empty string.
|
||
|
||
##### stream: `bool`
|
||
|
||
Indicates whether to output responses in a streaming way:
|
||
|
||
- `True`: Enable streaming.
|
||
- `False`: Disable streaming.
|
||
|
||
##### kwargs: `dict`
|
||
|
||
Additional request parameters forwarded to the completion API. Common options:
|
||
|
||
- `inputs`: Variables defined in the **Begin** component.
|
||
- `session_id`: Continue an existing session instead of creating a new one.
|
||
- `release`: Use the latest published version of the agent.
|
||
- `return_trace`: Include execution trace information in the response.
|
||
- Other custom Begin component parameters supported by the current workflow.
|
||
|
||
#### Returns
|
||
|
||
- A `Message` object containing the response to the question if `stream` is set to `False`
|
||
- An iterator containing multiple `message` objects (`iter[Message]`) if `stream` is set to `True`
|
||
|
||
The following shows the attributes of a `Message` object:
|
||
|
||
##### id: `string`
|
||
|
||
The auto-generated message ID.
|
||
|
||
##### content: `string`
|
||
|
||
The content of the message. Defaults to `"Hi! I am your assistant, can I help you?"`.
|
||
|
||
##### reference: `list[Chunk]`
|
||
|
||
A list of `Chunk` objects representing references to the message, each containing the following attributes:
|
||
|
||
- `id` `string`
|
||
The chunk ID.
|
||
- `content` `string`
|
||
The content of the chunk.
|
||
- `image_id` `string`
|
||
The ID of the snapshot of the chunk. Applicable only when the source of the chunk is an image, PPT, PPTX, or PDF file.
|
||
- `document_id` `string`
|
||
The ID of the referenced document.
|
||
- `document_name` `string`
|
||
The name of the referenced document.
|
||
- `document_metadata` `dict`
|
||
Optional document metadata, returned only when `extra_body.reference_metadata.include` is `true`.
|
||
- `position` `list[str]`
|
||
The location information of the chunk within the referenced document.
|
||
- `dataset_id` `string`
|
||
The ID of the dataset to which the referenced document belongs.
|
||
- `similarity` `float`
|
||
A composite similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity. It is the weighted sum of `vector_similarity` and `term_similarity`.
|
||
- `vector_similarity` `float`
|
||
A vector similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity between vector embeddings.
|
||
- `term_similarity` `float`
|
||
A keyword similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity between keywords.
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow, Agent
|
||
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
AGENT_id = "AGENT_ID"
|
||
agent = rag_object.get_agent(AGENT_id)
|
||
session = agent.create_session()
|
||
|
||
print("\n===== Miss R ====\n")
|
||
print("Hello. What can I do for you?")
|
||
|
||
while True:
|
||
question = input("\n===== User ====\n> ")
|
||
print("\n==== Miss R ====\n")
|
||
|
||
cont = ""
|
||
for ans in session.ask(question, stream=True):
|
||
print(ans.content[len(cont):], end='', flush=True)
|
||
cont = ans.content
|
||
```
|
||
|
||
Use Begin inputs and request trace output:
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow, Agent
|
||
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
agent = rag_object.get_agent("AGENT_ID")
|
||
session = agent.create_session()
|
||
|
||
message = session.ask(
|
||
"",
|
||
stream=False,
|
||
inputs={
|
||
"line_var": {
|
||
"type": "line",
|
||
"value": "I am line_var",
|
||
}
|
||
},
|
||
return_trace=True,
|
||
)
|
||
|
||
print(message.content)
|
||
print(message.reference)
|
||
```
|
||
|
||
---
|
||
|
||
### List agent sessions
|
||
|
||
```python
|
||
Agent.list_sessions(
|
||
page: int = 1,
|
||
page_size: int = 30,
|
||
orderby: str = "update_time",
|
||
desc: bool = True,
|
||
id: str = None
|
||
) -> List[Session]
|
||
```
|
||
|
||
Lists sessions associated with the current agent.
|
||
|
||
#### Parameters
|
||
|
||
##### page: `int`
|
||
|
||
Specifies the page on which the sessions will be displayed. Defaults to `1`.
|
||
|
||
##### page_size: `int`
|
||
|
||
The number of sessions on each page. Defaults to `30`.
|
||
|
||
##### orderby: `string`
|
||
|
||
The field by which sessions should be sorted. Available options:
|
||
|
||
- `"create_time"`
|
||
- `"update_time"`(default)
|
||
|
||
##### desc: `bool`
|
||
|
||
Indicates whether the retrieved sessions should be sorted in descending order. Defaults to `True`.
|
||
|
||
##### id: `string`
|
||
|
||
The ID of the agent session to retrieve. Defaults to `None`.
|
||
|
||
#### Returns
|
||
|
||
- Success: A list of `Session` objects associated with the current agent.
|
||
- Failure: `Exception`.
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow
|
||
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
AGENT_id = "AGENT_ID"
|
||
agent = rag_object.get_agent(AGENT_id)
|
||
sessons = agent.list_sessions()
|
||
for session in sessions:
|
||
print(session)
|
||
```
|
||
---
|
||
### Delete agent's sessions
|
||
|
||
```python
|
||
Agent.delete_sessions(ids: list[str] | None = None, delete_all: bool = False)
|
||
```
|
||
|
||
Deletes sessions of an agent by ID.
|
||
|
||
#### Parameters
|
||
|
||
##### ids: `list[str]` or `None`
|
||
|
||
The IDs of the sessions to delete. Defaults to `None`.
|
||
|
||
- If omitted, or set to `None` or an empty array, no sessions are deleted.
|
||
- If an array of IDs is provided, only the sessions matching those IDs are deleted.
|
||
|
||
##### delete_all: `bool`
|
||
|
||
Whether to delete all sessions of the current agent when `ids` is omitted, or set to `None` or an empty list. Defaults to `False`.
|
||
|
||
#### Returns
|
||
|
||
- Success: No value is returned.
|
||
- Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow
|
||
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
AGENT_id = "AGENT_ID"
|
||
agent = rag_object.get_agent(AGENT_id)
|
||
agent.delete_sessions(ids=["id_1","id_2"])
|
||
agent.delete_sessions(delete_all=True)
|
||
```
|
||
|
||
---
|
||
|
||
## AGENT MANAGEMENT
|
||
|
||
---
|
||
|
||
### List agents
|
||
|
||
```python
|
||
RAGFlow.list_agents(
|
||
page: int = 1,
|
||
page_size: int = 30,
|
||
orderby: str = "update_time",
|
||
desc: bool = True
|
||
) -> List[Agent]
|
||
```
|
||
|
||
Lists agents. This is a collection API and always returns a list.
|
||
|
||
#### Parameters
|
||
|
||
##### page: `int`
|
||
|
||
Specifies the page on which the agents will be displayed. Defaults to `1`.
|
||
|
||
##### page_size: `int`
|
||
|
||
The number of agents on each page. Defaults to `30`.
|
||
|
||
##### orderby: `string`
|
||
|
||
The attribute by which the results are sorted. Available options:
|
||
|
||
- `"create_time"`
|
||
- `"update_time"` (default)
|
||
|
||
##### desc: `bool`
|
||
|
||
Indicates whether the retrieved agents should be sorted in descending order. Defaults to `True`.
|
||
|
||
#### Returns
|
||
|
||
- Success: A list of `Agent` objects.
|
||
- Failure: `Exception`.
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
for agent in rag_object.list_agents():
|
||
print(agent)
|
||
```
|
||
|
||
---
|
||
|
||
### Get agent
|
||
|
||
```python
|
||
RAGFlow.get_agent(agent_id: str) -> Agent
|
||
```
|
||
|
||
Gets a single agent by ID and returns the detailed agent payload.
|
||
|
||
#### Parameters
|
||
|
||
##### agent_id: `string`
|
||
|
||
The ID of the agent to retrieve.
|
||
|
||
#### Returns
|
||
|
||
- Success: An `Agent` object.
|
||
- Failure: `Exception`.
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow
|
||
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
agent = rag_object.get_agent("AGENT_ID")
|
||
print(agent)
|
||
```
|
||
|
||
---
|
||
|
||
### Create agent
|
||
|
||
```python
|
||
RAGFlow.create_agent(
|
||
title: str,
|
||
dsl: dict,
|
||
description: str | None = None
|
||
) -> None
|
||
```
|
||
|
||
Create an agent.
|
||
|
||
#### Parameters
|
||
|
||
##### title: `string`
|
||
|
||
Specifies the title of the agent.
|
||
|
||
##### dsl: `dict`
|
||
|
||
Specifies the canvas DSL of the agent.
|
||
|
||
##### description: `string`
|
||
|
||
The description of the agent. Defaults to `None`.
|
||
|
||
#### Returns
|
||
|
||
- Success: Nothing.
|
||
- Failure: `Exception`.
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
rag_object.create_agent(
|
||
title="Test Agent",
|
||
description="A test agent",
|
||
dsl={
|
||
# ... canvas DSL here ...
|
||
}
|
||
)
|
||
```
|
||
|
||
---
|
||
|
||
### Update agent
|
||
|
||
```python
|
||
RAGFlow.update_agent(
|
||
agent_id: str,
|
||
title: str | None = None,
|
||
description: str | None = None,
|
||
dsl: dict | None = None
|
||
) -> None
|
||
```
|
||
|
||
Update an agent.
|
||
|
||
#### Parameters
|
||
|
||
##### agent_id: `string`
|
||
|
||
Specifies the id of the agent to be updated.
|
||
|
||
##### title: `string`
|
||
|
||
Specifies the new title of the agent. `None` if you do not want to update this.
|
||
|
||
##### dsl: `dict`
|
||
|
||
Specifies the new canvas DSL of the agent. `None` if you do not want to update this.
|
||
|
||
##### description: `string`
|
||
|
||
The new description of the agent. `None` if you do not want to update this.
|
||
|
||
#### Returns
|
||
|
||
- Success: Nothing.
|
||
- Failure: `Exception`.
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
rag_object.update_agent(
|
||
agent_id="58af890a2a8911f0a71a11b922ed82d6",
|
||
title="Test Agent",
|
||
description="A test agent",
|
||
dsl={
|
||
# ... canvas DSL here ...
|
||
}
|
||
)
|
||
```
|
||
|
||
---
|
||
|
||
### Delete agent
|
||
|
||
```python
|
||
RAGFlow.delete_agent(
|
||
agent_id: str
|
||
) -> None
|
||
```
|
||
|
||
Delete an agent.
|
||
|
||
#### Parameters
|
||
|
||
##### agent_id: `string`
|
||
|
||
Specifies the id of the agent to be deleted.
|
||
|
||
#### Returns
|
||
|
||
- Success: Nothing.
|
||
- Failure: `Exception`.
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
rag_object.delete_agent("58af890a2a8911f0a71a11b922ed82d6")
|
||
```
|
||
|
||
---
|
||
|
||
|
||
|
||
## Memory Management
|
||
|
||
### Create Memory
|
||
|
||
```python
|
||
Ragflow.create_memory(
|
||
name: str,
|
||
memory_type: list[str],
|
||
embd_id: str,
|
||
llm_id: str
|
||
) -> Memory
|
||
```
|
||
|
||
Create a new memory.
|
||
|
||
#### Parameters
|
||
|
||
##### name: `string`, *Required*
|
||
|
||
The unique name of the memory to create. It must adhere to the following requirements:
|
||
|
||
- Basic Multilingual Plane (BMP) only
|
||
- Maximum 128 characters
|
||
|
||
##### memory_type: `list[str]`, *Required*
|
||
|
||
Specifies the types of memory to extract. Available options:
|
||
|
||
- `raw`: The raw dialogue content between the user and the agent . *Required by default*.
|
||
- `semantic`: General knowledge and facts about the user and world.
|
||
- `episodic`: Time-stamped records of specific events and experiences.
|
||
- `procedural`: Learned skills, habits, and automated procedures.
|
||
|
||
##### embd_id: `string`, *Required*
|
||
|
||
The name of the embedding model to use. For example: `"BAAI/bge-large-zh-v1.5@BAAI"`
|
||
|
||
- Maximum 255 characters
|
||
- Must follow `model_name@model_factory` format
|
||
|
||
##### llm_id: `string`, *Required*
|
||
|
||
The name of the chat model to use. For example: `"glm-4-flash@ZHIPU-AI"`
|
||
|
||
- Maximum 255 characters
|
||
- Must follow `model_name@model_factory` format
|
||
|
||
#### Returns
|
||
|
||
- Success: A `memory` object.
|
||
|
||
- Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import RAGFlow
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
memory = rag_obj.create_memory("name", ["raw"], "BAAI/bge-large-zh-v1.5@SILICONFLOW", "glm-4-flash@ZHIPU-AI")
|
||
```
|
||
|
||
---
|
||
|
||
|
||
|
||
### Update Memory
|
||
|
||
```python
|
||
Memory.update(
|
||
update_dict: dict
|
||
) -> Memory
|
||
```
|
||
|
||
Updates configurations for a specified memory.
|
||
|
||
#### Parameters
|
||
|
||
##### update_dict: `dict`, *Required*
|
||
|
||
Configurations to update. Available configurations:
|
||
|
||
- `name`: `string`, *Optional*
|
||
|
||
The revised name of the memory.
|
||
|
||
- Basic Multilingual Plane (BMP) only
|
||
- Maximum 128 characters, *Optional*
|
||
|
||
- `avatar`: `string`, *Optional*
|
||
|
||
The updated base64 encoding of the avatar.
|
||
|
||
- Maximum 65535 characters
|
||
|
||
- `permission`: `enum<string>`, *Optional*
|
||
|
||
The updated memory permission. Available options:
|
||
|
||
- `"me"`: (Default) Only you can manage the memory.
|
||
- `"team"`: All team members can manage the memory.
|
||
|
||
- `llm_id`: `string`, *Optional*
|
||
|
||
The name of the chat model to use. For example: `"glm-4-flash@ZHIPU-AI"`
|
||
|
||
- Maximum 255 characters
|
||
- Must follow `model_name@model_factory` format
|
||
|
||
- `description`: `string`, *Optional*
|
||
|
||
The description of the memory. Defaults to `None`.
|
||
|
||
- `memory_size`: `int`, *Optional*
|
||
|
||
Defaults to `5*1024*1024` Bytes. Accounts for each message's content + its embedding vector (≈ Content + Dimensions × 8 Bytes). Example: A 1 KB message with 1024-dim embedding uses ~9 KB. The 5 MB default limit holds ~500 such messages.
|
||
|
||
- Maximum 10 * 1024 * 1024 Bytes
|
||
|
||
- `forgetting_policy`: `enum<string>`, *Optional*
|
||
|
||
Evicts existing data based on the chosen policy when the size limit is reached, freeing up space for new messages. Available options:
|
||
|
||
- `"FIFO"`: (Default) Prioritize messages with the earliest `forget_at` time for removal. When the pool of messages that have `forget_at` set is insufficient, it falls back to selecting messages in ascending order of their `valid_at` (oldest first).
|
||
|
||
- `temperature`: (*Body parameter*), `float`, *Optional*
|
||
|
||
Adjusts output randomness. Lower = more deterministic; higher = more creative.
|
||
|
||
- Range [0, 1]
|
||
|
||
- `system_prompt`: (*Body parameter*), `string`, *Optional*
|
||
|
||
Defines the system-level instructions and role for the AI assistant. It is automatically assembled based on the selected `memory_type` by `PromptAssembler` in `memory/utils/prompt_util.py`. This prompt sets the foundational behavior and context for the entire conversation.
|
||
|
||
- Keep the `OUTPUT REQUIREMENTS` and `OUTPUT FORMAT` parts unchanged.
|
||
|
||
- `user_prompt`: (*Body parameter*), `string`, *Optional*
|
||
|
||
Represents the user's custom setting, which is the specific question or instruction the AI needs to respond to directly. Defaults to `None`.
|
||
|
||
#### Returns
|
||
|
||
- Success: A `memory` object.
|
||
|
||
- Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import Ragflow, Memory
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
memory_obejct = Memory(rag_object, {"id": "your memory_id"})
|
||
memory_object.update({"name": "New_name"})
|
||
```
|
||
|
||
---
|
||
|
||
|
||
|
||
### List Memory
|
||
|
||
```python
|
||
Ragflow.list_memory(
|
||
page: int = 1,
|
||
page_size: int = 50,
|
||
tenant_id: str | list[str] = None,
|
||
memory_type: str | list[str] = None,
|
||
storage_type: str = None,
|
||
keywords: str = None) -> dict
|
||
```
|
||
|
||
List memories.
|
||
|
||
#### Parameters
|
||
|
||
##### page: `int`, *Optional*
|
||
|
||
Specifies the page on which the datasets will be displayed. Defaults to `1`
|
||
|
||
##### page_size: `int`, *Optional*
|
||
|
||
The number of memories on each page. Defaults to `50`.
|
||
|
||
##### tenant_id: `string` or `list[str]`, *Optional*
|
||
|
||
The owner's ID, supports search multiple IDs.
|
||
|
||
##### memory_type: `string` or `list[str]`, *Optional*
|
||
|
||
The type of memory (as set during creation). A memory matches if its type is **included in** the provided value(s). Available options:
|
||
|
||
- `raw`
|
||
- `semantic`
|
||
- `episodic`
|
||
- `procedural`
|
||
|
||
##### storage_type: `string`, *Optional*
|
||
|
||
The storage format of messages. Available options:
|
||
|
||
- `table`: (Default)
|
||
|
||
##### keywords: `string`, *Optional*
|
||
|
||
The name of memory to retrieve, supports fuzzy search.
|
||
|
||
#### Returns
|
||
|
||
Success: A dict of `Memory` object list and total count.
|
||
|
||
```json
|
||
{"memory_list": list[Memory], "total_count": int}
|
||
```
|
||
|
||
Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```
|
||
from ragflow_sdk import Ragflow, Memory
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
rag_obejct.list_memory()
|
||
```
|
||
|
||
---
|
||
|
||
|
||
|
||
### Get Memory Config
|
||
|
||
```python
|
||
Memory.get_config()
|
||
```
|
||
|
||
Get the configuration of a specified memory.
|
||
|
||
#### Parameters
|
||
|
||
None
|
||
|
||
#### Returns
|
||
|
||
Success: A `Memory` object.
|
||
|
||
Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import Ragflow, Memory
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
memory_obejct = Memory(rag_object, {"id": "your memory_id"})
|
||
memory_obejct.get_config()
|
||
```
|
||
|
||
---
|
||
|
||
|
||
|
||
### Delete Memory
|
||
|
||
```python
|
||
Ragflow.delete_memory(
|
||
memory_id: str
|
||
) -> None
|
||
```
|
||
|
||
Delete a specified memory.
|
||
|
||
#### Parameters
|
||
|
||
##### memory_id: `string`, *Required*
|
||
|
||
The ID of the memory.
|
||
|
||
#### Returns
|
||
|
||
Success: Nothing
|
||
|
||
Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import Ragflow, Memory
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
rag_object.delete_memory("your memory_id")
|
||
```
|
||
|
||
---
|
||
|
||
|
||
|
||
### List messages of a memory
|
||
|
||
```python
|
||
Memory.list_memory_messages(
|
||
agent_id: str | list[str]=None,
|
||
keywords: str=None,
|
||
page: int=1,
|
||
page_size: int=50
|
||
) -> dict
|
||
```
|
||
|
||
List the messages of a specified memory.
|
||
|
||
#### Parameters
|
||
|
||
##### agent_id: `string` or `list[str]`, *Optional*
|
||
|
||
Filters messages by the ID of their source agent. Supports multiple values.
|
||
|
||
##### keywords: `string`, *Optional*
|
||
|
||
Filters messages by their session ID. This field supports fuzzy search.
|
||
|
||
##### page: `int`, *Optional*
|
||
|
||
Specifies the page on which the messages will be displayed. Defaults to `1`.
|
||
|
||
##### page_size: `int`, *Optional*
|
||
|
||
The number of messages on each page. Defaults to `50`.
|
||
|
||
#### Returns
|
||
|
||
Success: a dict of messages and meta info.
|
||
|
||
```json
|
||
{"messages": {"message_list": [{message dict}], "total_count": int}, "storage_type": "table"}
|
||
```
|
||
|
||
Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import Ragflow, Memory
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
memory_obejct = Memory(rag_object, {"id": "your memory_id"})
|
||
memory_obejct.list_memory_messages()
|
||
```
|
||
|
||
---
|
||
|
||
|
||
|
||
### Add Message
|
||
|
||
```python
|
||
Ragflow.add_message(
|
||
memory_id: list[str],
|
||
agent_id: str,
|
||
session_id: str,
|
||
user_input: str,
|
||
agent_response: str,
|
||
user_id: str = ""
|
||
) -> str
|
||
```
|
||
|
||
Add a message to specified memories.
|
||
|
||
#### Parameters
|
||
|
||
##### memory_id: `list[str]`, *Required*
|
||
|
||
The IDs of the memories to save messages.
|
||
|
||
##### agent_id: `string`, *Required*
|
||
|
||
The ID of the message's source agent.
|
||
|
||
##### session_id: `string`, *Required*
|
||
|
||
The ID of the message's session.
|
||
|
||
##### user_input: `string`, *Required*
|
||
|
||
The text input provided by the user.
|
||
|
||
##### agent_response: `string`, *Required*
|
||
|
||
The text response generated by the AI agent.
|
||
|
||
##### user_id: `string`, *Optional*
|
||
|
||
The user participating in the conversation with the agent. Defaults to `""`.
|
||
|
||
#### Returns
|
||
|
||
Success: A text `"All add to task."`
|
||
|
||
Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import Ragflow, Memory
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
message_payload = {
|
||
"memory_id": memory_ids,
|
||
"agent_id": agent_id,
|
||
"session_id": session_id,
|
||
"user_id": "",
|
||
"user_input": "Your question here",
|
||
"agent_response": """
|
||
Your agent response here
|
||
"""
|
||
}
|
||
client.add_message(**message_payload)
|
||
```
|
||
|
||
---
|
||
|
||
|
||
|
||
### Forget Message
|
||
|
||
```python
|
||
Memory.forget_message(message_id: int) -> bool
|
||
```
|
||
|
||
Forget a specified message. After forgetting, this message will not be retrieved by agents, and it will also be prioritized for cleanup by the forgetting policy.
|
||
|
||
#### Parameters
|
||
|
||
##### message_id: `int`, *Required*
|
||
|
||
The ID of the message to forget.
|
||
|
||
#### Returns
|
||
|
||
Success: True
|
||
|
||
Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import Ragflow, Memory
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
memory_object = Memory(rag_object, {"id": "your memory_id"})
|
||
memory_object.forget_message(message_id)
|
||
```
|
||
|
||
---
|
||
|
||
|
||
|
||
### Update message status
|
||
|
||
```python
|
||
Memory.update_message_status(message_id: int, status: bool) -> bool
|
||
```
|
||
|
||
Update message status, enable or disable a message. Once a message is disabled, it will not be retrieved by agents.
|
||
|
||
#### Parameters
|
||
|
||
##### message_id: `int`, *Required*
|
||
|
||
The ID of the message to enable or disable.
|
||
|
||
##### status: `bool`, *Required*
|
||
|
||
The status of message. `True` = `enabled`, `False` = `disabled`.
|
||
|
||
#### Returns
|
||
|
||
Success: `True`
|
||
|
||
Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import Ragflow, Memory
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
memory_object = Memory(rag_object, {"id": "your memory_id"})
|
||
memory_object.update_message_status(message_id, True)
|
||
```
|
||
|
||
---
|
||
|
||
|
||
|
||
### Search message
|
||
|
||
```python
|
||
Ragflow.search_message(
|
||
query: str,
|
||
memory_id: list[str],
|
||
agent_id: str=None,
|
||
session_id: str=None,
|
||
user_id: str=None,
|
||
similarity_threshold: float=0.2,
|
||
keywords_similarity_weight: float=0.7,
|
||
top_n: int=10
|
||
) -> list[dict]
|
||
```
|
||
|
||
Searches and retrieves messages from memory based on the provided `query` and other configuration parameters.
|
||
|
||
#### Parameters
|
||
|
||
##### query: `string`, *Required*
|
||
|
||
The search term or natural language question used to find relevant messages.
|
||
|
||
##### memory_id: `list[str]`, *Required*
|
||
|
||
The IDs of the memories to search. Supports multiple values.
|
||
|
||
##### agent_id: `string`, *Optional*
|
||
|
||
The ID of the message's source agent. Defaults to `None`.
|
||
|
||
##### session_id: `string`, *Optional*
|
||
|
||
The ID of the message's session. Defaults to `None`.
|
||
|
||
##### user_id: `string`, *Optional*
|
||
|
||
The user participating in the conversation with the agent. Defaults to `None`.
|
||
|
||
##### similarity_threshold: `float`, *Optional*
|
||
|
||
The minimum cosine similarity score required for a message to be considered a match. A higher value yields more precise but fewer results. Defaults to `0.2`.
|
||
|
||
- Range [0.0, 1.0]
|
||
|
||
##### keywords_similarity_weight: `float`, *Optional*
|
||
|
||
Controls the influence of keyword matching versus semantic (embedding-based) matching in the final relevance score. A value of 0.5 gives them equal weight. Defaults to `0.7`.
|
||
|
||
- Range [0.0, 1.0]
|
||
|
||
##### top_n: `int`, *Optional*
|
||
|
||
The maximum number of most relevant messages to return. This limits the result set size for efficiency. Defaults to `10`.
|
||
|
||
#### Returns
|
||
|
||
Success: A list of `message` dict.
|
||
|
||
Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import Ragflow
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
rag_object.search_message("your question", ["your memory_id"])
|
||
```
|
||
|
||
---
|
||
|
||
|
||
|
||
### Get Recent Messages
|
||
|
||
```python
|
||
Ragflow.get_recent_messages(
|
||
memory_id: list[str],
|
||
agent_id: str=None,
|
||
session_id: str=None,
|
||
limit: int=10
|
||
) -> list[dict]
|
||
```
|
||
|
||
Retrieves the most recent messages from specified memories. Typically accepts a `limit` parameter to control the number of messages returned.
|
||
|
||
#### Parameters
|
||
|
||
##### memory_id: `list[str]`, *Required*
|
||
|
||
The IDs of the memories to search. Supports multiple values.
|
||
|
||
##### agent_id: `string`, *Optional*
|
||
|
||
The ID of the message's source agent. Defaults to `None`.
|
||
|
||
##### session_id: `string`, *Optional*
|
||
|
||
The ID of the message's session. Defaults to `None`.
|
||
|
||
##### limit: `int`, *Optional*
|
||
|
||
Control the number of messages returned. Defaults to `10`.
|
||
|
||
#### Returns
|
||
|
||
Success: A list of `message` dict.
|
||
|
||
Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import Ragflow
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
rag_object.get_recent_messages(["your memory_id"])
|
||
```
|
||
|
||
---
|
||
|
||
|
||
|
||
### Get Message Content
|
||
|
||
```python
|
||
Memory.get_message_content(message_id: int)
|
||
```
|
||
|
||
Retrieves the full content and embed vector of a specific message using its unique message ID.
|
||
|
||
#### Parameters
|
||
|
||
##### message_id: `int`, *Required*
|
||
|
||
#### Returns
|
||
|
||
Success: A `message` dict.
|
||
|
||
Failure: `Exception`
|
||
|
||
#### Examples
|
||
|
||
```python
|
||
from ragflow_sdk import Ragflow
|
||
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
|
||
memory_object = Memory(rag_object, {"id": "your memory_id"})
|
||
memory_object.get_message_content(message_id)
|
||
```
|
||
|
||
---
|