mirror of
https://github.com/infiniflow/ragflow.git
synced 2026-03-11 10:17:56 +08:00
Feat: Support get aggregated parsing status to dataset via the API (#13481)
### What problem does this PR solve? Support getting aggregated parsing status to dataset via the API Issue: #12810 ### Type of change - [x] New Feature (non-breaking change which adds functionality) Co-authored-by: heyang.why <heyang.why@alibaba-inc.com>
This commit is contained in:
@ -835,14 +835,14 @@ Failure:
|
||||
|
||||
### List datasets
|
||||
|
||||
**GET** `/api/v1/datasets?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={dataset_name}&id={dataset_id}`
|
||||
**GET** `/api/v1/datasets?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={dataset_name}&id={dataset_id}&include_parsing_status={include_parsing_status}`
|
||||
|
||||
Lists datasets.
|
||||
|
||||
#### Request
|
||||
|
||||
- Method: GET
|
||||
- URL: `/api/v1/datasets?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={dataset_name}&id={dataset_id}`
|
||||
- URL: `/api/v1/datasets?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={dataset_name}&id={dataset_id}&include_parsing_status={include_parsing_status}`
|
||||
- Headers:
|
||||
- `'Authorization: Bearer <YOUR_API_KEY>'`
|
||||
|
||||
@ -854,6 +854,13 @@ curl --request GET \
|
||||
--header 'Authorization: Bearer <YOUR_API_KEY>'
|
||||
```
|
||||
|
||||
```bash
|
||||
# List datasets with parsing status
|
||||
curl --request GET \
|
||||
--url 'http://{address}/api/v1/datasets?include_parsing_status=true' \
|
||||
--header 'Authorization: Bearer <YOUR_API_KEY>'
|
||||
```
|
||||
|
||||
##### Request parameters
|
||||
|
||||
- `page`: (*Filter parameter*)
|
||||
@ -870,6 +877,13 @@ curl --request GET \
|
||||
The name of the dataset to retrieve.
|
||||
- `id`: (*Filter parameter*)
|
||||
The ID of the dataset to retrieve.
|
||||
- `include_parsing_status`: (*Filter parameter*)
|
||||
Whether to include document parsing status counts in the response. Defaults to `false`. When set to `true`, each dataset object in the response will include the following additional fields:
|
||||
- `unstart_count`: Number of documents not yet started parsing.
|
||||
- `running_count`: Number of documents currently being parsed.
|
||||
- `cancel_count`: Number of documents whose parsing was cancelled.
|
||||
- `done_count`: Number of documents that have been successfully parsed.
|
||||
- `fail_count`: Number of documents whose parsing failed.
|
||||
|
||||
#### Response
|
||||
|
||||
@ -917,6 +931,49 @@ Success:
|
||||
}
|
||||
```
|
||||
|
||||
Success (with `include_parsing_status=true`):
|
||||
|
||||
```json
|
||||
{
|
||||
"code": 0,
|
||||
"data": [
|
||||
{
|
||||
"avatar": null,
|
||||
"cancel_count": 0,
|
||||
"chunk_count": 30,
|
||||
"chunk_method": "qa",
|
||||
"create_date": "2026-03-09T18:57:13",
|
||||
"create_time": 1773053833094,
|
||||
"created_by": "928f92a210b911f1ac4cc39e0b8fa3ad",
|
||||
"description": null,
|
||||
"document_count": 1,
|
||||
"done_count": 1,
|
||||
"embedding_model": "text-embedding-v2@Tongyi-Qianwen",
|
||||
"fail_count": 0,
|
||||
"id": "ba6586c21ba611f1a3dc476f0709e75e",
|
||||
"language": "English",
|
||||
"name": "Test Dataset",
|
||||
"parser_config": {
|
||||
"graphrag": { "use_graphrag": false },
|
||||
"llm_id": "deepseek-chat@DeepSeek",
|
||||
"raptor": { "use_raptor": false }
|
||||
},
|
||||
"permission": "me",
|
||||
"running_count": 0,
|
||||
"similarity_threshold": 0.2,
|
||||
"status": "1",
|
||||
"tenant_id": "928f92a210b911f1ac4cc39e0b8fa3ad",
|
||||
"token_num": 1746,
|
||||
"unstart_count": 0,
|
||||
"update_date": "2026-03-09T18:59:32",
|
||||
"update_time": 1773053972723,
|
||||
"vector_similarity_weight": 0.3
|
||||
}
|
||||
],
|
||||
"total_datasets": 1
|
||||
}
|
||||
```
|
||||
|
||||
Failure:
|
||||
|
||||
```json
|
||||
|
||||
Reference in New Issue
Block a user