mirror of
https://github.com/infiniflow/ragflow.git
synced 2026-05-06 02:07:49 +08:00
fix(metadata): handle unhashable list values in metadata split (#13116)
### What problem does this PR solve? This PR fixes missing metadata on documents synced from the Moodle connector, especially for **Book** modules. Background: - Moodle Book metadata includes fields like `chapters`, which is a `list[dict]`. - During metadata normalization in `DocMetadataService._split_combined_values`, list deduplication used `dict.fromkeys(...)`. - `dict.fromkeys(...)` fails for unhashable values (like `dict`), causing metadata update to fail. - Result: documents were imported, but metadata was not saved for affected module types (notably Books). What this PR changes: - Replaces hash-based list deduplication with `dedupe_list(...)`, which safely handles unhashable list items while preserving order. - This allows Book metadata (and other complex list metadata) to be persisted correctly. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): Contribution during my time at RAGcon GmbH.
This commit is contained in:
@ -231,8 +231,9 @@ class DocMetadataService:
|
||||
new_values.append(item)
|
||||
else:
|
||||
new_values.append(item)
|
||||
# Remove duplicates while preserving order
|
||||
processed[key] = list(dict.fromkeys(new_values))
|
||||
# Remove duplicates while preserving order.
|
||||
# Use string-based dedupe to support unhashable values (e.g. dict entries).
|
||||
processed[key] = dedupe_list(new_values)
|
||||
else:
|
||||
processed[key] = value
|
||||
|
||||
|
||||
Reference in New Issue
Block a user