Files
Ahmad Intisar 3c4d1da98f Feature/table parser column roles (#13710)
### What problem does this PR solve?

The table file parser (CSV/Excel) currently treats all columns
identically — every column is both vectorized (embedded in chunk text)
and stored as filterable metadata. There's no way for users to control
which columns should be searchable by semantic meaning versus which
should only be filterable attributes.

For example, when ingesting a news articles CSV with columns like title,
content, country, category, source, etc., the embedding includes
metadata fields like country: Brazil and source: Reuters in the chunk
text, which dilutes the semantic quality of the embedding without adding
retrieval value.

The RDBMS connector (MySQL/PostgreSQL) already supports content_columns
/ metadata_columns, but this capability was missing for file-based table
ingestion.

This PR adds column-level control (vectorize / metadata / both) for the
table file parser, following RAGFlow's existing patterns.

Backward compatible: Datasets without table_column_roles or with
table_column_mode: auto behave exactly as before (all columns = both).

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-05-11 10:06:04 +08:00
..