Closes#1398
### What problem does this PR solve?
Adds native support for EPUB files. EPUB content is extracted in spine
(reading) order and parsed using the existing HTML parser. No new
dependencies required.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
To check this parser manually:
```python
uv run --python 3.12 python -c "
from deepdoc.parser import EpubParser
with open('$HOME/some_epub_book.epub', 'rb') as f:
data = f.read()
sections = EpubParser()(None, binary=data, chunk_token_num=512)
print(f'Got {len(sections)} sections')
for i, s in enumerate(sections[:5]):
print(f'\n--- Section {i} ---')
print(s[:200])
"
```
### What problem does this PR solve?
add support for eml file parser
#1363
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Support extracting table for markdown file in general parser
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
feat: support json file.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: KevinHuSh <kevinhu.sh@gmail.com>
### What problem does this PR solve?
- Update readme
- Add license
### Type of change
- [x] Documentation Update
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>