vllm/entrypoints at 7c7714d856eee6fa94aade729b67f00584f72a4c - vllm

Files

Alexander Matveev 7c7714d856 [Core][Bugfix][Perf] Introduce MQLLMEngine to avoid asyncio OH (#8157 )

Co-authored-by: Nick Hill <nickhill@us.ibm.com>
Co-authored-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com>
Co-authored-by: Simon Mo <simon.mo@hey.com>

2024-09-18 13:56:58 +00:00

llm

[Core] Support load and unload LoRA in api server (#6566 )

2024-09-05 18:10:33 -07:00

offline_mode

[Bugfix] Offline mode fix (#8376 )

2024-09-12 11:11:57 -07:00

openai

[Core][Bugfix][Perf] Introduce MQLLMEngine to avoid asyncio OH (#8157 )

2024-09-18 13:56:58 +00:00

__init__.py

[CI/Build] Move test_utils.py to tests/utils.py (#4425 )

2024-05-13 23:50:09 +09:00

conftest.py

Support for guided decoding for offline LLM (#6878 )

2024-08-04 03:12:09 +00:00

test_chat_utils.py

[Frontend] Multimodal support in offline chat (#8098 )

2024-09-04 05:22:17 +00:00