vllm/entrypoints at e3dd0692fa2c803cd6f59a88d2fdf8bca26d8d96 - vllm

Files

Andy 2529d09b5a [Frontend] Batch inference for llm.chat() API (#8648 )

Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>

2024-09-24 09:44:11 -07:00

llm

[Frontend] Batch inference for llm.chat() API (#8648 )

2024-09-24 09:44:11 -07:00

offline_mode

[Bugfix] Offline mode fix (#8376 )

2024-09-12 11:11:57 -07:00

openai

Add output streaming support to multi-step + async while ensuring RequestOutput obj reuse (#8335 )

2024-09-23 15:38:04 -07:00

__init__.py

[CI/Build] Move test_utils.py to tests/utils.py (#4425 )

2024-05-13 23:50:09 +09:00

conftest.py

Support for guided decoding for offline LLM (#6878 )

2024-08-04 03:12:09 +00:00

test_chat_utils.py

[Frontend] Multimodal support in offline chat (#8098 )

2024-09-04 05:22:17 +00:00