vllm/dev at b4e4eda92e1d3a013fc4007db64b69d8604264ff - vllm

Files

Alexander Matveev 7c7714d856 [Core][Bugfix][Perf] Introduce MQLLMEngine to avoid asyncio OH (#8157 )

Co-authored-by: Nick Hill <nickhill@us.ibm.com>
Co-authored-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com>
Co-authored-by: Simon Mo <simon.mo@hey.com>

2024-09-18 13:56:58 +00:00

dockerfile

[Bugfix][Doc] FIx Duplicate Explicit Target Name Errors (#5703 )

2024-06-19 23:10:47 -07:00

engine

Fix autodoc directives (#4272 )

2024-04-23 01:53:01 +00:00

input_processing

[VLM][Core] Support profiling with multiple multi-modal inputs per prompt (#7126 )

2024-08-14 17:55:42 +00:00

kernel

fix document error for value and v_vec illustration (#3421 )

2024-03-15 16:06:09 -07:00

multimodal

[Core][VLM] Stack multimodal tensors to represent multiple images within each prompt (#7902 )

2024-08-28 01:53:56 +00:00

offline_inference

[Frontend] Refactor prompt processing (#4028 )

2024-07-22 10:13:53 -07:00

profiling

[Core][Bugfix][Perf] Introduce MQLLMEngine to avoid asyncio OH (#8157 )

2024-09-18 13:56:58 +00:00

sampling_params.rst

[Core] Consolidate prompt arguments to LLM engines (#4328 )

2024-05-28 13:29:31 -07:00