268c325078
Fix range_ratio Bug in RandomDataset ( #16126 )
...
Signed-off-by: jadewang21 <jadewangcn@outlook.com >
2025-04-10 15:31:17 -07:00
7cd0bd7212
[Bugfix] Fix output token length check logic ( #16419 )
...
Signed-off-by: look <eeslook@163.com >
2025-04-10 20:16:48 +00:00
04149cce27
[BugFix] fix some typos found by typos. ( #16314 )
...
Signed-off-by: yihong0618 <zouzou0208@gmail.com >
2025-04-09 03:43:59 -07:00
ba10801961
[Benchmark] Add sampling parameters to benchmark_serving. ( #16022 )
...
Signed-off-by: Hyesoo Yang <hyeygit@gmail.com >
2025-04-06 12:30:35 +08:00
06f21ce7a5
[Benchmark] Add AIMO Dataset to Benchmark ( #15955 )
...
Signed-off-by: Ziji Shi <shi.ziji.sm@gmail.com >
Signed-off-by: StevenShi-23 <shi.ziji.sm@gmail.com >
2025-04-03 06:09:18 +00:00
aa557e6422
[Benchmark]Fix error message ( #15866 )
...
Signed-off-by: wangli <wangli858794774@gmail.com >
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com >
2025-04-02 01:32:24 -07:00
effc5d24fa
[Benchmark] Update Vision Arena Dataset and HuggingFaceDataset Setup ( #15748 )
...
Signed-off-by: Jennifer Zhao <ai.jenniferzhao@gmail.com >
2025-03-31 15:38:58 +08:00
70e132244a
[Minor] Remove TGI launching script ( #15646 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu >
2025-03-28 09:30:08 -07:00
e7f720ea56
[Misc]add coding benchmark for speculative decoding ( #15303 )
...
Signed-off-by: CXIAAAAA <cxia0209@gmail.com >
2025-03-28 10:47:05 +08:00
583a9778e0
[Benchmark] Do not save detailed info to json by default ( #14879 )
...
Signed-off-by: simon-mo <simon.mo@hey.com >
2025-03-16 21:48:11 -07:00
1253b15774
[Feature] Consolidate performance benchmark datasets ( #14036 )
...
Signed-off-by: Jennifer Zhao <7443418+JenZhao@users.noreply.github.com >
Signed-off-by: Roger Wang <ywang@roblox.com >
Co-authored-by: Jennifer Zhao <7443418+JenZhao@users.noreply.github.com >
Co-authored-by: Roger Wang <ywang@roblox.com >
2025-03-10 07:23:11 +00:00
ad60bbb2b2
[Doc] Fix a typo ( #14385 )
2025-03-06 16:31:52 -08:00
a4f1ee35d6
Deprecate best_of Sampling Parameter in anticipation for vLLM V1 ( #13997 )
...
Signed-off-by: vincent-4 <vincentzhongy+githubvincent4@gmail.com >
Signed-off-by: Brayden Zhong <b8zhong@uwaterloo.ca >
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca >
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2025-03-05 20:22:43 +00:00
cf069aa8aa
Update deprecated Python 3.8 typing ( #13971 )
2025-03-02 17:34:51 -08:00
e7ef74e26e
Fix some issues with benchmark data output ( #13641 )
...
Signed-off-by: Huy Do <huydhn@gmail.com >
2025-02-24 10:23:18 +08:00
9bebc9512f
[Misc] Deprecate --dataset from benchmark_serving.py ( #13708 )
...
Signed-off-by: Roger Wang <ywang@roblox.com >
2025-02-23 13:32:20 +00:00
45186834a0
Run v1 benchmark and integrate with PyTorch OSS benchmark database ( #13068 )
...
Signed-off-by: Huy Do <huydhn@gmail.com >
2025-02-17 08:16:32 +00:00
3ee696a63d
[RFC][vllm-API] Support tokenizer registry for customized tokenizer in vLLM ( #12518 )
...
Signed-off-by: Keyun Tong <tongkeyun@gmail.com >
2025-02-12 12:25:58 +08:00
58047c6f04
[Benchmark] Add BurstGPT to benchmark_serving ( #13063 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu >
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com >
2025-02-10 21:25:30 -08:00
7e1837676a
[misc] Add LoRA to benchmark_serving ( #12898 )
...
Signed-off-by: Varun Sundar Rabindranath <varun@neuralmagic.com >
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com >
2025-02-08 17:15:44 +08:00
e489ad7a21
[Misc] Add SPDX-License-Identifier headers to python source files ( #12628 )
...
- **Add SPDX license headers to python source files**
- **Check for SPDX headers using pre-commit**
commit 9d7ef44c3cfb72ca4c32e1c677d99259d10d4745
Author: Russell Bryant <rbryant@redhat.com >
Date: Fri Jan 31 14:18:24 2025 -0500
Add SPDX license headers to python source files
This commit adds SPDX license headers to python source files as
recommended to
the project by the Linux Foundation. These headers provide a concise way
that is
both human and machine readable for communicating license information
for each
source file. It helps avoid any ambiguity about the license of the code
and can
also be easily used by tools to help manage license compliance.
The Linux Foundation runs license scans against the codebase to help
ensure
we are in compliance with the licenses of the code we use, including
dependencies. Having these headers in place helps that tool do its job.
More information can be found on the SPDX site:
- https://spdx.dev/learn/handling-license-info/
Signed-off-by: Russell Bryant <rbryant@redhat.com >
commit 5a1cf1cb3b80759131c73f6a9dddebccac039dea
Author: Russell Bryant <rbryant@redhat.com >
Date: Fri Jan 31 14:36:32 2025 -0500
Check for SPDX headers using pre-commit
Signed-off-by: Russell Bryant <rbryant@redhat.com >
---------
Signed-off-by: Russell Bryant <rbryant@redhat.com >
2025-02-02 11:58:18 -08:00
823ab79633
Update pre-commit hooks ( #12475 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com >
2025-01-27 17:23:08 -07:00
3c818bdb42
[Misc] Use VisionArena Dataset for VLM Benchmarking ( #12389 )
...
Signed-off-by: Roger Wang <ywang@roblox.com >
2025-01-24 00:22:04 -08:00
222a9dc350
[Benchmark] More accurate TPOT calc in benchmark_serving.py ( #12288 )
...
Signed-off-by: Nick Hill <nhill@redhat.com >
2025-01-22 13:46:14 +08:00
936db119ed
benchmark_serving support --served-model-name param ( #12109 )
...
Signed-off-by: zibai <zibai.gj@alibaba-inc.com >
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com >
2025-01-19 09:59:56 +00:00
238c0d93b4
[Misc] Add tokenizer_mode param to benchmark_serving.py ( #11174 )
...
Signed-off-by: Alexander Matveev <alexm@neuralmagic.com >
2024-12-13 16:19:10 +00:00
c11f172187
[Misc] Adding MMMU-Pro vision dataset to serving benchmark ( #10804 )
...
Signed-off-by: Roger Wang <ywang@roblox.com >
Co-authored-by: Chen Zhang <zhangch99@outlook.com >
Co-authored-by: Isotr0py <2037008807@qq.com >
2024-12-01 08:47:05 +00:00
8b6725b0cf
[Misc] Update benchmark to support image_url file or http ( #10287 )
...
Signed-off-by: rbbang <anjaehyun87@gmail.com >
2024-11-16 18:15:40 +08:00
a62bc0109c
[Misc] Add Gamma-Distribution Request Generation Support for Serving Benchmark. ( #10105 )
...
Signed-off-by: Mozhou <spli161006@gmail.com >
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com >
2024-11-07 11:20:30 +00:00
ea4adeddc1
[Bugfix] Fix E2EL mean and median stats ( #9984 )
...
Signed-off-by: daitran2k1 <tranquangdai7a@gmail.com >
2024-11-04 09:37:58 +00:00
855e0e6f97
[Frontend][Misc] Goodput metric support ( #9338 )
2024-10-20 18:39:32 +00:00
7dbe738d65
[Misc] benchmark: Add option to set max concurrency ( #9390 )
...
Signed-off-by: Russell Bryant <rbryant@redhat.com >
2024-10-18 11:15:28 -07:00
d65049daab
[Bugfix] Add random_seed to sample_hf_requests in benchmark_serving script ( #9013 )
...
Co-authored-by: Isotr0py <2037008807@qq.com >
2024-10-17 21:11:11 +00:00
5d264f4ab8
pass ignore_eos parameter to all benchmark_serving calls ( #9349 )
2024-10-15 13:30:44 -07:00
94bf9ae4e9
[Misc] Fix sampling from sonnet for long context case ( #9235 )
2024-10-11 00:33:16 +00:00
18b296fdb2
[core] remove beam search from the core ( #9105 )
2024-10-07 05:47:04 +00:00
fbb74420e7
[CI] Update performance benchmark: upgrade trt-llm to r24.07, and add SGLang ( #7412 )
2024-10-04 14:01:44 -07:00
22f5851b80
Update benchmark_serving.py to read and write json-datasets, results in UTF8, for better compatibility with Windows ( #8997 )
2024-10-01 11:07:06 -07:00
e585b583a9
[Bugfix] Support testing prefill throughput with benchmark_serving.py --hf-output-len 1 ( #8891 )
2024-09-28 18:51:22 +00:00
0e088750af
[MISC] Fix invalid escape sequence '\' ( #8830 )
...
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io >
2024-09-27 01:13:25 -07:00
c52ec5f034
[Bugfix] fixing sonnet benchmark bug in benchmark_serving.py ( #8616 )
2024-09-19 05:24:24 +00:00
1b6de8352b
[Benchmark] Support sample from HF datasets and image input for benchmark_serving ( #8495 )
2024-09-17 07:34:27 +00:00
795b662cff
Enable Random Prefix Caching in Serving Profiling Tool (benchmark_serving.py) ( #8241 )
2024-09-06 20:18:16 -07:00
e5cab71531
[Frontend] Add --logprobs argument to benchmark_serving.py ( #8191 )
2024-09-06 09:01:14 -07:00
77d9e514a2
[MISC] Replace input token throughput with total token throughput ( #8164 )
...
Co-authored-by: Michael Goin <michael@neuralmagic.com >
2024-09-04 20:23:22 +00:00
0c785d344d
Add more percentiles and latencies ( #7759 )
2024-08-29 16:48:11 -07:00
dd53c4b023
[misc] Add Torch profiler support ( #7451 )
...
Co-authored-by: Cody Yu <hao.yu.cody@gmail.com >
2024-08-21 15:39:26 -07:00
ccb20db8bd
[Bugfix] Benchmark serving script used global parameter 'args' in function 'sample_random_requests' ( #6428 )
2024-07-14 19:27:01 -07:00
dbfe254eda
[Feature] vLLM CLI ( #5090 )
...
Co-authored-by: simon-mo <simon.mo@hey.com >
2024-07-14 15:36:43 -07:00
a4feba929b
[CI/Build] Add nightly benchmarking for tgi, tensorrt-llm and lmdeploy ( #5362 )
2024-07-11 13:28:38 -07:00