[Sampler] Support returning final logprobs (#22387)

Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
This commit is contained in:
22quinn
2025-08-20 21:28:32 -07:00
committed by GitHub
parent f64ee61d9e
commit f571ff8eb6
7 changed files with 125 additions and 69 deletions

View File

@ -154,12 +154,15 @@ differences compared to V0:
##### Logprobs Calculation
Logprobs in V1 are now returned immediately once computed from the models raw output (i.e.
By default, logprobs in V1 are now returned immediately once computed from the models raw output (i.e.
before applying any logits post-processing such as temperature scaling or penalty
adjustments). As a result, the returned logprobs do not reflect the final adjusted
probabilities used during sampling.
Support for logprobs with post-sampling adjustments is in progress and will be added in future updates.
You can adjust this behavior by setting the `--logprobs-mode` flag.
Four modes are supported: `raw_logprobs` (default), `processed_logprobs`, `raw_logits`, `processed_logits`.
Raw means the values before applying any logit processors, like bad words.
Processed means the values after applying all processors, including temperature and top_k/top_p.
##### Prompt Logprobs with Prefix Caching