Merge remote-tracking branch 'origin/main' into il_tool
Signed-off-by: Lucia Fang <fanglu@fb.com>
This commit is contained in:
@ -26,6 +26,8 @@ See <gh-file:LICENSE>.
|
||||
|
||||
## Developing
|
||||
|
||||
--8<-- "docs/getting_started/installation/python_env_setup.inc.md"
|
||||
|
||||
Depending on the kind of development you'd like to do (e.g. Python, CUDA), you can choose to build vLLM with or without compilation.
|
||||
Check out the [building from source][build-from-source] documentation for details.
|
||||
|
||||
@ -42,7 +44,7 @@ For an optimized workflow when iterating on C++/CUDA kernels, see the [Increment
|
||||
Install MkDocs along with the [plugins](https://github.com/vllm-project/vllm/blob/main/mkdocs.yaml) used in the vLLM documentation, as well as required dependencies:
|
||||
|
||||
```bash
|
||||
pip install -r requirements/docs.txt
|
||||
uv pip install -r requirements/docs.txt
|
||||
```
|
||||
|
||||
!!! note
|
||||
@ -98,13 +100,14 @@ For additional features and advanced configurations, refer to the official [MkDo
|
||||
??? console "Commands"
|
||||
|
||||
```bash
|
||||
pip install -r requirements/common.txt -r requirements/dev.txt
|
||||
# These commands are only for Nvidia CUDA platforms.
|
||||
uv pip install -r requirements/common.txt -r requirements/dev.txt --torch-backend=auto
|
||||
|
||||
# Linting, formatting and static type checking
|
||||
pre-commit install --hook-type pre-commit --hook-type commit-msg
|
||||
pre-commit install
|
||||
|
||||
# You can manually run pre-commit with
|
||||
pre-commit run --all-files
|
||||
pre-commit run --all-files --show-diff-on-failure
|
||||
|
||||
# To manually run something from CI that does not run
|
||||
# locally by default, you can run:
|
||||
@ -122,6 +125,10 @@ For additional features and advanced configurations, refer to the official [MkDo
|
||||
|
||||
Therefore, we recommend developing with Python 3.12 to minimise the chance of your local environment clashing with our CI environment.
|
||||
|
||||
!!! note "Install python3-dev if Python.h is missing"
|
||||
If any of the above commands fails with `Python.h: No such file or directory`, install
|
||||
`python3-dev` with `sudo apt install python3-dev`.
|
||||
|
||||
!!! note
|
||||
Currently, the repository is not fully checked by `mypy`.
|
||||
|
||||
@ -153,7 +160,7 @@ Using `-s` with `git commit` will automatically add this header.
|
||||
|
||||
!!! tip
|
||||
You can enable automatic sign-off via your IDE:
|
||||
|
||||
|
||||
- **PyCharm**: Click on the `Show Commit Options` icon to the right of the `Commit and Push...` button in the `Commit` window.
|
||||
It will bring up a `git` window where you can modify the `Author` and enable `Sign-off commit`.
|
||||
- **VSCode**: Open the [Settings editor](https://code.visualstudio.com/docs/configure/settings)
|
||||
|
||||
@ -20,19 +20,19 @@ the failure?
|
||||
|
||||
- **Use this title format:**
|
||||
|
||||
```
|
||||
```text
|
||||
[CI Failure]: failing-test-job - regex/matching/failing:test
|
||||
```
|
||||
|
||||
- **For the environment field:**
|
||||
|
||||
```
|
||||
Still failing on main as of commit abcdef123
|
||||
```text
|
||||
Still failing on main as of commit abcdef123
|
||||
```
|
||||
|
||||
- **In the description, include failing tests:**
|
||||
|
||||
```
|
||||
```text
|
||||
FAILED failing/test.py:failing_test1 - Failure description
|
||||
FAILED failing/test.py:failing_test2 - Failure description
|
||||
https://github.com/orgs/vllm-project/projects/20
|
||||
|
||||
@ -57,8 +57,7 @@ cc the PyTorch release team to initiate discussion on how to address them.
|
||||
|
||||
## Update CUDA version
|
||||
|
||||
The PyTorch release matrix includes both stable and experimental [CUDA versions](https://github.com/pytorch/pytorch/blob/main/RELEASE.md#release-compatibility-matrix). Due to limitations, only the latest stable CUDA version (for example,
|
||||
`torch2.7.0+cu12.6`) is uploaded to PyPI. However, vLLM may require a different CUDA version,
|
||||
The PyTorch release matrix includes both stable and experimental [CUDA versions](https://github.com/pytorch/pytorch/blob/main/RELEASE.md#release-compatibility-matrix). Due to limitations, only the latest stable CUDA version (for example, torch `2.7.1+cu126`) is uploaded to PyPI. However, vLLM may require a different CUDA version,
|
||||
such as 12.8 for Blackwell support.
|
||||
This complicates the process as we cannot use the out-of-the-box
|
||||
`pip install torch torchvision torchaudio` command. The solution is to use
|
||||
@ -107,6 +106,7 @@ releases (which would take too much time), they can be built from
|
||||
source to unblock the update process.
|
||||
|
||||
### FlashInfer
|
||||
|
||||
Here is how to build and install it from source with `torch2.7.0+cu128` in vLLM [Dockerfile](https://github.com/vllm-project/vllm/blob/27bebcd89792d5c4b08af7a65095759526f2f9e1/docker/Dockerfile#L259-L271):
|
||||
|
||||
```bash
|
||||
@ -122,6 +122,7 @@ public location for immediate installation, such as [this FlashInfer wheel link]
|
||||
team if you want to get the package published there.
|
||||
|
||||
### xFormers
|
||||
|
||||
Similar to FlashInfer, here is how to build and install xFormers from source:
|
||||
|
||||
```bash
|
||||
@ -139,7 +140,7 @@ uv pip install --system \
|
||||
|
||||
### causal-conv1d
|
||||
|
||||
```
|
||||
```bash
|
||||
uv pip install 'git+https://github.com/Dao-AILab/causal-conv1d@v1.5.0.post8'
|
||||
```
|
||||
|
||||
|
||||
@ -31,7 +31,7 @@ Features that fall under this policy include (at a minimum) the following:
|
||||
The deprecation process consists of several clearly defined stages that span
|
||||
multiple Y releases:
|
||||
|
||||
**1. Deprecated (Still On By Default)**
|
||||
### 1. Deprecated (Still On By Default)
|
||||
|
||||
- **Action**: Feature is marked as deprecated.
|
||||
- **Timeline**: A removal version is explicitly stated in the deprecation
|
||||
@ -46,7 +46,7 @@ warning (e.g., "This will be removed in v0.10.0").
|
||||
- GitHub Issue (RFC) for feedback
|
||||
- Documentation and use of the `@typing_extensions.deprecated` decorator for Python APIs
|
||||
|
||||
**2.Deprecated (Off By Default)**
|
||||
### 2.Deprecated (Off By Default)
|
||||
|
||||
- **Action**: Feature is disabled by default, but can still be re-enabled via a
|
||||
CLI flag or environment variable. Feature throws an error when used without
|
||||
@ -55,7 +55,7 @@ re-enabling.
|
||||
while signaling imminent removal. Ensures any remaining usage is clearly
|
||||
surfaced and blocks silent breakage before full removal.
|
||||
|
||||
**3. Removed**
|
||||
### 3. Removed
|
||||
|
||||
- **Action**: Feature is completely removed from the codebase.
|
||||
- **Note**: Only features that have passed through the previous deprecation
|
||||
|
||||
@ -5,7 +5,12 @@
|
||||
|
||||
## Profile with PyTorch Profiler
|
||||
|
||||
We support tracing vLLM workers using the `torch.profiler` module. You can enable tracing by setting the `VLLM_TORCH_PROFILER_DIR` environment variable to the directory where you want to save the traces: `VLLM_TORCH_PROFILER_DIR=/mnt/traces/`
|
||||
We support tracing vLLM workers using the `torch.profiler` module. You can enable tracing by setting the `VLLM_TORCH_PROFILER_DIR` environment variable to the directory where you want to save the traces: `VLLM_TORCH_PROFILER_DIR=/mnt/traces/`. Additionally, you can control the profiling content by specifying the following environment variables:
|
||||
|
||||
- `VLLM_TORCH_PROFILER_RECORD_SHAPES=1` to enable recording Tensor Shapes, off by default
|
||||
- `VLLM_TORCH_PROFILER_WITH_PROFILE_MEMORY=1` to record memory, off by default
|
||||
- `VLLM_TORCH_PROFILER_WITH_STACK=1` to enable recording stack information, on by default
|
||||
- `VLLM_TORCH_PROFILER_WITH_FLOPS=1` to enable recording FLOPs, off by default
|
||||
|
||||
The OpenAI server also needs to be started with the `VLLM_TORCH_PROFILER_DIR` environment variable set.
|
||||
|
||||
@ -112,13 +117,13 @@ vllm bench serve \
|
||||
|
||||
In practice, you should set the `--duration` argument to a large value. Whenever you want the server to stop profiling, run:
|
||||
|
||||
```
|
||||
```bash
|
||||
nsys sessions list
|
||||
```
|
||||
|
||||
to get the session id in the form of `profile-XXXXX`, then run:
|
||||
|
||||
```
|
||||
```bash
|
||||
nsys stop --session=profile-XXXXX
|
||||
```
|
||||
|
||||
|
||||
@ -32,9 +32,9 @@ We prefer to keep all vulnerability-related communication on the security report
|
||||
on GitHub. However, if you need to contact the VMT directly for an urgent issue,
|
||||
you may contact the following individuals:
|
||||
|
||||
- Simon Mo - simon.mo@hey.com
|
||||
- Russell Bryant - rbryant@redhat.com
|
||||
- Huzaifa Sidhpurwala - huzaifas@redhat.com
|
||||
- Simon Mo - <simon.mo@hey.com>
|
||||
- Russell Bryant - <rbryant@redhat.com>
|
||||
- Huzaifa Sidhpurwala - <huzaifas@redhat.com>
|
||||
|
||||
## Slack Discussion
|
||||
|
||||
|
||||
Reference in New Issue
Block a user