[TPU] Use mark_dynamic to reduce compilation time (#7340)

This commit is contained in:
Woosuk Kwon
2024-08-10 18:12:22 -07:00
committed by GitHub
parent 4c5d8e8ea9
commit 90bab18f24
3 changed files with 50 additions and 16 deletions

View File

@ -56,7 +56,7 @@ First, install the dependencies:
$ pip uninstall torch torch-xla -y
$ # Install PyTorch and PyTorch XLA.
$ export DATE="+20240726"
$ export DATE="+20240808"
$ pip install https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch-nightly${DATE}-cp310-cp310-linux_x86_64.whl
$ pip install https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-nightly${DATE}-cp310-cp310-linux_x86_64.whl
@ -65,7 +65,7 @@ First, install the dependencies:
$ pip install torch_xla[pallas] -f https://storage.googleapis.com/jax-releases/jax_nightly_releases.html -f https://storage.googleapis.com/jax-releases/jaxlib_nightly_releases.html
$ # Install other build dependencies.
$ pip install packaging aiohttp
$ pip install -r requirements-tpu.txt
Next, build vLLM from source. This will only take a few seconds: