[Doc][3/N] Reorganize Serving section (#11766)

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
This commit is contained in:
Cyrus Leung
2025-01-07 11:20:01 +08:00
committed by GitHub
parent d93d2d74fd
commit 8ceffbf315
40 changed files with 248 additions and 133 deletions

View File

@ -65,32 +65,14 @@ getting_started/troubleshooting
getting_started/faq
```
```{toctree}
:caption: Serving
:maxdepth: 1
serving/openai_compatible_server
serving/deploying_with_docker
serving/deploying_with_k8s
serving/deploying_with_helm
serving/deploying_with_nginx
serving/distributed_serving
serving/metrics
serving/integrations
serving/tensorizer
serving/runai_model_streamer
serving/engine_args
serving/env_vars
serving/usage_stats
```
```{toctree}
:caption: Models
:maxdepth: 1
models/supported_models
models/generative_models
models/pooling_models
models/supported_models
models/extensions/index
```
```{toctree}
@ -99,7 +81,6 @@ models/pooling_models
features/quantization/index
features/lora
features/multimodal_inputs
features/tool_calling
features/structured_outputs
features/automatic_prefix_caching
@ -108,6 +89,32 @@ features/spec_decode
features/compatibility_matrix
```
```{toctree}
:caption: Inference and Serving
:maxdepth: 1
serving/offline_inference
serving/openai_compatible_server
serving/multimodal_inputs
serving/distributed_serving
serving/metrics
serving/engine_args
serving/env_vars
serving/usage_stats
serving/integrations/index
```
```{toctree}
:caption: Deployment
:maxdepth: 1
deployment/docker
deployment/k8s
deployment/nginx
deployment/frameworks/index
deployment/integrations/index
```
```{toctree}
:caption: Performance
:maxdepth: 1