From 17af6aa0da5e825007d1d32cf6df8b3c78c5558d Mon Sep 17 00:00:00 2001 From: jinghanhu <359577025@qq.com> Date: Sat, 25 Oct 2025 04:31:50 +0800 Subject: [PATCH] [Document] Add ms-swift library to rlhf.md (#27469) --- docs/training/rlhf.md | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/training/rlhf.md b/docs/training/rlhf.md index b207c9ed37..0b7e384dc8 100644 --- a/docs/training/rlhf.md +++ b/docs/training/rlhf.md @@ -5,6 +5,7 @@ Reinforcement Learning from Human Feedback (RLHF) is a technique that fine-tunes The following open-source RL libraries use vLLM for fast rollouts (sorted alphabetically and non-exhaustive): - [Cosmos-RL](https://github.com/nvidia-cosmos/cosmos-rl) +- [ms-swift](https://github.com/modelscope/ms-swift/tree/main) - [NeMo-RL](https://github.com/NVIDIA-NeMo/RL) - [Open Instruct](https://github.com/allenai/open-instruct) - [OpenRLHF](https://github.com/OpenRLHF/OpenRLHF)