DAMO-ConvAI/rlp
langhaobeijing 611141c7e9
Update README.md
2024-03-28 17:25:03 +08:00
..
README.md Update README.md 2024-03-28 17:25:03 +08:00

README.md

Fine-Tuning Language Models with Reward Learning on Policy

The code and data will be added soon