Concepedia

Publication | Open Access

Mitigating the Alignment Tax of RLHF

15

Citations

0

References

2024

Year

Abstract

Yong Lin, Hangyu Lin, Wei Xiong, Shizhe Diao, Jianmeng Liu, Jipeng Zhang, Rui Pan, Haoxiang Wang, Wenbin Hu, Hanning Zhang, Hanze Dong, Renjie Pi, Han Zhao, Nan Jiang, Heng Ji, Yuan Yao, Tong Zhang. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024.