Improve your agent’s tool-calling accuracy with SFT and DPO on Amazon SageMaker AI | Amazon Web Services
By using Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) techniques, you can significantly improve the tool-calling accuracy of a small language model on Amazon SageMaker AI. These methods combine high-quality datasets and human feedback to optimize the model’s interactions with digital tools.
入选理由:使用SFT和DPO技术可以提高AI代理执行复杂任务时选择正确工具的能力。
