AI Paper Review: Language Models are Unsupervised Multitask Learners (GPT-2)
freeCodeCamp.org3193 字 (约 13 分钟)
92
GPT-2 demonstrated that training a large language model solely on unsupervised next-word prediction enables emergent multitask capabilities, performing translation, QA, and summarization without task-specific fine-tuning.
入选理由:GPT-2在800万网页文本上训练,参数量达15亿,首次展示零样本迁移能力。
FeaturedArticle#GPT-2#Large Language Models#Zero-Shot Learning#Transformer英文

