AI Paper Review: Language Models are Unsupervised Multitask Learners (GPT-2)
GPT-2 demonstrated that training a large language model solely on unsupervised next-word prediction enables emergent multitask capabilities, performing translation, QA, and summarization without task-specific fine-tuning.
入选理由:GPT-2在800万网页文本上训练,参数量达15亿,首次展示零样本迁移能力。




