[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"$fkr_RelK1TKxX7SCS1UjZfyj4mgL5sSrHoVoZo9bncx0":3},{"code":4,"msg":5,"data":6},200,"操作成功",{"id":7,"title":8,"content":9,"digest":10,"source":10,"coverPath":11,"thumbsCoverPath":12,"isTop":13,"isShow":14,"baseClick":13,"clickCount":15,"createTime":16,"typeId":17,"isNewest":18,"newsInfoTypeRespVo":19,"voiceUrl":22,"voiceSize":23,"taskId":24,"releaseTime":25,"titleEn":26,"contentEn":27,"voiceUrlEn":28,"taskIdEn":29,"voiceSizeEn":30},1416,"230个大模型在婴幼儿认知题上集体翻车！揭秘多模态大模型的核心知识缺陷 ","\u003Cp>\u003Cstrong class=\"ql-lineHeight-1-75\" style=\"font-size: 18px; color: rgb(255, 153, 0);\">一篇被Yann LeCun转发的ICML 2025研究给了多模态大模型当头一棒——\u003C\u002Fstrong>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cstrong class=\"ql-lineHeight-1-75\" style=\"font-size: 18px; color: rgb(255, 153, 0);\">大部分AI在复杂任务上表现很好，但在人类从小就会的基础认知能力上却很拉垮。\u003C\u002Fstrong>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">研究者建了测评题库CoreCognition，覆盖在人类婴幼儿阶段即出现的12种核心认知能力（如客体永恒、视角采择、直觉物理、知觉恒常等），用来对模型进行系统性测试。\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">在CoreCognition基准的1503道“经典发展心理学测验”上，230个主流模型系统暴露出对世界常识的“核心知识盲区”。\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">在归一化准确率对比中，多模态大模型在基础核心认知能力上普遍落后，差距往往达到两位数，即便规模更大也难以弥补。\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">这是否意味着MLLM（多模态大模型）的先天认知结构中，缺少那些支撑早期人类学习的基础知识机制？\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">也就是说，它们是否缺乏“core knowledge”（核心认知能力）？\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cstrong class=\"ql-lineHeight-1-75\" style=\"font-size: 18px; color: rgb(255, 153, 0);\">构建CoreCognition Benchmark\u003C\u002Fstrong>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cimg alt=\"undefined\" src=\"https:\u002F\u002Fimage.51xinwei.com\u002F2025\u002F10\u002Fef6229b26ae048d487f41001380c34ad\u002Fbb2713f16f5442bcb1c2b90894722c1b.webp\" width=\"863\" height=\"undefined\" style=\"display: block; margin: auto;\">\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">来自加州大学圣地亚哥分校、约翰霍普金斯大学、埃默里大学、北卡罗来纳大学教堂山分校、斯坦福大学、卡内基梅隆大学等机构的研究人员，花费一年时间构造并开源了业界首个核心认知基准CoreCognition。\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">基准围绕发展心理学与皮亚杰分层框架，覆盖从连续性到机械推理\u003C\u002Fspan>\u003Cstrong style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">12 项核心认知概念\u003C\u002Fstrong>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">，共\u003C\u002Fspan>\u003Cstrong style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">1503\u003C\u002Fstrong>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">道多模态题目，\u003C\u002Fspan>\u003Cstrong style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">每类≥95例\u003C\u002Fstrong>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">，含图像与视频。\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cimg alt=\"undefined\" src=\"https:\u002F\u002Fimage.51xinwei.com\u002F2025\u002F10\u002Fb9d7981a46e64729880b0bb4a8520467\u002Fea88f303e4834123b7c5fda41058089a.webp\" width=\"698\" height=\"undefined\" style=\"display: block; margin: auto;\">\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">研究团队在设计题目时遵循以下高标准：\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cul>\u003Cli>\u003Cstrong style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">判别性强\u003C\u002Fstrong>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">不具备目标核心知识的模型在逻辑上更易选择错误选项。\u003C\u002Fspan>\u003C\u002Fli>\u003C\u002Ful>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cul>\u003Cli>\u003Cstrong style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">最小混淆\u003C\u002Fstrong>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">题目尽量仅依赖待测概念完成推理，剔除与其他核心知识或外部能力的耦合，避免跨概念干扰。\u003C\u002Fspan>\u003C\u002Fli>\u003C\u002Ful>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cul>\u003Cli>\u003Cstrong style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">无文本捷径\u003C\u002Fstrong>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">所有题目必须联合利用图像与文本才能得出正确答案。\u003C\u002Fspan>\u003C\u002Fli>\u003C\u002Ful>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">所有数据由12位具备认知科学、计算机科学或统计学背景的高年级本科或研究生协作完成标注与审核，经过两轮交叉验证和Amazon Mechanical Turk人工校验。\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cstrong class=\"ql-lineHeight-1-75\" style=\"font-size: 18px; color: rgb(255, 153, 0);\">干预测试揭示“假理解”陷阱\u003C\u002Fstrong>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">为了进一步验证模型是否真的掌握核心概念，研究团队提出了\u003C\u002Fspan>\u003Cstrong style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Concept Hacking（概念干预）&nbsp;\u003C\u002Fstrong>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">方法：通过构造“对照组”（control）与“干预组”（manipulated），故意在测试任务中反转与核心知识相关的关键特征，但保持其余细节一致，检测模型是否真正理解概念还是走捷径。\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cimg alt=\"undefined\" src=\"https:\u002F\u002Fimage.51xinwei.com\u002F2025\u002F10\u002F0d9261a5e1f94711bc78ace965f4bbde\u002Fdbe0019688124e6bb3cc02e4fc08063a.webp\" width=\"undefined\" height=\"undefined\" style=\"display: block; margin: auto;\">\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">例如其中的Intuitive Physics测试：\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cul>\u003Cli>\u003Cstrong style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">原版题\u003C\u002Fstrong>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">同时释放两颗小球，哪一个会先落地？考察基础直觉物理（相同释放高度、忽略空气阻力时，自由落体等时到地）。\u003C\u002Fspan>\u003C\u002Fli>\u003C\u002Ful>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cul>\u003Cli>\u003Cstrong style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">孪生版\u003C\u002Fstrong>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">保持大小不变，但改变释放高度，用以检验模型是否真正依据高度差\u002F落地时间推断，而非套用“同时落地”的固定模板。\u003C\u002Fspan>\u003C\u002Fli>\u003C\u002Ful>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cul>\u003Cli>\u003Cstrong style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">人类表现\u003C\u002Fstrong>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">两题均能作对，能根据高度改变及时更新判断。\u003C\u002Fspan>\u003C\u002Fli>\u003C\u002Ful>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cul>\u003Cli>\u003Cstrong style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">模型表现\u003C\u002Fstrong>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">原题作对（选C），孪生版仍沿用旧模式选C，直接翻车——暴露出对表面模板的依赖，而非对落体规律的真实理解。\u003C\u002Fspan>\u003C\u002Fli>\u003C\u002Ful>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cstrong style=\"font-size: 18px; color: rgb(255, 153, 0);\">五大关键发现\u003C\u002Fstrong>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">一、在与人类早期认知直接相关的低层能力（如边界感、连续性、客体永恒、空间性、视角采择等）上，模型显著落后于高层能力（如意向理解、工具使用、机械推理），与人类各层稳定高分的模式明显不同。这表明\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cstrong style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">当前MLLMs在人类早期即具备的基础“核心知识”上存在系统性短板\u003C\u002Fstrong>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">。\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cimg alt=\"undefined\" src=\"https:\u002F\u002Fimage.51xinwei.com\u002F2025\u002F10\u002F602c4a099a614795b4158759aec249fe\u002F4d5f62982abf4722812ef6b786d88db2.webp\" width=\"undefined\" height=\"undefined\" style=\"display: block; margin: auto;\">\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">二、关联性矩阵显示，高层能力族内关联较强，底层能力Permanence\u002FSpatiality\u002FContinuity与高层能力相关性普遍偏弱。说明模型缺乏人类由低到高的脚手架式认知发展结构，模型的高级感知与推理并不是建立在基础的认知能力上的。这也能解释为什么模型出现鲁棒性缺陷。\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cimg alt=\"undefined\" src=\"https:\u002F\u002Fimage.51xinwei.com\u002F2025\u002F10\u002Fe2ecc0e9909b4392963f3c5fbae156fd\u002F89d0242e4a50457c9f419c43237eae43.webp\" width=\"undefined\" height=\"undefined\" style=\"display: block; margin: auto;\">\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">三、研究团队将三阶段12个核心能力的得分与26个公开基准做相关性分析，结果表明除Perspective和Intuitive Physics外，大多数核心能力与公开基准（除ChartQA）及高层能力显著正相关。这表明核心知识越强，上层任务越稳。而Perspective和Intuitive Physics能力作为人类高级推理的基础展现出的低相关性，与我们之前在关系矩阵里看到的模式一致，这正是现有模型核心知识缺陷的直接证据。\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">四、基于230个模型拟合“规模—表现”的回归斜率显示，低层能力随规模提升改善显著更少或几乎不变；其中Perspective-taking甚至出现反向规模效应（模型越大越差）。增加模型规模主要利好高层能力，对低层核心能力帮助有限甚至为负。\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cimg alt=\"undefined\" src=\"https:\u002F\u002Fimage.51xinwei.com\u002F2025\u002F10\u002Ff4f0061566e8457b99a8cbd6f45a9fc7\u002F1d28ca033d244ccc8705d0da6473f664.webp\" width=\"undefined\" height=\"undefined\" style=\"display: block; margin: auto;\">\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">五、Concept Hacking实验结果显示，大模型相较小模型整体并未取得提升，部分情形甚至更差。这说明单靠扩规模不足以消除对捷径的依赖，也难以获得稳健的核心知识。直观上，模型并非“越大越懂”，而是越大越善于投机。\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">结合结果图中的信息，模型可归纳为四类：\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cul>\u003Cli>\u003Cstrong style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">核心知识型\u003C\u002Fstrong>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">控制题与操纵题均表现良好（接近人类水平，但样本占比极少），说明具备稳健的核心概念理解与迁移。\u003C\u002Fspan>\u003C\u002Fli>\u003C\u002Ful>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cul>\u003Cli>\u003Cstrong style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">捷径依赖型\u003C\u002Fstrong>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">控制题得分高、操纵题显著下降，提示主要依赖表面线索或训练相似性，缺乏对概念要素的因果把握。\u003C\u002Fspan>\u003C\u002Fli>\u003C\u002Ful>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cul>\u003Cli>\u003Cstrong style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">核心缺陷型\u003C\u002Fstrong>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">控制题即低于或接近偶然水平，操纵题亦无稳定收益，反映基础“核心知识”不足。\u003C\u002Fspan>\u003C\u002Fli>\u003C\u002Ful>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cul>\u003Cli>\u003Cstrong style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">偶然型\u003C\u002Fstrong>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">控制题与操纵题均近似随机波动，整体不可依赖（更多体现噪声与运气）。\u003C\u002Fspan>\u003C\u002Fli>\u003Cli>\u003Cbr>\u003C\u002Fli>\u003C\u002Ful>\u003Cimg alt=\"undefined\" src=\"https:\u002F\u002Fimage.51xinwei.com\u002F2025\u002F10\u002Fb9a2c186f86a45f6b50ce0408a842b20\u002Fcb3fe127faed4dfca87ffdb072acca7a.webp\" width=\"undefined\" height=\"undefined\" style=\"display: block; margin: auto;\">\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cstrong style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">认知指令带来短期增益，但难以弥补底层缺口\u003C\u002Fstrong>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">。\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">对比推理模型与其对应非推理版本模型性能显示，推理模型多数核心能力任务未见显著提升，症结不在“会不会用推理”，而在底层表征是否具备，即预训练阶段对核心知识的覆盖与结构化不足。\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cimg alt=\"undefined\" src=\"https:\u002F\u002Fimage.51xinwei.com\u002F2025\u002F10\u002Fde8e591c4fcd4157b6a7b01cd3e42015\u002F7dba64fa3ad843af8dbceaaa84aedda0.webp\" width=\"842\" height=\"undefined\" style=\"display: block; margin: auto;\">\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">与此同时，研究团队发现，引入认知指令（在题目前明确提示相关概念，如perspective taking）可带来约6%的即刻增益，提示模型内部可能分布式存有相关线索，但缺少有效的检索与调用机制。\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">然而，此类做法在真实场景中可获得性与可用性受限，实际应用往往无法提供如此明确的概念标签来引导模型。\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">在引人注目的“能写会画”之外，真正的智能首先取决于对世界最朴素规则的把握。\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">这项研究说明：参数堆叠并不等于理解，地基是否扎实才是关键。\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">与其一味追求“更大、更强”，不如换个起点：先把核心知识补齐，让模型学会在变化、多样与噪声中保持一致的常识判断与因果直觉。\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">简单说就是：先长地基，再长楼层；规模是加法，核心认知是乘法。\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan class=\"ql-lineHeight-1-75\" style=\"color: rgb(187, 187, 187);\">论文地址：https:\u002F\u002Farxiv.org\u002Fabs\u002F2410.10855\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cspan class=\"ql-lineHeight-1-75\" style=\"color: rgb(187, 187, 187);\">Website：https:\u002F\u002Fgrow-ai-like-a-child.github.io\u002Fcore-knowledge\u002F\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cspan class=\"ql-lineHeight-1-75\" style=\"color: rgb(187, 187, 187);\">Dataset：https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fwilliamium\u002FCoreCognition\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"color: rgb(187, 187, 187);\">【新闻来源】量子位 | 公众号 QbitAI \u003C\u002Fspan>\u003Ca href=\"https:\u002F\u002Fwww.sohu.com\u002Fa\u002F942435345_610300?_trans_=000020_cozecj\" rel=\"noopener noreferrer\" target=\"_blank\" style=\"color: rgb(187, 187, 187);\">https:\u002F\u002Fwww.sohu.com\u002Fa\u002F942435345_610300?_trans_=000020_cozecj\u003C\u002Fa>\u003C\u002Fp>\u003Cp class=\"ql-align-justify\">\u003Cspan style=\"color: rgb(187, 187, 187);\">（本网转发此文章，旨在为读者提供更多的信息资讯，所涉内容不构成投资、消费建议。文章事实如有疑问，请与有关方核实，文章观点非本网观点，仅供读者参考。）\u003C\u002Fspan>\u003C\u002Fp>\u003Cp class=\"ql-align-justify\">\u003Cbr>\u003C\u002Fp>","","https:\u002F\u002Fimage.51xinwei.com\u002F2025\u002F10\u002F136cca0a1a0d44ad94c85bf06d6e216e\u002FAI领域.jpg","https:\u002F\u002Fimage.51xinwei.com\u002F2025\u002F10\u002Fthumbs\u002F136cca0a1a0d44ad94c85bf06d6e216e\u002FAI领域.jpg",0,1,69,"2025-10-11 18:20",2,false,{"id":17,"name":20,"enName":21},"芯位视野","Xinwei Vision","https:\u002F\u002Fxinwei-dev-test.oss-cn-shenzhen.aliyuncs.com\u002Fintelligent\u002Faudio%3Af1f3ab97-d104-4fd5-b77f-3b3db25a2997%3A0.wav?Expires=1760185231&OSSAccessKeyId=LTAI5tNvY2RkKjZw4LLWsrPK&Signature=9H6Wmw%2B1MlKwcApZy3DxjOV63iI%3D",12271328,"f1f3ab97-d104-4fd5-b77f-3b3db25a2997","2025-10-11 18:11","230 large models failed collectively in infant cognitive questions! Revealing the core knowledge deficiencies of multimodal large models","\u003Cp>\u003Cstrong class=\"ql-lineHeight-1-75\" style=\"font-size: 18px; color: rgb(255, 153, 0);\">A study from ICML 2025 that was retweeted by Yann LeCun gave multimodal large models a hard blow——\u003C\u002Fstrong>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cstrong class=\"ql-lineHeight-1-75\" style=\"font-size: 18px; color: rgb(255, 153, 0);\">Most AI perform well in complex tasks, but they are very poor in basic cognitive abilities that humans learn from a young age.\u003C\u002Fstrong>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Researchers built an evaluation question bank called CoreCognition, covering 12 core cognitive abilities that appear during the human infant stage (such as object permanence, perspective taking, intuitive physics, perceptual constancy, etc.), to conduct systematic testing on the models.\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">On the 1503 \"classic developmental psychology tests\" in the CoreCognition benchmark, 230 mainstream model systems exposed \"core knowledge blind spots\" about the world's common sense.\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">In normalized accuracy comparisons, multimodal large models generally lagged behind in basic core cognitive abilities, with gaps often reaching double digits, and even larger models could not compensate.\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Does this mean that MLLMs (multimodal large models) lack the fundamental knowledge mechanisms that support early human learning in their innate cognitive structures?\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">In other words, do they lack \"core knowledge\" (core cognitive abilities)?\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cstrong class=\"ql-lineHeight-1-75\" style=\"font-size: 18px; color: rgb(255, 153, 0);\">Building CoreCognition Benchmark\u003C\u002Fstrong>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cimg alt=\"undefined\" src=\"https:\u002F\u002Fimage.51xinwei.com\u002F2025\u002F10\u002Fef6229b26ae048d487f41001380c34ad\u002Fbb2713f16f5442bcb1c2b90894722c1b.webp\" width=\"863\" height=\"undefined\" style=\"display: block; margin: auto;\">\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Researchers from institutions such as the University of California, San Diego, Johns Hopkins University, Emory University, the University of North Carolina at Chapel Hill, Stanford University, and Carnegie Mellon University spent a year constructing and open-sourcing the industry's first core cognition benchmark, CoreCognition.\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">The benchmark revolves around developmental psychology and Piaget's hierarchical framework, covering 12 core cognitive concepts, including continuity and mechanical reasoning, with a total of 1503 multimodal questions, with at least 95 examples per category, including images and videos.\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cimg alt=\"undefined\" src=\"https:\u002F\u002Fimage.51xinwei.com\u002F2025\u002F10\u002Fb9d7981a46e64729880b0bb4a8520467\u002Fea88f303e4834123b7c5fda41058089a.webp\" width=\"698\" height=\"undefined\" style=\"display: block; margin: auto;\">\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">When designing the questions, the research team followed the following high standards:\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cul>\u003Cli>\u003Cstrong style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Discriminative\u003C\u002Fstrong>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Models without the target core knowledge are logically more likely to choose the wrong option.\u003C\u002Fspan>\u003C\u002Fli>\u003C\u002Ful>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cul>\u003Cli>\u003Cstrong style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Minimal Confusion\u003C\u002Fstrong>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Questions rely as much as possible only on the concept being tested for reasoning, removing coupling with other core knowledge or external abilities, to avoid cross-concept interference.\u003C\u002Fspan>\u003C\u002Fli>\u003C\u002Ful>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cul>\u003Cli>\u003Cstrong style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">No Text Shortcut\u003C\u002Fstrong>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">All questions must be solved by combining images and text to get the correct answer.\u003C\u002Fspan>\u003C\u002Fli>\u003C\u002Ful>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">All data were annotated and reviewed by 12 senior undergraduates or graduate students with backgrounds in cognitive science, computer science, or statistics, undergoing two rounds of cross-validation and Amazon Mechanical Turk manual verification.\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cstrong class=\"ql-lineHeight-1-75\" style=\"font-size: 18px; color: rgb(255, 153, 0);\">Intervention Tests Reveal the \"False Understanding\" Trap\u003C\u002Fstrong>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">To further verify whether the models truly understand the core concepts, the research team proposed the Concept Hacking method: by constructing \"control groups\" (control) and \"intervention groups\" (manipulated), intentionally reversing key features related to core knowledge in the test tasks while keeping other details consistent, to detect whether the model truly understands the concept or is just taking shortcuts.\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cimg alt=\"undefined\" src=\"https:\u002F\u002Fimage.51xinwei.com\u002F2025\u002F10\u002F0d9261a5e1f94711bc78ace965f4bbde\u002Fdbe0019688124e6bb3cc02e4fc08063a.webp\" width=\"undefined\" height=\"undefined\" style=\"display: block; margin: auto;\">\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">For example, the Intuitive Physics test:\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cul>\u003Cli>\u003Cstrong style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Original version\u003C\u002Fstrong>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Release two small balls at the same time, which one will fall first? It examines basic intuitive physics (when released from the same height and ignoring air resistance, free-falling objects reach the ground at the same time).\u003C\u002Fspan>\u003C\u002Fli>\u003C\u002Ful>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cul>\u003Cli>\u003Cstrong style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Twin version\u003C\u002Fstrong>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Keep the size unchanged, but change the release height, to test whether the model really infers based on height difference\u002Flanding time, rather than applying the fixed template of \"falling at the same time\".\u003C\u002Fspan>\u003C\u002Fli>\u003C\u002Ful>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cul>\u003Cli>\u003Cstrong style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Human performance\u003C\u002Fstrong>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Both questions can be answered correctly, and can update judgments promptly according to changes in height.\u003C\u002Fspan>\u003C\u002Fli>\u003C\u002Ful>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cul>\u003Cli>\u003Cstrong style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Model performance\u003C\u002Fstrong>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Correctly answered the original question (choice C), still chose C in the twin version, directly failing — exposing dependence on surface templates rather than true understanding of falling laws.\u003C\u002Fspan>\u003C\u002Fli>\u003C\u002Ful>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cstrong style=\"font-size: 18px; color: rgb(255, 153, 0);\">Five Key Findings\u003C\u002Fstrong>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">First, on low-level abilities directly related to human early cognition (such as boundary awareness, continuity, object permanence, spatiality, perspective taking, etc.), models significantly lag behind high-level abilities (such as intention understanding, tool use, mechanical reasoning), which is clearly different from the pattern of stable high scores across all levels in humans. This indicates\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cstrong style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Current MLLMs have systematic shortcomings in the fundamental \"core knowledge\" that humans possess early on\u003C\u002Fstrong>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">.\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cimg alt=\"undefined\" src=\"https:\u002F\u002Fimage.51xinwei.com\u002F2025\u002F10\u002F602c4a099a614795b4158759aec249fe\u002F4d5f62982abf4722812ef6b786d88db2.webp\" width=\"undefined\" height=\"undefined\" style=\"display: block; margin: auto;\">\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Second, the correlation matrix shows that there is strong internal correlation among high-level abilities, while the lower-level abilities Permanence\u002FSpatiality\u002FContinuity show weak correlation with high-level abilities. This suggests that models lack the scaffolded cognitive development structure from low to high that humans have, and their advanced perception and reasoning are not built on basic cognitive abilities. This also explains why models exhibit robustness defects.\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cimg alt=\"undefined\" src=\"https:\u002F\u002Fimage.51xinwei.com\u002F2025\u002F10\u002Fe2ecc0e9909b4392963f3c5fbae156fd\u002F89d0242e4a50457c9f419c43237eae43.webp\" width=\"undefined\" height=\"undefined\" style=\"display: block; margin: auto;\">\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Third, the research team conducted a correlation analysis between the scores of three stages and 12 core abilities and 26 public benchmarks. The results showed that, except for Perspective and Intuitive Physics, most core abilities were significantly positively correlated with public benchmarks (except ChartQA) and high-level abilities. This indicates that the stronger the core knowledge, the more stable the upper-level tasks. The low correlation of Perspective and Intuitive Physics abilities, which are the foundation of human advanced reasoning, matches the pattern we saw in the relationship matrix, which is direct evidence of the current model's core knowledge deficiency.\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Fourth, based on the regression slope of \"scale-performance\" fitted by 230 models, it was shown that low-level abilities improved significantly less or almost not at all with increasing scale; among them, Perspective-taking even showed a reverse scale effect (the larger the model, the worse the performance). Increasing model scale mainly benefits high-level abilities, with limited or even negative effects on low-level core abilities.\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cimg alt=\"undefined\" src=\"https:\u002F\u002Fimage.51xinwei.com\u002F2025\u002F10\u002Ff4f0061566e8457b99a8cbd6f45a9fc7\u002F1d28ca033d244ccc8705d0da6473f664.webp\" width=\"undefined\" height=\"undefined\" style=\"display: block; margin: auto;\">\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Fifth, the Concept Hacking experiment results showed that large models did not achieve improvement overall compared to smaller models, and in some cases even performed worse. This indicates that simply scaling up is not enough to eliminate dependency on shortcuts or gain robust core knowledge. Intuitively, models are not \"bigger means wiser,\" but rather \"bigger means better at exploiting loopholes.\"\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Combined with the information in the result figures, models can be categorized into four types:\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cul>\u003Cli>\u003Cstrong style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Core Knowledge Type\u003C\u002Fstrong>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Perform well in both control and manipulation questions (close to human level, but with a small sample size), indicating robust core concept understanding and transferability.\u003C\u002Fspan>\u003C\u002Fli>\u003C\u002Ful>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cul>\u003Cli>\u003Cstrong style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Shortcut Dependent Type\u003C\u002Fstrong>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">High scores on control questions, but significant decline on manipulation questions, suggesting reliance on surface cues or training similarity, lacking causal understanding of concept elements.\u003C\u002Fspan>\u003C\u002Fli>\u003C\u002Ful>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cul>\u003Cli>\u003Cstrong style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Core Deficiency Type\u003C\u002Fstrong>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Control questions are below or close to chance level, and manipulation questions also show no stable gains, reflecting insufficient basic \"core knowledge\".\u003C\u002Fspan>\u003C\u002Fli>\u003C\u002Ful>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cul>\u003Cli>\u003Cstrong style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Random Type\u003C\u002Fstrong>\u003C\u002Fli>\u003Cli>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Both control and manipulation questions show near-random fluctuations, overall unreliable (more reflecting noise and luck).\u003C\u002Fspan>\u003C\u002Fli>\u003Cli>\u003Cbr>\u003C\u002Fli>\u003C\u002Ful>\u003Cimg alt=\"undefined\" src=\"https:\u002F\u002Fimage.51xinwei.com\u002F2025\u002F10\u002Fb9a2c186f86a45f6b50ce0408a842b20\u002Fcb3fe127faed4dfca87ffdb072acca7a.webp\" width=\"undefined\" height=\"undefined\" style=\"display: block; margin: auto;\">\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cstrong style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Cognitive instructions bring short-term gains, but cannot compensate for underlying gaps\u003C\u002Fstrong>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">.\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Comparing the performance of reasoning models with their corresponding non-reasoning versions shows that most core ability tasks of reasoning models did not see significant improvements, the problem is not \"whether they can use reasoning\", but whether the underlying representations are sufficient, i.e., whether the core knowledge was covered and structured during the pre-training phase.\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cimg alt=\"undefined\" src=\"https:\u002F\u002Fimage.51xinwei.com\u002F2025\u002F10\u002Fde8e591c4fcd4157b6a7b01cd3e42015\u002F7dba64fa3ad843af8dbceaaa84aedda0.webp\" width=\"842\" height=\"undefined\" style=\"display: block; margin: auto;\">\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">At the same time, the research team found that introducing cognitive instructions (explicitly prompting relevant concepts before the questions, such as perspective taking) brings about a 6% immediate gain, suggesting that the model may have distributed clues internally, but lacks an effective retrieval and invocation mechanism.\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">However, such approaches are limited in availability and usability in real scenarios, and in practical applications, it is often impossible to provide such explicit conceptual labels to guide the model.\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Beyond the impressive \"writing and drawing\" capabilities, true intelligence first depends on grasping the most basic rules of the world.\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">This study shows that parameter stacking does not equal understanding, and whether the foundation is solid is the key.\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">Instead of blindly pursuing \"bigger and stronger,\" it's better to change the starting point: first fill in the core knowledge, let the model learn to maintain consistent common sense judgment and causal intuition in changing, diverse, and noisy environments.\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">In short, build the foundation first, then build the floors; scale is addition, core cognition is multiplication.\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan class=\"ql-lineHeight-1-75\" style=\"color: rgb(187, 187, 187);\">Paper address: https:\u002F\u002Farxiv.org\u002Fabs\u002F2410.10855\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cspan class=\"ql-lineHeight-1-75\" style=\"color: rgb(187, 187, 187);\">Website: https:\u002F\u002Fgrow-ai-like-a-child.github.io\u002Fcore-knowledge\u002F\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cspan class=\"ql-lineHeight-1-75\" style=\"color: rgb(187, 187, 187);\">Dataset: https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fwilliamium\u002FCoreCognition\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"color: rgb(187, 187, 187);\">【News source】 QbitAI | Official Account QbitAI \u003C\u002Fspan>\u003Ca href=\"https:\u002F\u002Fwww.sohu.com\u002Fa\u002F942435345_610300?_trans_=000020_cozecj\" rel=\"noopener noreferrer\" target=\"_blank\" style=\"color: rgb(187, 187, 187);\">https:\u002F\u002Fwww.sohu.com\u002Fa\u002F942435345_610300?_trans_=000020_cozecj\u003C\u002Fa>\u003C\u002Fp>\u003Cp class=\"ql-align-justify\">\u003Cspan style=\"color: rgb(187, 187, 187);\">（This article is reprinted by this website to provide readers with more information and news. The content does not constitute investment or consumer advice. If there are any questions about the facts of the article, please verify with the relevant parties. The views of the article are not the views of this website, and are for reference only.）\u003C\u002Fspan>\u003C\u002Fp>\u003Cp class=\"ql-align-justify\">\u003Cbr>\u003C\u002Fp>","https:\u002F\u002Fxinwei-dev-test.oss-cn-shenzhen.aliyuncs.com\u002Fintelligent\u002Faudio%3A98c5be73-1248-46f1-aaa9-3b893f78516c%3A0.wav?Expires=1774838460&OSSAccessKeyId=LTAI5tNvY2RkKjZw4LLWsrPK&Signature=x04vnJh%2Fu%2BOXWdJI%2B9Ne6dmnQdY%3D","98c5be73-1248-46f1-aaa9-3b893f78516c",16909570]