[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"$fSfKN5oEXklkjfvMuYBN_HseQN-3isB2_HYFTsgHfWj4":3},{"code":4,"msg":5,"data":6},200,"操作成功",{"id":7,"title":8,"content":9,"digest":10,"source":10,"coverPath":11,"thumbsCoverPath":12,"isTop":13,"isShow":14,"baseClick":13,"clickCount":15,"createTime":16,"typeId":17,"isNewest":18,"newsInfoTypeRespVo":19,"voiceUrl":22,"voiceSize":23,"taskId":24,"releaseTime":25,"titleEn":26,"contentEn":27,"voiceUrlEn":28,"taskIdEn":29,"voiceSizeEn":30},1220,"字节跳动开源 VeOmni 框架：提升多模态训练效率的新利器","\u003Cp>\u003Cstrong class=\"ql-lineHeight-1-75\" style=\"font-size: 18px; color: rgb(255, 153, 0);\">近日，字节跳动宣布开源其内部开发的 VeOmni 框架，这是一款专注于多模态模型训练的统一框架。随着人工智能技术的不断发展，特别是从单一语言模型向文本、图像和视频的多模态演进，算法工程师们在训练过程中面临诸多挑战，特别是训练流程的碎片化问题。为了应对这些困扰，VeOmni 应运而生。\u003C\u002Fstrong>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cimg alt=\"undefined\" src=\"https:\u002F\u002Fimage.51xinwei.com\u002F2025\u002F08\u002F76e2b9ce0ea0464eb2b6767c818dbe22\u002F6389079019275364517117051.png\" width=\"undefined\" height=\"undefined\" style=\"display: block; margin: auto;\">\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">VeOmni 由字节跳动的 Seed 团队与火山机器学习平台共同研发，旨在实现 “统一多模态、统一并行策略和统一算力底座” 的目标。该框架通过提供统一的 API，将多种混合并行策略整合到一个框架中，支持各种模型的快速训练。无论是大规模语言模型、视觉语言模型，还是视频生成模型，开发者都可以轻松上手。\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">该框架具备显著的性能优化能力。例如，它通过显存计算的双优化策略，能够在保证显存充足的情况下，最大限度地减少额外计算开销。此外，VeOmni 还采用了多维并行体系，支持不同的并行原语，从而有效降低显存峰值。这些技术的结合，使得 VeOmni 在实际训练中表现出色，相比同类开源方案，其训练吞吐量提升了40% 以上。\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">在蒸馏加速方面，VeOmni 也展现了其强大的优势。通过集成多种前沿的蒸馏技术，用户可以显著减少模型推理所需的步骤和资源消耗，从而加速模型的部署和应用。\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">VeOmni 框架的开源，不仅提升了字节跳动内部模型训练的效率，也为更多的 AI 研究者和开发者提供了一个强大的工具，助力多模态 AI 技术的发展。\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"color: rgb(187, 187, 187);\">【新闻来源】\u003C\u002Fspan>\u003Cspan style=\"color: rgb(187, 187, 187); font-size: 14px;\">AIbase基地 \u003C\u002Fspan>\u003Ca href=\"https:\u002F\u002Fnews.aibase.com\u002Fzh\u002Fnews\u002F20518\" rel=\"noopener noreferrer\" target=\"_blank\" style=\"color: rgb(187, 187, 187);\">https:\u002F\u002Fnews.aibase.com\u002Fzh\u002Fnews\u002F20518\u003C\u002Fa>\u003C\u002Fp>\u003Cp class=\"ql-align-justify\">\u003Cspan style=\"color: rgb(187, 187, 187);\">（本网转发此文章，旨在为读者提供更多的信息资讯，所涉内容不构成投资、消费建议。文章事实如有疑问，请与有关方核实，文章观点非本网观点，仅供读者参考。）\u003C\u002Fspan>\u003C\u002Fp>","","https:\u002F\u002Fimage.51xinwei.com\u002F2025\u002F08\u002F06adf11be73c499abfa01d0cb37680a9\u002FAI领域.jpg","https:\u002F\u002Fimage.51xinwei.com\u002F2025\u002F08\u002Fthumbs\u002F06adf11be73c499abfa01d0cb37680a9\u002FAI领域.jpg",0,1,217,"2025-08-14 18:49",2,false,{"id":17,"name":20,"enName":21},"芯位视野","Xinwei Vision","https:\u002F\u002Fxinwei-dev-test.oss-cn-shenzhen.aliyuncs.com\u002Fintelligent\u002Faudio%3Aafaedc47-96cb-43f3-9d12-7f18c1802cad%3A0.wav?Expires=1755172399&OSSAccessKeyId=LTAI5tNvY2RkKjZw4LLWsrPK&Signature=QcutrUZKtNpYBY5v%2FOlzJf%2BU%2BlM%3D",3378466,"afaedc47-96cb-43f3-9d12-7f18c1802cad","2025-08-14 18:47","ByteDance open sources VeOmni framework: a new tool to improve multi-modal training efficiency","\u003Cp>\u003Cstrong class=\"ql-lineHeight-1-75\" style=\"font-size: 18px; color: rgb(255, 153, 0);\">Recently, ByteDance announced the open source of its internally developed VeOmni framework, which is a unified framework focusing on multi-modal model training. With the continuous development of artificial intelligence technology, especially the evolution from single-language models to multi-modal models that include text, images and videos, algorithm engineers face many challenges in the training process, particularly the problem of fragmented training processes. To address these issues, VeOmni was born.\u003C\u002Fstrong>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cimg alt=\"undefined\" src=\"https:\u002F\u002Fimage.51xinwei.com\u002F2025\u002F08\u002F76e2b9ce0ea0464eb2b6767c818dbe22\u002F6389079019275364517117051.png\" width=\"undefined\" height=\"undefined\" style=\"display: block; margin: auto;\">\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">VeOmni was jointly developed by ByteDance's Seed team and the Volcano Machine Learning Platform, aiming to achieve the goals of \"unified multi-modal, unified parallel strategy and unified computing infrastructure\". This framework integrates various hybrid parallel strategies into a single framework through a unified API, supporting fast training for various models. Whether it is a large-scale language model, a vision-language model, or a video generation model, developers can easily get started.\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">The framework has significant performance optimization capabilities. For example, it uses a dual optimization strategy for memory computing, which minimizes additional computational overhead while ensuring sufficient memory. In addition, VeOmni adopts a multidimensional parallel system, supports different parallel primitives, thus effectively reducing memory peaks. The combination of these technologies makes VeOmni perform excellently in actual training, with a training throughput increase of more than 40% compared to similar open-source solutions.\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">In terms of distillation acceleration, VeOmni also demonstrates its strong advantages. By integrating various cutting-edge distillation technologies, users can significantly reduce the steps and resource consumption required for model inference, thereby accelerating the deployment and application of the model.\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"font-size: 18px;\" class=\"ql-lineHeight-1-75\">The open sourcing of the VeOmni framework not only improves the efficiency of internal model training at ByteDance, but also provides a powerful tool for more AI researchers and developers, helping to promote the development of multi-modal AI technology.\u003C\u002Fspan>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cbr>\u003C\u002Fp>\u003Cp>\u003Cspan style=\"color: rgb(187, 187, 187);\">【News Source】\u003C\u002Fspan>\u003Cspan style=\"color: rgb(187, 187, 187); font-size: 14px;\">AIbase Base \u003C\u002Fspan>\u003Ca href=\"https:\u002F\u002Fnews.aibase.com\u002Fzh\u002Fnews\u002F20518\" rel=\"noopener noreferrer\" target=\"_blank\" style=\"color: rgb(187, 187, 187);\">https:\u002F\u002Fnews.aibase.com\u002Fzh\u002Fnews\u002F20518\u003C\u002Fa>\u003C\u002Fp>\u003Cp class=\"ql-align-justify\">\u003Cspan style=\"color: rgb(187, 187, 187);\">（This article is forwarded by this website to provide readers with more information. The content does not constitute investment or consumption advice. If there are any questions about the facts of the article, please verify with the relevant parties. The views expressed in the article are not the views of this website, and are for reference only.)\u003C\u002Fspan>\u003C\u002Fp>","https:\u002F\u002Fxinwei-dev-test.oss-cn-shenzhen.aliyuncs.com\u002Fintelligent\u002Faudio%3Ae68a789f-28a3-419e-9420-b35082d230d8%3A0.wav?Expires=1774838496&OSSAccessKeyId=LTAI5tNvY2RkKjZw4LLWsrPK&Signature=3xmI6QWuEI%2FIMeFIoeUlrP6VwkE%3D","e68a789f-28a3-419e-9420-b35082d230d8",4440378]