blog
Hugging Face Blog
AI
LLM

Mixture of Experts (MoEs) in Transformers

Aritra Roy Gosthipaty, Pedro Cuenca, merve, Ilyas Moutawwakil, Arthur Zucker, Sergio Paniego, Pablo Montalvo
发布时间
2026/2/26 08:00:00
来源类型
blog
语言
en
摘要

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

资源链接
Careersapply.workable.com/huggingfacetorch.bmmdocs.pytorch.org/docs/stable/generated/torch.bmm.htmltorch._grouped_mmdocs.pytorch.org...enerated/torch.nn.functional.grouped_mm.htmlhttps://github.com/ariG23498/transformers/tree/bench-v4github.com/ariG23498/transformers/tree/bench-v4https://github.com/ariG23498/transformers/tree/bench-v5github.com/ariG23498/transformers/tree/bench-v5Update on GitHubgithub.com...ggingface/blog/blob/main/moe-transformers.mdSplitModulelistgithub.com...0a4ed/src/transformers/core_model_loading.pyGroupedGemmParallelgithub.com...transformers/integrations/tensor_parallel.pyRouterParallelgithub.com...transformers/integrations/tensor_parallel.pyMergeModulelistgithub.com.../main/src/transformers/core_model_loading.pyweight loading refactorgithub.com/huggingface/transformers/pull/41580PR #42697github.com/huggingface/transformers/pull/42697this bloghuggingface.co/blog/moeMiniMax M2huggingface.co/collections/MiniMaxAI/minimax-m2Qwen 3.5huggingface.co/collections/Qwen/qwen35Kimi K2.5huggingface.co/collections/moonshotai/kimi-k25gpt-oss modelshuggingface.co/collections/openai/gpt-ossGLM-5huggingface.co/collections/zai-org/glm-5数据集页面huggingface.co...ain/blog/moe-transformers/expert_backend.png数据集页面huggingface.co...in/blog/moe-transformers/faster_training.png数据集页面huggingface.co.../blog/moe-transformers/loading_benchmark.png数据集页面huggingface.co...in/blog/moe-transformers/moe_2y_timeline.png数据集页面huggingface.co...e/main/blog/moe-transformers/moe_routing.pngDeepSeek R1huggingface.co/deepseek-ai/DeepSeek-R1DeepSeek V2huggingface.co/deepseek-ai/DeepSeek-V2DeepSeek-V3 checkpoint indexhuggingface.co...eek-V3/raw/main/model.safetensors.index.jsonExperts Backend systemhuggingface.co/docs/transformers/experts_interfaceWeightConverterhuggingface.co...ansformers/main/en/internal/weight_converterAutoModelForCausalLM.from_pretrained("model_id")huggingface.co/docs/transformers/main/en/model_doc/autogeneric WeightConverterhuggingface.co/docs/transformers/main/en/weightconvertergrouped GEMMs and fused MoE implementationshuggingface.co/kernels-community/megablocksMixtral-8x7Bhuggingface.co/mistralai/Mixtral-8x7B-v0.1gpt-oss-20bhuggingface.co/openai/gpt-oss-20bScaling lawshuggingface.co/papers/2001.08361OLMoE: Open Mixture-of-Experts Language Modelshuggingface.co/papers/2409.02060Maarten Grootendorstnewsletter.maartengrootendorst.com/p/a-visual-guide-to-mixture-of-expertsULMFiTnlp.fast.ai...ification/2018/05/15/introducing-ulmfit.htmlUnsloth’s official guideunsloth.ai/docs/new/faster-moerumoredx.com/soumithchintala/status/1671267150101721090YouTube video on routingyoutu.be/CDnkFbW-uEQ原始来源页面huggingface.co/blog/moe-transformers
元数据
来源Hugging Face Blog
类型blog
抽取状态raw
关键词
AI
LLM