Description
本期的 16 篇论文如下:
[00:25] 📱 BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices(BlueLM-V-3B:移动设备上多模态大语言模型的算法与系统协同设计)
[01:06] 🌍 Generative World Explorer(生成世界探索者)
[01:43] 🔍 Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering(搜索、验证与反馈:通过验证器工程实现下一代基础模型的后训练范式)
[02:24] 🎥 AnimateAnything: Consistent and Controllable Animation for Video Generation(动画任何事物:视频生成的连贯可控动画)
[03:08] 🧠 Top-$nσ$: Not All Logits Are You Need(Top-$nσ$:并非所有对数都需要)
[03:55] 🧠 Awaker2.5-VL: Stably Scaling MLLMs with Parameter-Efficient Mixture of Experts(Awaker2.5-VL:通过参数高效混合专家稳定扩展多模态大语言模型)
[04:40] ⚡ SmoothCache: A Universal Inference Acceleration Technique for Diffusion Transformers(SmoothCache:一种用于扩散变换器的通用推理加速技术)
[05:19] 📚 Drowning in Documents: Consequences of Scaling Reranker Inference(文档淹没:扩展重排序器推理的后果)
[06:00] 🩺 Comprehensive and Practical Evaluation of Retrieval-Augmented Generation Systems for Medical Question Answering(医疗问答系统中检索增强生成系统的综合与实用评估)
[06:37] 📱 SlimLM: An Efficient Small Language Model for On-Device Document Assistance(SlimLM:一种用于设备端文档辅助的高效小型语言模型)
[07:19] 🎥 VeGaS: Video Gaussian Splatting(视频高斯喷射)
[07:50] 🔄 Adaptive Decoding via Latent Preference Optimization(通过潜在偏好优化的自适应解码)
[08:27] 🎥 StableV2V: Stablizing Shape Consistency in Video-to-Video Editing(稳定视频编辑:在视频到视频编辑中保持形状一致性)
[09:11] 🇩 LLäMmlein: Compact and Competitive German-Only Language Models from Scratch(LLäMmlein:从头开始构建紧凑且有竞争力的德语专用语言模型)
[09:43] 👕 FitDiT: Advancing the Authentic Garment Details for High-fidelity Virtual Try-on(FitDiT:提升高保真虚拟试穿的真实服装细节)
[10:18] 📜 Evaluating the role of `Constitutions' for learning from AI feedback(评估‘宪法’在从AI反馈中学习的作用)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递