2024.11.21 每日AI论文 | 4比特注意力加速显著,视频生成基准全面评估。
Listen now
Description
本期的 8 篇论文如下: [00:28] ⚡ SageAttention2 Technical Report: Accurate 4 Bit Attention for Plug-and-play Inference Acceleration(SageAttention2技术报告:用于即插即用推理加速的精确4比特注意力机制) [01:10] 📹 VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models(VBench++:全面且多功能的视频生成模型基准套件) [01:51] 🎮 VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation(视频自动竞技场:通过用户模拟评估大型多模态模型在视频分析中的能力) [02:33] 🎯 SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory(SAMURAI:利用运动感知记忆机制将分割模型适应于零样本视觉跟踪) [03:10] 🌐 Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents(你的LLM是否秘密地成为互联网的世界模型?基于模型的网络代理规划) [03:52] 🔄 When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training(精度与位置的碰撞:BFloat16在长上下文训练中破坏了RoPE) [04:34] 🎨 Stylecodes: Encoding Stylistic Information For Image Generation(风格编码:为图像生成编码风格信息) [05:11] 🩺 ORID: Organ-Regional Information Driven Framework for Radiology Report Generation(器官-区域信息驱动的放射报告生成框架) 【关注我们】 您还可以在以下平台找到我们,获得播客内容以外更多信息 小红书: AI速递
More Episodes
本期的 7 篇论文如下: [00:33] ⚡ Continuous Speculative Decoding for Autoregressive Image Generation(自回归图像生成的连续推测解码) [01:14] 📚 RedPajama: an Open Dataset for Training Large Language Models(红睡衣:用于训练大型语言模型的开放数据集) [01:58] 🤖 Soft Robotic Dynamic In-Hand Pen Spinning(软体机器人动态手内笔旋转) [02:39] 🚀 ITACLIP: Boosting...
Published 11/20/24
本期的 16 篇论文如下: [00:25] 📱 BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices(BlueLM-V-3B:移动设备上多模态大语言模型的算法与系统协同设计) [01:06] 🌍 Generative World Explorer(生成世界探索者) [01:43] 🔍 Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of...
Published 11/19/24