2024.11.21 每日AI论文 | 4比特注意力加速显著，视频生成基准全面评估。 - Listen - HuggingFace

2024.11.21 每日AI论文 | 4比特注意力加速显著，视频生成基准全面评估。

Listen now

Description

本期的 8 篇论文如下： [00:28] ⚡ SageAttention2 Technical Report: Accurate 4 Bit Attention for Plug-and-play Inference Acceleration（SageAttention2技术报告：用于即插即用推理加速的精确4比特注意力机制） [01:10] 📹 VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models（VBench++：全面且多功能的视频生成模型基准套件） [01:51] 🎮 VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation（视频自动竞技场：通过用户模拟评估大型多模态模型在视频分析中的能力） [02:33] 🎯 SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory（SAMURAI：利用运动感知记忆机制将分割模型适应于零样本视觉跟踪） [03:10] 🌐 Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents（你的LLM是否秘密地成为互联网的世界模型？基于模型的网络代理规划） [03:52] 🔄 When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training（精度与位置的碰撞：BFloat16在长上下文训练中破坏了RoPE） [04:34] 🎨 Stylecodes: Encoding Stylistic Information For Image Generation（风格编码：为图像生成编码风格信息） [05:11] 🩺 ORID: Organ-Regional Information Driven Framework for Radiology Report Generation（器官-区域信息驱动的放射报告生成框架）【关注我们】您还可以在以下平台找到我们，获得播客内容以外更多信息小红书: AI速递

More Episodes

See all »

2024.11.20 每日AI论文 | 图像生成加速，语言模型数据集创新

本期的 7 篇论文如下： [00:33] ⚡ Continuous Speculative Decoding for Autoregressive Image Generation（自回归图像生成的连续推测解码） [01:14] 📚 RedPajama: an Open Dataset for Training Large Language Models（红睡衣：用于训练大型语言模型的开放数据集） [01:58] 🤖 Soft Robotic Dynamic In-Hand Pen Spinning（软体机器人动态手内笔旋转） [02:39] 🚀 ITACLIP: Boosting...

Published 11/20/24

HuggingFace 每日AI论文速递

Published 11/20/24

2024.11.19 每日AI论文 | 移动设备高效部署，具身AI虚拟探索

本期的 16 篇论文如下： [00:25] 📱 BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices（BlueLM-V-3B：移动设备上多模态大语言模型的算法与系统协同设计） [01:06] 🌍 Generative World Explorer（生成世界探索者） [01:43] 🔍 Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of...

Published 11/19/24