2024.11.13 每日AI论文 | 三维物体分割新框架,多模态理解生成模型
Listen now
Description
本期的 6 篇论文如下: [00:28] 🔍 SAMPart3D: Segment Any Part in 3D Objects(SAMPart3D:三维物体任意部分分割) [01:06] 🌐 JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation(JanusFlow:统一自回归与校正流的多模态理解与生成) [01:42] 🤔 Stronger Models are NOT Stronger Teachers for Instruction Tuning(更强的模型并非更强的指令调优教师) [02:21] 🌐 Wavelet Latent Diffusion (Wala): Billion-Parameter 3D Generative Model with Compact Wavelet Encodings(小波潜在扩散(WaLa):具有紧凑小波编码的十亿参数3D生成模型) [03:02] 📚 BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions(BLIP3-KALE:知识增强的大规模密集字幕) [03:55] 🔍 Hardware and Software Platform Inference(硬件与软件平台推断) 【关注我们】 您还可以在以下平台找到我们,获得播客内容以外更多信息 小红书: AI速递
More Episodes
本期的 5 篇论文如下: [00:41] TOP1(🔥93) | 🧠 LLaVA-o1: Let Vision Language Models Reason Step-by-Step(LLaVA-o1:让视觉语言模型逐步推理) [02:41] TOP2(🔥55) | 🌍 Generative World Explorer(生成世界探索者) [05:00] TOP3(🔥44) | 🧠 Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference...
Published 11/23/24
本期的 14 篇论文如下: [00:26] 🧠 Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization(通过混合偏好优化提升多模态大语言模型的推理能力) [01:12] 🌐 Multimodal Autoregressive Pre-training of Large Vision Encoders(大规模视觉编码器多模态自回归预训练) [01:55] 🧠 Marco-o1: Towards Open Reasoning Models for...
Published 11/22/24