2024.11.26 每日AI论文 | 3D材料生成自动化，零样本图像生成创新。 - Listen - HuggingFace

2024.11.26 每日AI论文 | 3D材料生成自动化，零样本图像生成创新。

Listen now

Description

本期的 21 篇论文如下： [00:26] 🌐 Material Anything: Generating Materials for Any 3D Object via Diffusion（材料生成：通过扩散生成任意3D对象的材料） [01:05] 🎨 Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot Subject-Driven Image Generator（基于修复的大规模文本到图像模型：零样本主题驱动图像生成器） [01:48] 🤖 From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge（从生成到判断：LLM作为评判者的机遇与挑战） [02:22] 🌐 Knowledge Transfer Across Modalities with Natural Language Supervision（基于自然语言监督的多模态知识迁移） [03:00] 🧠 MH-MoE:Multi-Head Mixture-of-Experts（多头混合专家模型） [03:34] 🎥 DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation（DreamRunner：基于检索增强的运动适应细粒度故事视频生成） [04:13] 🌐 One Diffusion to Generate Them All（一个扩散模型生成所有） [04:54] 👁 VisualLens: Personalization through Visual History（视觉透镜：通过视觉历史实现个性化） [05:34] 🔍 Factorized Visual Tokenization and Generation（因子分解视觉标记化与生成） [06:15] 🔍 O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson?（O1复制之旅 -- 第二部分：通过简单蒸馏超越O1预览版，巨大进步还是苦涩教训？） [07:00] 🩺 GMAI-VL & GMAI-VL-5.5M: A Large Vision-Language Model and A Comprehensive Multimodal Dataset Towards General Medical AI（通用医疗人工智能的大规模视觉语言模型与综合多模态数据集） [07:39] 🌐 SplatFlow: Multi-View Rectified Flow Model for 3D Gaussian Splatting Synthesis（SplatFlow：用于3D高斯喷射合成的多视图校正流模型） [08:25] 🔄 From CISC to RISC: language-model guided assembly transpilation（从CISC到RISC：语言模型引导的汇编转译） [09:03] ⚙ Cautious Optimizers: Improving Training with One Line of Code（谨慎优化器：用一行代码改进训练） [09:49] 🤖 The Impossible Test: A 2024 Unsolvable Dataset and A Chance for an AGI Quiz（不可能的测试：2024年不可解数据集与AGI测验的机会） [10:30] 🔮 Predicting Emergent Capabilities by Finetuning（通过微调预测涌现能力） [11:04] 📊 SegBook: A Simple Baseline and Cookbook for Volumetric Medical Image Segmentation（SegBook：体积医学图像分割的简单基线和操作手册） [11:48] 🩺 Interactive Medical Image Segmentation: A Benchmark Dataset and Baseline（交互式医学图像分割：基准数据集与基线） [12:25] 🤔 LLMs Do Not Think Step-by-step In Implicit Reasoning（大语言模型在隐式推理中不进行逐步思考） [13:00] 🌐 Best of Both Worlds: Advantages of Hybrid Graph Sequence Models（双剑合璧：混合图序列模型的优势） [13:34] 🔗 Edge Weight Prediction For Category-Agnostic Pose Estimation（类别无关姿态估计的边权重预测）【关注我们】您还可以在以下平台找到我们，获得播客内容以外更多信息小红书: AI速递

More Episodes

See all »

2024.11.27 每日AI论文 | ShowUI提升GUI效率，F2F改进图像编辑。

本期的 18 篇论文如下： [00:28] 🖥 ShowUI: One Vision-Language-Action Model for GUI Visual Agent（ShowUI：一种用于GUI视觉代理的视觉-语言-动作模型） [01:08] 🎥 Pathways on the Image Manifold: Image Editing via Video Generation（图像流形上的路径：通过视频生成进行图像编辑） [01:45] ⭐ Star Attention: Efficient LLM Inference over Long...

Published 11/27/24

HuggingFace 每日AI论文速递

Published 11/26/24

2024.11.25 每日AI论文 | 风格友好SNR采样器提升图像生成，TÜLU 3开源模型性能超越。

本期的 14 篇论文如下： [00:26] 🎨 Style-Friendly SNR Sampler for Style-Driven Generation（风格友好SNR采样器用于风格驱动生成） [01:08] 🚀 TÜLU 3: Pushing Frontiers in Open Language Model Post-Training（TÜLU 3：推动开放语言模型后训练的前沿） [01:53] 🌐 OminiControl: Minimal and Universal Control for Diffusion...

Published 11/25/24