2024.11.12 每日AI论文 | 对象无缝插入，通用编辑模型提升精度 - Listen - HuggingFace

2024.11.12 每日AI论文 | 对象无缝插入，通用编辑模型提升精度

Listen now

Description

本期的 14 篇论文如下： [00:23] 🖼 Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models（Add-it：基于预训练扩散模型的图像中无训练对象插入） [01:05] 🎨 OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision（全能编辑器：通过专家监督构建图像编辑通用模型） [01:49] 📚 Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models（中文简单问答：大语言模型的中文事实性评估） [02:27] 📚 M-Longdoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework（M-Longdoc：多模态超长文档理解和检索感知调优框架的基准） [03:04] 🖼 Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models（启迪图像：基于像素空间拉普拉斯扩散模型的高质量图像生成） [03:42] 🧠 IOPO: Empowering LLMs with Complex Instruction Following via Input-Output Preference Optimization（IOPO：通过输入输出偏好优化增强LLMs复杂指令跟随能力） [04:33] 🦎 GitChameleon: Unmasking the Version-Switching Capabilities of Code Generation Models（GitChameleon：揭秘代码生成模型的版本切换能力） [05:11] 🌐 Watermark Anything with Localized Messages（基于局部信息的水印技术） [05:50] 🧠 Counterfactual Generation from Language Models（语言模型中的反事实生成） [06:22] 🤖 KMM: Key Frame Mask Mamba for Extended Motion Generation（KMM：扩展运动生成的关键帧掩码Mamba） [06:56] 🎲 Game-theoretic LLM: Agent Workflow for Negotiation Games（博弈论LLM：谈判游戏中的代理工作流程） [07:35] 📊 Golden Touchstone: A Comprehensive Bilingual Benchmark for Evaluating Financial Large Language Models（金标准：评估金融大语言模型的综合双语基准） [08:15] 🧠 NeKo: Toward Post Recognition Generative Correction Large Language Models with Task-Oriented Experts（NeKo：面向任务导向专家的生成校正大型语言模型） [08:54] 🧠 Ablation is Not Enough to Emulate DPO: How Neuron Dynamics Drive Toxicity Reduction（消融不足以模拟DPO：神经元动力学如何驱动毒性降低）【关注我们】您还可以在以下平台找到我们，获得播客内容以外更多信息小红书: AI速递

More Episodes

See all »

【周末特辑】11月第4周最火AI论文 | LLaVA-o1提升多模态推理，Genex优化具身AI规划。

本期的 5 篇论文如下： [00:41] TOP1(🔥93) | 🧠 LLaVA-o1: Let Vision Language Models Reason Step-by-Step（LLaVA-o1：让视觉语言模型逐步推理） [02:41] TOP2(🔥55) | 🌍 Generative World Explorer（生成世界探索者） [05:00] TOP3(🔥44) | 🧠 Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference...

Published 11/23/24

HuggingFace 每日AI论文速递

Published 11/23/24

2024.11.22 每日AI论文 | 混合偏好优化提升推理，多模态自回归预训练创新。

本期的 14 篇论文如下： [00:26] 🧠 Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization（通过混合偏好优化提升多模态大语言模型的推理能力） [01:12] 🌐 Multimodal Autoregressive Pre-training of Large Vision Encoders（大规模视觉编码器多模态自回归预训练） [01:55] 🧠 Marco-o1: Towards Open Reasoning Models for...

Published 11/22/24