2024.11.27 每日AI论文 | ShowUI提升GUI效率，F2F改进图像编辑。 - Listen -

2024.11.27 每日AI论文 | ShowUI提升GUI效率，F2F改进图像编辑。

Listen now

Description

本期的 18 篇论文如下： [00:28] 🖥 ShowUI: One Vision-Language-Action Model for GUI Visual Agent（ShowUI：一种用于GUI视觉代理的视觉-语言-动作模型） [01:08] 🎥 Pathways on the Image Manifold: Image Editing via Video Generation（图像流形上的路径：通过视频生成进行图像编辑） [01:45] ⭐ Star Attention: Efficient LLM Inference over Long Sequences（星型注意力：长序列上高效的大型语言模型推理） [02:24] ⚡ Rethinking Token Reduction in MLLMs: Towards a Unified Paradigm for Training-Free Acceleration（重新思考MLLMs中的Token减少：迈向无训练加速的统一范式） [03:01] 📊 MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs（MME-Survey: 多模态大语言模型评估的综合调查） [03:44] 🎨 TEXGen: a Generative Diffusion Model for Mesh Textures（TEXGen：一种用于网格纹理的生成扩散模型） [04:27] 🎨 SketchAgent: Language-Driven Sequential Sketch Generation（SketchAgent：语言驱动的顺序草图生成） [05:11] 🔄 Learning 3D Representations from Procedural 3D Programs（从程序化3D程序中学习3D表示） [05:55] 🧠 VLRewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models（VLRewardBench：视觉语言生成奖励模型的挑战性基准） [06:50] 🔄 SAR3D: Autoregressive 3D Object Generation and Understanding via Multi-scale 3D VQVAE（SAR3D：通过多尺度3D VQVAE实现自回归3D物体生成与理解） [07:27] 🖼 FINECAPTION: Compositional Image Captioning Focusing on Wherever You Want at Any Granularity（精细标题：聚焦任意粒度的组合图像描述） [08:09] 🎨 DreamMix: Decoupling Object Attributes for Enhanced Editability in Customized Image Inpainting（DreamMix：解耦对象属性以增强定制化图像修复的可编辑性） [08:41] 📹 SALOVA: Segment-Augmented Long Video Assistant for Targeted Retrieval and Routing in Long-Form Video Analysis（SALOVA：长视频助手在长视频分析中的目标检索与路由） [09:19] 📉 Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens（低比特量化有利于未充分训练的大型语言模型：基于100万亿训练标记的量化大型语言模型缩放规律） [10:05] 🧬 MolReFlect: Towards In-Context Fine-grained Alignments between Molecules and Texts（MolReFlect：面向分子与文本之间细粒度对齐的研究） [10:40] 👕 Controllable Human Image Generation with Personalized Multi-Garments（个性化多服装的可控人体图像生成） [11:12] 🤖 Visual Counter Turing Test (VCT^2): Discovering the Challenges for AI-Generated Image Detection and Introducing Visual AI Index (V_AI)（视觉反图灵测试（VCT²）：发现AI生成图像检测的挑战并引入视觉AI指数（V_AI）） [11:55] 🎥 AnchorCrafter: Animate CyberAnchors Saling Your Products via Human-Object Interacting Video Generation（锚点创作者：通过人-物交互视频生成动画网络锚点推广产品）【关注我们】您还可以在以下平台找到我们，获得播客内容以外更多信息小红书: AI速递

More Episodes

See all »

2024.11.26 每日AI论文 | 3D材料生成自动化，零样本图像生成创新。

本期的 21 篇论文如下： [00:26] 🌐 Material Anything: Generating Materials for Any 3D Object via Diffusion（材料生成：通过扩散生成任意3D对象的材料） [01:05] 🎨 Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot Subject-Driven Image Generator（基于修复的大规模文本到图像模型：零样本主题驱动图像生成器） [01:48] 🤖 From Generation to Judgment:...

Published 11/26/24

HuggingFace 每日AI论文速递

Published 11/26/24

2024.11.25 每日AI论文 | 风格友好SNR采样器提升图像生成，TÜLU 3开源模型性能超越。

本期的 14 篇论文如下： [00:26] 🎨 Style-Friendly SNR Sampler for Style-Driven Generation（风格友好SNR采样器用于风格驱动生成） [01:08] 🚀 TÜLU 3: Pushing Frontiers in Open Language Model Post-Training（TÜLU 3：推动开放语言模型后训练的前沿） [01:53] 🌐 OminiControl: Minimal and Universal Control for Diffusion...

Published 11/25/24