Ep. 243 - Part 2 - June 9, 2024
Listen now
Description
ArXiv Computer Vision research for Sunday, June 09, 2024. 00:20: ControlLoc: Physical-World Hijacking Attack on Visual Perception in Autonomous Driving 02:23: Unified Text-to-Image Generation and Retrieval 03:51: F-LMM: Grounding Frozen Large Multimodal Models 05:34: Multi-Stain Multi-Level Convolutional Network for Multi-Tissue Breast Cancer Image Segmentation 07:43: BOSC: A toolbox for aerial imagery mapping 08:27: Mamba YOLO: SSMs-Based YOLO For Object Detection 10:12: Solution for CVPR 2024 UG2+ Challenge Track on All Weather Semantic Segmentation 11:02: Scaling Graph Convolutions for Mobile Vision 12:59: RefGaussian: Disentangling Reflections from 3D Gaussian Splatting for Realistic Rendering 14:28: Self-supervised Adversarial Training of Monocular Depth Estimation against Physical-World Attacks 15:45: Procrastination Is All You Need: Exponent Indexed Accumulators for Floating Point, Posits and Logarithmic Numbers 16:40: OmniControlNet: Dual-stage Integration for Conditional Image Generation 17:51: GCtx-UNet: Efficient Network for Medical Image Segmentation 19:14: InfoGaussian: Structure-Aware Dynamic Gaussians through Lightweight Information Shaping 20:40: BD-SAT: High-resolution Land Use Land Cover Dataset & Benchmark Results for Developing Division: Dhaka, BD 22:19: Bits-to-Photon: End-to-End Learned Scalable Point Cloud Compression for Direct Rendering 23:28: MeanSparse: Post-Training Robustness Enhancement Through Mean-Centered Feature Sparsification 24:38: Solution for SMART-101 Challenge of CVPR Multi-modal Algorithmic Reasoning Task 2024 26:12: CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark 29:32: Inter-slice Super-resolution of Magnetic Resonance Images by Pre-training and Self-supervised Fine-tuning 31:04: Causality-inspired Latent Feature Augmentation for Single Domain Generalization 32:41: MHS-VM: Multi-Head Scanning in Parallel Subspaces for Vision Mamba 34:13: FLEUR: An Explainable Reference-Free Evaluation Metric for Image Captioning Using a Large Multimodal Model
More Episodes
ArXiv Computer Vision research for Thursday, June 13, 2024. 00:21: LRM-Zero: Training Large Reconstruction Models with Synthesized Data 01:56: Scale-Invariant Monocular Depth Estimation via SSI Depth 03:08: GGHead: Fast and Generalizable 3D Gaussian Heads 04:55: Multiagent Multitraversal...
Published 06/15/24
ArXiv Computer Vision research for Thursday, June 13, 2024. 00:21: INS-MMBench: A Comprehensive Benchmark for Evaluating LVLMs' Performance in Insurance 02:11: Large-Scale Evaluation of Open-Set Image Classification Techniques 03:43: PC-LoRA: Low-Rank Adaptation for Progressive Model...
Published 06/15/24
Published 06/15/24