Ep. 241 - Part 1 - June 7, 2024
Description
ArXiv Computer Vision research for Friday, June 07, 2024.
00:20: Image Processing Based Forest Fire Detection
01:08: STAR: Skeleton-aware Text-based 4D Avatar Generation with In-Network Motion Retargeting
03:05: UVCPNet: A UAV-Vehicle Collaborative Perception Network for 3D Object Detection
04:47: UCDNet: Multi-UAV Collaborative 3D Object Detection Network by Reliable Feature Mapping
06:14: SMART: Scene-motion-aware human action recognition framework for mental disorder group
08:12: LocLLM: Exploiting Generalizable Human Keypoint Localization via Large Language Model
09:34: Evaluating and Mitigating IP Infringement in Visual Generative AI
11:01: MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models
12:20: OVMR: Open-Vocabulary Recognition with Multi-Modal References
13:57: ACE Metric: Advection and Convection Evaluation for Accurate Weather Forecasting
15:11: XctDiff: Reconstruction of CT Images with Consistent Anatomical Structures from a Single Radiographic Projection Image
16:22: MTS-Net: Dual-Enhanced Positional Multi-Head Self-Attention for 3D CT Diagnosis of May-Thurner Syndrome
17:58: CDeFuse: Continuous Decomposition for Infrared and Visible Image Fusion
19:41: MGIMM: Multi-Granularity Instruction Multimodal Model for Attribute-Guided Remote Sensing Image Detailed Description
21:24: PQPP: A Joint Benchmark for Text-to-Image Prompt and Query Performance Prediction
22:58: Interpretable Multimodal Out-of-context Detection with Soft Logic Regularization
24:24: SMC++: Masked Learning of Unsupervised Video Semantic Compression
26:19: Diffusion-based Generative Image Outpainting for Recovery of FOV-Truncated CT Images
27:09: MoE Jetpack: From Dense Checkpoints to Adaptive Mixture of Experts for Vision Tasks
28:35: Predictive Dynamic Fusion
29:43: Online Continual Learning of Video Diffusion Models From a Single Video Stream
30:40: A short review on graphonometric evaluation tools in children
31:49: Navigating Efficiency in MobileViT through Gaussian Process on Global Architecture Factors
33:04: EGOR: Efficient Generated Objects Replay for incremental object detection
34:37: 3rd Place Solution for MeViS Track in CVPR 2024 PVUW workshop: Motion Expression guided Video Segmentation
36:02: Multi-Granularity Language-Guided Multi-Object Tracking
37:56: Normal-guided Detail-Preserving Neural Implicit Functions for High-Fidelity 3D Surface Reconstruction
39:52: Ada-VE: Training-Free Consistent Video Editing Using Adaptive Motion Prior
41:48: 3DRealCar: An In-the-wild RGB-D Car Dataset with 360-degree Views
43:54: Seeing the Unseen: Visual Metaphor Captioning for Videos
45:09: Zero-Shot Video Editing through Adaptive Sliding Score Distillation
46:28: Labeled Data Selection for Category Discovery
ArXiv Computer Vision research for Thursday, June 13, 2024.
00:21: LRM-Zero: Training Large Reconstruction Models with Synthesized Data
01:56: Scale-Invariant Monocular Depth Estimation via SSI Depth
03:08: GGHead: Fast and Generalizable 3D Gaussian Heads
04:55: Multiagent Multitraversal...
Published 06/15/24
ArXiv Computer Vision research for Thursday, June 13, 2024.
00:21: INS-MMBench: A Comprehensive Benchmark for Evaluating LVLMs' Performance in Insurance
02:11: Large-Scale Evaluation of Open-Set Image Classification Techniques
03:43: PC-LoRA: Low-Rank Adaptation for Progressive Model...
Published 06/15/24