Ep. 238 - Part 1 - June 4, 2024
Description
ArXiv Computer Vision research for Tuesday, June 04, 2024.
00:20: Plug-and-Play Diffusion Distillation
01:29: Enhance Image-to-Image Generation with LLaVA Prompt and Negative Prompt
02:33: The Crystal Ball Hypothesis in diffusion models: Anticipating object positions from initial noise
04:03: Dealing with All-stage Missing Modality: Towards A Universal Model with Robust Reconstruction and Personalization
05:38: Choroidal Vessel Segmentation on Indocyanine Green Angiography Images via Human-in-the-Loop Labeling
07:31: 3D Imaging of Complex Specular Surfaces by Fusing Polarimetric and Deflectometric Information
08:47: MetaMixer Is All You Need
10:36: Multi-Scale Direction-Aware Network for Infrared Small Target Detection
12:26: Leveraging Predicate and Triplet Learning for Scene Graph Generation
14:15: OpenGaussian: Towards Point-Level 3D Gaussian-based Open Vocabulary Understanding
15:57: FaceCom: Towards High-fidelity 3D Facial Shape Completion via Optimization and Inpainting Guidance
17:17: Domain Game: Disentangle Anatomical Feature for Single Domain Generalized Segmentation
18:41: Analyzing the Effect of Combined Degradations on Face Recognition
19:58: UA-Track: Uncertainty-Aware End-to-End 3D Multi-Object Tracking
21:42: Analyzing the Feature Extractor Networks for Face Image Synthesis
22:59: Radar Spectra-Language Model for Automotive Scene Parsing
24:17: GraVITON: Graph based garment warping with attention guided inversion for Virtual-tryon
25:32: Can CLIP help CLIP in learning 3D?
26:34: Why Only Text: Empowering Vision-and-Language Navigation with Multi-modal Prompts
28:19: SMCL: Saliency Masked Contrastive Learning for Long-tailed Recognition
29:19: I4VGen: Image as Stepping Stone for Text-to-Video Generation
30:34: PuFace: Defending against Facial Cloaking Attacks for Facial Recognition Models
31:57: M3DM-NR: RGB-3D Noisy-Resistant Industrial Anomaly Detection via Multimodal Denoising
33:37: Image contrast enhancement based on the Schr\"odinger operator spectrum
34:44: Understanding Retrieval Robustness for Retrieval-Augmented Image Captioning
35:46: Optimised ProPainter for Video Diminished Reality Inpainting
36:43: Continual Unsupervised Out-of-Distribution Detection
37:55: Progressive Confident Masking Attention Network for Audio-Visual Segmentation
39:06: Flash Diffusion: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation
ArXiv Computer Vision research for Thursday, June 13, 2024.
00:21: LRM-Zero: Training Large Reconstruction Models with Synthesized Data
01:56: Scale-Invariant Monocular Depth Estimation via SSI Depth
03:08: GGHead: Fast and Generalizable 3D Gaussian Heads
04:55: Multiagent Multitraversal...
Published 06/15/24
ArXiv Computer Vision research for Thursday, June 13, 2024.
00:21: INS-MMBench: A Comprehensive Benchmark for Evaluating LVLMs' Performance in Insurance
02:11: Large-Scale Evaluation of Open-Set Image Classification Techniques
03:43: PC-LoRA: Low-Rank Adaptation for Progressive Model...
Published 06/15/24