Ep. 245 - Part 2 - June 11, 2024
Listen now
Description
ArXiv Computer Vision research for Tuesday, June 11, 2024. 00:21: NeRSP: Neural 3D Reconstruction for Reflective Objects with Sparse Polarized Images 01:27: Beyond Bare Queries: Open-Vocabulary Object Retrieval with 3D Scene Graph 03:14: T2S-GPT: Dynamic Vector Quantization for Autoregressive Sign Language Production from Text 04:45: Benchmarking and Boosting Radiology Report Generation for 3D High-Resolution Medical Images 06:23: FaceGPT: Self-supervised Learning to Chat about 3D Human Faces 07:52: RecMoDiffuse: Recurrent Flow Diffusion for Human Motion Generation 09:15: VoxNeuS: Enhancing Voxel-Based Neural Surface Reconstruction via Gradient Interpolation 10:51: RAD: A Comprehensive Dataset for Benchmarking the Robustness of Image Anomaly Detection 12:05: RGB-Sonar Tracking Benchmark and Spatial Cross-Attention Transformer Tracker 13:52: MeMSVD: Long-Range Temporal Structure Capturing Using Incremental SVD 15:15: Can Foundation Models Reliably Identify Spatial Hazards? A Case Study on Curb Segmentation 16:56: MS-Diffusion: Multi-subject Zero-shot Image Personalization with Layout Guidance 18:20: Open-World Human-Object Interaction Detection via Multi-modal Prompts 20:03: Which Country Is This? Automatic Country Ranking of Street View Photos 20:44: Needle In A Multimodal Haystack 22:10: Is One GPU Enough? Pushing Image Generation at Higher-Resolutions with Foundation Models 23:24: Towards Realistic Data Generation for Real-World Super-Resolution 24:37: Unsupervised Object Detection with Theoretical Guarantees 25:43: Embedded Graph Convolutional Networks for Real-Time Event Data Processing on SoC FPGAs 27:45: A Framework for Efficient Model Evaluation through Stratification, Sampling, and Estimation 29:01: Cinematic Gaussians: Real-Time HDR Radiance Fields with Depth of Field 30:24: Minimizing Energy Costs in Deep Learning Model Training: The Gaussian Sampling Approach 32:09: Global-Regularized Neighborhood Regression for Efficient Zero-Shot Texture Anomaly Detection 33:52: Deep Implicit Optimization for Robust and Flexible Image Registration 35:28: Visual Representation Learning with Stochastic Frame Prediction
More Episodes
ArXiv Computer Vision research for Thursday, June 13, 2024. 00:21: LRM-Zero: Training Large Reconstruction Models with Synthesized Data 01:56: Scale-Invariant Monocular Depth Estimation via SSI Depth 03:08: GGHead: Fast and Generalizable 3D Gaussian Heads 04:55: Multiagent Multitraversal...
Published 06/15/24
ArXiv Computer Vision research for Thursday, June 13, 2024. 00:21: INS-MMBench: A Comprehensive Benchmark for Evaluating LVLMs' Performance in Insurance 02:11: Large-Scale Evaluation of Open-Set Image Classification Techniques 03:43: PC-LoRA: Low-Rank Adaptation for Progressive Model...
Published 06/15/24
Published 06/15/24