ICCV 2021 结果出炉!最新200篇ICCV2021论文分方向汇总(更新中) |
您所在的位置:网站首页 › 怎样提高身体代谢 › ICCV 2021 结果出炉!最新200篇ICCV2021论文分方向汇总(更新中) |
不久前,计算机视觉三大顶会之一ICCV2021接收结果已经公布,本次ICCV共计 6236 篇有效提交论文,其中有 1617 篇论文被接收,接收率为25.9%。 接收论文ID:https://docs.google.com/spreadsheets/u/1/d/e/2PACX-1vRfaTmsNweuaA0Gjyu58H_Cx56pGwFhcTYII0u1pg0U7MbhlgY0R6Y-BbK3xFhAiwGZ26u3TAtN5MnS/pubhtml 极市平台对此次ICCV2021接收的论文进行了分类汇总,分为检测、分割、估计、跟踪、视觉定位、底层图像处理、图像视频检索、三维视觉等多个方向。所有关于ICCV2021的论文整理都汇总在了我们的Github项目中,该项目目前已收获1300 Star。 这个Github项目将持续更新,项目地址: 目前整理的论文(8月19日更新): 检测2D目标检测(2D Object Detection)[12] G-DetKD: Towards General Distillation Framework for Object Detectors via Contrastive and Semantic-guided Feature Imitationpaper [11] Vector-Decomposed Disentanglement for Domain-Invariant Object Detectionpaper [10] Oriented R-CNN for Object Detectionpaper | code [9] Conditional DETR for Fast Training Convergencepaper | code [8] Boosting Weakly Supervised Object Detection via Learning Bounding Box Adjusterspaper | code [7] GraphFPN: Graph Feature Pyramid Network for Object Detectionpaper解读:复旦&港大提出GraphFPN:用图特征金字塔提升目标检测性能! [6] SimROD: A Simple Adaptation Method for Robust Object Detectionpaper [5] Active Learning for Deep Object Detection via Probabilistic Modelingpaper [4] Detecting Invisible Peoplepaper | project | video [3] Conditional Variational Capsule Network for Open Set Recognitionpaper | code [2] MDETR : Modulated Detection for End-to-End Multi-Modal Understanding(Oral)paper | code | project | colab解读:无需检测器提取特征!LeCun团队提出MDETR:实现真正的端到端多模态推理 [1] DetCo: Unsupervised Contrastive Learning for Object Detectionpaper | code解读:性能优于何恺明团队MoCo v2,DetCo:为目标检测定制任务的对比学习 3D目标检测(3D Object Detection)[6] LIGA-Stereo: Learning LiDAR Geometry Aware Representations for Stereo-based 3D Detectorpaper [5] RandomRooms: Unsupervised Pre-training from Synthetic Shapes and Randomized Layouts for 3D Object Detectionpaper [4] Is Pseudo-Lidar needed for Monocular 3D Object detection?paper [3] Fog Simulation on Real LiDAR Point Clouds for 3D Object Detection in Adverse Weatherpaper | code [2] Geometry Uncertainty Projection Network for Monocular 3D Object Detectionpaper [1] Unsupervised Domain Adaptive 3D Detection with Multi-Level Consistencypaper 显著性目标检测(Saliency Object Detection)[2] Specificity-preserving RGB-D Saliency Detectionpaper | code [1] Disentangled High Quality Salient Object Detectionpaper 伪装目标检测(Camouflaged Object Detection)[1] TransForensics: Image Forgery Localization with Dense Self-Attentionpaper 图像异常检测/表面缺陷检测(Anomally Detection in Image)[2] DRÆM -- A discriminatively trained reconstruction embedding for surface anomaly detectionpaper [1] Divide-and-Assemble: Learning Block-wise Memory for Unsupervised Anomaly Detectionpaper 边缘检测(Edge Detection)[2] Pixel Difference Networks for Efficient Edge Detectionpaper | code [1] RINDNet: Edge Detection for Discontinuity in Reflectance, Illumination, Normal and Depthpaper 分割(Segmentation)图像分割(Image Segmentation)[2] Labels4Free: Unsupervised Segmentation using StyleGANpaper | code | project [1] Mining Latent Classes for Few-shot Segmentation(Oral)paper | code 实例分割(Instance Segmentation)[5] Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networkspaper | code [4] SOTR: Segmenting Objects with Transformerspaper | code [3] Hierarchical Aggregation for 3D Instance Segmentationpaper | code [2] Crossover Learning for Fast Online Video Instance Segmentationcode [1] Instances as Queriespaper | code 语义分割(Semantic Segmentation)[18] Multi-Anchor Active Domain Adaptation for Semantic Segmentation(Oral)paper [17] Multi-Target Adversarial Frameworks for Domain Adaptation in Semantic Segmentationpaper [16] Exploiting a Joint Embedding Space for Generalized Zero-Shot Semantic Segmentationpaper [15] LabOR: Labeling Only if Required for Domain Adaptive Semantic Segmentation(Oral)paper [14] Dual Path Learning for Domain Adaptation of Semantic Segmentationpaper | code [13] Deep Metric Learning for Open World Semantic Segmentationpaper [12] Complementary Patch for Weakly Supervised Semantic Segmentationpaper [11] RECALL: Replay-based Continual Learning in Semantic Segmentationpaper [10] Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformerpaper | code [9] Learning Meta-class Memory for Few-Shot Semantic Segmentationpaper [8] Personalized Image Semantic Segmentationpaper [7] VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentationpaper | code [6] Leveraging Auxiliary Tasks with Affinity Learning for Weakly Supervised Semantic Segmentationpaper [5] ReDAL: Region-based and Diversity-aware Active Learning for Point Cloud Semantic Segmentation(点云语义分割)paper [4] Domain Adaptive Video Segmentation via Temporal Consistency Regularization(video semantic segmentation)paper | code [3] Standardized Max Logits: A Simple yet Effective Approach for Identifying Unexpected Road Obstacles in Urban-Scene Segmentation(Oral)paper [2] Re-distributing Biased Pseudo Labels for Semi-supervised Semantic Segmentation: A Baseline Investigation(Oral)paper | code [1] Calibrated Adversarial Refinement for Stochastic Semantic Segmentationpaper | code 视频目标分割(Video Object Segmentation)[2] Joint Inductive and Transductive Learning for Video Object Segmentationpaper | code [1] Full-Duplex Strategy for Video Object Segmentationpaper | project 参考图像分割(Referring Image Segmentation)[1] Vision-Language Transformer and Query Generation for Referring Segmentationpaper | code 密集预测(Dense Prediction)[1] FaPN: Feature-aligned Pyramid Network for Dense Image Predictionpaper | code 人脸(Face)[1] Learning Facial Representations from the Cycle-consistency of Facepaper 人脸识别/检测(Facial Recognition/Detection)[2] SynFace: Face Recognition with Synthetic Datapaper [1] PASS: Protected Attribute Suppression System for Mitigating Bias in Face Recognitionpaper 人脸生成/合成/重建/编辑(Face Generation/Face Synthesis/Face Reconstruction/Face Editing)[5] FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learningpaper [4] Disentangled Lifespan Face Synthesispaper | code [3] MeshTalk: 3D Face Animation from Speech using Cross-Modality Disentanglement(音频驱动面部动画)paper | video [2] Focal Frequency Loss for Image Reconstruction and Synthesispaper | code [1] HeadGAN: One-shot Neural Head Synthesis and Editingpaper 人脸伪造/反欺骗(Face Forgery/Face Anti-Spoofing)[1] Exploring Temporal Coherence for More General Video Face Forgery Detectionpaper 三维视觉(3D Vision)[3] Differentiable Surface Rendering via Non-Differentiable Samplingpaper [2] M3D-VTON: A Monocular-to-3D Virtual Try-On Network(3D试穿)paper [1] Score-Based Point Cloud Denoisingpaper 点云(Point Cloud)[10] ME-PCN: Point Completion Conditioned on Mask Emptiness(点云补全)paper [9] Adaptive Graph Convolution for Point Cloud Analysispaper | code [8] PICCOLO: Point Cloud-Centric Omnidirectional Localizationpaper [7] AdaFit: Rethinking Learning-based Normal Estimation on Point Cloudspaper | code [6] SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformerpaper | code [5] DRINet: A Dual-Representation Iterative Learning Network for Point Cloud Segmentationpaper [4] Unsupervised Learning of Fine Structure Generation for 3D Point Clouds by 2D Projection Matchingpaper | code [3] (Just) A Spoonful of Refinements Helps the Registration Error Go Down(Oral)paper [2] Learning with Noisy Labels for Robust Point Cloud Segmentation(点云分割)paper | code [1] HRegNet: A Hierarchical Network for Large-scale Outdoor LiDAR Point Cloud Registrationpaper | project 三维重建(3D Reconstruction)[6] Deep Hybrid Self-Prior for Full 3D Mesh Generationpaper | project [5] PR-RRN: Pairwise-Regularized Residual-Recursive Networks for Non-rigid Structure-from-Motionpaper [4] Learning Canonical 3D Object Representation for Fine-Grained Recognitionpaper [3] ELLIPSDF: Joint Object Pose and Shape Optimization with a Bi-level Ellipsoid and Signed Distance Function Descriptionpaper [2] Discovering 3D Parts from Image Collectionspaper | project [1] PlaneTR: Structure-Guided Transformers for 3D Plane Recoverypaper | code 神经网络设计与优化(Neural Network Structure Design & Optimization)[2] Unifying Nonlocal Blocks for Neural Networkspaper [1] Energy-Based Open-World Uncertainty Modeling for Confidence Calibration(置信度校准)paper CNN[3] MicroNet: Improving Image Recognition with Extremely Low FLOPspaper | code1 | code2 [2] Learning to Resize Images for Computer Vision Taskspaper [1] Bias Loss for Mobile Neural Networkspaper解读:超越MobileNet V3 | 详解SkipNet+Bias Loss=轻量化模型新的里程碑 Attention[4] Residual Attention: A Simple but Effective Method for Multi-Label Recognitionpaper [3] Fast Convergence of DETR with Spatially Modulated Co-Attentionpaper | code [2] SCOUTER: Slot Attention-based Classifier for Explainable Image Recognitionpaper | code [1] FcaNet: Frequency Channel Attention Networkspaper | code Transformer[10] An Empirical Study of Training Self-Supervised Vision Transformers(Oral)paper解读:解决训练不稳定性,何恺明团队新作来了!自监督学习+Transformer=MoCoV3 [9] LeViT: a Vision Transformer in ConvNet’s Clothing for Faster Inferencepaper | code解读:FaceBook提出LeViT,0.077ms的单图处理速度却拥有ResNet50的精度 [8] Emerging Properties in Self-Supervised Vision Transformerspaper | code解读:当Transformer遇见自监督学习!Facebook重磅开源DINO [7] Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNetpaper | code解读:ResNet被全面超越了,是Transformer干的:依图科技开源“可大可小”T2T-ViT,轻量版优于MobileNet [6] Vision Transformer with Progressive Samplingpaper | code [5] Rethinking and Improving Relative Position Encoding for Vision Transformerpaper | code解读:Vision Transformer中的相对位置编码 [4] AutoFormer: Searching Transformers for Visual Recognitionpaper | code [3] Rethinking Spatial Dimensions of Vision Transformerspaper | code [2] Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers(Oral)paper | code [1] Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions(Oral)paper | code解读:金字塔视觉Transformer(PVT):用于密集预测的多功能backbone 神经网络架构搜索(NAS)[3] BN-NAS: Neural Architecture Search with Batch Normalizationpaper [2] NASOA: Towards Faster Task-oriented Online Fine-tuning with a Zoo of Modelspaper [1] AutoFormer: Searching Transformers for Visual Recognitionpaper | code 损失函数(Loss Function)[3] Rank & Sort Loss for Object Detection and Instance Segmentation(Oral)paper | code解读:拒绝调参,显著提点!检测分割任务的新损失函数RS Loss开源 [2] Focal Frequency Loss for Image Reconstruction and Synthesispaper | code [1] Orthogonal Projection Losspaper | code 可视化/可解释性(Visualization/Interpretability)[1] Finding Representative Interpretations on Convolutional Neural Networkspaper 模型训练/泛化(Model Training/Generalization)[3] MultiTask-CenterNet (MCN): Efficient and Diverse Multitask Learning using an Anchor Free Approach(多任务学习)paper [2] Impact of Aliasing on Generalization in Deep Convolutional Networkspaper [1] Learning Compatible Embeddingspaper | code 噪声标签(Noisy Label)[1] Learning with Noisy Labels via Sparse Regularizationpaper | code 长尾分布(Long-Tailed Distribution)[1] ACE: Ally Complementary Experts for Solving Long-Tailed Recognition in One-Shot(Oral)paper | code 分布外样本检测(Out of Distribution Detection)[2] Trash to Treasure: Harvesting OOD Data with Cross-Modal Matching for Open-Set Semi-Supervised Learningpaper [1] CODEs: Chamfer Out-of-Distribution Examples against Overconfidence Issuepaper 模型压缩(Model Compression)知识蒸馏(Knowledge Distillation)[4] G-DetKD: Towards General Distillation Framework for Object Detectors via Contrastive and Semantic-guided Feature Imitationpaper [3] Online Multi-Granularity Distillation for GAN Compressionpaper | code [2] Distilling Holistic Knowledge with Graph Neural Networkspaper | code [1] AGKD-BML: Defense Against Adversarial Attack by Attention Guided Knowledge Distillation and Bi-directional Metric Learningpaper | code 剪枝(Pruning)剪枝(Pruning)量化(Quantization)[2] Distance-aware Quantizationpaper [1] Generalizable Mixed-Precision Quantization via Attribution Rank Preservationpaper | code 图像生成/合成(Image Generation/Image Synthesis)[7] Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates(手势生成)paper | code [6] Orthogonal Jacobian Regularization for Unsupervised Disentanglement in Image Generationpaper | code [5] ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models(Oral)paper [4] Toward Spatially Unbiased Generative Modelspaper [3] A Light Stage on Every Deskpaper | project [2] Handwriting Transformerspaper [1] On Generating Transferable Targeted Perturbationspaper | code 视图合成(View Synthesis)[1] PixelSynth: Generating a 3D-Consistent Experience from a Single Imagepaper | project GAN/生成式/对抗式(GAN/Generative/Adversarial)[13] Unsupervised Geodesic-preserved Generative Adversarial Networks for Unconstrained 3D Pose Transferpaper | code [12] Online Multi-Granularity Distillation for GAN Compressionpaper | code [11] AGKD-BML: Defense Against Adversarial Attack by Attention Guided Knowledge Distillation and Bi-directional Metric Learningpaper | code [10] Meta Gradient Adversarial Attackpaper [9] Sketch Your Own GANpaper | code | project解读:用一张草图创建GAN模型,新手也能玩转,朱俊彦团队新研究入选ICCV 2021 [8] Feature Importance-aware Transferable Adversarial Attackspaper | code [7] From Continuity to Editability: Inverting GANs with Consecutive Imagespaper | code [6] Learnable Boundary Guided Adversarial Trainingpaper | code [5] Transporting Causal Mechanisms for Unsupervised Domain Adaptation(Oral)paper [4] Robustness via Cross-Domain Ensembles(Oral)paper | code | model | homepage | video [3] HeadGAN: One-shot Neural Head Synthesis and Editingpaper [2] Labels4Free: Unsupervised Segmentation using StyleGANpaper | code | project [1] EigenGAN: Layer-Wise Eigen-Learning for GANspaper | code 图像处理(Image Processing)[3] Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescalingpaper | code [2] Accelerating Atmospheric Turbulence Simulation via Learned Phase-to-Space Transformpaper [1] Equivariant Imaging: Learning Beyond the Range Space(Oral)paper 超分辨率(Super Resolution)[2] Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolutionpaper | code [1] Learning for Scale-Arbitrary Super-Resolution from Scale-Specific Networkspaper | code 图像去噪/去模糊/去雨去雾(Image Denoising)[1] Rethinking Coarse-to-Fine Approach in Single Image Deblurringpaper | code 图像编辑/修复(Image Edit/Image Inpainting)[1] Occlusion-Aware Video Object Inpainting(视频修复)paper 风格迁移(Style Transfer)[5] SSH: A Self-Supervised Framework for Image Harmonization(图像协调)paper | code [4] Domain-Aware Universal Style Transferpaper [3] AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transferpaper | code1 | code2 [2] ALADIN: All Layer Adaptive Instance Normalization for Fine-grained Style Similarity(风格迁移)paper [1] Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts(字体生成)paper | code 图像质量评估(Image Quality Assessment)[1] MUSIQ: Multi-scale Image Quality Transformerpaper 估计(Estimation)姿态估计(Human Pose Estimation)[8] Learning Skeletal Graph Neural Networks for Hard 3D Pose Estimationpaper | code [7] EventHPE: Event-based 3D Human Pose and Shape Estimationpaper [6] HandFoldingNet: A 3D Hand Pose Estimation Network Using Multiscale-Feature Guided Folding of a 2D Hand Skeletonpaper | code [5] Online Knowledge Distillation for Efficient Pose Estimationpaper [4] Probabilistic Monocular 3D Human Pose Estimation with Normalizing Flowspaper [3] Human Pose Regression with Residual Log-likelihood Estimation(Oral)paper | code [2] PyMAF: 3D Human Pose and Shape Regression with Pyramidal Mesh Alignment Feedback Loop(Oral)paper | code | project [1] HuMoR: 3D Human Motion Model for Robust Pose Estimation(Oral)paper | video | project 深度估计(Depth Estimation)[4] Self-supervised Monocular Depth Estimation for All Day Images using Domain Separationpaper [3] Towards Interpretable Deep Networks for Monocular Depth Estimationpaper | code [2] Regularizing Nighttime Weirdness: Efficient Self-supervised Monocular Depth Estimation in the Darkpaper [1] MonoIndoor: Towards Good Practice of Self-Supervised Monocular Depth Estimation for Indoor Environmentspaper 图像&视频检索/理解(Image&Video Retrieval/Video Understanding)[5] ASMR: Learning Attribute-Based Person Search with Adaptive Semantic Margin Regularizerpaper [4] Image Retrieval on Real-life Images with Pre-trained Vision-and-Language Modelspaper | code [3] DOLG: Single-Stage Image Retrieval with Deep Orthogonal Fusion of Local and Global Featurespaper [2] Hand Image Understanding via Deep Multi-Task Learning(手部图像理解)paper [1] Cross-Sentence Temporal and Semantic Relations in Video Activity Localisationpaper 行为识别/行为识别/动作识别/检测/分割(Action/Activity Recognition)[7] Group-aware Contrastive Regression for Action Quality Assessment(动作质量评估)paper [6] Foreground-Action Consistency Network for Weakly Supervised Temporal Action Localization(动作定位)paper | code [5] Learning Action Completeness from Points for Weakly-supervised Temporal Action Localization(动作定位)paper | code [4] Elaborative Rehearsal for Zero-shot Action Recognitionpaper | code [3] Skeleton Cloud Colorization for Unsupervised 3D Action Representation Learningpaper [2] Enriching Local and Global Contexts for Temporal Action Localizationpaper [1] Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognitionpaper | code 行人重识别/检测(Re-Identification/Detection)[6] Learning by Aligning: Visible-Infrared Person Re-identification using Cross-Modal Correspondencespaper [5] Towards Discriminative Representation Learning for Unsupervised Person Re-identificationpaper [4] Learning Instance-level Spatial-Temporal Patterns for Person Re-identificationpaper | Cleaned database [3] An Intermediate Domain Module for Domain Adaptive Person Re-ID(Oral)paper | code [2] Spatio-Temporal Representation Factorization for Video-based Person Re-Identificationpaper [1] TransReID: Transformer-based Object Re-Identificationpaper | code解读:来自Transformer的降维打击:ReID各项任务全面领先,阿里&浙大提出TransReID 图像/视频字幕(Image/Video Caption)[1] End-to-End Dense Video Captioning with Parallel Decodingpaper | code 视觉定位(Visual Localization)[4] PICCOLO: Point Cloud-Centric Omnidirectional Localizationpaper [3] Normalization Matters in Weakly Supervised Object Localizationpaper [2] TS-CAM: Token Semantic Coupled Attention Map for Weakly Supervised Object Localizationpaper | code [1] Boundary-sensitive Pre-training for Temporal Localization in Videospaper 图像匹配(Image Matching)[5] Pixel-Perfect Structure-from-Motion with Featuremetric Refinementpaper | code [4] Progressive Correspondence Pruning by Consensus Learningpaper | code | project解读:CLNet:基于一致性学习的渐进式匹配筛选 [3] Multi-scale Matching Networks for Semantic Correspondencepaper [2] Warp Consistency for Unsupervised Learning of Dense Correspondences(Oral)paper | code [1] COTR: Correspondence Transformer for Matching Across Imagespaper 三维视觉(3D Vision)[1] MVTN: Multi-View Transformation Network for 3D Shape Recognitionpaper 目标跟踪(Object Tracking)[8] Learning Spatio-Temporal Transformer for Visual Trackingpaper | code解读:屠榜目标跟踪!大连理工和MSRA提出STARK:基于Transformer的目标跟踪器 [7] Box-Aware Feature Enhancement for Single Object Tracking on Point Cloudspaper [6] Video Annotation for Visual Tracking via Selection and Refinementpaper [5] Saliency-Associated Object Trackingpaper [4] Learn to Match: Automatic Matching Network Design for Visual Trackingpaper | code [3] HiFT: Hierarchical Feature Transformer for Aerial Trackingpaper | code [2] Learning to Adversarially Blur Visual Object Trackingpaper | code [1] Detecting Invisible Peoplepaper | project | video 医学影像(Medical Imaging)[2] Recurrent Mask Refinement for Few-Shot Medical Image Segmentationpaper [1] Generative Adversarial Registration for Improved Conditional Deformable Templatespaper | code | homepage 文本检测/识别(Text Detection/Recognition)[4] Adaptive Boundary Proposal Network for Arbitrary Shape Text Detectionpaper [3] Joint Visual Semantic Reasoning: Multi-Stage Decoder for Text Recognitionpaper [2] Text is Text, No Matter What: Unifying Text Recognition using Knowledge Distillationpaper [1] Towards the Unseen: Iterative Text Recognition by Distilling from Errorspaper 遥感图像(Remote Sensing Image)[4] Structured Outdoor Architecture Reconstruction by Exploration and Classificationpaper [3] Change is Everywhere Single-Temporal Supervised Object Change Detection for High Spatial Resolution Remote Sensing Imagery(变化检测)paper | code [2] Geography-Aware Self-Supervised Learningpaper [1] Seasonal Contrast: Unsupervised Pre-Training from Uncurated Remote Sensing Data(迁移学习)paper | code 场景图(Scene Graph)场景图生成(Scene Graph Generation)[4] Target Adaptive Context Aggregation for Video Scene Graph Generationpaper | code [3] Unconditional Scene Graph Generationpaper [2] Spatial-Temporal Transformer for Dynamic Scene Graph Generationpaper解读:用于视频场景图生成的时空上下文Transformer [1] Unconstrained Scene Generation with Locally Conditioned Radiance Fieldspaper 场景图预测(Scene Graph Prediction)[1] Generative Compositional Augmentations for Scene Graph Predictionpaper | code 数据处理(Data Processing)数据增广(Data Augmentation)[1] MixMo: Mixing Multiple Inputs for Multiple Outputs via Deep Subnetworkspaper解读:“白嫖”性能的MixMo,一种新的数据增强or模型融合方法 异常检测(Anomaly Detection)[3] A Hybrid Video Anomaly Detection Framework via Memory-Augmented Flow Reconstruction and Flow-Guided Frame Predictionpaper | code [2] Weakly Supervised Temporal Anomaly Segmentation with Dynamic Time Warpingpaper [1] Weakly-supervised Video Anomaly Detection with Robust Temporal Feature Magnitude Learningpaper | code 表征学习(Representation Learning)[4] Self-Supervised Visual Representations Learning by Contrastive Mask Predictionpaper [3] Collaborative Unsupervised Visual Representation Learning from Decentralized Datapaper [2] Enhancing Self-supervised Video Representation Learning via Multi-level Feature Optimizationpaper [1] In-Place Scene Labelling and Understanding with Implicit Scene Representation(Oral)paper | project 归一化/正则化(Batch Normalization)图像聚类(Image Clustering)[4] Instance Similarity Learning for Unsupervised Feature Representationpaper | code [3] Graph Constrained Data Representation Learning for Human Motion Segmentation(人体运动分割)paper [2] Improve Unsupervised Pretraining for Few-label Transferpaper [1] Clustering by Maximizing Mutual Information Across Viewspaper 小样本学习/零样本学习(Few-shot/Zero-shot Learning)[3] Boosting the Generalization Capability in Cross-Domain Few-shot Learning via Noise-enhanced Supervised Autoencoderpaper [2] Transductive Few-Shot Classification on the Oblique Manifoldpaper [1] FREE: Feature Refinement for Generalized Zero-Shot Learningpaper | code 持续学习(Continual Learning/Life-long Learning)[3] Continual Neural Mapping: Learning An Implicit Scene Representation from Sequential Observationspaper [2] RECALL: Replay-based Continual Learning in Semantic Segmentationpaper [1] Few-Shot and Continual Learning with Attentive Independent Mechanismspaper | code 迁移学习/domain/自适应(Transfer Learning/Domain Adaptation)[14] PIT: Position-Invariant Transform for Cross-FoV Domain Adaptationpaper | code [13] Multi-Target Adversarial Frameworks for Domain Adaptation in Semantic Segmentationpaper [12] Semantic Concentration for Domain Adaptationpaper [11] Dual Path Learning for Domain Adaptation of Semantic Segmentationpaper | code [10] Zero-Shot Domain Adaptation with a Physics Prior(Oral)paper | code [9] BiMaL: Bijective Maximum Likelihood Approach to Domain Adaptation in Semantic Scene Segmentationpaper [8] Domain Generalization via Gradient Surgery paper [7] Generalizing Gaze Estimation with Outlier-guided Collaborative Adaptationpaper | code [6] Adversarial Unsupervised Domain Adaptation with Conditional and Label Shift: Infer, Align and Iteratepaper [5] Recursively Conditional Gaussian for Ordinal Unsupervised Domain Adaptation(Oral)paper [4] Improve Unsupervised Pretraining for Few-label Transferpaper [3] Generalized Source-free Domain Adaptationhomepage | code [2] Seasonal Contrast: Unsupervised Pre-Training from Uncurated Remote Sensing Data(迁移学习)paper | code [1] Calibrated prediction in and out-of-domain for state-of-the-art saliency modeling(迁移学习)paper 度量学习(Metric Learning)[4] Towards Interpretable Deep Metric Learning with Structural Matchingpaper | code [3] AGKD-BML: Defense Against Adversarial Attack by Attention Guided Knowledge Distillation and Bi-directional Metric Learningpaper | code [2] Deep Metric Learning for Open World Semantic Segmentationpaper [1] Learning with Memory-based Virtual Classes for Deep Metric Learningpaper 增量学习(Incremental Learning)[2] Generalized and Incremental Few-Shot Learning by Explicit Learning and Calibration without Forgettingpaper | code [1] Always Be Dreaming: A New Approach for Data-Free Class-Incremental Learningpaper | code | project 对比学习(Contrastive Learning)[4] Improving Contrastive Learning by Visualizing Feature Transformationpaper | visualization tools and codes [3] Parametric Contrastive Learningpaper | code [2] Geography-Aware Self-Supervised Learningpaper [1] CoMatch: Semi-supervised Learning with Contrastive Graph Regularizationpaper | code 主动学习(Active Learning)[2] Semi-Supervised Active Learning with Temporal Output Discrepancypaper | code [1] Active Learning for Deep Object Detection via Probabilistic Modelingpaper 视觉推理/视觉问答(Visual Reasoning/VQA)[3] Greedy Gradient Ensemble for Robust Visual Question Answeringpaper | code [2] On the hidden treasure of dialog in video question answeringpaper [1] Just Ask: Learning to Answer Questions from Millions of Narrated Videos(Oral)paper | code | project 元学习(Meta Learning)多模态学习(Multi-Modal Learning)视听学习(Audio-visual Learning)[1] The Right to Talk: An Audio-Visual Transformer Approachpaper 视觉预测(Vision-based Prediction)[7] MSR-GCN: Multi-Scale Residual Graph Convolution Networks for Human Motion Prediction(人体运动预测)paper | code [6] RAIN: Reinforced Hybrid Attention Inference Network for Motion Forecasting(运动预测)paper | project [5] SLAMP: Stochastic Latent Appearance and Motion Prediction(运动预测)paper [4] Unlimited Neighborhood Interaction for Heterogeneous Trajectory Prediction(轨迹预测)paper [3] Personalized Trajectory Prediction via Distribution Discrimination(轨迹预测)paper | code [2] Human Trajectory Prediction via Counterfactual Analysis(轨迹预测)paper | code [1] On Exposing the Challenging Long Tail in Future Prediction of Traffic Actorspaper 数据集(Dataset)[7] LOKI: Long Term and Key Intentions for Trajectory Prediction(轨迹预测)paper | dataset [6] Who's Waldo? Linking People Across Text and Images(Oral)paper | project [5] Towards Real-World Prohibited Item Detection: A Large-Scale X-ray Benchmark(违禁物品检测)paper [4] Towers of Babel: Combining Images, Language, and 3D Geometry for Learning Multimodal Vision(地标照片集)paper | project [3] Webly Supervised Fine-Grained Recognition: Benchmark Datasets and An Approachpaper | dataset [2] OpenForensics: Large-Scale Challenging Dataset For Multi-Face Forgery Detection And Segmentation In-The-Wildpaper | project [1] 4DComplete: Non-Rigid Motion Estimation Beyond the Observable Surface(4D重建)paper | dataset | video 暂无分类Stochastic Scene-Aware Motion Prediction(运动合成)(运动预测)paper | project End-to-End Urban Driving by Imitating a Reinforcement Learning Coach(自动驾驶)(强化学习)paper Learning RAW-to-sRGB Mappings with Inaccurately Aligned Supervisionpaper | code Asymmetric Bilateral Motion Estimation for Video Frame Interpolation(视频插帧)paper | code Focus on the Positives: Self-Supervised Learning for Biodiversity Monitoringpaper DiagViB-6: A Diagnostic Benchmark Suite for Vision Models in the Presence of Shortcut and Generalization Opportunitiespaper MT-ORL: Multi-Task Occlusion Relationship Learningpaper | code ProAI: An Efficient Embedded AI Hardware for Automotive Applications - a Benchmark Studypaper Invisible Backdoor Attack with Sample-Specific Triggers(后门学习)paper解读:具有样本特定触发器的隐形后门攻击 SUNet: Symmetric Undistortion Network for Rolling Shutter Correctionpaper Learning to Cut by Watching Moviespaper | project Paint Transformer: Feed Forward Neural Painting with Stroke Prediction(Oral)paper | code Internal Video Inpainting by Implicit Long-range Propagationpaper CanvasVAE: Learning to Generate Vector Graphic Documentspaper TkML-AP: Adversarial Attacks to Top-k Multi-Label Learning(多标签学习)paper Out-of-Core Surface Reconstruction via Global TGV Minimizationpaper Variational Attention: Propagating Domain-Specific Knowledge for Multi-Domain Learning in Crowd Counting(人群计数)paper Spatial Uncertainty-Aware Semi-Supervised Crowd Counting(人群计数)paper Rethinking Counting and Localization in Crowds: A Purely Point-Based Framework(Oral)(人群计数)paper | code Uniformity in Heterogeneity:Diving Deep into Count Interval Partition for Crowd Counting(人群计数)paper | code Self-Conditioned Probabilistic Learning of Video Rescaling(视频压缩)paper Mixed SIGNals: Sign Language Production via a Mixture of Motion Primitives(手势生成)paper Temporal-wise Attention Spiking Neural Networks for Event Streams Classificationpaper Long-Term Temporally Consistent Unpaired Video Translation from Simulated Surgical 3D Data(视频翻译/医学/视频合成)paper Pathdreamer: A World Model for Indoor Navigation(视觉导航)paper IPOKE: POKING A STILL IMAGE FOR CONTROLLED STOCHASTIC VIDEO SYNTHESISpaper | code | project Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesispaper | project KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPspaper | code |
今日新闻 |
推荐新闻 |
CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3 |