-
Contrastive Learning for Many-to-many Multilingual Neural Machine Translation
Paper β’ 2105.09501 β’ Published β’ 1 -
Cross-modal Contrastive Learning for Speech Translation
Paper β’ 2205.02444 β’ Published -
ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs
Paper β’ 2210.03052 β’ Published -
Diffusion Glancing Transformer for Parallel Sequence to Sequence Learning
Paper β’ 2212.10240 β’ Published β’ 1
Collections
Discover the best community collections!
Collections including paper arxiv:2406.09414
-
Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis
Paper β’ 2401.09048 β’ Published β’ 10 -
Improving fine-grained understanding in image-text pre-training
Paper β’ 2401.09865 β’ Published β’ 18 -
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Paper β’ 2401.10891 β’ Published β’ 62 -
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild
Paper β’ 2401.13627 β’ Published β’ 78
-
mDPO: Conditional Preference Optimization for Multimodal Large Language Models
Paper β’ 2406.11839 β’ Published β’ 40 -
Pandora: Towards General World Model with Natural Language Actions and Video States
Paper β’ 2406.09455 β’ Published β’ 16 -
WPO: Enhancing RLHF with Weighted Preference Optimization
Paper β’ 2406.11827 β’ Published β’ 17 -
In-Context Editing: Learning Knowledge from Self-Induced Distributions
Paper β’ 2406.11194 β’ Published β’ 20
-
Depth Anything V2
Paper β’ 2406.09414 β’ Published β’ 103 -
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels
Paper β’ 2406.09415 β’ Published β’ 51 -
Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion
Paper β’ 2406.04338 β’ Published β’ 39 -
SAM 2: Segment Anything in Images and Videos
Paper β’ 2408.00714 β’ Published β’ 122
-
Drivable 3D Gaussian Avatars
Paper β’ 2311.08581 β’ Published β’ 47 -
Single-Shot Implicit Morphable Faces with Consistent Texture Parameterization
Paper β’ 2305.03043 β’ Published β’ 6 -
One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion
Paper β’ 2311.07885 β’ Published β’ 40 -
Dynamic Mesh-Aware Radiance Fields
Paper β’ 2309.04581 β’ Published β’ 7
-
Contrastive Learning for Many-to-many Multilingual Neural Machine Translation
Paper β’ 2105.09501 β’ Published β’ 1 -
Cross-modal Contrastive Learning for Speech Translation
Paper β’ 2205.02444 β’ Published -
ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs
Paper β’ 2210.03052 β’ Published -
Diffusion Glancing Transformer for Parallel Sequence to Sequence Learning
Paper β’ 2212.10240 β’ Published β’ 1
-
Depth Anything V2
Paper β’ 2406.09414 β’ Published β’ 103 -
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels
Paper β’ 2406.09415 β’ Published β’ 51 -
Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion
Paper β’ 2406.04338 β’ Published β’ 39 -
SAM 2: Segment Anything in Images and Videos
Paper β’ 2408.00714 β’ Published β’ 122
-
Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis
Paper β’ 2401.09048 β’ Published β’ 10 -
Improving fine-grained understanding in image-text pre-training
Paper β’ 2401.09865 β’ Published β’ 18 -
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Paper β’ 2401.10891 β’ Published β’ 62 -
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild
Paper β’ 2401.13627 β’ Published β’ 78
-
Drivable 3D Gaussian Avatars
Paper β’ 2311.08581 β’ Published β’ 47 -
Single-Shot Implicit Morphable Faces with Consistent Texture Parameterization
Paper β’ 2305.03043 β’ Published β’ 6 -
One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion
Paper β’ 2311.07885 β’ Published β’ 40 -
Dynamic Mesh-Aware Radiance Fields
Paper β’ 2309.04581 β’ Published β’ 7
-
mDPO: Conditional Preference Optimization for Multimodal Large Language Models
Paper β’ 2406.11839 β’ Published β’ 40 -
Pandora: Towards General World Model with Natural Language Actions and Video States
Paper β’ 2406.09455 β’ Published β’ 16 -
WPO: Enhancing RLHF with Weighted Preference Optimization
Paper β’ 2406.11827 β’ Published β’ 17 -
In-Context Editing: Learning Knowledge from Self-Induced Distributions
Paper β’ 2406.11194 β’ Published β’ 20