SHIC: Shape-Image Correspondences with no Keypoint Supervision Paper • 2407.18907 • Published Jul 26 • 39
Wolf: Captioning Everything with a World Summarization Framework Paper • 2407.18908 • Published Jul 26 • 30
Floating No More: Object-Ground Reconstruction from a Single Image Paper • 2407.18914 • Published Jul 26 • 18
Harvesting Textual and Structured Data from the HAL Publication Repository Paper • 2407.20595 • Published Jul 30 • 21
Knesset-DictaBERT: A Hebrew Language Model for Parliamentary Proceedings Paper • 2407.20581 • Published Jul 30 • 23
WalkTheDog: Cross-Morphology Motion Alignment via Phase Manifolds Paper • 2407.18946 • Published Jul 11 • 12
TAPTRv2: Attention-based Position Update Improves Tracking Any Point Paper • 2407.16291 • Published Jul 23 • 10
Sentiment Analysis of Lithuanian Online Reviews Using Large Language Models Paper • 2407.19914 • Published Jul 29 • 12
VolDoGer: LLM-assisted Datasets for Domain Generalization in Vision-Language Tasks Paper • 2407.19795 • Published Jul 29 • 10
Integrating Large Language Models into a Tri-Modal Architecture for Automated Depression Classification Paper • 2407.19340 • Published Jul 27 • 56
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis Paper • 2310.00426 • Published Sep 30, 2023 • 61
ConceptGraphs: Open-Vocabulary 3D Scene Graphs for Perception and Planning Paper • 2309.16650 • Published Sep 28, 2023 • 10
RealFill: Reference-Driven Generation for Authentic Image Completion Paper • 2309.16668 • Published Sep 28, 2023 • 14
Visual Riddles: a Commonsense and World Knowledge Challenge for Large Vision and Language Models Paper • 2407.19474 • Published Jul 28 • 22
Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning Paper • 2407.18248 • Published Jul 25 • 30
Eventful Transformers: Leveraging Temporal Redundancy in Vision Transformers Paper • 2308.13494 • Published Aug 25, 2023 • 9
Relighting Neural Radiance Fields with Shadow and Highlight Hints Paper • 2308.13404 • Published Aug 25, 2023 • 8
SoTaNa: The Open-Source Software Development Assistant Paper • 2308.13416 • Published Aug 25, 2023 • 11
Text Injection for Capitalization and Turn-Taking Prediction in Speech Models Paper • 2308.07395 • Published Aug 14, 2023 • 6
Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification Paper • 2308.07921 • Published Aug 15, 2023 • 22
RAVEN: In-Context Learning with Retrieval Augmented Encoder-Decoder Language Models Paper • 2308.07922 • Published Aug 15, 2023 • 17
Enhancing Network Management Using Code Generated by Large Language Models Paper • 2308.06261 • Published Aug 11, 2023 • 5
Improving Joint Speech-Text Representations Without Alignment Paper • 2308.06125 • Published Aug 11, 2023 • 7
HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion Models Paper • 2312.14091 • Published Dec 21, 2023 • 15
DreamTuner: Single Image is Enough for Subject-Driven Generation Paper • 2312.13691 • Published Dec 21, 2023 • 26
EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language Models with 3D Parallelism Paper • 2312.04916 • Published Dec 8, 2023 • 6
DreaMoving: A Human Dance Video Generation Framework based on Diffusion Models Paper • 2312.05107 • Published Dec 8, 2023 • 38
CRM: Single Image to 3D Textured Mesh with Convolutional Reconstruction Model Paper • 2403.05034 • Published Mar 8 • 20
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context Paper • 2403.05530 • Published Mar 8 • 59
Data Mixture Inference: What do BPE Tokenizers Reveal about their Training Data? Paper • 2407.16607 • Published Jul 23 • 21
Learning to Manipulate Anywhere: A Visual Generalizable Framework For Reinforcement Learning Paper • 2407.15815 • Published Jul 22 • 13
DreamCar: Leveraging Car-specific Prior for in-the-wild 3D Car Reconstruction Paper • 2407.16988 • Published Jul 24 • 7
SV4D: Dynamic 3D Content Generation with Multi-Frame and Multi-View Consistency Paper • 2407.17470 • Published Jul 24 • 14
Longhorn: State Space Models are Amortized Online Learners Paper • 2407.14207 • Published Jul 19 • 16
DDK: Distilling Domain Knowledge for Efficient Large Language Models Paper • 2407.16154 • Published Jul 23 • 20