JB D.'s picture

JB D. PRO

IAMJB

·

jbdel

AI & ML interests

None yet

Organizations

IAMJB's activity

upvoted 2 papers 5 days ago

ZePo: Zero-Shot Portrait Stylization with Faster Sampling

Paper • 2408.05492 • Published Aug 10 • 7

OMCAT: Omni Context Aware Transformer

Paper • 2410.12109 • Published 7 days ago • 4

upvoted a paper 13 days ago

Instruction-Guided Visual Masking

Paper • 2405.19783 • Published May 30 • 1

upvoted a collection 13 days ago

Chexpert-plus RRG

8 items • Updated 13 days ago • 1

upvoted 3 papers 14 days ago

DinoBloom: A Foundation Model for Generalizable Cell Embeddings in Hematology

Paper • 2404.05022 • Published Apr 7 • 2

BIMCV-R: A Landmark Dataset for 3D CT Text-Image Retrieval

Paper • 2403.15992 • Published Mar 24 • 1

SteP: Stacked LLM Policies for Web Actions

Paper • 2310.03720 • Published Oct 5, 2023 • 7

upvoted 7 papers 15 days ago

TurtleBench: Evaluating Top Language Models via Real-World Yes/No Puzzles

Paper • 2410.05262 • Published 15 days ago • 9

VideoGuide: Improving Video Diffusion Models without Training Through a Teacher's Guide

Paper • 2410.04364 • Published 17 days ago • 26

TLDR: Token-Level Detective Reward Model for Large Vision Language Models

Paper • 2410.04734 • Published 16 days ago • 15

Grounding Language in Multi-Perspective Referential Communication

Paper • 2410.03959 • Published 18 days ago • 3

FAN: Fourier Analysis Networks

Paper • 2410.02675 • Published 19 days ago • 24

Presto! Distilling Steps and Layers for Accelerating Music Generation

Paper • 2410.05167 • Published 15 days ago • 15

Differential Transformer

Paper • 2410.05258 • Published 15 days ago • 159

upvoted 26 papers 18 days ago

Robin3D: Improving 3D Large Language Model via Robust Instruction Tuning

Paper • 2410.00255 • Published 22 days ago • 5

SciPrompt: Knowledge-augmented Prompting for Fine-grained Categorization of Scientific Topics

Paper • 2410.01946 • Published 20 days ago • 4

Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data

Paper • 2410.02056 • Published 20 days ago • 4

Contextual Document Embeddings

Paper • 2410.02525 • Published 19 days ago • 16

Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models

Paper • 2410.01335 • Published 21 days ago • 5

Learning the Latent Rules of a Game from Data: A Chess Story

Paper • 2410.02426 • Published 19 days ago • 5

Intelligence at the Edge of Chaos

Paper • 2410.02536 • Published 19 days ago • 5

Improving Autonomous AI Agents with Reflective Tree Search and Self-Learning

Paper • 2410.02052 • Published 20 days ago • 8

MedVisionLlama: Leveraging Pre-Trained Large Language Model Layers to Enhance Medical Image Segmentation

Paper • 2410.02458 • Published 19 days ago • 9

Vinoground: Scrutinizing LMMs over Dense Temporal Reasoning with Short Videos

Paper • 2410.02763 • Published 19 days ago • 7

L-CiteEval: Do Long-Context Models Truly Leverage Context for Responding?

Paper • 2410.02115 • Published 20 days ago • 10

Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations

Paper • 2410.02762 • Published 19 days ago • 9

MVGS: Multi-view-regulated Gaussian Splatting for Novel View Synthesis

Paper • 2410.02103 • Published 20 days ago • 8

SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration

Paper • 2410.02367 • Published 19 days ago • 45

CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified Multiplet Upcycling

Paper • 2409.19291 • Published 24 days ago • 18

Training Language Models on Synthetic Edit Sequences Improves Code Synthesis

Paper • 2410.02749 • Published 19 days ago • 12

Eliminating Oversaturation and Artifacts of High Guidance Scales in Diffusion Models

Paper • 2410.02416 • Published 19 days ago • 25

Distilling an End-to-End Voice Assistant Without Instruction Training Data

Paper • 2410.02678 • Published 19 days ago • 21

VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment

Paper • 2410.01679 • Published 20 days ago • 22

Large Language Models as Markov Chains

Paper • 2410.02724 • Published 19 days ago • 31

Depth Pro: Sharp Monocular Metric Depth in Less Than a Second

Paper • 2410.02073 • Published 20 days ago • 38

Loong: Generating Minute-level Long Videos with Autoregressive Language Models

Paper • 2410.02757 • Published 19 days ago • 36

Video Instruction Tuning With Synthetic Data

Paper • 2410.02713 • Published 19 days ago • 33

Contrastive Localized Language-Image Pre-Training

Paper • 2410.02746 • Published 19 days ago • 30

Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models

Paper • 2410.02740 • Published 19 days ago • 52

LLaVA-Critic: Learning to Evaluate Multimodal Models

Paper • 2410.02712 • Published 19 days ago • 34

upvoted 8 papers 20 days ago

Closed-loop Long-horizon Robotic Planning via Equilibrium Sequence Modeling

Paper • 2410.01440 • Published 20 days ago • 3

EVER: Exact Volumetric Ellipsoid Rendering for Real-time View Synthesis

Paper • 2410.01804 • Published 20 days ago • 5

HelpSteer2-Preference: Complementing Ratings with Preferences

Paper • 2410.01257 • Published 21 days ago • 16

Quantifying Generalization Complexity for Large Language Models

Paper • 2410.01769 • Published 20 days ago • 13

ComfyGen: Prompt-Adaptive Workflows for Text-to-Image Generation

Paper • 2410.01731 • Published 20 days ago • 15

LEOPARD : A Vision Language Model For Text-Rich Multi-Image Tasks

Paper • 2410.01744 • Published 20 days ago • 23

From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging

Paper • 2410.01215 • Published 21 days ago • 30

RATIONALYST: Pre-training Process-Supervision for Improving Reasoning

Paper • 2410.01044 • Published 21 days ago • 34

upvoted 2 collections 20 days ago

CheXagent

8 items • Updated 18 days ago • 1

GREEN

9 items • Updated 11 days ago • 1

upvoted 10 papers 21 days ago

IDEAW: Robust Neural Audio Watermarking with Invertible Dual-Embedding

Paper • 2409.19627 • Published 23 days ago • 1

Can Models Learn Skill Composition from Examples?

Paper • 2409.19808 • Published 23 days ago • 8

Coffee-Gym: An Environment for Evaluating and Improving Natural Language Feedback on Erroneous Code

Paper • 2409.19715 • Published 23 days ago • 8

Image Copy Detection for Diffusion Models

Paper • 2409.19952 • Published 23 days ago • 12

Cottention: Linear Transformers With Cosine Attention

Paper • 2409.18747 • Published 25 days ago • 15

Scaling Proprioceptive-Visual Learning with Heterogeneous Pre-trained Transformers

Paper • 2409.20537 • Published 22 days ago • 11

UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models

Paper • 2409.20551 • Published 22 days ago • 13

Hyper-Connections

Paper • 2409.19606 • Published 24 days ago • 19

DiaSynth -- Synthetic Dialogue Generation Framework

Paper • 2409.19020 • Published 28 days ago • 19

Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models

Paper • 2409.18943 • Published 25 days ago • 26