LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models Paper • 2410.09732 • Published 10 days ago • 53
MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models Paper • 2410.10139 • Published 9 days ago • 48
Toward General Instruction-Following Alignment for Retrieval-Augmented Generation Paper • 2410.09584 • Published 10 days ago • 43
MSI-Agent: Incorporating Multi-Scale Insight into Embodied Agents for Superior Planning and Decision-Making Paper • 2409.16686 • Published 28 days ago • 8
Instruction Following without Instruction Tuning Paper • 2409.14254 • Published about 1 month ago • 27
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models Paper • 2409.17146 • Published 27 days ago • 96
InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning Paper • 2409.12568 • Published Sep 19 • 46
MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines Paper • 2409.12959 • Published Sep 19 • 35
PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation Paper • 2409.06820 • Published Sep 10 • 62
MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge Discovery Paper • 2409.05591 • Published Sep 9 • 28
Towards a Unified View of Preference Learning for Large Language Models: A Survey Paper • 2409.02795 • Published Sep 4 • 72
Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis Paper • 2409.06135 • Published Sep 10 • 14
MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct Paper • 2409.05840 • Published Sep 9 • 45
How Do Your Code LLMs Perform? Empowering Code Instruction Tuning with High-Quality Data Paper • 2409.03810 • Published Sep 5 • 30
CogVLM2: Visual Language Models for Image and Video Understanding Paper • 2408.16500 • Published Aug 29 • 56
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search Paper • 2408.08152 • Published Aug 15 • 51
I-SHEEP: Self-Alignment of LLM from Scratch through an Iterative Self-Enhancement Paradigm Paper • 2408.08072 • Published Aug 15 • 31
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output Paper • 2407.03320 • Published Jul 3 • 92
Skywork-Math: Data Scaling Laws for Mathematical Reasoning in Large Language Models -- The Story Goes On Paper • 2407.08348 • Published Jul 11 • 50
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing Paper • 2406.08464 • Published Jun 12 • 62
Scaling Retrieval-Based Language Models with a Trillion-Token Datastore Paper • 2407.12854 • Published Jul 9 • 29
DotaMath: Decomposition of Thought with Code Assistance and Self-correction for Mathematical Reasoning Paper • 2407.04078 • Published Jul 4 • 16
Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion Paper • 2407.01392 • Published Jul 1 • 39
Semi-Supervised Knowledge-Grounded Pre-training for Task-Oriented Dialog Systems Paper • 2210.08873 • Published Oct 17, 2022 • 1
Scaling Relationship on Learning Mathematical Reasoning with Large Language Models Paper • 2308.01825 • Published Aug 3, 2023 • 21
InstructERC: Reforming Emotion Recognition in Conversation with a Retrieval Multi-task LLMs Framework Paper • 2309.11911 • Published Sep 21, 2023 • 3
Query and Response Augmentation Cannot Help Out-of-domain Math Reasoning Generalization Paper • 2310.05506 • Published Oct 9, 2023 • 1
Large Language Models Meet Open-World Intent Discovery and Recognition: An Evaluation of ChatGPT Paper • 2310.10176 • Published Oct 16, 2023 • 1
Revisit Input Perturbation Problems for LLMs: A Unified Robustness Evaluation Framework for Noisy Slot Filling Task Paper • 2310.06504 • Published Oct 10, 2023 • 1
OccuQuest: Mitigating Occupational Bias for Inclusive Large Language Models Paper • 2310.16517 • Published Oct 25, 2023 • 1
DolphCoder: Echo-Locating Code Large Language Models with Diverse and Multi-Objective Instruction Tuning Paper • 2402.09136 • Published Feb 14 • 1
PreAct: Predicting Future in ReAct Enhances Agent's Planning Ability Paper • 2402.11534 • Published Feb 18 • 1
CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery Paper • 2406.08587 • Published Jun 12 • 15
Step-Controlled DPO: Leveraging Stepwise Error for Enhanced Mathematical Reasoning Paper • 2407.00782 • Published Jun 30 • 23
We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning? Paper • 2407.01284 • Published Jul 1 • 76
Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models Paper • 2406.13542 • Published Jun 19 • 16
Understand What LLM Needs: Dual Preference Alignment for Retrieval-Augmented Generation Paper • 2406.18676 • Published Jun 26 • 5
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence Paper • 2406.11931 • Published Jun 17 • 57
How Abilities in Large Language Models are Affected by Supervised Fine-tuning Data Composition Paper • 2310.05492 • Published Oct 9, 2023 • 2
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models Paper • 2401.06066 • Published Jan 11 • 42