Minbyul Jeong's picture

Minbyul Jeong

Minbyul

·

https://minstar.github.io/

AI & ML interests

Biomedical Natural Language Processing, Graph Network

Organizations

Minbyul's activity

upvoted 3 papers 1 day ago

MedMobile: A mobile-sized language model with expert-level clinical capabilities

Paper • 2410.09019 • Published 11 days ago • 8

A Comparative Study on Reasoning Patterns of OpenAI's o1 Model

Paper • 2410.13639 • Published 5 days ago • 14

BenTo: Benchmark Task Reduction with In-Context Transferability

Paper • 2410.13804 • Published 5 days ago • 20

upvoted 2 papers 6 days ago

Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts

Paper • 2410.10626 • Published 8 days ago • 36

SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement Learning

Paper • 2410.09754 • Published 10 days ago • 7

upvoted a paper 7 days ago

Thinking LLMs: General Instruction Following with Thought Generation

Paper • 2410.10630 • Published 8 days ago • 7

upvoted a paper 8 days ago

ChroKnowledge: Unveiling Chronological Knowledge of Language Models in Multiple Domains

Paper • 2410.09870 • Published 9 days ago • 7

upvoted 4 papers 9 days ago

Everything Everywhere All at Once: LLMs can In-Context Learn Multiple Tasks in Superposition

Paper • 2410.05603 • Published 15 days ago • 11

Intriguing Properties of Large Language and Vision Models

Paper • 2410.04751 • Published 16 days ago • 16

Self-Boosting Large Language Models with Synthetic Preference Data

Paper • 2410.06961 • Published 13 days ago • 14

One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation

Paper • 2410.07170 • Published 13 days ago • 15

upvoted 10 papers 13 days ago

DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search

Paper • 2410.03864 • Published 18 days ago • 10

ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery

Paper • 2410.05080 • Published 15 days ago • 19

Differential Transformer

Paper • 2410.05258 • Published 15 days ago • 159

LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations

Paper • 2410.02707 • Published 19 days ago • 45

What Matters for Model Merging at Scale?

Paper • 2410.03617 • Published 18 days ago • 7

MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs

Paper • 2410.04698 • Published 16 days ago • 13

Named Clinical Entity Recognition Benchmark

Paper • 2410.05046 • Published 15 days ago • 17

TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention

Paper • 2410.05076 • Published 15 days ago • 6

Only-IF:Revealing the Decisive Effect of Instruction Diversity on Generalization

Paper • 2410.04717 • Published 16 days ago • 17

LongGenBench: Long-context Generation Benchmark

Paper • 2410.04199 • Published 17 days ago • 17

upvoted 2 papers 15 days ago

Tutor CoPilot: A Human-AI Approach for Scaling Real-Time Expertise

Paper • 2410.03017 • Published 19 days ago • 25

Addition is All You Need for Energy-efficient Language Models

Paper • 2410.00907 • Published 21 days ago • 138

upvoted 2 papers 16 days ago

Erasing Conceptual Knowledge from Language Models

Paper • 2410.02760 • Published 19 days ago • 12

Selective Attention Improves Transformer

Paper • 2410.02703 • Published 19 days ago • 22

upvoted 25 papers 18 days ago

LLaVA-Critic: Learning to Evaluate Multimodal Models

Paper • 2410.02712 • Published 19 days ago • 34

SciPrompt: Knowledge-augmented Prompting for Fine-grained Categorization of Scientific Topics

Paper • 2410.01946 • Published 20 days ago • 4

MedVisionLlama: Leveraging Pre-Trained Large Language Model Layers to Enhance Medical Image Segmentation

Paper • 2410.02458 • Published 19 days ago • 9

Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations

Paper • 2410.02762 • Published 19 days ago • 9

L-CiteEval: Do Long-Context Models Truly Leverage Context for Responding?

Paper • 2410.02115 • Published 20 days ago • 10

SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration

Paper • 2410.02367 • Published 20 days ago • 45

Training Language Models on Synthetic Edit Sequences Improves Code Synthesis

Paper • 2410.02749 • Published 19 days ago • 12

Large Language Models as Markov Chains

Paper • 2410.02724 • Published 19 days ago • 31

Quantifying Generalization Complexity for Large Language Models

Paper • 2410.01769 • Published 20 days ago • 13

Is Preference Alignment Always the Best Option to Enhance LLM-Based Translation? An Empirical Analysis

Paper • 2409.20059 • Published 23 days ago • 15

Not All LLM Reasoners Are Created Equal

Paper • 2410.01748 • Published 20 days ago • 27

From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging

Paper • 2410.01215 • Published 21 days ago • 30

RATIONALYST: Pre-training Process-Supervision for Improving Reasoning

Paper • 2410.01044 • Published 21 days ago • 34

Law of the Weakest Link: Cross Capabilities of Large Language Models

Paper • 2409.19951 • Published 23 days ago • 53

Can Models Learn Skill Composition from Examples?

Paper • 2409.19808 • Published 23 days ago • 8

Coffee-Gym: An Environment for Evaluating and Improving Natural Language Feedback on Erroneous Code

Paper • 2409.19715 • Published 23 days ago • 8

DiaSynth -- Synthetic Dialogue Generation Framework

Paper • 2409.19020 • Published 28 days ago • 19

Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models

Paper • 2409.18943 • Published 25 days ago • 26

MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning

Paper • 2409.20566 • Published 22 days ago • 50

LML: Language Model Learning a Dataset for Data-Augmented Prediction

Paper • 2409.18957 • Published 25 days ago • 9

Modulated Intervention Preference Optimization (MIPO): Keep the Easy, Refine the Difficult

Paper • 2409.17545 • Published 27 days ago • 17

MinerU: An Open-Source Solution for Precise Document Content Extraction

Paper • 2409.18839 • Published 25 days ago • 24

VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models

Paper • 2409.17066 • Published 27 days ago • 26

A Survey on the Honesty of Large Language Models

Paper • 2409.18786 • Published 25 days ago • 29

Emu3: Next-Token Prediction is All You Need

Paper • 2409.18869 • Published 25 days ago • 84

upvoted 5 papers 26 days ago

Reducing the Footprint of Multi-Vector Retrieval with Minimal Performance Impact via Token Pooling

Paper • 2409.14683 • Published 30 days ago • 8

Pixel-Space Post-Training of Latent Diffusion Models

Paper • 2409.17565 • Published 27 days ago • 19

Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction

Paper • 2409.17422 • Published 27 days ago • 23

MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models

Paper • 2409.17481 • Published 27 days ago • 46

EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions

Paper • 2409.18042 • Published 26 days ago • 36

upvoted 5 papers 28 days ago

A Case Study of Web App Coding with OpenAI Reasoning Models

Paper • 2409.13773 • Published Sep 19 • 4

An adapted large language model facilitates multiple medical tasks in diabetes care

Paper • 2409.13191 • Published Sep 20 • 6

Style over Substance: Failure Modes of LLM Judges in Alignment Benchmarking

Paper • 2409.15268 • Published 29 days ago • 11

Beyond Fine-tuning: Unleashing the Potential of Continuous Pretraining for Clinical LLMs

Paper • 2409.14988 • Published 29 days ago • 21

Phantom of Latent for Large Language and Vision Models

Paper • 2409.14713 • Published 30 days ago • 27