TokenButler

发表于 2025-03-15 更新于 2025-03-16 分类于 LLM ， Efficacy ， KV-Cache ， Token Pruning 本文字数： 2.7k 阅读时长 ≈ 2 分钟

TokenButler

TokenButler: Token Importance is Predictable. arXiv 2025. 代码. 康奈尔大学

Large Language Models (LLMs) rely on the KeyValue (KV) Cache to store token history, enabling efficient decoding of tokens. As the KV-Cache grows, it becomes a major memory and computation bottleneck, however, there is an opportunity to alleviate this bottleneck, especially because prior research has shown that only a small subset of tokens contribute meaningfully to each decoding step. A key challenge in finding these critical tokens is that they are dynamic, and heavily input query-dependent. Existing methods either risk quality by evicting tokens permanently, or retain the full KV-Cache but rely on retrieving chunks (pages) of tokens at generation, failing at dense, context-rich tasks. Additionally, many existing KV-Cache sparsity methods rely on inaccurate proxies for token importance. To address these limitations, we introduce TokenButler, a highgranularity, query-aware predictor that learns to identify these critical tokens. By training a lightweight predictor with less than 1.2% parameter overhead, TokenButler prioritizes tokens based on their contextual, predicted importance. This improves perplexity & downstream accuracy by over 8% relative to SoTA methods for estimating token importance. We evaluate TokenButler on a novel synthetic small-context co-referential retrieval task, demonstrating near-oracle accuracy. Code, models and benchmarks: [Code]

阅读全文 »

LMM-Hallucinations-Collection1

发表于 2025-03-11 更新于 2025-03-15 分类于 LMM ， Hallucinations 本文字数： 11k 阅读时长 ≈ 10 分钟

Hallucinations in LMM

多模态大模型中的幻觉问题。

阅读全文 »

MIE-Collection2

发表于 2024-11-13 更新于 2025-01-13 分类于 Collection ， Multimodal ， IE 本文字数： 41k 阅读时长 ≈ 38 分钟

MRE and MNER 1

多模态信息抽取相关论文总结集合2。

阅读全文 »

LMM-Grounding-Collection1

发表于 2024-10-22 更新于 2025-03-11 分类于 LMM ， Grounding 本文字数： 24k 阅读时长 ≈ 22 分钟

Grounding LMM

面向grounding的多模态大模型large multimodal models。

阅读全文 »

MCR-Collection1

发表于 2024-09-19 更新于 2024-11-17 分类于 Collection ， Multimodal ， MCR 本文字数： 15k 阅读时长 ≈ 14 分钟

Multimodal Coreference Resolution

多模态共指消解调研。

阅读全文 »

MM-DA

发表于 2024-09-06 更新于 2024-11-17 分类于 Collection ， Multimodal ， LLM 本文字数： 39k 阅读时长 ≈ 36 分钟

Multimodal data augmentation

多模态数据增强调研。

阅读全文 »

LLM-DA-survey

发表于 2024-09-05 更新于 2024-09-10 分类于 Survey ， LLM ， DA 本文字数： 2.9k 阅读时长 ≈ 3 分钟

Data Augmentation using LLMs: Data Perspectives, Learning Paradigms and Challenges

南洋理工. ACL 2024 Findings

In the rapidly evolving field of large language models (LLMs), data augmentation (DA) has emerged as a pivotal technique for enhancing model performance by diversifying training examples without the need for additional data collection. This survey explores the transformative impact of LLMs on DA, particularly addressing the unique challenges and opportunities they present in the context of natural language processing (NLP) and beyond. From both data and learning perspectives, we examine various strategies that utilize LLMs for data augmentation, including a novel exploration of learning paradigms where LLM-generated data is used for diverse forms of further training. Additionally, this paper highlights the primary open challenges faced in this domain, ranging from controllable data augmentation to multimodal data augmentation. This survey highlights a paradigm shift introduced by LLMs in DA, and aims to serve as a comprehensive guide for researchers and practitioners.

阅读全文 »