OpenIE-LLM

发表于 2024-03-03 更新于 2024-03-13 分类于 Paper ， LLM ， IE 本文字数： 12k 阅读时长 ≈ 11 分钟

Open Information Extraction 1

开放域信息抽取相关论文合集1

阅读全文 »

LLM-data-augment2

发表于 2023-11-02 更新于 2023-11-22 分类于 Paper ， LLM ， Data Augmentation 本文字数： 10k 阅读时长 ≈ 9 分钟

LLM数据增强

基于LLM的数据增强论文合集2。

阅读全文 »

FewGen-icml2023

发表于 2023-10-30 更新于 2023-11-03 分类于 Paper ， Data Augmentation 本文字数： 9.7k 阅读时长 ≈ 9 分钟

FewGen-ICML2023

Recent studies have revealed the intriguing few-shot learning ability of pretrained language models (PLMs): They can quickly adapt to a new task when fine-tuned on a small amount of labeled data formulated as prompts, without requiring abundant task-specific annotations. Despite their promising performance, most existing few-shot approaches that only learn from the small training set still underperform fully supervised training by nontrivial margins. In this work, we study few-shot learning with PLMs from a different perspective: We first tune an autoregressive PLM on the few-shot samples and then use it as a generator to synthesize a large amount of novel training samples which augment the original training set. To encourage the generator to produce label-discriminative samples, we train it via weighted maximum likelihood where the weight of each token is automatically adjusted based on a discriminative meta-learning objective. A classification PLM can then be fine-tuned on both the few-shot and the synthetic samples with regularization for better generalization and stability. Our approach FewGen achieves an overall better result across seven classification tasks of the GLUE benchmark than existing few-shot learning methods, improving no-augmentation methods by 5+ average points, and outperforming augmentation methods by 3+ average points.

Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot Learning. University of Illinois Urbana-Champaign. ICML 2023. Code.

一篇微调语言模型来生成训练数据的工作。主要关注如何学习label-discriminative (/dɪsˈkrɪmɪnətɪv/) samples。

阅读全文 »

synthetic-data-llm-sub

发表于 2023-10-21 更新于 2023-10-29 分类于 Paper ， LLM ， Data Augmentation 本文字数： 4.9k 阅读时长 ≈ 4 分钟

Synthetic Data Generation with Large Language Models for Text Classification: Potential and Limitations

Purdue University, 作者评论接收至EMNLP 2023。

The collection and curation of high-quality training data is crucial for developing text classification models with superior performance, but it is often associated with significant costs and time investment. Researchers have recently explored using large language models (LLMs) to generate synthetic datasets as an alternative approach. However, the effectiveness of the LLM-generated synthetic data in supporting model training is inconsistent across different classification tasks. To better understand factors that moderate the effectiveness of the LLM-generated synthetic data, in this study, we look into how the performance of models trained on these synthetic data may vary with the subjectivity of classification. Our results indicate that subjectivity, at both the task level and instance level, is negatively associated with the performance of the model trained on synthetic data. We conclude by discussing the implications of our work on the potential and limitations of leveraging LLM for synthetic data generation.

Issue: 目前在不同的task里，对于使用LLM生成的data是否能够和真实人工标注的data相比，没有定论。

Solution: 作者认为出现这种现象的原因之一和具体text classification任务的主观程度subjectivity有关，实验发现主观性越强的分类任务，LLM生成数据的效果也会越差。

阅读全文 »

ConvRe-LLM

发表于 2023-10-17 分类于 Paper ， LLM ， Capacity 本文字数： 2.9k 阅读时长 ≈ 3 分钟

ConvRe

An Investigation of LLMs’ Inefficacy in Understanding Converse Relations. 北航. EMNLP 2023. 代码.

Large Language Models (LLMs) have achieved remarkable success in many formal language oriented tasks, such as structural data-to-text and semantic parsing. However current benchmarks mostly follow the data distribution of the pre-training data of LLMs. Therefore, a natural question rises that do LLMs really understand the structured semantics of formal languages. In this paper, we investigate this problem on a special case, converse binary relation.** We introduce a new benchmark ConvRe focusing on converse relations, which contains 17 relations and 1240 triples extracted from popular knowledge graph completion datasets.** Our ConvRE features two tasks, Re2Text and Text2Re, which are formulated as multi-choice question answering to evaluate LLMs’ ability to determine the matching between relations and associated text. For the evaluation protocol, apart from different prompting methods, we further introduce variants to the test text and few-shot example text. We conduct experiments on three popular LLM families and have observed various scaling trends. The results suggest that LLMs often resort to shortcut learning and still face challenges on our proposed benchmark.

在这篇论文里，作者探究了LLM对于逆关系converse relations理解的问题，为此，创建了一个benchmark ConvRe。作者发现LLM能够更好的理解normal relation，而不能很好的理解逆关系。并且随着model size增大，反而理解效果越差。作者推测是由于LLM在预训练阶段学习到了很多特征，在推理时，习惯于利用这些预训练特征走捷径shortcut。

阅读全文 »

MIE-collection1

发表于 2023-09-26 更新于 2024-12-17 分类于 Paper ， Multimodal ， IE 本文字数： 20k 阅读时长 ≈ 18 分钟

MRE and MNER1

多模态信息抽取相关论文总结集合1。

阅读全文 »

LLM-data-augment1

发表于 2023-09-22 更新于 2024-02-09 分类于 Paper ， LLM ， Data Augmentation 本文字数： 46k 阅读时长 ≈ 42 分钟

LLM数据增强

基于LLM的数据增强论文合集1。

阅读全文 »

rethinking-role-of-demonstrations

发表于 2023-09-20 分类于 Paper ， LLM ， ICL 本文字数： 3.1k 阅读时长 ≈ 3 分钟

Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?

华盛顿大学与Meta，EMNLP 2022，代码。

Large language models (LMs) are able to incontext learn—perform a new task via inference alone by conditioning on a few input-label pairs (demonstrations) and making predictions for new inputs. However, there has been little understanding of how the model learns and which aspects of the demonstrations contribute to end task performance. In this paper, we show that ground truth demonstrations are in fact not required—randomly replacing labels in the demonstrations barely hurts performance on a range of classification and multi-choice tasks, consistently over 12 different models including GPT-3. Instead, we find that other aspects of the demonstrations are the key drivers of end task performance, including the fact that they provide a few examples of (1) the label space, (2) the distribution of the input text, and (3) the overall format of the sequence. Together, our analysis provides a new way of understanding how and why in-context learning works, while opening up new questions about how much can be learned from large language models through inference alone.

作者对于上下文学习中，什么样的signal是对LLM进行task learning有帮助的进行了实验探究。

阅读全文 »

IE-data-augment-collection1

发表于 2023-09-17 更新于 2024-12-15 分类于 Paper ， IE ， Data Augment 本文字数： 42k 阅读时长 ≈ 39 分钟

Data Augment for IE papers 1

基于数据增强策略的信息抽取论文合集 1。

阅读全文 »

when-how-paraphrase-NER

发表于 2023-09-16 更新于 2023-09-17 分类于 Paper ， NER ， Data Augment 本文字数： 5.1k 阅读时长 ≈ 5 分钟

When and how to paraphrase for named entity recognition?

ACL 2023

While paraphrasing is a promising approach for data augmentation in classification tasks, its effect on named entity recognition (NER) is not investigated systematically due to the difficulty of span-level label preservation. In this paper, we utilize simple strategies to annotate entity spans in generations and compare established and novel methods of paraphrasing in NLP such as back translation, specialized encoder-decoder models such as Pegasus, and GPT-3 variants for their effectiveness in improving downstream performance for NER across different levels of gold annotations and paraphrasing strength on 5 datasets. We thoroughly explore the influence of paraphrasers, dynamics between paraphrasing strength and gold dataset size on the NER performance with visualizations and statistical testing. We find that the choice of the paraphraser greatly impacts NER performance, with one of the larger GPT-3 variants exceedingly capable of generating high quality paraphrases, yielding statistically significant improvements in NER performance with increasing paraphrasing strength, while other paraphrasers show more mixed results. Additionally, inline auto annotations generated by larger GPT-3 are strictly better than heuristic based annotations. We also find diminishing benefits of paraphrasing as gold annotations increase for most datasets. Furthermore, while most paraphrasers promote entity memorization in NER, the proposed GPT-3 configuration performs most favorably among the compared paraphrasers when tested on unseen entities, with memorization reducing further with paraphrasing strength. Finally, we explore mention replacement using GPT-3, which provides additional benefits over base paraphrasing for specific datasets.

系统的分析了不同设置，不同的模型，用改写sentence的方式来做NER任务的data augmentation。

阅读全文 »