
A Re-evaluation of Knowledge Graph Completion Methods

2019-11-10 ACL 2020


1 Introduction

最近出现的nn的model for Knowledge Graph Completion(KGC),的效果存在问题:

in ConvKB, there is a 21.8% improvement over ConvE on FB15k-237, but a degradation of 42.3% on WN18RR, which is surprising given the method is claimed to be better than ConvE.


3 Observations

经过调查发现,在最后进行评估的时候,部分受到影响的模型如ConvKB,KBAT等,它们会对于很多的negative sample产生和valid triple一样的score。

On average, ConvKB and CapsE have 125 and 278 entities with exactly same score as the valid triplet over the entire evaluation dataset of FB15k-237, whereas ConvE has around 0.002,

在这样的情况下,如果一开始的valid triple是作为评估triple的开头的话,效果就会虚假的高。

4 Evaluation Method



In this, the correct triplet is placed randomly in \(\cal{T^{'}}\) .

其中, \[ \cal{T^{'}} = \{ (h, r, t^{'})\ |\ t^{'} \in \cal{E} \} \]

RANDOM is the best evaluation technique which is both rigorous and fair to the model.