KB + Text => Great KB な論文を多読してみた

KB + Text => Great KB
な論文を多読してみた
Koji Matsuda (@conditional)
1
※各ページの図は元論文からの引用です。

KB + Text => Great KB な論文た
ち1. Connecting Language and Knowledge Bases with Embedding Models for Relation
Extraction, Weston+, EMNLP 2013
2. Knowledge Graph and Text Jointly Embedding, Wang+, EMNLP 2014
3. Representing Text for Joint Embedding of Text and Knowledge Bases, Toutanova+,
EMNLP 2015
4. Representation Learning of Knowledge Graphs with Entity Descriptions, Xie+,
AAAI 2016 (2月)
5. Text-Enhanced Representation Learning for Knowledge Graph, Wang and Li, IJCAI
2016 (7月)
6. Distributed representation learning for knowledge graphs with entity descriptions,
Fan+, Pattern Recognition Letters, 2016/09
7. Knowledge Representation via Joint Learning of Sequential Text and Knowledge
Graphs, Wu+, https://arxiv.org/abs/1609.07075
8. WebBrain: Joint Neural Learning of Large-Scale Commonsense Knowledge,
Chen+, ISWC 2016 (10月?)
9. Joint Representation Learning of Text and Knowledge for Knowledge Graph
Completion, Han+, https://arxiv.org/pdf/1611.04125v1.pdf
2
赤字は筆頭著者が精華大
所属の論文
（グループは一つじゃないらしい…）

KB + Text => Great KB
• 知識グラフ(Freebase)に書いてある知識を、
テキストを使って「増強」する
– 知識ベースに書いていない知識を予測する
3
Wikipedia, NYT
ClueWeb(FACC1)
Freebase
WordNet
知識の埋め込み
TransE or RESCAL
メンションの埋め込み
CNN, RNN etc
もっと凄い
知識ベース
知識ベース
テキスト
input
output
+ 融合

準備: Knowledge Graph
• 知識ベースをグラフとして表現
– ノード: エンティティ, エッジ: 関係の種類
• 知識 : (head, relation, tail)
– (Washinton, CapitalOf, USA)
– (Barack Obama, gender, Male)
• 個別の知識をつなぎあわせるとグラフになる
– Knowledge Graphと呼ばれる
– Freebase, WordNet…

準備: Ditributed Knowledge
Representation
• Knowledge Graphのエンティティ／関係をベ
クトル空間に埋め込む
– 応用: テキストからの関係抽出, 知識ベースの拡充
(欠けたタプルを予測)
– TransE [Bordes+, 2013]
– TransH [Wang+, 2014]
• headを適当な空間に射影してから relationを足す
– TransR [Lin+, 2015]
• エンティティをリレーションの空間に写像
f
o
d
a-
s,
h
h
h
e
-
d
-
e
y
asatranslation between entities. WeintroduceTransE and its
extensions TransH and TransR in detail.
TransE
For each triple(h, r, t), TransE [Bordeset al., 2013] wants
h + r ⇡ t when (h, r, t) holds. This indicates that t should
bethenearest entity from (h + r). Hence, TransE definesthe
following energy function
f r (h, t) = kh + r − t kL 1/ L 2 (1)
Thefunction returnslow scoreif (h, r, t) holds, viceversa.
TransH
TransH [Wang et al., 2014b] enables an entity to havedis-
tinct embeddings when involved in different relations. For
a relation r, TransH models the relation with a vector r and
a hyperplane with wr as the normal vector. Then the score
ways shared
esand rela-
ith entities,
ework, each
nected with
sented with
eddings are
ies accord-
TransH and
ased on en-
erformance
uding entity
ction. The
eling of at-
outperforms
at effort on
TransE
For each triple(h, r, t), TransE [Bordeset al., 2013] wants
h + r ⇡ t when (h, r, t) holds. This indicates that t should
bethenearest entity from (h + r). Hence, TransE definesthe
following energy function
f r (h, t) = kh + r − t kL 1/ L 2 (1)
Thefunction returnslow scoreif (h, r, t) holds, viceversa.
TransH
function isdefined as
f r (h, t) = − kh − wT
r hwr + r − (t − wT
r twr )kL 1/ L 2 (2)
TransR
TransR [Lin et al., 2015b] models entities and relations in
entity space and relation spaces, and performs translation in
relation spaces. TransR setsaprojection matrix M r 2 Rk⇥d
,
ed with
ings are
accord-
nsH and
d on en-
ormance
ng entity
on. The
g of at-
erforms
ffort on
orks and
elational
ties and
and cap-
., 2006;
f r (h, t) = kh + r − tkL 1/ L 2 (1)
Thefunction returnslow scoreif (h, r,t) holds, viceversa.
TransH
function isdefined as
f r (h, t) = − kh − wT
r hwr + r − (t − wT
r twr )kL 1/ L 2 (2)
TransR
TransR [Lin et al., 2015b] models entities and relations in
entity space and relation spaces, and performs translation in
relation spaces. TransR setsaprojection matrix M r 2 Rk⇥d
,
which may projects entities from entity space to relation
space. Via the mapping matrix, the energy function is cor-
respondingly defined as
f r (h,t) = khM r + r − tM r kL 1/ L 2 (3)

準備: データセット/評価について
• Knowledge Base Completion の評価に用いら
れる標準的データセット: FB15k
• 評価
– KBC: エンティティ(head, tail)予測、関係予測
• (h, r, ?) : head, relationが与えられたもとで tailを予測
• 評価指標: MRR, Top N Acc(%)
– 三つ組分類: (h,r,t) が与えられたときにそれが KB
に含まれるか否か: Acc, AUC
– テキストからの関係抽出: P-R Graph, AUC 6
その他:
FB20k [Xie+ AAAI2016]
ゼロショット実験向け
FB15k-237 [Toutanova 2015]
関係の種類を237に絞ったもの

• Relation Extractionに KB 由来の情報を使う
– head と tail が与えられたときに r という関係を持つ
かどうかのスコア関数にKB由来の情報を導入
Connecting Language and Knowledge Bases
with Embedding Models for Relation Extraction
Weston+ EMNLP 2013
7
メンション, relationペアの
スコア関数(学習する)
TransE
h,t を含む全てのmに対して和
NYTからの関係抽出問題において、
テキストのみを用いるベースラインに
対して性能向上 (赤線 -> 黒線)

Knowledge Graph and Text Jointly
Embedding
Wang+, EMNLP 2014
8
• TransEの確率化: pTransE
• 単語間の関係モデル
• アラインメントモデル
• Jointで最適化
Wikipediaから学習
結果: out-of-kb エンティティに対しても推論ができるように
いくつかよくわからないところがある論文だが…
softmaxにのせただけ

Representing Text for Joint Embedding
of Text and Knowledge Bases
Toutanova+, EMNLP 2015
• DistMultモデル
– TransEではなく、RESCALというモデルがベー
ス
• f(h, r, t) = rT (h⦿t)
• 係り受けパスをCNNで
ベクトル空間に埋め込む
• 実験: FB15k 上での予測
– テキスト: EL が適用済みのコーパス FACC1
• 2.7M relation
– テキストの併用：MRRで2ポイント程度改善
• 効果はかなり小さいと思う 9

Representation Learning of Knowledge Graphs
with Entity Descriptions
Xie+, AAAI 2016
• Freebase 内のエンティティの説明をエン
コードに使うモデル: DKRL
– 知識グラフの埋め込み
は TransE
– エンコードは CNN
• エンティティが KB に含まれていなくても、
説明文さえあれば推論できる (Zero-shot)
– FB20k という新しいデータを作って実証
10

Distributed representation learning for
knowledge graphs with entity descriptions
Fan+, Pattern Recognition Letters 2016/09
• 前の論文とやっていることは同じ
– head, tail の descriptionを埋め込む
– 全体が浅い確率モデルになっている
• あるエンティティからdescriptionが生成される確
率モデル
11
h
w1
前の論文と比較して、
パラメータが少なく、性能も良い
(↑これが主な貢献)

Text-Enhanced Representation Learning
for Knowledge Graph
Wang and Li, IJCAI 2016
• EL済みのコーパス→エンティティの周辺
文脈の表現を学習
– 文脈の埋め込み：重み付き平均
– エンティティの埋め込みとの統合:線形変換
12
文脈の埋め込みに変換行列をかけて、
TransEの表現を足す
結果： 1-to-N, N-to-1 の関係推定が性能向上

Knowledge Representation via Joint Learning
of Sequential Text and Knowledge Graphs
Wu+, arXiv 2016/09
13
• エンティティに言及している文はエン
ティティの理解に役立つ
– 一般に複数ある
– が、ゴミも混ざるので Attentionで排除
• 定義文っぽいものに強く Attend するイメージ

Joint Representation Learning of Text and
Knowledge for Knowledge Graph Completion
Han+, arXiv 2016/11
• [Toutanova+ 2015]と同様
– 係り受けパスではなく、関係パタンにCNN
• 係り受け解析はWebテキストに対してはよく失敗
するので、shallowな方がよいという主張
– テキストは distant supervisionで獲得
• head, tailを両方含んでいる文から知識獲得
14
関係分類タスク on NYT : テキストのみ
使うベースラインより性能向上

まとめ
KB埋め込み Text埋め込み結合評価
Weston+ EMNLP
2013
TransE
(4M ent)
関係予測モデル
NYTコーパス
Composite Score 関係抽出
on NYT+FB
Wang+ EMNLP2014 pTransE
Freebase
(巨大)
Skip-gram/
Wikipedia
アラインメントモ
デル+Composite
Score
三つ組分類,
関係抽出,Analogy
Toutanova+
EMNLP2015
DistMult
(RESCAL)
FB15K
係り受けパス+CNN
EL済みClueWeb12
(FACC1)
Composite Score KBC on FB15k
Xie+
AAAI2016
TransE
FB15K
CNN
Freebaseの
Description
Composite Score KBC on FB15k &
Zeroshot(FB20k)
Wang and Li
IJCAI2016
Trans*
FB15k / WN
キーワード抽出 &
平均
エンティティ空間
に線形写像
KBC
多対多の関係
Wu+
arXiv 2016/09
TransE
FB15k
リファレンス文
LSTM+Attention
(Wikipedia)
Composite Score KBC/三つ組分類
Han+
arXiv 2016/11
TransE
FB15k
関係パタン+CNN
NYT+distant
supervision
Composite Score KBC on FB15k
関係分類 on NYT
15赤字: 面白かったので精読したい論文

KB + Text => Great KB な論文を多読してみた

Related slideshows

More Related Content

KB + Text => Great KB な論文を多読してみた