[Book Reading] 機械翻訳 - Section 5 No.2

Graph Structure
・Use search graph in phrase-based model
・At weighted acyclic directed graph G < Ф,V,E,s,g,𝐴>
Ф : phrase pair sets
Ф=feature vector h(・)・weight 𝜔
V: vertex ≡ partial hypotheses
E:edges ≡ weight of route
E ⊆ V×V× Ф×A
A: weight sets

Graph Structure
• out(𝑣)= 𝑣 = 𝑒 ∈ 𝐸|tail(𝑒) : edge sets which go
out from vertex 𝑣
• in(𝑣) = 𝑣 = 𝑒 ∈ 𝐸|head(𝑒) : edge sets which
head to vertex 𝑣
->Phrase pairs are linked by <out(𝑣), in(𝑣)>
At figure 5.8, phrase pair <へ行った, I went to> is
linked by
out(𝑣) = <-----,0,<s>> and in(𝑣)=<--・・・,9,went to>
𝑣
𝑣

Graph Structure
• If Ѱ=(𝑒1, 𝑒1,…, 𝑒l): rout from start to any vertexs,
head(𝑒k)=tail(𝑒k+1), then
Source language phrase sets:
𝑘=1
𝑙
𝑓(∅(𝑒 𝑘)) ≡ 𝑓(Ѱ)
Target language phrase sets:
𝑒(∅ 𝑒1 ), … , 𝑒(∅ 𝑒𝑙 ≡ 𝑒(Ѱ)
Route weight: 𝜔(Ѱ)= 𝑘=1
𝑙
𝜔(𝑒 𝑘)

Graph Structure
• In Fig.5.8, for the route
-> the parallel of word sets of source language
「行った」「へ」「領事館」is
“He went to the consulate”
Start
<行った,He went>
<へ,to>
<領事館,
the consulate>

Semiring
• set R equipped with two binary operations
addition“ + ” and multiplication “ × ”
• Associative:
a+(b+c)=(a+b)+c, a×(b×c)=(a×b)×c
• Commutative: a+b=b+a
• Distributional: a×(b+c)=(a×b)+(a×c)
• Additive inverse, multiplicative inverse
0+a=a+0=a; 1×a=a×1=a; 0×a=a×0=0
are not defined

Semiring
• In Table 5.1, tropical semiring is used to solve
maximization problem for route weight in
decoder
A ⊕ ⊗ 𝟎 𝟏
Tropical 𝑅−∞
∞ max + ー∞ 0

Semiring
• In weight directed graph G, for a rout from
starting point to ending point of source
language input f is Ѱ= 𝑒1, 𝑒1,…, 𝑒l
• Score of Ѱ = product of partial routes
𝜔(Ѱ)=⊗ 𝑘=1 𝜔(𝑒 𝑘)
-> Problem which maximize this score is
max⊗𝜔(𝑒)= ⊕⊗𝜔(𝑒)
A ⊕ ⊗ 𝟎 𝟏
Tropical 𝑅−∞
∞ max + ー∞ 0

Semiring
• In Fig.5.7,line 11
Q(𝑣′
, 𝑗′′
+1,𝑒′
𝑠 𝑒′′
𝑠)←max
Q(𝑣′
, 𝑗′′
+1,𝑒′
𝑠 𝑒′′
𝑠),
Q(𝑣, 𝑗, 𝑒′
𝑒′′
)+𝑠 𝑑 + 𝑠∅ + 𝑠𝑙𝑚
additive operation ⊕ is implemented for
each vertex tail(e)=s of G
• As semiring sastifies distributional feature
-> weight 𝜔(𝑣)of any vertexs 𝑣 ∈V is
⊕⊗𝜔(𝑒)=⊕ 𝑒∈𝑖𝑛(𝑣) 𝜔(𝑒)⊗ 𝜔(𝑡𝑎𝑖𝑙(𝑒))

Semiring
• Forward-backward algorithm for finding
maximum of route weight in graph structure
• topological order(G): list of vertexs of graph G
which arranged in topological order
• 𝛼, 𝛽: external variable

Semiring
FORWARD(G)
• 𝑣 ∈ topological order(G), e∈in(𝑣)
𝜔 = 𝜔(𝑒)⊗ 𝛼(𝑡𝑎𝑖𝑙(𝑒))
𝛼 𝑣 = 𝛼(𝑣)⊕ 𝜔
Start
tail(e)
𝜔(e)
𝜔 = 𝜔(e) ⊗ 𝛼(𝑡𝑎𝑖𝑙(𝑒))

Semiring
BACKWARD(G)
• 𝑣 ∈ inversetopological order(G), e∈ out(𝑣)
𝜔 = 𝜔(𝑒)⊗ 𝛽(ℎ𝑒𝑎𝑑(𝑒))
𝛽 𝑣 = 𝛽(𝑣)⊕ 𝜔
Goal
𝜔(e)
𝜔 = 𝜔(e)⊗ 𝛽(ℎ𝑒𝑎𝑑(𝑒))
head(e)

Semiring
In problem which choose the optimum
translation from search space expressed by
weighted directed graph G
Tropical semiring + Forward algorithm
->Viterbi semiring

k-best
• Besides forward-backward algorithm, k-best
algorithm is used to optimize route weight
• Dijkstra’s algorithm: for single source shortest
path problem
• Eppstein’s algorithm: for heaping multiple paths
efficiently

k-best
• Assume problem satisfies Tropical semiring
and backward algorithm
• Calculate and choose max (weight 𝛽(𝑣))
• Fig.5.10 algorithm
・cand: priority queue
・< 𝑣, s>: partial route
・< 𝑣′
,𝑠′
>: partial route whose vertex 𝑣′
= 𝑣
and edge 𝑠′
= tail 𝑒 = 𝑒 ∈out(𝑣)
・D: set of < 𝑣′
,𝑠′
>

k-best
• k=1: Initialized cand
• Optimize weight of partial route and whole
route
Whole route
D
cand
optimal
get out < 𝑣, s>,register D
Choose 𝑣′ = 𝑣 and
𝑒′ = e ∈out(𝑣)
insert to cand
heap 𝛽(・) to get optimal
k time

Limitation of Search Space
• If search space is big
->any sort can be forgiven
->calculation amount of decode algorithm
become massive
->limitation is necessary:
・Distortion limit, constraint
・Reordering limit, constraint

Distortion Constraint
• Upper limit setting d for distance between
phrase pair ∅ 𝑘and∅ 𝑘−1: start 𝑘 − end 𝑘−1 ≤d
The purpose is making model score small if
model distorted lead to penalty become big
For language pair which do not have big sort,
distortion constraint reach good efficiency
If d=0: no skip, translate from left to right
smoothly
->monotone translation

Distortion Constraint
• Constraint for case when have partial phrases
do not reach the ending point
𝑗: position of the first phrase of source language
start 𝑘: the first position of translated phrase
If ( 𝑗 < start 𝑘), add
end 𝑘 − 𝑗 ≤d
・IBM Constraint
𝑗 𝑠𝑡𝑎𝑟𝑡 𝑘 𝑒𝑛𝑑 𝑘・・・
∅ 𝑘 phrase
No need to
exam

Beam Search
・Prune disused partial hypothesis and pay
attention only partial hypothesis with high score
for computational reduction
・Group of vertexs of search graph and prune
partial hypothesis which has low score

Beam Search
・Group of vertexs of search graph and prune
partial hypothesis which has low score
Partial hypothesis pruned Partial hypothesis chose

Beam Search
Some kinds of grouping:
- Cover vector grouping
- Radix grouping
- Beam width pruning
- Histogram pruning

Heuristic Function
• Prevent partial hypothesis which has not been
translated yet from pruning
• Give predicted score for the rout and learn by
A* search so that rout score get the maximum
• ->can reduce search error

Pre-reordering Method
Translation between languages which has
significantly different grammatical structure
• Pre-reordering rule
• Pre-reordering model
• Pre-reordering learning

Pre-reordering Rule
• Based on tree from syntactic analysis, reorder to
target language word order
• Head-driven phrase structure grammar(HPSG)’s
rule:
- Syntactic anlysis
- Move the subjects back

Pre-reordering Model
• Source languages must have syntactic analysis
tool and morphological analysis tool
• Bilingual data are necessary
• Probability value of pre-reordering patterns
obtained will be estimated by maximum-
likelihood estimation(MLE)
• Choose the suitable pre-reordering patterns
based on reordering part of speech from
morphological analysis, or clustering word
class

Pre-reordering Learning
• For language pairs without any syntactic
analysis tools and morphological analysis tools
• Provisional tree structure automatically
generated from syntactic analysis result
• Divide tree factors to 2 labels: reordering label
[X],and no-reordering label <X>
• Use linear ordering problem(LOP) to
formulate reordering model to find the
approximate solution and build the parse tree

[Book Reading] 機械翻訳 - Section 5 No.2

More Related Content

[Book Reading] 機械翻訳 - Section 5 No.2

Editor's Notes