A Rule-Based Approach For Aligning Japanese-Spanish Sentences From A Comparable Corpora

Ramírez, Jessica C.; Matsumoto, Yuji

Computer Science > Computation and Language

arXiv:1211.4488 (cs)

[Submitted on 19 Nov 2012]

Title:A Rule-Based Approach For Aligning Japanese-Spanish Sentences From A Comparable Corpora

Authors:Jessica C. Ramírez, Yuji Matsumoto

View PDF

Abstract:The performance of a Statistical Machine Translation System (SMT) system is proportionally directed to the quality and length of the parallel corpus it uses. However for some pair of languages there is a considerable lack of them. The long term goal is to construct a Japanese-Spanish parallel corpus to be used for SMT, whereas, there are a lack of useful Japanese-Spanish parallel Corpus. To address this problem, In this study we proposed a method for extracting Japanese-Spanish Parallel Sentences from Wikipedia using POS tagging and Rule-Based approach. The main focus of this approach is the syntactic features of both languages. Human evaluation was performed over a sample and shows promising results, in comparison with the baseline.

Comments:	International Journal on Natural Language Computing (IJNLC) Vol.1, No.3, October 2012
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:1211.4488 [cs.CL]
	(or arXiv:1211.4488v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1211.4488

Submission history

From: Jessica Ramírez [view email]
[v1] Mon, 19 Nov 2012 16:38:32 UTC (135 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2012-11

Change to browse by:

cs
cs.AI

References & Citations

DBLP - CS Bibliography

listing | bibtex

Jessica C. Ramírez
Yuji Matsumoto

export BibTeX citation

Computer Science > Computation and Language

Title:A Rule-Based Approach For Aligning Japanese-Spanish Sentences From A Comparable Corpora

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:A Rule-Based Approach For Aligning Japanese-Spanish Sentences From A Comparable Corpora

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators