Data di Pubblicazione:
2013
Abstract:
This paper addresses the impact of multiword translation errors in machine translation (MT). We have analysed translations of multiwords in the OpenLogos rule-based system (RBMT) and in the Google Translate statistical system (SMT) for the English-French, English-Italian, and English-Portuguese language pairs.
Our study shows that, for distinct reasons, multiwords remain a problematic area for MT independently of the approach, and require adequate linguistic quality evaluation metrics founded on a systematic categorization of errors by MT expert linguists.
We propose an empirically-driven taxonomy for multiwords, and highlight the need for the development of specific
corpora for multiword evaluation. Finally, the paper presents the Logos approach to multiword processing, illustrating how semantico-syntactic rules contribute to multiword translation quality.
Tipologia CRIS:
4.1 Contributo in Atti di convegno
Keywords:
Machine Translation; Multiword; MT Evaluation
Elenco autori:
Monti, Johanna; Barreiro, Anabela; Oroliac, Brigitte; Batista, Fernando
Link alla scheda completa:
Link al Full Text:
Titolo del libro:
Workshop Proceedings for:Multi-word Units in Machine Translation and Translation Technologies (Organised at the 14th Machine Translation Summit)