المستخلص: |
Identifying and translating a Multi Word Expression (MWE) in a text represent an issue for numerous applications of Natural Language Processing (NLP) especially for Machine Translation (MT). In this paper, we describe an hybrid approach, combining linguistic and statistical information to extract and align MWEs from a sentence level aligned English -Arabic parallel corpus. In order to assess the quality of the mined bilingual MWEs, we conduct a Statistical Machine Translation (SMT) task-based evaluation. We investigate the performance of three methods aiming to integrate extracted bilingual MWEs in Moses, a phrase based SMT system. We experimentally show that these textual units enhance the translation quality for both In-Domain and Out-Of-Domain configurations.
|