English to Persian transliteration

Karimi, S, Turpin, A and Scholer, F 2006, 'English to Persian transliteration', in F. Crestani, P. Ferragina and M. Sanderson (ed.) Proceedings of the 13th International Conference on String Processing and Information Retrieval (SPIRE) 2006, Glasgow, UK, 29 September 2006.


Document type: Conference Paper
Collection: Conference Papers

Title English to Persian transliteration
Author(s) Karimi, S
Turpin, A
Scholer, F
Year 2006
Conference name International Conference on String Processing and Information Retrieval
Conference location Glasgow, UK
Conference dates 29 September 2006
Proceedings title Proceedings of the 13th International Conference on String Processing and Information Retrieval (SPIRE) 2006
Editor(s) F. Crestani
P. Ferragina
M. Sanderson
Publisher Springer
Place of publication USA
Abstract Persian is an Indo-European language written using Arabic script, and is an official language of Iran, Afghanistan, and Tajikistan. Transliteration of Persian to English-that is, the character-by-character mapping of a Persian word that is not readily available in a bilingual dictionary-is an unstudied problem. In this paper we make three novel contributions. First, we present performance comparisons of existing grapheme-based transliteration methods on English to Persian. Second, we discuss the difficulties in establishing a corpus for studying transliteration. Finally, we introduce a new model of Persian that takes into account the habit of shortening, or even omitting, runs of English vowels. This trait makes transliteration of Persian particularly difficult for phonetic based methods. This new model outperforms the existing grapheme based methods on Persian, exhibiting a 24% relative increase in transliteration accuracy measured using the top-5 criteria.
Subjects Information and Computing Sciences not elsewhere classified
Keyword(s) Persian
language
scripts
characters
comparisons
transliteration
DOI - identifier 10.1007/11880561_21
Copyright notice © Springer-Verlag Berlin Heidelberg 2006
Versions
Version Filter Type
Altmetric details:
Access Statistics: 414 Abstract Views  -  Detailed Statistics
Created: Wed, 08 Apr 2009, 09:42:32 EST by Catalyst Administrator
© 2014 RMIT Research Repository • Powered by Fez SoftwareContact us