Finding variants of out-of-vocabulary words in Arabic

Nwesri, A, Tahaghoghi, S and Scholer, F 2007, 'Finding variants of out-of-vocabulary words in Arabic', in Proceedings of the 2007 Workshop on Computational Approaches to Semitic Languages: Common Issues and Resources, Prague, Czech Republic, 28-29 June 2007.


Document type: Conference Paper
Collection: Conference Papers

Title Finding variants of out-of-vocabulary words in Arabic
Author(s) Nwesri, A
Tahaghoghi, S
Scholer, F
Year 2007
Conference name ACL 2007 Workshop on Computational Approaches to Semitic Languages: Common Issues and Resources
Conference location Prague, Czech Republic
Conference dates 28-29 June 2007
Proceedings title Proceedings of the 2007 Workshop on Computational Approaches to Semitic Languages: Common Issues and Resources
Publisher Association of Computational Linguistics
Place of publication USA
Abstract Transliteration of a word into another language often leads to multiple spellings. Unless an information retrieval system recognises different forms of transliterated words, a significant number of documents will be missed when users specify only one spelling variant. Using two different datasets, we evaluate several approaches to finding variants of foreign words in Arabic, and show that the longest common subsequence (LCS) technique is the best overall.
Subjects Business Information Management (incl. Records, Knowledge and Information Management, and Intelligence)
Copyright notice © 2007 Association of Computational Linguistics
Versions
Version Filter Type
Access Statistics: 159 Abstract Views  -  Detailed Statistics
Created: Fri, 09 Oct 2009, 08:09:01 EST by Catalyst Administrator
© 2014 RMIT Research Repository • Powered by Fez SoftwareContact us