As part of Wojood project , Prof. Adnan Yahya presented an article at the ALTIC’11 titled “Tools For Arabic People Names Processing And Retrieval, A Statistical Approach” . The ALTIC’11 (Arabic Language Technology International Conference) which was held in Bibliotheca Alexandrina (B.A.), Alexandria, Egypt- October 9-10, 2011.
Paper Abstract — Arabic web content has been rapidly growing, generating a need for tools to overcome the many challenges of processing and retrieving Arabic content: challenges related to Arabic Language Processing, Search and Query Analysis. An important part of dealing with Arabic digital content is processing and analyzing Arabic people names. This paper reports on our work aimed at designing name pre-processing tools that are able to efficiently identify and process Arabic people names in queries and documents. We try to address challenges such as Name Gender Detection, Translation (Arabic to English), Correction, Auto Suggestion and Extraction from text. All through, we employ a statistical approach based on data obtained from High School student names lists in Palestine and Birzeit University student names lists. Based on this information we constructed different types of databases of Arabic names and used them as the infrastructure for the well structured names tools which are capable of being integrated into existing web search engines and document processing systems. We have been experimenting with some of the developed tools in our online application process at Birzeit University, with encouraging preliminary results.
Index Terms— Arabic Proper Names, Statistical Databases, Name Correction, Name Translation, Names Gender Detection, Proper Names , Extraction, Natural Language Processing.