Nakba-NLP 2025

The 1st International Workshop on Nakba Narratives as Language Resources

Part of the COLING-2025 Conference

Fully Virtual

January 20, 2025

Overview

The narratives of the (ongoing) Palestinian Nakba possess significant historical, cultural, literary, and academic value. Preserving this content and empowering it with AI tools is crucial for ensuring its accessibility and usability for present and future generations. Nakba narratives and testimonies exist in diverse formats such as manuscripts, books, audio recordings, novels, and films. Converting this content into a machine-understandable format presents a notable challenge. Establishing accessible archives and well-annotated collections is essential for researchers and historians to verify and share meaningful information.

This workshop aims to explore how artificial intelligence, natural language processing, and corpus linguistics can assist in understanding, disseminating and preserving, Nakba narratives and testimonies. The goal is to create accessible, comprehensive, and well-annotated collections that empower researchers and historians to validate and share critical insights derived from these data. The workshop targets datasets and narratives in Arabic, English, and other languages, however, submitted articles should be written in English.


Call for Papers

We invite submissions for Nakba-NLP 2025, a workshop dedicated to the exploration and preservation of Nakba narratives through the application of artificial intelligence, natural language processing, and corpus linguistics. All submitted papers should explain their relevance to the topic of ‘Nakba Narratives as Language Resources’. The organisers reserve the right to reject any papers that incite hatred, refute established facts, or undermine the suffering of individuals.

We seek contributions on the following issues of interest:

  • Digitisation of oral and written narratives
  • Creation and labeling of language corpora and datasets
  • Digital archives, metadata, and semantic/content mark-up
  • Annotation tools and annotation guidelines
  • Document classification, topic modeling, and information retrieval
  • Named entity recognition for identifying people, places, organizations, and events
  • Entity linking and relationship extraction
  • Event detection and event argument extraction
  • Knowledge Graphs and Linked Data
  • Vocabularies, dictionaries, and ontologies
  • Data visualisation
  • Knowledge representation
  • Machine translation, summarisation, and paraphrasing
  • Natural Language Generation
  • Large Language Models
  • Sentiment analysis and emotional content extraction
  • Discourse analysis (e.g., bias, offensive language, and misinformation) related to Nakba narratives
  • Voice & dialogue-based systems; ASR
  • Palestinian dialects (written and spoken)

Participants are invited to use the following archives: Institute for Palestine Studies, The Palestinian Museum, Nakba-Archive, POHA,Alhaq,ICHR, as well as Wikipedia and the Wikidata Knowledge Graph.


Submission Details

All submitted papers must clearly state and explain their relevance to the topic of ‘Nakba Narratives as Language Resources’. The organisers reserve the right to reject any papers that incite hatred, refute established facts, or undermine the suffering of individuals.

Submissions may be of two types:

  • Long papers – up to eight (8) pages maximum, presenting substantial, original, completed, and unpublished work.
  • Short papers – up to four (4) pages, describing a small focused contribution, negative results, system demonstrations, etc.


The workshop supports the COLING anti-harassment policy: Policy.
COLING 2025 submission templates: Template.
Submission URL: Please submit here.

Important Dates

  • Submission Deadline: 25 November 2024 28 November 2024
  • Notifications of Acceptance: 5 December 2024
  • Camera Ready Deadline: 13 December 2024 (cannot be changed).

Welcome Message from the Organizers

Welcome to the Nakba Narratives as Language Resources workshop!

This workshop explores how natural language processing (NLP) tools can contribute to the documentation and understanding of significant historical events. Similar to other initiatives such as the Holocaust Testimonies as Language Resources workshop at LREC-COLING 2024, we aim to examine the intersection of language, technology, and social good - an essential and growing area of research.

We recognize that topics like the Nakba may be viewed as sensitive or politically charged. However, addressing such issues is not unprecedented in NLP. Research on socially significant topics, including gender bias, propaganda, and historical testimonies, often intersects with political and cultural discourse. Such efforts are widely recognized as essential for advancing NLP as a field. Similarly, this workshop seeks to foster rigorous and inclusive exploration of narratives surrounding the Nakba.

Our keynote speaker is a globally respected scholar specializing in the Nakba. While not an NLP researcher, his expertise offers invaluable context for understanding the linguistic and societal dynamics behind the narratives we aim to analyze and document. His talk will explore how words are used, shaped, and contested in political and cultural discourse on the Nakba —insights that are directly relevant to applying NLP to such narratives.

We are committed to fostering an inclusive and respectful environment where diverse perspectives are welcomed and all opinions can be expressed and discussed in an academic and constructive manner. Open dialogue is at the heart of our approach, and we value thoughtful contributions from every participant. This workshop adheres to the Diversity, Inclusion, and Anti-Harassment Policy of COLING 2025.

Thank you for your participation. We look forward to a thoughtful and impactful workshop.

Best regards,
The Workshop Organizers

Keynote Speaker

Ilan Pappé

University of Exeter, UK

The Words Laundrette: Unmasking Bias and Propaganda in the Discourses on the Ongoing Nakba

Understanding bias and propaganda in the discourse of Nakba narratives is of utmost importance, and this significance is further amplified by the advent of AI and technological advancements. In this talk, I will examine the Zionist language employed both for domestic and external consumption during the 1948 Nakba and the present genocide of Gaza. This language and its basic vocabulary shielded Israel for many years from internal criticism and international condemnation and granted it Western impunity.

This vocabulary represented a twin process of dehumanization and militarization of the Palestinian civil space. The reference to villages and towns as military bases at best, or as terrorist hotbeds at worst, was an important part of the indoctrination of the Israeli troops as well as a crucial aspect of the Israeli Hasbara, propaganda, outside.

This vocabulary should be seen as an eliminatory praxis by Israel and before that by the Zionist state. A verbal elimination indeed preceded the actual attempted one in 1948. The early attempt was to expunge the Palestinians as people of the land from history and memory before perpetrating a massive expulsion in 1948. The eliminatory vocabulary continued to be employed during the Nakba and ever since 1948. One the most worrying aspects of the current phase of the ongoing Nakba is the sense that manipulation of language is not needed anymore and the eliminatory policies are presented as such in the most vile and direct manner. But it is noteworthy that the direct approach is done only in Hebrew, which Israeli policymakers, still treat as a secret language only they understand.

Resisting this eliminatory praxis, common among settler colonial movements in the past, highlights the importance, quite often belittled, of scholarship, professional acumen, wordsmanship and articulation in any language that can reach a wider audience, who can be part of a solidarity movement or in position to make a difference on the ground. Also the slow process of introducing the counter vocabulary to the Israeli Jewish society, as done by Zochrot, should be acknowledged. None of these efforts is just about words or language, or even discourse. They have to be contextualized historically, legally and morally to appreciate how close they bring us to the adage: the word is [can be at least] mightier than the sword.

I hope through this talk to provide insights for AI practitioners on how to approach texts on this topic, as well as guidance on developing benchmarks and metrics to assess and track patterns of bias and propaganda over time and across diverse sources and languages.


Panel Discussion

Title: Digital Archives and Cultural Heritage in the LLMs Era.

Mediator:

  • Mo El-Haj – Panel Chair – Lancaster University, UK.

Panelists:

Organizers

Mustafa Jarrar

Birzeit University, Palestine

Nizar Habash

New York University Abu Dhabi, UAE

Mo El-Haj

Lancaster University, UK

Amal Haddad

University of Granada, Spain

Zeina Jallad

Harvard Law School, USA

Camille Mansour

Institute for Palestine Studies, Lebanon

Diana Allan

McGill University, Canada

Paul Rayson

Lancaster University, UK

Tymaa Hammouda

Birzeit University, Palestine

Program Committee

  • Abdelkader El Mahdaouy, Mohamed VI polytechnic University
  • Abdellah El Mekki, School of Computer Sciences, Mohammed VI Polytechnic University
  • Abdelrahim Qaddoumi, NYU
  • Abdulrahman Abdulsalam, University of Utah
  • Abed Alhakim Freihat, University of Trento
  • Adnan Yahya, Birzeit University
  • Ala Alazzeh, Birzeit University
  • Ali AlKhathlan, Assistant Professor King Abdulaziz University
  • Almoataz B. Al-Said, Cairo University
  • Amr Keleg, The University of Edinburgh
  • Areej Jaber, Technical University Khadouri
  • Ashraf Elnagar, University of Sharjah
  • Ayah Soufan, Strathclyde University
  • Azzeddine Mazroui, University Mohammed First, Faculty of Sciences
  • Badr AlKhamissi, EPFL
  • Baker Abdalhaq, Annajah National University
  • Basem Ezbidi, Birzeit university
  • Bassam Haddad, University of Petra
  • Bayan AbuShawar, Associate Professor-Cyber security Department-Al Ain University
  • Dana Abdulrahim, University of Bahrain
  • Dima Taji, Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics
  • ElMoatez Billah Nagoudi, The University of British Columbia
  • Eyad Elyan, Robert Gordon University
  • Fadhl Eryani, University of Tübingen
  • Fadi Zaraket, American University of Beirut
  • Faisal Awartani, Insights for Research Polling and Training
  • Fatemah Husain, Kuwait University
  • Fatima Haouari, Qatar University
  • Fethi Bougares, LIUM- Le Mans Université
  • Fouzi Harrag, Ferhat Abbas University
  • Ghassan Mourad, libanese university
  • Go Inoue, Mohamed bin Zayed University of Artificial Intelligence
  • Habiba Dahmani, Mohamed Boudiaf University
  • Haithem Afli, ADAPT Centre, Munster Technological University
  • Hamada Nayel, Benha University
  • Ibrahim Abu Farha, University of Sheffield
  • Imed Zitouni, Google
  • Imene Bensalem, ESCF de Constantine & MISC Lab, Constantine 2 University, Algeria
  • Injy Hamed, Institute for Natural Language Processing, University of Stuttgart
  • Kamel Gaanoun, National Institute of Statistics and Applied Economics
  • Kamel Smaili, LORIA
  • Kaoukab Chebaro, Columbia University
  • Karim Bouzoubaa, Mohammed V University in Rabat
  • Khaled Shaalan, The British University in Dubai
  • Khalid Choukri, ELRA/ELDA
  • Khalil Mrini, Bytedance
  • Khloud Al Jallad, SySSR
  • Labib Arafeh, AlQuds University
  • Lama Nachman, Intel Labs
  • Lamia Hadrich-Belguith, ANLP Research Group, MIRACL Lab, FSEGS, Sfax University
  • Majdi Sawalha, The University of Jordan
  • Manar Alkhatib, British University in Dubai
  • Maram Hasanain, Qatar Computing Research Institute
  • Mo El-Haj, Lancaster University
  • Mohamed Lichouri, Centre de Recherche Scientifique et Technique pour le Développement de la Langue Arabe (CRSTDLA)
  • Mohamed Yahya, NLP Researcher
  • Mohammad Abuoudeh, Al Hussein Bin Talal University
  • Mohammed Attia, Google Inc.
  • Mohammed Khalilia, Birzeit University
  • Mohammed Salah Al-Radhi, Budapest University of Technology and Economics, Department of Telecommunications and Media Informatics
  • Mona Baker, University of Oslo
  • Moustafa Al-Hajj, Lebanese University
  • Muhammad Abdul-Mageed, The University of British Columbia
  • Munir Fakher Eldin, Birzeit University
  • Munther Dahleh, MIT
  • Nada Ghneim, Damascus University - Information Technology Engineering Faculty
  • Omar Shehabi, Yale Law School
  • Omar Tesdell, Birzeit University
  • Omar Trigui, University of Sousse in Tunisia
  • Owen Rambow, Stony Brook University
  • Radi Jarrar, Birzeit University
  • Radwan Tahboub, Palestine Polytechnic University
  • Raia Abu Ahmad, DFKI
  • Reem Suwaileh, Qatar University
  • Saad Ezzini, Lancaster University
  • Sabri Boughorbel, Qatar Computing Research Institute
  • Sahar Ghannay, CNRS, LISN
  • Salima Harrat, ENS Bouzaréah, Algiers
  • salima mdhaffar, LIA - University of Avignon
  • Samhaa R. El-Beltagy, Newgiza University/Optomatica
  • Sari Hanafi, American Univ of Beirut
  • Seif Mechti, ISSEPS
  • Serin Atiani, Princess Sumaya University for Technology
  • Shady Elbassuoni, American University of Beirut
  • Sultan Alrowili, University of Delaware
  • Susan Akram, Boston University
  • Tamer Elsayed, Qatar University
  • Thaher Gharabeh, Univeristy of Granada
  • Violetta Cavalli-Sforza, Al Akhawayn University
  • Wajdi Zaghouani, Northwestern University Qatar
  • Walid Magdy, The University of Edinburgh
  • Wassim El-Hajj, American University of Beirut
  • Watheq Mansour, The University of Queensland
  • Wissam Antoun, Inria
  • Yahya Mohamed Elhadj, Arab Center for Research and Policy Studies

Workshop Program

All times are in GMT
09:00 - 09:20 Opening Session: Welcome by Workshop Chairs

Chairs: Mustafa Jarrar, Camille Mansour

A note from the organizers
09:20 - 10:00 Session 1: Propaganda Detection

Chair:
Nizar Habash

  • Integrating Argumentation Features for Enhanced Propaganda Detection in Arabic Narratives on the Israeli War on Gaza. Sara Nabhani, Claudia Borg, Kurt Micallef, and Khalid Al-Khatib
  • Multilingual Propaganda Detection: Exploring Transformer-Based Models mBERT, XLM-RoBERTa, and mT5. Mohamed Ibrahim Ragab, Ensaf Hussein Mohamed and Walaa Medhat.
10:00 - 10:40 Session 2: Bias Detection

Chair:
Mustafa Jarrar

  • The Missing Cause: An Analysis of Causal Attributions in Reporting on Palestine. Paulina Garcia Corral, Hannah Bechara, Krishnamoorthy Manohara and Slava Jankin
  • Deciphering Implicatures: On NLP and Oral Testimonies. Zainab Sabra
10:40 - 11:15 Break
11:15 - 12:30 Panel Discussion: Digital Archives and Cultural Heritage in the LLMs Era.
Panel Chair: Mo El-Haj
Panelists:
  • Muhammad Abdul-Mageed, The University of British Columbia, Canada.
  • Antonio Moreno Sandoval, Autonomous University Madrid, Spain.
  • Dawn Knight, Cardiff University, UK.
  • Mustafa Jarrar, Birzeit University, Palestine.
12:30 - 13:00 Lunch Break
13:00 - 13:45 Session 3: Classification of Narratives

Chair:
Tymaa Hammouda

  • Sentiment Analysis of Nakba Oral Histories: A Critical Study of Large Language Models.Huthaifa I. Ashqar.
  • Arabic Topic Classification Corpus of the Nakba Short Stories. Osama Hamed and Nadeem Zaidkilani.
  • Exploring Author Style in Nakba Short Stories: A Comparative Study of Transformer-Based Models. Osama Hamed and Nadeem Zaidkilani.
13:45 - 14:00 Break
14:00 - 15:00 Session 4: Tagging of Nakba Narratives

Chair:
Ghadir Awad

  • NakbaTR: A Turkish NER Dataset for Nakba Narratives.Esma Fatıma Bilgin Tasdemir and Şaziye Betül Özateş.
  • The Nakba Lexicon: Building a Comprehensive Dataset from Palestinian Literature. Izza AbuHaija, Salim Al Mandhari, Mo El-Haj, Jonas Sibony and Paul Rayson.
  • Cognitive Geographies of Catastrophe Narratives: Georeferenced Interview Transcriptions as Language Resource for Models of Forced Displacement. Annie K. Lamar, Rick Castle, Carissa Chappell, Emmanouela Schoinoplokaki, Allene M. Seet, Amit Shilo and Chloe Nahas.
15:00 - 16:00 Session 5: Nakba Narratives

Chair:
Osama Hamed

  • Collective Memory and Narrative Cohesion: A Computational Study of Palestinian Refugee Oral Histories in Lebanon.Ghadir A. Awad, Tamara N. Rayan, Lavinia Dunagan and David Gamba.
  • A cultural shift in Western perceptions of Palestine. Terry Regier and Muhammad Ali Khalidi.
  • Detecting Inconsistencies in Narrative Elements of Cross Lingual Nakba Texts. Nada Hamarsheh, Zahia Elabour, Aya Murra and Adnan Yahya.
16:00 - 17:00 Keynote: The Words Laundrette: Unmasking Bias and Propaganda in the Discourses on the Ongoing Nakba
By Ilan Pappe
17:00 - 17:15 Closing

Sponsors