Nakba-NLP 2025

The 1st International Workshop on Nakba Narratives as Language Resources

Part of the COLING-2025 Conference

Abu Dhabi, UAE (Fully Virtual)

January 20, 2025

Overview

The narratives of the (ongoing) Palestinian Nakba possess significant historical, cultural, literary, and academic value. Preserving this content and empowring it with AI tools is crucial for ensuring its accessibility and usability for present and future generations. Nakba narratives and testimonies exist in diverse formats such as manuscripts, books, audio recordings, novels, and films. Converting this content into a machine-understandable format presents a notable challenge. Establishing accessible archives and well-annotated collections is essential for researchers and historians to verify and share meaningful information.

This workshop aims to explore how artificial intelligence, natural language processing, and corpus linguistics can assist in understanding, disseminating and preserving, Nakba narratives and testimonies. The goal is to create accessible, comprehensive, and well-annotated collections that empower researchers and historians to validate and share critical insights derived from these data. The workshop targets datasets and narratives in Arabic, English, and other languages, however, submitted articles should be written in English.


Call for Papers

We invite submissions for Nakba-NLP 2025, a workshop dedicated to the exploration and preservation of Nakba narratives through the application of artificial intelligence, natural language processing, and corpus linguistics. All submitted papers should explain their relevance to the topic of ‘Nakba Narratives as Language Resources’. The organisers reserve the right to reject any papers that incite hatred, refute established facts, or undermine the suffering of individuals.

We seek contributions on the following issues of interest:

  • Digitisation of oral and written narratives
  • Creation and labeling of language corpora and datasets
  • Digital archives, metadata, and semantic/content mark-up
  • Annotation tools and annotation guidelines
  • Document classification, topic modeling, and information retrieval
  • Named entity recognition for identifying people, places, organizations, and events
  • Entity linking and relationship extraction
  • Event detection and event argument extraction
  • Knowledge Graphs and Linked Data
  • Vocabularies, dictionaries, and ontologies
  • Data visualisation
  • Knowledge representation
  • Machine translation, summarisation, and paraphrasing
  • Natural Language Generation
  • Large Language Models
  • Sentiment analysis and emotional content extraction
  • Discourse analysis (e.g., bias, offensive language, and misinformation) related to Nakba narratives
  • Voice & dialogue-based systems; ASR
  • Palestinian dialects (written and spoken)

Participants are invited to use the following archives: Institute for Palestine Studies, The Palestinian Museum, Nakba-Archive, POHA,Alhaq,ICHR, as well as Wikipedia and the Wikidata Knowledge Graph.


Submission Details

All submitted papers must clearly state and explain their relevance to the topic of ‘Nakba Narratives as Language Resources’. The organisers reserve the right to reject any papers that incite hatred, refute established facts, or undermine the suffering of individuals.

Submissions may be of two types:

  • Long papers – up to eight (8) pages maximum, presenting substantial, original, completed, and unpublished work.
  • Short papers – up to four (4) pages, describing a small focused contribution, negative results, system demonstrations, etc.


The workshop supports the COLING anti-harassment policy: Policy.
COLING 2025 submission templates: Template.
Submission URL: Please submit here.

IMPORTANT DATES

  • Submission Deadline: 25 November 2024 28 November 2024
  • Notifications of Acceptance: 5 December 2024
  • Camera Ready Deadline: 13 December 2024 (cannot be changed).

SPEAKERS

TBD


Panel Discussion

Panelists, here is the esteemed list of panelists joining the discussion:

Title: “Digital Archives and Cultural Heritage in the LLMs Era”. This session will focus on how Corpus and NLP techniques can support the preservation and accessibility of cultural heritage materials, such as testimonies, books, narratives, blogs, and media. The discussion will address the following essential questions:

  1. What best practices exist for working with communities to develop corpora from diverse narratives?
  2. What challenges arise in creating and curating corpora from underrepresented or low-resource languages, and how can cultural authenticity be maintained?
  3. How can large language models, multimodal models, and corpus-building techniques support the digitisation of cultural heritage?
  4. How are Large Language Models being used to document cultural heritage, and how can existing archives be utilised for training and fine-tuning them?
  5. What innovative approaches can ensure cultural heritage materials remain accessible, authentic, and preserved for future generations?

Organizers

Mustafa Jarrar

Birzeit University, Palestine

Nizar Habash

New York University Abu Dhabi, UAE

Mo El-Haj

Lancaster University, UK

Zeina Jallad

Harvard Law School, USA

Camille Mansour

Institute for Palestine Studies, Lebanon

Diana Allan

McGill University, Canada

Paul Rayson

Lancaster University, UK

Sanad Malaysha

Birzeit University, Palestine
Publicity Chair

Amal Haddad

University of Granada, Spain
Publicity Chair

Programme Committee

  • Abdelkader El Mahdaouy, Mohamed VI polytechnic University
  • Abdellah El Mekki, School of Computer Sciences, Mohammed VI Polytechnic University
  • Abdelrahim Qaddoumi, NYU
  • Abdulrahman Abdulsalam, University of Utah
  • Abed Alhakim Freihat, University of Trento
  • Adnan Yahya, Birzeit University
  • Ala Alazzeh, Birzeit University
  • Ali AlKhathlan, Assistant Professor King Abdulaziz University
  • Almoataz B. Al-Said, Cairo University
  • Amr Keleg, The University of Edinburgh
  • Areej Jaber, Technical University Khadouri
  • Ashraf Elnagar, University of Sharjah
  • Ayah Soufan, Strathclyde University
  • Azzeddine Mazroui, University Mohammed First, Faculty of Sciences
  • Badr AlKhamissi, EPFL
  • Baker Abdalhaq, Annajah National University
  • Basem Ezbidi, Birzeit university
  • Bassam Haddad, University of Petra
  • Bayan AbuShawar, Associate Professor-Cyber security Department-Al Ain University
  • Dana Abdulrahim, University of Bahrain
  • Dima Taji, Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics
  • ElMoatez Billah Nagoudi, The University of British Columbia
  • Eyad Elyan, Robert Gordon University
  • Fadhl Eryani, University of Tübingen
  • Fadi Zaraket, American University of Beirut
  • Faisal Awartani, Insights for Research Polling and Training
  • Fatemah Husain, Kuwait University
  • Fatima Haouari, Qatar University
  • Fethi Bougares, LIUM- Le Mans Université
  • Fouzi Harrag, Ferhat Abbas University
  • Ghassan Mourad, libanese university
  • Go Inoue, Mohamed bin Zayed University of Artificial Intelligence
  • Habiba Dahmani, Mohamed Boudiaf University
  • Haithem Afli, ADAPT Centre, Munster Technological University
  • Hamada Nayel, Benha University
  • Ibrahim Abu Farha, University of Sheffield
  • Imed Zitouni, Google
  • Imene Bensalem, ESCF de Constantine & MISC Lab, Constantine 2 University, Algeria
  • Injy Hamed, Institute for Natural Language Processing, University of Stuttgart
  • Kamel Gaanoun, National Institute of Statistics and Applied Economics
  • Kamel Smaili, LORIA
  • Kaoukab Chebaro, Columbia University
  • Karim Bouzoubaa, Mohammed V University in Rabat
  • Khaled Shaalan, The British University in Dubai
  • Khalid Choukri, ELRA/ELDA
  • Khalil Mrini, Bytedance
  • Khloud Al Jallad, SySSR
  • Labib Arafeh, AlQuds University
  • Lama Nachman, Intel Labs
  • Lamia Hadrich-Belguith, ANLP Research Group, MIRACL Lab, FSEGS, Sfax University
  • Majdi Sawalha, The University of Jordan
  • Manar Alkhatib, British University in Dubai
  • Maram Hasanain, Qatar Computing Research Institute
  • Mo El-Haj, Lancaster University
  • Mohamed Lichouri, Centre de Recherche Scientifique et Technique pour le Développement de la Langue Arabe (CRSTDLA)
  • Mohamed Yahya, NLP Researcher
  • Mohammad Abuoudeh, Al Hussein Bin Talal University
  • Mohammed Attia, Google Inc.
  • Mohammed Khalilia, Birzeit University
  • Mohammed Salah Al-Radhi, Budapest University of Technology and Economics, Department of Telecommunications and Media Informatics
  • Mona Baker, University of Oslo
  • Moustafa Al-Hajj, Lebanese University
  • Muhammad Abdul-Mageed, The University of British Columbia
  • Munir Fakher Eldin, Birzeit University
  • Munther Dahleh, MIT
  • Nada Ghneim, Damascus University - Information Technology Engineering Faculty
  • Omar Shehabi, Yale Law School
  • Omar Tesdell, Birzeit University
  • Omar Trigui, University of Sousse in Tunisia
  • Owen Rambow, Stony Brook University
  • Radi Jarrar, Birzeit University
  • Radwan Tahboub, Palestine Polytechnic University
  • Raia Abu Ahmad, DFKI
  • Reem Suwaileh, Qatar University
  • Saad Ezzini, Lancaster University
  • Sabri Boughorbel, Qatar Computing Research Institute
  • Sahar Ghannay, CNRS, LISN
  • Salima Harrat, ENS Bouzaréah, Algiers
  • salima mdhaffar, LIA - University of Avignon
  • Samhaa R. El-Beltagy, Newgiza University/Optomatica
  • Sari Hanafi, American Univ of Beirut
  • Seif Mechti, ISSEPS
  • Serin Atiani, Princess Sumaya University for Technology
  • Shady Elbassuoni, American University of Beirut
  • Sultan Alrowili, University of Delaware
  • Susan Akram, Boston University
  • Tamer Elsayed, Qatar University
  • Thaher Gharabeh, Univeristy of Granada
  • Violetta Cavalli-Sforza, Al Akhawayn University
  • Wajdi Zaghouani, Northwestern University Qatar
  • Walid Magdy, The University of Edinburgh
  • Wassim El-Hajj, American University of Beirut
  • Watheq Mansour, The University of Queensland
  • Wissam Antoun, Inria
  • Yahya Mohamed Elhadj, Arab Center for Research and Policy Studies

SPONSORS