Relation Extraction

Two corpora: (1) Relation Extraction corpus supporting 40 relation types (WojoodRelations), see article, and (2) Event-Argument Relation Extraction corpus supporting 3 relation types (WojoodHadath), see article.

Performance: WojoodRelations (88.61%), WojoodHadath (93.99%) WojoodOutOfDomain (74.90%).

  • Models: NLIRE and NLIEAE BERT-based models.

    Method: We formulated both the Relation Extraction and Event-Argument Extraction tasks as Natural Language Inference (NLI) problems. For each task, a BERT-based model was fine-tuned using a large set of NLI-style sentence pairs automatically derived from the corresponding datasets — WojoodRelations and WojoodHadath.

    WojoodRelations Corpus: 550K tokens, 40 relations (Relation Extraction).

    WojoodHadath Corpus: 550K tokens, 3 relations (Event-Argument Extraction).

    WojoodOutOfDomain Corpus: 80K tokens, 3 relations (Event-Argument Extraction).

    Relations (WojoodRelations - 40 Types):

    leader_of manager_of president_of employee_of
    member_of owner_of president_of student_at
    has_compititor has_conflict_with has_partner_with has_parent
    has_relative has_sibling has_spouse capital_of
    has_currency has_population official_language has_border_with
    headquartered_in lives_in located_in nearby
    branch_count employs found_on has_alternate_name
    has_property subsidary geopolitical_division birth_date
    birth_place death_date has_occupation builder_of
    founder_of manufacturer_of

    Relations (WojoodHadath - 3 Types):

    has Agent: participant(s) involved in the event.
    hasLocation: where the event occurred.
    hasDate: when the event occurred.

    Please email Prof. Jarrar (mjarrar AT birzeit.edu) for the annotation guidelines
  • SinaTools: Relation Extraction module as python library.

    Hugging Face: NLI-based fine-tuned BERT module.

    WojoodRelations (Corpus only)

    WojoodHadath (Corpus only)

    WojoodOutOfDomain (Corpus only)

  • Alaa Aljabari, Mohammed Khalilia, Mustafa Jarrar: WojoodRelations: Arabic Relation Extraction Corpus and Modeling In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing 2025 (ENMLP), China, ACL.


    Alaa Aljabari, Lina Duaibes, Mustafa Jarrar, Mohammed Khalilia: Event-Arguments Extraction Corpus and Modeling using BERT for Arabic. In Proceedings of the Second Arabic Natural Language Processing Conference (ArabicNLP 2024), Bangkok, Thailand. Association for Computational Linguistics.