Frequently Asked Questions
Participant support for KnowledgeGraphEval 2026, including task scope, subtasks, datasets, evaluation, submission formats, timeline, and contact details.
General
KnowledgeGraphEval 2026 is an ArabicNLP 2026 shared task on Arabic knowledge graph construction and domain adaptation. It focuses on extracting entities and semantic relations from Arabic text to build structured, machine-readable knowledge representations.
The goal is to evaluate Arabic information extraction systems under realistic settings. The task measures how well systems recognize entities, classify semantic relations, and produce subject-relation-object triples from Arabic text.
Arabic knowledge graph construction remains underexplored. This task supports research on Arabic NER, relation extraction, domain adaptation, LLM-based extraction, RAG systems, semantic search, and knowledge-intensive NLP applications.
Participation
Yes. Teams may participate in Subtask 1, Subtask 2, Subtask 3, or any combination of them.
Team names must be one word only and should be meaningful and appropriate. Avoid using spaces, special symbols, or offensive terms.
The shared task is open to academic, industry, and independent research teams interested in Arabic NLP, information extraction, domain adaptation, and knowledge graph construction.
Participants may use external resources when allowed by the official rules of each subtask. The use of external data, pretrained models, LLMs, and retrieval resources must be reported in the system description paper.
Data and Resources
Subtask 1 uses Konooz, a large-scale Arabic NER benchmark for cross-domain and cross-dialect evaluation. It covers 10 domains, 16 Arabic dialects, and Modern Standard Arabic.
The domains are Sports, Economics, Health, Agriculture, Art, Finance, History, Law, Politics, and Science.
Subtask 2 uses WojoodRelations, an Arabic benchmark for semantic relation extraction. It extends the Wojood NER corpus with annotated relations between entity pairs.
Subtask 3 also uses WojoodRelations, but participants receive raw Arabic sentences only. Systems must identify entities and extract relations jointly.
- Subtask 1: CoNLL sequence-labeling format.
- Subtask 2: Structured JSON format with predefined entity mentions.
- Subtask 3: Raw sentences with required JSON output for extracted triples.
Evaluation
Subtask 1 is evaluated using entity-level micro F1-score. A prediction is correct only when both the entity span and entity type exactly match the gold annotation.
Subtask 2 is evaluated using micro F1-score over predicted relation labels. Systems must classify each candidate entity pair as one relation type or no-relation.
Subtask 3 is evaluated using precision, recall, and F1-score over extracted relation triples. A prediction is correct only if the subject, relation, and object exactly match the gold annotation.
Submission
Submissions will be handled through Codabench. The Codabench links will be added to the shared task website when they are available.
- Subtask 1: BIO sequence-labeling output.
- Subtask 2: Sentence ID and predicted relation label.
- Subtask 3: JSON file with extracted subject-relation-object triples.
Yes. Baseline code, evaluation scripts, and submission rules will be released with the training and development data.
Timeline
Training and development data, baseline code, and evaluation scripts are scheduled for June 5, 2026. Blind test data will be released on July 20, 2026.
The current website timeline lists July 20, 2026 as the registration deadline and blind test data release date.
Final official results and rankings are scheduled for July 30, 2026.
Contact
For shared task questions, datasets, submissions, or participation, contact the organizers by email.
- Alaa Aljabari: aaljabari@birzeit.edu
- Nagham Hamad: nhamad@birzeit.edu
- Shared task email: KnowledgeGraphEval@gmail.com
Updates about registration, Codabench links, data release, evaluation scripts, and submission rules will be announced through the official shared task website.