Resources and Downloads
Download and access NLP data, corpora, tools and services
Retrieves lexical concepts from all lexicons that have the SearchTerm in its synset. It allows an ​authenticated user (application or end-user) to search the dictionaries for a term they provide. They can set the results page size and the search filter to search either for definitions, translations, synonyms or a combination of them Request API Token.
Actors | Authenticated user. |
URL schema | https://{domain}/api/term/{term}/?type={filter-no}&page={page-no}&limit={pageSize}&apikey={key} |
Pre-conditions | The user has registered and provided their API Key. |
API Parameters |
|
Flow of events |
|
Retrieved Data | results JSON object (list of lexical concepts). |
https://ontology.birzeit.edu/sina/api/term/virus/?type=3&page=1&limit=10&apikey=sampleKey
Retrieves a certain lexical concept from a lexicon, given its IDff Request API Token.
Actors | Authenticated user. |
URL schema | https://{domain}/api/lexicalconcept/{id}?apikey={key} |
Pre-conditions | The user has registered and provided their API Key. |
API Parameters |
|
Flow of events |
|
Retrieved Data | results JSON object (one lexical concept). |
https://ontology.birzeit.edu/sina/api/lexicalconcept/1520039900?apikey=sampleKey
Retrieve all concepts from the Arabic Ontology that have the SearchTerm in its synset Request API Token.
Actors | Authenticated user. |
URL schema | https://{domain}/api/OntologyTermSearch/{term}?page={page-no}&limit={pageSize}&apikey={key} |
Pre-conditions | The user has registered and provided their API Key. |
API Parameters |
|
Flow of events |
|
Data | results JSON object (list of ontology concepts). |
https://ontology.birzeit.edu/sina/api/OntologyTermSearch/virus/?page=1&limit=5&apikey=sampleKey
Retrieves basic information about a given concept from the Arabic Ontology Request API Token.
Actors | Authenticated user. |
URLs schema | https://{domain}/api/OntologyConcept/{conceptID}?apikey={key} |
Pre-conditions | The user has registered and provided their API Key. |
API Parameters |
|
Flow of events |
|
Retrieved Data | results JSON object (One concept from the Arabic Ontology). |
https://ontology.birzeit.edu/sina/api/OntologyConcept/293572?apikey=sampleKey
Retrieves subtypes of an ontology concept, given its ID Request API Token.
Actors | Authenticated user. |
URL schema | https://{domain}/api/OntologyConceptSubtypes/{superId}?apikey={key} |
Pre-conditions | The user has registered and provided their API Key. |
API Parameters |
|
Flow of events |
|
Retrieved Data | results JSON object (list of ontology concepts). |
https://ontology.birzeit.edu/sina/api/OntologyConceptSubtypes/293572?apikey=sampleKey
Retrieves all Arabic Ontology concepts that are part of a given ontology concept.
Actors | Authenticated user. |
URL schema | https://{domain}/api/ConceptParts/{partOfID}?apikey={key} |
Pre-conditions | The user has registered and provided their API Key. |
API Parameters |
|
Flow of events |
|
Retrieved Data | results JSON object (list of ontology concepts). |
https://ontology.birzeit.edu/sina/api/ConceptParts/293121?apikey=sampleKey
Retrieves all Arabic Ontology concepts that are instances of a given ontology concept.
Actors | Authenticated user. |
URL schema | https://{domain}/api/ConceptInstances/{instanceOfID}?apikey={key} |
Pre-conditions | The user has registered and provided their API Key. |
API Parameters |
|
Flow of events |
|
Retrieved Data | results JSON object (list of ontology concepts). |
https://ontology.birzeit.edu/sina/api/ConceptInstances/293121?apikey=sampleKey
Palestinian morphologically annotated corpus (Curras) with 56K tokens,
and
a newly annotated Lebanese corpus (Baladi) with 10K tokens. Each token
is annotated with 16 different features.
Five other dialect corpora will be added soon.
The four corpora consists of about (1.2 million tokens) that we collected from different social media platforms. The Yemeni corpus (\~1.05M tokens) was collected automatically from Twitter, while the other three dialects (~\ 50K tokens each) were manually collected from Facebook and YouTube. Each word in the four corpora was annotated with different morphological features.
The dataset consists of 500 synsets from the 10K synsets in Arabic WordNet. For each synset, an Arabic candidate synonyms are extracted.
The total number of candidate synonyms is 3K with a fuzziness value of each.
Actors | Authenticated user. |
URL schema | https://{domain}/sina/v2/api/SynonymGenerator/?apikey={key} |
Pre-conditions | The user has registered and provided their API Token. |
API Parameters |
|
Flow of events |
|
Retrieved Data | Return the candidate synonyms with their fuzzy values. |
A relatively large dataset of context-gloss pairs, labeled with True/False, was developed for fine-tuning BERT for WSD. Read the article to learn more about this dataset.
Actors | Authenticated user. |
URL schema | https://{domain}/v2/api/SALMA/{text}?apikey={key} |
Pre-conditions |
The user has registered and provided their API Key. The text must be in the http request body. |
API Parameters |
|
Flow of events |
|
Retrieved Data | results JSON object. |
Extract named entities from a given Arabic text. 22 types of entities are supported, which can be single or overlapping entities. Different output formats are supported.
Actors | Authenticated user. |
URL schema | https://{domain}/sina/v2/api/wojood/?apikey={key} |
Pre-conditions | The user has registered and provided their API Key. |
API Parameters |
|
Flow of events |
|
Retrieved Data | returns the results in the specified format. |
Lemmatize every token in a given sentence. The lemma and POS of every token are retrieved Request API Token.
Actors | Authenticated user. |
URL schema | https://{domain}/v2/api/ALMADB/{text}?apikey={key} |
Pre-conditions |
The user has registered and provided their API Key. The text must be in the http request body. |
API Parameters |
|
Flow of events |
|
Retrieved Data | results JSON object. |
Retrieves lemmas and its linguistic features from our lemma index, that have the SearchTerm. It allows an authenticated user (application or end-user) to search the lemma index. No filters can be applied in this service Request API Token.
Actors | Authenticated user. |
URL schema | https://{domain}/api/LemmaSearch/{term}?apikey={key} |
Pre-conditions | The user has registered and provided their API Key. |
API Parameters |
|
Flow of events |
|
Retrieved Data | results JSON object (a list of morphological result). |
https://ontology.birzeit.edu/sina/api/LemmaSearch/اخذ?apikey=sampleKey
Retrieves basic morphological analysis for the SearchTerm Request API Token.
Actors | Authenticated user. |
URL schema | https://{domain}/api/sina-morphizer/{term}?lang={lan}&apikey={key} |
Pre-conditions | The user has registered and provided their API Token. |
API Parameters |
|
Flow of events |
|
Retrieved Data | results JSON object (a list of morphological result). |
https://ontology.birzeit.edu/sina/api/sina-morphizer/هون?lang=dialect&apikey=sampleKey
Normalize an Arabic word based on the selected parameters (remove diacritics, remove small arabic diacritics, remove shadda, remove digits, normalize Alef, remove special characters) Request API Token.
Actors | Any user. |
URL schema | https://{domain}/v2/api/ArStrip/{word}/{parameters} |
Pre-conditions | None. |
API Parameters |
|
Flow of events |
|
Retrieved Data | results JSON object. |
https://ontology.birzeit.edu/sina/v2/api/ArStrip/هَذا/true/true/true/false/false/false
This web service computes whether two Arabic words are the same or not regardless of how they are diacritized, and returns “Same” or “Different”. The output also contains implication direction, distance, number of conflict diacritics, and other outputs. The direction (1,2,3) is to specify which word implies the other. Read more in this article. Request API Token
Actors | Any user. |
URL schema | https://{domain}/api/Implication/{word1}/{word2} |
Pre-conditions | None. |
API Parameters |
|
Flow of events | The system returns the JSON data object. |
Retrieved Data | results JSON object. |
https://ontology.birzeit.edu/sina/api/Implication2/%D9%81%D9%8E%D8%B9%D9%8E%D9%84%D9%8E/%D9%81%D9%8E%D8%B9%D9%84/false/false/false/false
Takes a set of delimited Arabic words and smartly removes duplicates, regardless of how they are diacritized based on the selected parameters.
Actors | Any user. |
URL schema | https://{domain}/api/DuplicateCleaner/{words}/{separator}/{parameters} |
Pre-conditions | None. |
API Parameters |
|
Flow of events | The system returns the JSON data object. |
Retrieved Data | results JSON object. |
https://ontology.birzeit.edu/sina/api/DuplicateCleaner/%D9%81%D8%B9%D9%84%20%20%7C%20%D9%81%D8%B9%D9%84%D9%8E/%7C/true/true/true/true
Takes two sets of words and outputs the union, intersection and similarity measure between them. The service is smart and can tolerate the same words with different diacritics based on the selected parameters.
Actors | Any user. |
URL schema | https://{domain}/api/Jaccard/{set of words1}/{set of words2}/{parameters}/{separator} |
Pre-conditions | None. |
API Parameters |
|
Flow of events | The system returns the JSON data object. |
Retrieved Data | results JSON object. |
https://ontology.birzeit.edu/sina/api/Jaccard/%7B%D9%81%D9%8E%D8%B9%D9%8E%D9%84%D9%8E|%20%D9%81%D9%8E%D8%B9%D9%8E%D9%84%7D/%7B%D9%81%D8%B9%D9%84%D9%8E,%20%D9%81%D9%8E%D8%B9%D9%8E%D9%84%7D/false/false/|
Retrieve the terms (that are lexicon entries) that begin with a given string of characters Request API Token.
Actors | Authenticated user. |
URL schema | https://{domain}/api/Autocomplete/{term}?limit={number}&apikey={key} |
Pre-conditions | The user has registered and provided their API Key. |
API Parameters |
|
Flow of events |
|
Retrieved Data | Results JSON object. |
https://ontology.birzeit.edu/sina/api/Autocomplete/time?limit=10&apikey=sampleKey
Details of error messages returned by the APIs Request API Token.
Error Code | Error Message |
-1 | User blocked, exceeded access limit |
-3 | user is not authenticated |
-4 | Incorrect API parameter value |
-5 | No Data Records Found |
-6 | Incorrect Data Value |
login-error | {"error":"invalid_grant","error_description":"Bad credentials"} |