Sina Arabic Tools and Resources

LexAPI - Arabic Lexicographic Web Services

A set of RESTful web services that all together form an API for other third-party software developers to directly retrieve linguistic data from the lexicographic database (150 multilingual lexicons + Arabic Ontology)..

Implication Function

A web service that computes whether two Arabic words are the same or not, taking into account their (in/)compatible diacritics. The output contains implication direction, distance, number of conflicts, and other outputs.

Word-Duplicates Cleaner

A web service that takes a set of delimited Arabic words and smartly removes duplicates, regardless on how they are discretized. Some diacritics are combined if needed

Word-Duplicates Cleaner from file

A web service that reads a file, and smartly remove word-duplicates in each line.

Arabic Jaccard Function

A web service that takes two sets of words and outputs the union, intersection, and similarity between them.

A framework and 30 parsers for digitizing lexicons.

30 parsers to handle a broad set of delicate issues in both Arabic and English lexical entries of various types of lexicons. We used them in digitizing, restructuring, and normalization 150 lexicons.

Curras - Corpus for Palestinian Dialect

Download and search a well-annotated coprus Palestinian Dialect (~60k words)

CODE for Palestinian Dialect orthography

a set of spelling guidlines on how the Palestinian dialect can be written.

Method for ontological Classification of Processes

Five ontological notions {Homeomericity, Cumulativity, Telicity, Instantaneity, Atomicity} that can be used describe and classify ontologies + full annotation of the Gene Ontology Top Level Processes.

MashQL - a Query Mashup Language

A novel query formulation language for querying the Data Web, graph databases, and data mashups. Its indexing (Graph Signature) of big data is way faster than e.g., Oracle's graph database.