Available Modules

Automated Translation: Automatically translate text and documents from one language to another, with automatic identification of the source language. Supports 100 languages.

Automatic Summary: Summarizes a text or document by extracting the most relevant sentences. Supports 26 languages.

Classifier (IPTC) LP1: Automatic classification of documents using standard IPTC (The International Press Telecommunications Council – the Global Standards Body of the News Media) taxonomy. Classification based on custom taxonomies (patents, cyber security, military intelligence or others) can be created on demand. Language Pack includes: English, French, German, Italian, Polish, Portuguese, Spanish, Romanian, Hungarian.

Classifier (IPTC) LP2: Automatic classification of documents using standard IPTC (The International Press Telecommunications Council – the Global Standards Body of the News Media) taxonomy. Classification based on custom taxonomies (patents, cyber security, military intelligence or others) can be created on demand. Language Pack includes: Arabic, Chinese, Farsi, Hebrew, Ukrainian, Russian, Turkish, Serbian.

Image Text Extraction (OCR): Extracts the text from images or scanned documents. Language Pack 1: English, French, German, Italian, Polish, Portuguese, Spanish, Romanian, Hungarian. Language Pack 2: Arabic, Chinese Simplified, Chinese Simplified (Vertical), Chinese Traditional, Chinese Traditional (Vertical), Persian (Farsi), Hebrew, Ukrainian, Russian, Turkish, Serbian (Generic), Serbian (Latin script).

Language Detector: Automated identification of the language in a source text or document. Supports over 170 languages.

Name Entity Recognition: Identifies and extracts Named Entities (keywords) from text and documents in over 100 languages: persons, locations, organizations, titles, time references, PII, and more.

Semantic Similarity: The Semantic Similarity Engine finds identical meaning contained by the analyzed content pieces, ignoring syntax or grammar. You can even compare content pieces written in different languages. It can be used for clustering documents based on the information they contain. Supports over 50 languages.

Sentiment Analysis LP1: Determines the polarity of any text content: negative (neg), neutral (neu), or positive (pos). Language Pack 1: English, French, German, Italian, Polish, Portuguese, Spanish, Romanian, Hungarian.

Sentiment Analysis LP2: Determines the polarity of any text content: negative (neg), neutral (neu), or positive (pos). Language Pack 2: Arabic, Chinese, Farsi, Japanese, Russian.

Speech to Text: Automated Speech Transcription identifies and extracts spoken language from audio files and transforms it into text, in over 100 languages.

Summarizer: Understands the meaning of a text by reading only the key sentences.

Topic Clustering: Topic Custering performs the task of grouping similar documents into clusters (partitions) where documents within the same cluster (partition) present a higher degree of similarity among each other than to any other document in any other (cluster) partition.