RecoMadeEasy Embedded AudioVisual Recognition Engine by Recognition Technologies, Inc.
  • AudioVisual Recognition
    (Combination of Speaker, Speech, Face Recognition, and Object Detection and Recognition with a single interface)
    Server Based

  • Speaker Recognition
    (Language- and Text-Independent, aka: Speaker Biometrics, Voice Biometrics, or SIV)
    Recipient: Frost & Sullivan Award 2011
    Server Based

  • Large-Vocabulary Speech Recognition
    Available for English, Spanish, Mandarin, Arabic, and German
    Also Available in Bilingual Spanish-English, Mandarin-English, Arabic-English, and German-English
    (Customizable domain full transcription ~ 240,000+ word vocabulary)
    Server Based

    The Large-Vocabulary Speech Recognition engine provides full speech transcription capability on small embedded devices as well as servers. The speech recognition engine supports a configurable lexicon, in the order of 230,000 unique words, with the capability of customizing a language model to your domain of interest. The engine comes with a generic language model which covers most idiosyncrasies of the language, as default. Medical, legal, financial, and other language domains are also available. In addition, new and unique domains may be defined and trained in a matter of hours.

    Once utilized in conjunction with our Speaker Recognition engine, it provides full diarization capabilities, where the Speaker Recognition engine segments speakers and labels their identities and the Speech Recognition engine transcribes the text that is spoken by each individual. These engines work together and provide timestamps and other details such as score and confidence for each result. They also provide multiple possible results sorted according to their relevance scores. These results may be returned in XML, JSON, or even clean human readable Text and HTML. We provide a C++ API as well as web, Android, iOS, and command-line interfaces.

    Supported Languages

      The RecoMadeEasy® Speech Recognition engine is currently available for the following languages:


      • All dialects of English
      • Supports 8kHz and 16kHz Audio


      • Major dialects of Spanish
      • Supports 8kHz and 16kHz Audio

      Chinese (Mandarin)

      • Major dialects of Mandarin Chinese
      • Supports 8kHz and 16kHz Audio


      • Major dialects of Arabic
      • Supports 8kHz and 16kHz Audio


      • Major dialects of German
      • Supports 8kHz and 16kHz Audio

  • Face Recognition
    (face detection and recognition)
    Server Based

  • Object Recognition
    (object detection and recognition)
    Server Based

  • Interactive Voice Response (IVR)
    (Graph-based logic, easily configured)
    Product Details

  • Automatic Language Proficiency Rating (ALPR)
    (Multi-lingual automated language proficiency rating)

  • Signature Recognition
    Status: Advanced Development Stage

  • Keystroke Recognition
    Status: Research Stage

For further information please contact us at 1-800-215-0841 inside the U.S. or +1-914-997-5676 from any other country. Alternatively, you may send an Email to Recognition Technologies, Inc.