Clarifai Guide
Clarifai Home
v7.0
v7.0
  • Welcome
  • Getting Started
    • Quick Start
    • Applications
      • Create an Application
      • Application Settings
      • Collaboration
    • Authentication
      • App-Specific API Keys
      • Personal Access Tokens
      • Scopes
      • Authorize
      • 2FA
    • Glossary
  • How-To
    • Portal
      • Auto Annotation
      • Custom Models
      • Text Classification
      • Visual Text Recognition
    • API
      • Auto Annotation
      • Batch Predict CSV on Custom Text Model
      • Custom KNN Face Classifier Workflow
      • Custom Models
      • Custom Text Model
      • Visual Text Recognition
  • API Guide
    • API overview
      • API Clients
      • Using Postman with Clarifai APIs
      • Status Codes
      • Pagination
      • Patching
    • Data Mode
      • Supported Formats
      • Create, Get, Update, Delete
      • Collectors
    • Concepts
      • Create, Get, Update
      • Languages
      • Search by Concept
      • Knowledge Graph
    • Scribe Label
      • Annotations
      • Training Data
      • Positive and Negative Annotations
      • Tasks
      • Task Annotations
    • Enlight Train
      • Clarifai Models
      • Model Types
      • Create, Get, Update, Delete
      • Deep Training
      • Evaluate
        • Interpreting Evaluations
        • Improving Your Model
    • Mesh Workflows
      • Base Workflows
      • Create, Get, Update, Delete
      • Input Nodes
      • Workflow Predict
    • Armada Predict
      • Images
      • Video
      • Prediction Parameters
      • Multilingual Classification
    • Spacetime Search
      • Search Overview
      • Combine or Negate
      • Filter
      • Rank
      • Index Images for Search
      • Legacy Search
        • Combine or Negate
        • Filter
        • Rank
        • Saved Searches
  • Portal Guide
    • Portal Overview
    • Data Mode
      • Supported Formats
      • Bulk Labeling
      • CSV and TSV
      • Collectors
    • Concepts
      • Create, Get, Update, Delete
      • Knowledge Graph
      • Languages
    • Scribe Label
      • Create a Task
      • Label Types
      • Labeling Tools
      • AI Assist
      • Workforce Management
      • Review
      • Training Data
      • Positive and Negative Annotations
    • Enlight Train
      • Training Basics
      • Clarifai Models
      • Model Types
      • Deep Training
      • Evaluate
        • Interpreting Evaluations
        • Improving Your Model
    • Mesh Workflows
      • Base Workflows
      • Setting Up a Mesh Workflow
      • Input Nodes
    • Armada Predict
    • Spacetime Search
      • Rank
      • Filter
      • Combine or Negate
      • Saved Searches
      • Visual Search
  • Data Labeling Services
    • Scribe LabelForce
  • Product Updates
    • Upcoming API Changes
    • Changelog
      • Release 7.0
      • Release 6.11
      • Release 6.10
      • Release 6.9
      • Release 6.8
      • Release 6.7
      • Release 6.6
      • Release 6.5
      • Release 6.4
      • Release 6.3
      • Release 6.2
      • Release 6.1
      • Release 6.0
      • Release 5.11
      • Release 5.10
Powered by GitBook
On this page

Was this helpful?

  1. How-To
  2. Portal

Visual Text Recognition

Work with text in images, just like you work with encoded text.

PreviousText ClassificationNextAPI

Last updated 4 years ago

Was this helpful?

Visual text recognition helps you convert printed text in images and videos into machine-encoded text. You can input a scanned document, a photo of a document, a scene-photo (such as the text on signs and billboards), or text superimposed on an image (such as in a meme) and output the words and individual characters present in the images. VTR lets you "digitize" text so that it can be edited, searched, stored, displayed and analyzed.

Please note: The current version of our VTR model is not designed for use with handwritten text, or documents with tightly-packed text (like you might see on the page of a novel, for example).

How VTR works

VTR works by first detecting the location of text in your photos or video frames, then cropping the region where the text is present, and then finally running a specialized classification model that will extract text from the cropped image. To accomplish these different tasks, you will need to configure a workflow. You will then add these three models to your workflow:

  • Visual Text Detection

  • 1.0 Cropper

  • Visual Text Recognition

Start by creating an app with General-Detection as the base workflow.

Next, navigate to Model Mode and click "Create Workflow".

Under "User" select Clarifai to access Clarifai Models.

Add these three models to your workflow:

  • Visual Text Detection

  • 1.0 Cropper

  • Visual Text Recognition

Connect the input nodes in your workflow.

  • Connect "1.0 Cropper" to "Visual Text Detector".

  • Connect "Visual Text Recognition" to "1.0 Cropper".

Upload your inputs and navigate to Explorer view. On the righthand sidebar click the "gear" icon under app workflow. Select your newly created workflow and view your detected text.