Auto Annotation

Use AI to help you build AI. Auto annotation uses your model predictions to label your training data.

This tutorial demonstrates how auto-annotation workflows can be configured in the Clarifai API. With auto-annotation, you can use model predictions to label your inputs. Auto-annotation can help you to prepare training data, or assign other useful labels and metadata to your inputs. Since models are doing most of the work of annotating your data, this enables you to speed-up and scale-up your annotation process while ensuring quality standards, typically reducing human effort of labelling data by orders of magnitude. And since this is built into our APIs it seamlessly integrates with all the search, training and prediction functionality of the Clarifai platform.

When a concept is predicted by a model, it is predicted with a confidence score between 0 and 1. In this walkthrough we will leverage that score in our workflow so that when your model predictions are confident (close to 1), you can have your data automatically labeled with that concept. When your predictions are less-than-confident, you can have your input sent to a human being for review.

Create Concepts

Create the concepts that we'll be using in our model. In this tutorial we'll create the following concepts: people, man and adult.

# Insert here the initialization code as outlined on this page:
# https://docs.clarifai.com/api-guide/api-overview/api-clients#client-installation-instructions

post_concepts_response = stub.PostConcepts(
    service_pb2.PostConceptsRequest(
        user_app_id=userDataObject,  # The userDataObject is created in the overview and is required when using a PAT
        concepts=[
            resources_pb2.Concept(id="peopleID", name="people"),
            resources_pb2.Concept(id="manID", name="man"),
            resources_pb2.Concept(id="adultID", name="adult"),
        ]
    ),
    metadata=metadata
)

if post_concepts_response.status.code != status_code_pb2.SUCCESS:
    raise Exception("Post concepts failed, status: " + post_concepts_response.status.description)

Link the newly created concepts with concepts in the Clarifai/Main General model.

Run the code below three times, once for each concept created previously. The concept IDs of the clarifai/main General models are the following:

  • ai_l8TKp2h5 - the people concept,

  • ai_dxSG2s86 - the man concept,

  • ai_VPmHr5bm - the adult concept.

Your model's concept IDs are the ones you created in the previous step: peopleID, manID, and adultID.

# Insert here the initialization code as outlined on this page:
# https://docs.clarifai.com/api-guide/api-overview/api-clients#client-installation-instructions

post_concept_relations_response = stub.PostConceptRelations(
    service_pb2.PostConceptRelationsRequest(
        user_app_id=userDataObject,  # The userDataObject is created in the overview and is required when using a PAT
        concept_id="{YOUR_MODEL_CONCEPT_ID}",
        concept_relations=[
            resources_pb2.ConceptRelation(
                object_concept=resources_pb2.Concept(id="{GENERAL_MODEL_CONCEPT_ID}", app_id="main"),
                predicate="synonym"
            )
        ]
    ),
    metadata=metadata
)

if post_concept_relations_response.status.code != status_code_pb2.SUCCESS:
    raise Exception("Post concept relations failed, status: " + post_concept_relations_response.status.description)

Create a Concept Mapper Model

We're going to create a concept mapper model that translates the concepts from the General model to our new concepts. The model will map the concepts as synonyms. Hypernyms and hyponyms are supported as well.

We'll be setting the knowledge_graph_id value to be empty. If you wanted to define a subset of relationships in your app to be related to each other you can provide the knowledge_graph_id to each concept relation and then provide that knowledge_graph_id as input to this model as well which will only follow relationships in that subset of your app's knowledge graph.

from google.protobuf.struct_pb2 import Struct

# Insert here the initialization code as outlined on this page:
# https://docs.clarifai.com/api-guide/api-overview/api-clients#client-installation-instructions

params = Struct()
params.update({
    "knowledge_graph_id": ""
})

post_models_response = stub.PostModels(
    service_pb2.PostModelsRequest(
        user_app_id=userDataObject,  # The userDataObject is created in the overview and is required when using a PAT
        models=[
            resources_pb2.Model(
                id="synonym-model-id",
                model_type_id="concept-synonym-mapper",
                output_info=resources_pb2.OutputInfo(
                    params=params,
                )
            ),
        ]
    ),
    metadata=metadata
)

if post_models_response.status.code != status_code_pb2.SUCCESS:
    raise Exception("Post models failed, status: " + post_models_response.status.description)

Create a "Greater Than" Concept Thresholder Model

This model will allow any predictions >= the concept values defined in the model to be output from this model.

# Insert here the initialization code as outlined on this page:
# https://docs.clarifai.com/api-guide/api-overview/api-clients#client-installation-instructions

params = Struct()
params.update({
    "concept_threshold_type": resources_pb2.GREATER_THAN
})

post_models_response = stub.PostModels(
    service_pb2.PostModelsRequest(
        user_app_id=userDataObject,  # The userDataObject is created in the overview and is required when using a PAT
        models=[
            resources_pb2.Model(
                id="greater-than-model-id",
                model_type_id="concept-threshold",
                output_info=resources_pb2.OutputInfo(
                    data=resources_pb2.Data(
                        concepts=[
                            resources_pb2.Concept(id="peopleID", value=0.5),
                            resources_pb2.Concept(id="manID", value=0.5),
                            resources_pb2.Concept(id="adultID", value=0.95),
                        ]
                    ),
                    params=params
                )
            ),
        ]
    ),
    metadata=metadata
)

if post_models_response.status.code != status_code_pb2.SUCCESS:
    raise Exception("Post models failed, status: " + post_models_response.status.description)

Create a "Less Than" Concept Thresholder Model

This model will allow any predictions < the concept values defined in the model to be output from this model.

# Insert here the initialization code as outlined on this page:
# https://docs.clarifai.com/api-guide/api-overview/api-clients#client-installation-instructions

params = Struct()
params.update({
    "concept_threshold_type": resources_pb2.LESS_THAN
})

post_models_response = stub.PostModels(
    service_pb2.PostModelsRequest(
        user_app_id=userDataObject,  # The userDataObject is created in the overview and is required when using a PAT
        models=[
            resources_pb2.Model(
                id="less-than-model-id",
                model_type_id="concept-threshold",
                output_info=resources_pb2.OutputInfo(
                    data=resources_pb2.Data(
                        concepts=[
                            resources_pb2.Concept(id="peopleID", value=0.5),
                            resources_pb2.Concept(id="manID", value=0.5),
                            resources_pb2.Concept(id="adultID", value=0.95),
                        ]
                    ),
                    params=params
                )
            ),
        ]
    ),
    metadata=metadata
)

if post_models_response.status.code != status_code_pb2.SUCCESS:
    raise Exception("Post models failed, status: " + post_models_response.status.description)

Create a "Write Success as Me" Annotation Writer Model

Any incoming Data object full of concepts, regions, etc. will be writtent by this model to the database as an annotation with ANNOTATION_SUCCESS status as if the app owner did the work themself.

# Insert here the initialization code as outlined on this page:
# https://docs.clarifai.com/api-guide/api-overview/api-clients#client-installation-instructions

params = Struct()
params.update({
    "annotation_status": status_code_pb2.ANNOTATION_SUCCESS,
    "annotation_user_id": "{YOUR_USER_ID}"
})

post_models_response = stub.PostModels(
    service_pb2.PostModelsRequest(
        user_app_id=userDataObject,  # The userDataObject is created in the overview and is required when using a PAT
        models=[
            resources_pb2.Model(
                id="write-success-model-id",
                model_type_id="annotation-writer",
                output_info=resources_pb2.OutputInfo(
                    params=params
                )
            ),
        ]
    ),
    metadata=metadata
)

if post_models_response.status.code != status_code_pb2.SUCCESS:
    raise Exception("Post models failed, status: " + post_models_response.status.description)

Create a "Write Pending as Me" Annotation Writer Model

Any incoming Data object full of concepts, regions, etc. will be written by this model to the database as an annotation with ANNOTATION_PENDING status as if the app owner did the work themself but needs further review so is marked pending.

# Insert here the initialization code as outlined on this page:
# https://docs.clarifai.com/api-guide/api-overview/api-clients#client-installation-instructions

params = Struct()
params.update({
    "annotation_status": status_code_pb2.ANNOTATION_PENDING,
    "annotation_user_id": "{YOUR_USER_ID}"
})

post_models_response = stub.PostModels(
    service_pb2.PostModelsRequest(
        user_app_id=userDataObject,  # The userDataObject is created in the overview and is required when using a PAT
        models=[
            resources_pb2.Model(
                id="write-pending-model-id",
                model_type_id="annotation-writer",
                output_info=resources_pb2.OutputInfo(
                    params=params
                )
            ),
        ]
    ),
    metadata=metadata
)

if post_models_response.status.code != status_code_pb2.SUCCESS:
    raise Exception("Post models failed, status: " + post_models_response.status.description)

Create the Workflow

We will now connect all the models together into a single workflow.

Every input will be predicted by General Embed model to generate embeddings. The output of the embed model (embeddings) will be sent to general concept to predict concept and cluster model. Then the concept model's output (a list of concepts with prediction values) will be sent to concept mapper model which maps Clarifai concepts to your concepts within your app, people, man and adult in this case. Then the mapped concepts will be sent to both concept thresholds models (GREATER THAN and LESS THAN). GREATER THAN model will filter out the concepts that are lower than corresponding value you defined in model and send the remaining concept list to write success as me model which labels the input with these concepts (your app concepts only) as you with success status. You can train or search on these concepts immediately. The LESS THAN model will filter out concepts that are higher than the corresponding value you defined in the model and send the remaining concept list to write pending as me model which labels the input with these concepts (your app concepts only) as you with pending status.

The model IDs and model version IDs from the public clarifai/main application are fixed to the latest version at the time of this writing (check GET /models for an always up to date list of available models), so they are already hard-coded in the code examples below. It's possible to use other public model or model version IDs.

# Insert here the initialization code as outlined on this page:
# https://docs.clarifai.com/api-guide/api-overview/api-clients#client-installation-instructions

post_workflows_response = stub.PostWorkflows(
    service_pb2.PostWorkflowsRequest(
        user_app_id=userDataObject,  # The userDataObject is created in the overview and is required when using a PAT
        workflows=[
            resources_pb2.Workflow(
                id="auto-annotation-workflow-id",
                nodes=[
                    resources_pb2.WorkflowNode(
                        id="general-embed",
                        model=resources_pb2.Model(
                            id="bbb5f41425b8468d9b7a554ff10f8581",
                            model_version=resources_pb2.ModelVersion(
                                id="bb186755eda04f9cbb6fe32e816be104"
                            )
                        )
                    ),
                    resources_pb2.WorkflowNode(
                        id="general-concept",
                        model=resources_pb2.Model(
                            id="aaa03c23b3724a16a56b629203edc62c",
                            model_version=resources_pb2.ModelVersion(
                                id="aa7f35c01e0642fda5cf400f543e7c40"
                            )
                        )
                    ),
                    resources_pb2.WorkflowNode(
                        id="general-cluster",
                        model=resources_pb2.Model(
                            id="cccbe437d6e54e2bb911c6aa292fb072",
                            model_version=resources_pb2.ModelVersion(
                                id="cc2074cff6dc4c02b6f4e1b8606dcb54"
                            )
                        ),
                    ),
                    resources_pb2.WorkflowNode(
                        id="mapper",
                        model=resources_pb2.Model(
                            id="synonym-model-id",
                            model_version=resources_pb2.ModelVersion(
                                id="{YOUR_SYNONYM_MODEL_VERSION_ID}"
                            )
                        ),
                        node_inputs=[
                            resources_pb2.NodeInput(node_id="general-concept")
                        ]
                    ),
                    resources_pb2.WorkflowNode(
                        id="greater-than",
                        model=resources_pb2.Model(
                            id="greater-than-model-id",
                            model_version=resources_pb2.ModelVersion(
                                id="{YOUR_GREATER_THAN_MODEL_VERSION_ID}"
                            )
                        ),
                        node_inputs=[
                            resources_pb2.NodeInput(node_id="mapper")
                        ]
                    ),
                    resources_pb2.WorkflowNode(
                        id="write-success",
                        model=resources_pb2.Model(
                            id="write-success-model-id",
                            model_version=resources_pb2.ModelVersion(
                                id="{YOUR_WRITE_SUCCESS_MODEL_VERSION_ID}"
                            )
                        ),
                        node_inputs=[
                            resources_pb2.NodeInput(node_id="greater-than")
                        ]
                    ),
                    resources_pb2.WorkflowNode(
                        id="less-than",
                        model=resources_pb2.Model(
                            id="less-than-model-id",
                            model_version=resources_pb2.ModelVersion(
                                id="{YOUR_LESS_THAN_MODEL_VERSION_ID}"
                            )
                        ),
                        node_inputs=[
                            resources_pb2.NodeInput(node_id="mapper")
                        ]
                    ),
                    resources_pb2.WorkflowNode(
                        id="write-pending",
                        model=resources_pb2.Model(
                            id="write-pending-model-id",
                            model_version=resources_pb2.ModelVersion(
                                id="{YOUR_WRITE_PENDING_MODEL_VERSION_ID}"
                            )
                        ),
                        node_inputs=[
                            resources_pb2.NodeInput(node_id="less-than")
                        ]
                    ),
                ]
            )
        ]
    ),
    metadata=metadata
)

if post_workflows_response.status.code != status_code_pb2.SUCCESS:
    raise Exception("Post workflows failed, status: " + post_workflows_response.status.description)

Make the New Workflow your App's Default

Make this the default workflow in the app, so it will run every time we add an input and execute the auto annotation process. If the workflow is not the default workflow of your app you can still use PostWorkflowResults on new inputs to check that you configured the workflow graph and your models properly but the data will not be written to the DB. This is recommended before making it your default workflow and adding inputs to you app.

# Insert here the initialization code as outlined on this page:
# https://docs.clarifai.com/api-guide/api-overview/api-clients#client-installation-instructions

patch_apps_response = stub.PatchApps(
    service_pb2.PatchAppsRequest(
        user_app_id=userDataObject,  # The userDataObject is created in the overview and is required when using a PAT
        action="overwrite",
        apps=[
            resources_pb2.App(
                id="{YOUR_APP_ID}",
                default_workflow_id="auto-annotation-workflow-id"
            )
        ]
    ),
    metadata=metadata
)

if patch_apps_response.status.code != status_code_pb2.SUCCESS:
    raise Exception("Patch apps failed, status: " + patch_apps_response.status.description)

Add an Image

Adding the image will trigger the default workflow.

# Insert here the initialization code as outlined on this page:
# https://docs.clarifai.com/api-guide/api-overview/api-clients#client-installation-instructions

post_inputs_response = stub.PostInputs(
    service_pb2.PostInputsRequest(
        user_app_id=userDataObject,  # The userDataObject is created in the overview and is required when using a PAT
        inputs=[
            resources_pb2.Input(
                data=resources_pb2.Data(
                    image=resources_pb2.Image(
                        url="{YOUR_IMAGE_URL}"
                    )
                )
            )
        ]
    ),
    metadata=metadata
)

if post_inputs_response.status.code != status_code_pb2.SUCCESS:
    raise Exception("Post inputs failed, status: " + post_inputs_response.status.description)

List Annotations

Now you can list annotations with your user id to see the annotations created by your workflow.

# Insert here the initialization code as outlined on this page:
# https://docs.clarifai.com/api-guide/api-overview/api-clients#client-installation-instructions

list_annotations_response = stub.ListAnnotations(
    service_pb2.ListAnnotationsRequest(
        user_app_id=userDataObject,  # The userDataObject is created in the overview and is required when using a PAT
        user_ids=["{YOUR_USER_ID}"],
        list_all_annotations=True,
    ),
    metadata=metadata
)

if list_annotations_response.status.code != status_code_pb2.SUCCESS:
    raise Exception("List annotations failed, status: " + list_annotations_response.status.description)

for annotation in list_annotations_response.annotations:
    print(annotation)

Last updated