Labeling support tickets with Robocorp tools

In this tutorial, you will learn how to use aito.ai with the Robocorp tools and run intelligent workflows in the cloud.

Before you begin

Prequisites

This tutorial expects you to be familiar with both Python and Robot Framework at least on a novice level.

Getting an aito.ai instance

If you want to follow along with this tutorial, get your own free aito.ai instance from the Console.

  1. Sign in or create an account, if you haven't got one already.

  2. In the Console go to the instances page and click the "Create an instance" button.

  3. Select the instance type you want to create and fill in needed fields, Sandbox is the free instance for testing and small projects. Visit our pricing page, to learn more about the aito.ai instance types.

  4. Click "Create instance" and wait for a moment while your instance is created, you will receive an email once your instance is ready.

Accessing instance details

After the instance creation is ready, you can access the URL and API keys of the instance by

  1. Log in to Console

  2. Click on the instance your created

  3. Go to the overview page. You can copy the API keys after clicking the eye icon.

Getting Robocorp Lab & Cloud

For this tutorial you will need to install and have access to the Robocorp Lab and Cloud. Robocorp provides you with good instructions on how to get access to their products in their docs.

How to install and setup Robocorp Lab & Cloud

Use case: Labeling support tickets

Most companies nowadays have a ticketing system in use. Whether it be internal development or external customer support, usually everything has to have a ticket. To keep track of which tickets are important and relate to which team the ticket task belongs to, the tickets need to be labeled. The person labeling the tickets needs to read through the whole ticket before they can decide which class the ticket belongs, or how urgently it needs to be addressed. With many incoming tickets, this task takes a lot of time and isn't that motivating for the employees.

This is where aito.ai comes along and gives out the helping hand. aito.ai just needs the historical data on how have the tickets been labeled in the past and it can deduce how the new incoming tickets should be labeled.

aito.ai also gives a probability for the predicted label, so the employee can manually check those cases in which aito.ai wasn't entirely sure about the label. So instead of going through 50 tickets per day, the employee only has to check the more tricky ones. By using Robocorp you can create the whole infrastructure in a flash and have the whole process running in the cloud as a scheduled task.

Data

The data used in this tutorial is gotten from Kaggle's ticket classification. The data comprises of anonymized labeled support tickets with original messages from users. There are two datasets. One is used as training data, so it is already labeled (category column) and includes also subcategory columns. Testing testing data does not have the labeling information (category) and the subcategories.

Training tickets

Overview

  1. Setup Robocorp lab

  2. Creating the ticket labeling robot

  3. Running your robot in the Robocorp cloud

#1 Setup Robocorp Lab

If you followed the instructions in chapter "Before you begin", you should now have an aito.ai instance, access to Robocorp cloud and Robocorp Lab installed.

#2 Creating the ticket labeling robot

Define the workflow

First you will have to plan the workflow you want the robot to do. Where does the data come from? How to get it to aito.ai? What do I want to achieve with the classification? Where do I want to store the results?

To make things a bit easier for you, we have planned a workflow for the robot.

  1. Download the training and test data CSV

  2. Upload the training data to aito.ai

  3. Create an output for the results that have a probability over a threshold

  4. Create an output for the results below a threshold

  5. Label the tickets defined in the test data

  6. Add the good results to aito.ai as more training data

The idea behind the workflow is that we trust aito.ai to label the tickets correctly when the probability of the label is high enough, i.e. over a defined threshold, but if the probability is lower than the set threshold we want a human to check the results and fix them accordingly.

In the example workflow we add the results back to aito.ai which aito.ai has given high enough probability, just to show that training data can be easily added to aito.ai without any extra retraining steps. In reality you shouldn't add the results which are already good by aito.ai but the results the user has manually curated as then you'll train aito.ai to label the tricky ones better next time aito.ai is used to label tickets.

Writing a python helper library

You can easily add python scripts in the Robocode lab template. Just add the python file to the root directory. Create a file called AitoRFHelper.py to the root directory.

In the AitoRFHelper.py you will write helper functions to transform the data, predict with aito.ai, create output files and write data to the outputs. Our Python SDK exists to make the integration of aito.ai easier to Python based systems. You can leave the AitoRFHelper.py file empty for now, we will get back to it when starting to create the workflow for the robot.

Setup the robot

Credentials

In order to use your Robot Framework robot with aito.ai, you will have to give it the URL and API key of your aito.ai instance. The Robocorp platform offers you a way to handle sensitive information through their vault. In that way you can have the sensitive information defined outside your repository either in the cloud environment or on your local machine. We will go through both setups, but let's first get the robot to run locally. In order to use the vault you will have to create a vault.json somewhere on you machine and define the URL and API key as variables for your robot to use. Here are the steps to follow:

1. Create the vault.json file somewhere on your computer (not in the robot's directory if you're thinking to add it to git for example) and store your instance information in the file as variables.

{
  "credentials": {
    "aito_api_url": "your-aito-instance-api-url",
    "aito_api_key": "your-aito-instance-write-key"
  }
}

2. Next create a directory under your robot's root directory (the directory where you created the AitoRFHelper.py) called devdata and create a json file in the directory called env.json. Copy the following text into the env.json and modify the RPA_SECRET_FILE to point to the vault.json you created in the first step.

{
    "RPA_SECRET_MANAGER": "RPA.Robocloud.Secrets.FileSecrets",
    "RPA_SECRET_FILE": "/path/to/vault.json"
}

3. Then create another directory under your robot's root directory called variables and under it a python script called variables.py. Copy the following script to the variables.py.

from RPA.Robocloud.Secrets import Secrets

secrets = Secrets()
AITO_API_URL = secrets.get_secret("credentials")["aito_api_url"]
AITO_API_KEY = secrets.get_secret("credentials")["aito_api_key"]

4. Open your robot file tasks.robot and underneath the *** Settings *** add:

*** Settings ***
Variables    variables.py

5. You're now able to use the instance URL and API key in the robot script by referring the variables ${AITO_API_URL} and ${AITO_API_KEY} respectively.

Setup aito.ai Client

After you have setup the credentials you can define the aito.ai Client in your robot file. In order to have all the necessary keywords at your disposal, you will also have to setup the aito.ai API. Just add the setup for the client and aito.ai API to your robot's settings as follows.

*** Settings ***
Variables    variables.py
Library    aito.api
Library    aito.client.AitoClient    ${AITO_API_URL}    ${AITO_API_KEY}    False    WITH NAME    aito_client

Now you're all set to start working on your robot!

Scripting the robot

The file in which the robot will be written exists in tasks/robot.robot it is created by Robocode lab. Let's start going through the workflow step by step.

1. Download the training and test data CSV

You will add the Keywords foŕ the workflow to the *** Tasks *** cell of the robot file. Add a keyword Download Ticket Data to the tasks.

*** Tasks ***
Label tickets
    Download Ticket Data

Next you will have to define what the keyword means. Add a *** Keywords *** cell and name the keyword as Download Ticket Data. It's a good practice to add each new keyword as a new cell for the robot as it will allow you to run just a single keyword at a time. The library RPA.HTTP includes a Download keyword that can be used for downloading files from URLs. For this tutorial the data files have been uploaded to S3, you can use those URLs for your robot. Set the variable overwrite to be True, so that when the robot is run it will download new files and overwrite the old ones.

*** Keywords ***
Download Ticket Data
    Download    https://aitoai-test-resources.s3-eu-west-1.amazonaws.com/datasets/ticket_classification/training_three_quarters_of_tickets_less_cols.csv    overwrite=True
    Download    https://aitoai-test-resources.s3-eu-west-1.amazonaws.com/datasets/ticket_classification/200_test_cases_wo_answers.csv    overwrite=True

The library RPA.HTTP has to be added to the *** Settings *** cell in order for the Download keyword to work. At this point you can also add the AitoRFHelper that you created earlier to the ĺibraries directory.

*** Settings ***
Variables    variables.py
Library    aito.api
Library    aito.client.AitoClient    ${AITO_API_URL}    ${AITO_API_KEY}    False    WITH NAME    aito_client
Library           RPA.HTTP
Library           AitoRFHelper

To test how the robot now runs press the forward button on top of the script. You will be asked to restart the kernel, press restart. The whole robot file is run. You can then check the log that everything was run successfully (you can ignore the warning about the missing keywords for now). The downloaded files should appear in the robot's directory.

2. Upload the training data to aito.ai

Next you will have to upload the training data into aito.ai. The training data includes the already labeled tickets which aito.ai can then use to deduce labels for new tickets. To use the aito.ai Client add the Get Library Instance Keyword keyword to the tasks. Then add a new task called Upload Training Data, it will take in the name of the training data file and table name to be created to aito.ai as arguments. The tasks section should now look like the following.

*** Tasks ***
Label tickets
    Download Ticket Data
    ${client}    Get Library Instance    aito_client
    Upload Training Data    ${client}    ${trainingFile}    ${tableName}

You will also need to add a new library to the settings called Collections to be able to use the Get Library Instance keyword.

*** Settings ***
Variables    variables.py
Library    aito.api
Library    aito.client.AitoClient    ${AITO_API_URL}    ${AITO_API_KEY}    False    WITH NAME    aito_client
Library           RPA.HTTP
Library           AitoRFHelper
Library    Collections

The training file and table name arguments are defined through variables, so you will need to add a new cell to the robot file called *** Variables ***.

*** Variables ***
${trainingFile}    training_three_quarters_of_tickets_less_cols.csv
${tableName}    tickets

You will also have to define the keyword Upload Training Data as a new cell in the robot file. The Upload Training Data keyword will only upload data into aito.ai if the table does not yet exist. The Quick Add Table keyword, provided by the aito.api, will infer a schema from the CSV, transform the data, upload the schema and the data.

*** Keywords ***
Upload Training Data
    [Arguments]    ${client}    ${filePath}    ${tableName}
    ${tableExists}    Check Table Exists    client=${client}    table_name=${table_name}
    Run Keyword Unless    ${tableExists}    Quick Add Table    client=${client}    input_file=${filePath}    table_name=${tableName}

Now run your robot to check that everything works as expected.

3. Create an output for the results that have a probability over a threshold

Next create an output where you will write the results you deem have a good probability. Add a Create Output task and define output and table name as arguments.

*** Tasks ***
Classify tickets
    Download Ticket Data
    ${client}    Get Library Instance    aito_client
    Upload Training Data    ${client}    ${trainingFile}    ${tableName}
    ${header}=    Create Output    ${client}    ${output}    ${tableName}

As output and table name are used as variables, add them to the variables as well.

*** Variables ***
${trainingFile}    training_three_quarters_of_tickets_less_cols.csv
${testingFile}    200_test_cases_wo_answers.csv
${tableName}    tickets
${output}    output_aito_classified.csv

Now you're missing the definition of the Create Output so add the code that creates an output file to the AitoRFHelper.py.

def create_output(aito_client, file_name, table_name):
    """ Create output file"""
    schema = aito_api.get_table_schema(
        client=aito_client,
        table_name=table_name
    )
    features = schema.columns + ["$p"]
    with open(file_name, "w+") as open_file:
        line = ",".join(features) + "\n"
        open_file.write(line)
    return features

Now you can again run your robot to test that it works as intended. You should see the file output_aito_classified.csv being created the tasks directory.

Checking the Python script

You can also test that your Python script runs by clicking on the script the same forward button as you would when running the robot.

4. Create an output for the results below a threshold

Let's also create an output file for the results that need manual curation. Similarly as before add the Create Output task to tasks. This tine use a different output variable for the file name.

*** Tasks ***
Classify tickets
    Download Ticket Data
    ${client}    Get Library Instance    aito_client
    Upload Training Data    ${client}    ${trainingFile}    ${tableName}
    ${header}=    Create Output    ${client}    ${output}    ${tableName}
    ${headerCuration}=    Create Output    ${outputNotLabeled}    ${tableName}

Also remember to add the file name variable definition to variables.

*** Variables ***
${trainingFile}    training_three_quarters_of_tickets_less_cols.csv
${testingFile}    200_test_cases_wo_answers.csv
${tableName}    tickets
${output}    output_aito_classified.csv
${outputNotLabeled}    output_needs_curation.csv

You're now ready to test out this step. You should see the file output_needs_curation.csv being created the tasks directory.

5. Label the tickets defined in the test data

Now let's get to the real beef of this tutorial, labeling tickets using Aiaito.aito. First you can add a task that creates a simple predict and evaluate query from the uploaded data, we won't go through using the evaluate in this tutorial, but now you know how to get it. Then add a Label Tickets With Aito task into tasks. It will take the headers of both the labeled output, the manual curation output and the predict query.

*** Tasks ***
Classify tickets
    Download Ticket Data
    ${client}    Get Library Instance    aito_client
    Upload Training Data    ${client}    ${trainingFile}    ${tableName}
    ${header}=    Create Output    ${client}    ${output}    ${tableName}
    ${headerCuration}=    Create Output    ${outputNotLabeled}    ${tableName}
    ${predictQuery}    ${evaluateQuery}=    Quick Predict And Evaluate    ${client}    ${table_name}     ${predictField}
    Label Tickets With Aito    ${client}    ${header}    ${headerCuration}    ${predictQuery}

Then you can create the Label Tickets With Aito keyword into a new cell. The keyword takes two arguments, headers for the output files. The test data is then read and formated from CSV into JSON as aito.ai queries work in JSON. The keyword Format CSV To Json we created to the AitoRFHelper.py previously when making the functions to add test data into aito.ai. The result is fetched separately for each row by a for loop and then the result is stored either into the labeled output or the manual curation output, depending on the probability of the result.

*** Keywords ***
Label Tickets With Aito
    [Arguments]    ${client}    ${header}    ${headerCuration}    ${predictQuery}
    ${testData}=    Format CSV To Json    ${client}    ${testingFile}    ${tableName}
    FOR    ${testRow}    IN    @{testData}
        &{result}=    Predict Row    ${client}    ${testRow}    ${predictQuery}
        Run Keyword If    ${result}[$p] < ${threshold}    Append Output    ${outputNotLabeled}    ${testRow}    ${result}    ${headerCuration}    ${predictField}
        ...    ELSE    Append Output    ${output}    ${testRow}    ${result}    ${header}    ${predictField}
    END

Add to the variables the field to be predicted as predictField in this case it is category and threshold you want to use for the predictions as threshold.

`*** Variables ***
${trainingFile}    training_three_quarters_of_tickets_less_cols.csv
${testingFile}    200_test_cases_wo_answers.csv
${tableName}    tickets
${output}    output_aito_classified.csv
${outputNotLabeled}    output_needs_curation.csv
${predictField}    category
${threshold}    0.8

If you look at the Label Tickets With Aito a bit closer you'll notice that you still have three keywords which haven't been defined, Format CSV To Json, Predict Row and Append Output. Those you will add into the AitoRFHelper.py. The formatting of the file is needed for making the queries for aito.ai.

def format_csv_to_json(
        aito_client,
        file_path,
        table_name,
        schema=None
        ):
    """ Format CSV file to JSON """
    # If schema is not given expect to get it from aito.ai
    if not schema:
        schema = aito_api.get_table_schema(
            client=aito_client,
            table_name=table_name
        )

    # Convert the data to be in correct data types by using the schema
    file_df = pandas.read_csv(file_path)
    data_frame_handler = DataFrameHandler()
    converted_file_df = data_frame_handler.convert_df_using_aito_table_schema(
          df=file_df,
          table_schema=schema
        )

    # Modify NA values to be None
    converted_file_df = converted_file_df.where(
        pandas.notnull(converted_file_df), None)

    return converted_file_df.to_dict(orient="records")

def predict_row(aito_client, datadict, predict_query):
    """ Use aito.ai predict endpoint to predict a result for
    the given field for a data row.
    """
    # Use the template query and formulate a query from the data values
    for key in datadict.keys():
        predict_query["where"][key] = datadict[key]

    # Send query to aito.ai predict endpoint
    result = aito_api.predict(
        client=aito_client,
        query=predict_query
        )

    return {
        "feature": result["hits"][0]["feature"],
        "$p": result["hits"][0]["$p"]
        }

The query created in the predict_row function, looks like the following (data from the first row of the test data is used as an example).

{
	"from": "tickets",
	"where": {
		"title": "for call issues",
		"body": "for call issues hi we have daily meeting with client last period we have issues dialing scenario we calling client phone enter conference number can dial others cannot ones cannot receive message saying busy we already asked client if somehow conference number but confirmed today during call got suddenly disconnected join call anymore could you please look into reflecting bad our work cheers application engineer",
		"ticket_type": 0,
		"business_service": 87,
		"urgency": 1,
		"impact": 3
	},
	"predict": "category",
	"limit": 1
}

The from clause defines the table we're using for the prediction. Resembles FROM in SQL.

The where clause defines the prior information we have on the ticket by using propositions, e.g. "ticket_type": 0. This information is used by aito.ai to deduce a label for the ticket.

In the predict clause you define the column which you are trying to predict, in this case it's the category column.

With "limit": 1 aito.ai will return only one result by default the results are ordered by highest probability to the lowest, so the first result is the one with highest probability.

Remember to also add the append_output function to AitoRFHelper.py before trying to run the labeling task.

def append_output(file_name, row, result, header, predict_field):
    """ Append data rows to the created output file"""
    with open(file_name, "a") as open_file:
        row[predict_field] = result["feature"]
        row["$p"] = result["$p"]
        line = ",".join(
            [str(row.get(key, "")) for key in header]) + "\n"
        open_file.write(line)

Now you're ready to test how aito.ai makes predictions! Press the forward button in the robot script and see aito.ai in action. After running the robot you should now have content in the labeled output or the manual curation output files. It works, yay!

6. Add the good results to aito.ai as more training data

Now you have only one step left, adding more training data to aito.ai. Note that normally you should add the manually curated data into aito.ai as those tickets are the ones that are harder for aito.ai so it needs more data in order to do better labeling. Adding the results aito.ai gives can skew the accuracy of the predictions.

In this tutorial we use the data aito.ai has been sure about to add as training data, just to show how easy it is to add more training data into aito.ai (without any additional retraining steps!). Add a keyword Upload Data to tasks.

*** Tasks ***
Label tickets
    Download Ticket Data
    ${client}    Get Library Instance    aito_client
    Upload Training Data    ${client}    ${trainingFile}    ${tableName}
    ${predictQuery}    ${evaluateQuery}=    Quick Predict And Evaluate    ${client}    ${table_name}     ${predictField}
    ${header}=    Create Output    ${client}    ${output}    ${tableName}
    ${headerCuration}=    Create Output    ${client}    ${outputNotLabeled}    ${tableName}
    Label Tickets With Aito    ${client}    ${header}    ${headerCuration}    ${predictQuery}
    Upload Data    ${client}    ${output}    ${tableName}

Then add the Upload Data keyword as follows. The data has to first be transformed into the JSON format (function was defined in the AitoRFHelper.py) and then the Upload Entries can be used to upload the data into aito.ai.

*** Keywords ***
Upload Data
    [Arguments]    ${client}    ${fileName}    ${tableName}
    ${entryData}=    Format CSV To Json    ${client}    ${fileName}    ${tableName}
    Upload Entries    ${client}    ${tableName}    ${entryData}

When you now run the robot, after labeling the data the robot will add the newly labeled data into aito.ai. And you're done building your robot! Next let's go through how to run the robot in the Robocorp Cloud.

The full robot can be foud as an attachment in this article, in case you just want to test running the robot in the Robocorp cloud.

#3. Running your robot in the Robocorp cloud

Set up your robot in Robocorp cloud:

  1. Click on your example organization (or use some existing organization, no need to create a new one if you don't want to)

  2. Click on the "New Process" button

  3. Give a name to the process e.g. "Ticket labeler"

  4. Go to the "Robots" tab and click "New Robot"

  5. Give the robot a name, e.g. "Ticket Labeling Robot" (otherwise use the default options)

  6. Now open up your robot code in the Robocode Lab (robot.robot file)

  7. Click the "Publish to Robocorp Cloud" button in the upper right corner

  8. Select the correct Organization (your example organization), Process (Ticket labeler) and Robot (Ticket Labeling Robot) and click "".

  9. Wait for the process to finish, you'll see a note in the bottom of the IDE that says "The robot was successfully pushed to Robocorp Cloud." when finished.

  10. Click yourself to the "Ticket labeler" process and click "Add Step"

  11. Select the "Ticket Labeling Robot" and press "Add to process"

  12. Go to the tab "Vault" and press "Add"

  13. Give the name "credentials" to the secret and click twice the "Add item" button

  14. Now you're ready to run your robot in the cloud! Go back to the "Processes" tab and click "Ticket labeler", you should see a "Run Process" button, click it and the robot will start running.

What's next

Now that you have the robot up and about you could integrate it with your ticketing system. For example, you could choose tickets from a certain time period and upload them to aito.ai as training data. Then schedule the robot to be run on a certain day to get new unlabeled tickets from the ticketing system and use aito.ai to label them and write the labels back to the ticketing system. The tickets that need manual curation could be sent to the employee via email and he could then send the curated file to an email owned by another robot, which would be triggered by the email to store the data into the ticketing system and aito.ai as additional training data.

Last updated