Quick start
Get Typless up and running in 5 minutes!
Typless is an AI-powered document extraction platform for automating manual data entry from any document. With an easy-to-use API, you can extract data with a single call in a matter of seconds.
1. Create a new document type
Before you start extracting data, you need to define a document type.
Navigate to the Dashboard page and click on the New document type button in the top right corner of the table.
Next, select the Simple invoice template and click Create document type. This will create a new document type with the name simple-invoice and the following fields:
supplier_name
invoice_number
pay_due_date
issue_date
total_amount
2. Get your API key
To use the Typless API, you need to authorize your requests with an API key.
The Authorization header for your API key is: Token YOUR-API-KEY
(Login if you do not see one)
To obtain the API key, visit the Settings page.
3. Adding a new supplier
Typless is a tool for automation. That's why you need to fill the dataset and train it first. To automate a new supplier, you need first to add its invoices to the data setto first add its invoices to the data set. Download an invoice from Amazing Company:
To add a document to the dataset, use the add-document
endpoint or use the training room, where you can easily upload a file and fill out the necessary information.
The dataset is created by uploading an original file with the correct value for each field defined inside the document type:
import requests
import base64
file_name = 'amazing_company_1.pdf'
# Make sure that you are pointing to the directory with file
with open(file_name, 'rb') as file:
base64_data = base64.b64encode(file.read()).decode('utf-8')
url = "https://developers.typless.com/api/add-document"
payload = {
"learning_fields": [
{
"name": "supplier_name",
"value": "Amazing Company"
},
{
"name": "invoice_number",
"value": "333"
},
{
"name": "issue_date",
"value": "2021-02-01"
},
{
"name": "pay_due_date",
"value": "2021-03-31"
},
{
"name": "total_amount",
"value": "15.0000"
}
],
"line_items": [],
"file": base64_data,
"file_name": file_name,
"document_type_name": "simple-invoice"
}
headers = {
"Accept": "application/json",
"Content-Type": "application/json",
"Authorization": "<<apikey>>"
}
response = requests.request("POST", url, json=payload, headers=headers)
print(response.json())
Response:
{
'details': ['0cbabe33bc9f0f093d2bd06ae3f86240368b6937'],
'message': 'Document added successfully.'
}
You have just added your first document to your data set!
4. Execute training
To build the model for extraction, you need to trigger the training. Go to the Dashboard page, look for the simple-invoice document type in the list, and click on the ⚙️ settings icon.
Wait a few moments for the training to finish. You can refresh the status by clicking on the refresh button in the top left part of the table. After the training status says Models trained, the model is ready for data extraction.
5. Extract data from documents
After the training is finished, you can start precisely extracting data from documents from trained suppliers. Download another invoice from Amazing Company:
Extract the data:
import requests
import base64
file_name = 'amazing_company_2.pdf'
# Make sure that you are pointing to the directory with file
with open(file_name, 'rb') as file:
base64_data = base64.b64encode(file.read()).decode('utf-8')
url = "https://developers.typless.com/api/extract-data"
payload = {
"file": base64_data,
"file_name": file_name,
"document_type_name": "simple-invoice"
}
headers = {
"Accept": "application/json",
"Content-Type": "application/json",
"Authorization": "<<apikey>>"
}
response = requests.request("POST", url, json=payload, headers=headers)
for field in response.json()['extracted_fields']:
print(f'{field["name"]}: {field["values"][0]["value"]}')
Response:
// Example extraction response - the provided recipe will not produce equal results
{
"file_name": "invoice_2.pdf",
"object_id": "1cb25cc8-c9fa-4149-9a83-b4ed6a2173b9",
"extracted_fields": [
{
"name": "supplier_name",
"values": [
{
"x": -1,
"y": -1,
"width": -1,
"height": -1,
"value": "ScaleGrid",
"confidence_score": "0.968",
"page_number": -1
}
],
"data_type": "AUTHOR"
},
{
"name": "invoice_number",
"values": [
{
"x": 1989,
"y": 545,
"width": 323,
"height": 54,
"value": "20190500005890",
"confidence_score": "0.250",
"page_number": 0
},
{
"x": 167,
"y": 574,
"width": 391,
"height": 54,
"value": "GB123456789",
"confidence_score": "0.250",
"page_number": 0
}
],
"data_type": "STRING"
},
{
"name": "issue_date",
"values": [
{
"x": 2072,
"y": 628,
"width": 240,
"height": 54,
"value": "2019-06-05",
"confidence_score": "0.358",
"page_number": 0
}
],
"data_type": "DATE"
},
{
"name": "pay_due_date",
"values": [
{
"x": 2072,
"y": 628,
"width": 240,
"height": 54,
"value": "2019-06-05",
"confidence_score": "0.358",
"page_number": 0
}
],
"data_type": "DATE"
},
{
"name": "total_amount",
"values": [
{
"x": 2146,
"y": 1196,
"width": 126,
"height": 54,
"value": "47.5300",
"confidence_score": "0.990",
"page_number": 0
}
],
"data_type": "NUMBER"
}
],
"line_items": []
}
"customer": null
}
For each of the defined fields, you get an object inside extracted_fields
.
Every field has up to 5 best-predicted value blocks with coordinates, recognized value, and confidence score.
The values are always in a string format.
📘 For simple layouts, one invoice in your data set is enough, but for more complex ones, you may need around 5 from the same supplier.
Congratulations! You just successfully trained and extracted data with Typless. You can now use it to automate manual data entry from any of your invoices.
What’s Next
Now you know the basic concepts of Typless, you can search for some pre-made use cases with examples that can help you satisfy your project requirements and get up and running in a couple of minutes!
Last updated