Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
Integration:  dvc git
e1bd998206
Initial dataset commit
1 year ago
e1bd998206
Initial dataset commit
1 year ago
0154230ccf
Created the Readme File
1 year ago
e1bd998206
Initial dataset commit
1 year ago
Storage Buckets
Data Pipeline
Legend
DVC Managed File
Git Managed File
Metric
Stage File
External File

README.md

You have to be logged in to leave a comment. Sign In

Cover

layoutlm-document-qa

Source: HuggingFace layoutlm-document-qa Model

Description

This is a fine-tuned version of the multi-modal LayoutLM model for the task of question answering on documents. It has been fine-tuned using both the SQuAD2.0 and DocVQA datasets.

Usage

To run some of these examples, you must have PIL, pytesseract, and PyTorch installed in addition to transformers.

from transformers import pipeline

nlp = pipeline(
    "document-question-answering",
    model="impira/layoutlm-document-qa",
)

nlp(
    "https://templates.invoicehome.com/invoice-template-us-neat-750px.png",
    "What is the invoice number?"
)
# {'score': 0.9943977, 'answer': 'us-001', 'start': 15, 'end': 15}

nlp(
    "https://miro.medium.com/max/787/1*iECQRIiOGTmEFLdWkVIH2g.jpeg",
    "What is the purchase amount?"
)
# {'score': 0.9912159, 'answer': '$1,000,000,000', 'start': 97, 'end': 97}

nlp(
    "https://www.accountingcoach.com/wp-content/uploads/2013/10/income-statement-example@2x.png",
    "What are the 2020 net sales?"
)
# {'score': 0.59147286, 'answer': '$ 3,750', 'start': 19, 'end': 20}

NOTE: This model and pipeline was recently landed in transformers via PR #18407 and PR #18414, so you'll need to use a recent version of transformers. For example:

pip install git+https://github.com/huggingface/transformers.git@2ef774211733f0acf8d3415f9284c49ef219e991

License

This model is available on HuggingFace under the MIT License.

Citation

This model was created by the team at Impira.
Tip!

Press p or to see the previous file or, n or to see the next file

About

This is a fine-tuned version of the multi-modal LayoutLM model for the task of question answering on documents.

Collaborators 1

Comments

Loading...