Document processing in AI Builder

| Ijlal Monawwar


AI Builder is one of the newer services in introduced in Power Platform, which is a no code technology that provides machine learning capability using AI models. It is primarily used to automate certain tasks to make our applications more productive and enables us to gain insights from our processes. An example would be to automate data extraction from a file or a document and organize it into various tables. There is a slew of pre-built models available out of the box such as Document processing like Invoice, Receipt and Forms, Classification models, Sentiment Analysis, etc. but we can also tailor them to our own business needs to optimize how the model performs. I will be focusing on some of the terminologies and the best practices while developing and training a document processing model.


We can use collections to group documents that share a similar layout. For example, we can use a collection to group documents sharing identical structure like same number of columns and rows for a tabular representation of some data.


Tagging is used to extract data from the documents by selecting the area where the data is expected to be found. We create a bounding box to limit the area of selection which trains the model to adjust its data extraction bounds. There are two methods of tagging present. The simple tagging mode extracts data on a tabular level. We use this on simple and straightforward tables which do not have any complexities present like nested values. On the other hand, the advanced tagging mode extracts data from tables at a cell level. We use this mode when we have tables that are skewed, where tagging with a grid isn't possible or when we need to extract nested items, like an item within a cell.


To list all pieces of information that we want the AI model to extract from our documents, we use tables. It contains the information that is captured while the document is being processed. Then we can use these tables in our Power Apps or Flows for e.g., to perform insights or store them in Dataverse tables.

Field or table not in document

Sometimes we can have a document which does not contain a table or a field. In that case, we have an option available to tell the AI model that this is the case by selecting ‘Not available in the document’ in the three dotted menu.

Join us next time, as we continue our journey of learning canvas apps.Click here to learn more about Imperium's Power Apps Services. We hope this information was useful, and we look forward to sharing more insights into the Power Platform world.

Chief Architect, Founder, and CEO - a Microsoft recognized Power Platform solution architect.

About The Blog

Stay updated with what is happening in the Microsoft Business Applications world and initiatives Imperium is taking to ease digital transformation for customers.

More About Us

We provide guidance and strategic oversight to C-Suite and IT Directors for on-going implementations. Feel free to give us a call.

1 331 250 27 17
Send A Message

Ready to Start?

Get a personalized consultation for your project.

Book a Meeting