4 min read
Harnessing Generative AI & OCR tools for revenue optimisation
Arianna Fischer 17 July, 2024
How QuantSpark identified high-value complex processes for a global travel logistics firm revealing millions of dollars in untapped revenue. QuantSpark leveraged the latest technologies in Generative AI along with the power of Optical Character Recognition to transform a client’s billing process.
Executive summary
A global travel logistics firm asked QuantSpark to examine its international billing processes and identify potential discrepancies driving revenue losses.
Taking on the task, QuantSpark deployed Optical Character Recognition (OCR) tools to automate contract transcription followed by Large Language Models (LLMs) to understand and parse them. We then effectively identified revenue discrepancies in our client’s historical invoices accounting for up to 3% of revenue value.
As a result, QuantSpark delivered a streamlined invoicing process, reducing manual effort and time-consuming tasks, thereby enhancing operational efficiency and minimising the risk of human errors. The solution not only identified millions of dollars in revenue discrepancies, but is poised to enhance billing accuracy for our client prior to invoices being sent out, reinforcing their financial stability.
The problem
Each day, our client experiences revenue losses due to under-billing for the services it provides.
With each new contract signing, renewal, or amendment, the pricing details for each service offered, along with contract conditions, stipulations, and clauses, are manually entered into a billing system presenting a challenge to accuracy.
As services are executed in real-time, this billing system utilises the entered information to generate invoices. Being a substantial corporation serving numerous customers, any inadvertent human errors in this process pose a significant risk of revenue loss. Historically, identifying potential under-billing has relied on manual, labour-intensive methods, with inconsistent performance and limited scalability due to contractual complexities.
The solution
QuantSpark worked alongside the client to conduct a 6-week feasibility study to investigate whether a Generative AI powered solution could support the reconciliation of invoiced revenue to expected revenue. Our proof-of-concept solution is divided into three parts:
-
Transcription of contracts.
-
Extraction of relevant contract data and application to historical service data.
-
Comparing QuantSpark-generated invoices to our client’s historically issued invoices.
Full process diagram with all three parts of the solution
Transcription of contracts
Our client supplied us with a subset of scanned contracts, each differing in quality, to extract billing information. While some scans were in pristine condition, others displayed blurry or misaligned elements. Moreover, aspects such as structure, colour scheme, font style, and size varied considerably, posing a challenge in formulating a uniform approach capable of accommodating these differences.
Our solution focused on two primary patterns in which information was presented in the contracts: tables and paragraph text. We fine-tuned an OCR model to segment the contracts into these two categories. Subsequently, we further refined a set of OCR models to transcribe the paragraph text and reconstruct the tables into a structured format suitable for database ingestion.
Finally, we assembled the transcribed document in its original sequence to yield a fully transcribed result. While our proof of concept predates multi-modal LLMs like GPT4 vision, our transition now to a minimum viable product will provide us with the opportunity to leverage cutting-edge technology, thereby optimising transcription quality.
Extraction of relevant contract data and application to historic service data
With our transcribed documents in hand, we turned to LLMs to extract pertinent information from the contracts. The logic embedded in these contracts falls into two categories:
-
Prices for services, such as cleaning or refuelling services. These often reside within the tables of contracts, whose structures can differ greatly even within the same contract.
-
Contract stipulations, which encompass conditions and clauses detailing alterations to service prices in scenarios like cancellations or delays in service execution.
Extracting prices for services
Deriving prices and rates for services from tables boils down to the art of ‘prompt engineering’. This process entails designing and refining input prompts to elicit desired responses from LLMs, thereby maximising the model's performance and relevance in generating text-based outputs.
Our approach involves providing the model with the contract's validity date, a summarised overview regarding the contents of a given table, the table itself, and a user query, such as “What is the rate for cleaning services?”.
This provides the LLM with enough context to accurately deliver its interpretation of the best and most up-to-date pricing for the desired service. To enhance its capability in handling common sense and complex reasoning tasks, we employed Chain of Thought Reasoning. This involves guiding the LLM to produce logically connected sequences of arguments within its response, ensuring consistency, relevance, and logical flow akin to human thinking patterns.
Applying contract stipulations to service pricing
Having obtained the most up-to-date pricing for a given service, we semantically scanned the transcribed document for contract stipulations, including any conditions or clauses potentially impacting pricing. This process selects statements from the contracts most semantically akin to a set of examples provided to our scanner. Leveraging LLMs once again, we assess the relevance of these statements to the service in question and adjust the price accordingly.
Consider this example.
If there is a cancellation for cleaning services, the scanner would search for the most semantically similar statement in the contract to examples of cancellation clauses provided. Suppose it finds the following statement:
"Should the cleaning service be cancelled within 24 hours of the scheduled appointment time, the fee for the service shall be subject to doubling."
The LLM would then apply this statement to the most recent rate for cleaning services. For instance, if the regular rate for cleaning services were $60, the rate would be updated to $120.
Putting it all together
Comparing QuantSpark’s generated invoices to those issued historically revealed and flagged discrepancies. These were then categorised by potential drivers for our client to review, including operational considerations such as missing contractual documents justifying price increases seen in some historically issued invoices.
Business impact
-
Streamlined Process: Our solution streamlined the invoicing process by automating the labour-intensive task of manually inputting contract billing logic into our client’s billing system. By utilising OCR tools to transcribe scanned documents, we minimised manual effort and reduced time-consuming tasks. This enhancement significantly boosted operational efficiency and reduced the risk of human errors associated with manual data entry.
-
Improved Accuracy and Billing Efficiency: Our utilisation of Large Language Models (LLMs) to read and understand niche contract details for invoice generation contributed to improving billing accuracy. By extracting relevant contract data and applying it to historic service data, our proof of concept solution is poised to actively contribute to enhancing billing accuracy prior to invoices being sent out. This mitigates the risk of millions of dollars in revenue loss due to under-billing and therefore reinforces our client's financial stability by ensuring consistent and accurate invoicing practices.
-
Increased Scalability and Resilience: The implementation of our solution not only addressed existing data quality challenges but also positioned our client for greater scalability and resilience in managing contractual complexities. By automating processes traditionally reliant on manual intervention, we can facilitate quicker decision-making, ultimately contributing to improved business agility and competitiveness.
Get in touch
Are you looking for a team with deep expertise in advanced analytics and modelling techniques to drive value in your business? We can support you.
Posts by Tag
- CASE STUDY (20)
- RETAIL (10)
- PRIVATE EQUITY (8)
- SAAS (7)
- FINANCIAL SERVICES (6)
- DATA ENGINEERING (4)
- ANALYTICS ROADMAP (3)
- AUTOMATION (3)
- LLMS (3)
- STRATEGY (3)
- ANALYTICS SUITE (2)
- ASSET MANAGEMENT (2)
- BI (2)
- BUSINESS PERFORMANCE (2)
- CLIENT RETENTION (2)
- DATA PIPELINE (2)
- DATAFLOW (2)
- EMAIL OPTIMISATION (2)
- GROCERY (2)
- MARKETING (2)
- PREDICTIVE CHURN (2)
- AI (1)
- BUSINESS INTELLIGENCE (1)
- CHURN (1)
- CLTV (1)
- CPA (1)
- CUSTOMER CONVERSION (1)
- CUSTOMER SEGMENTATION (1)
- DATA CUBES (1)
- DIAGNOSTIC (1)
- EXCEL AUTOMATION (1)
- FORECASTING (1)
- HR (1)
- LEAD SCORING (1)
- LOCATION INTELLIGENCE (1)
- PROFESSIONAL SERVICES (1)
- RECURRING REVENUE ANALYTICS (1)
- REVENUE RECOGNITION (1)
- SUPPLY CHAIN (1)
- TALENT (1)