Our modern lives are deeply intertwined with digital technologies. Our data is instantly available on our devices and in the cloud. But when dealing with our business finances, we still have to deal with paper invoices and receipts. Consequently, we have to manually input the data, a major pain point of running your own business, as it takes much time, effort and a decent understanding of accounting to do well.
Money In and Money Out are designed to minimize that pain, by making intelligent suggestions, based on the user's previous activity. When adding a new transaction, after just a couple of letters are entered into the Description, a suggestion is offered, if used, the other fields auto complete, based on the users previous activity.
"In the time than it took for my credit card charge to go through, I had already added it!"
Via: Siegfried Grimbeek
What is OCR?
Optical Character Recognition (OCR), is a technology that extract the text from images, like photographs, scanned documents or books. In many scenarios, this can be used to automate the work which have had to been done manually before.
Why we didn't use OCR
Well, creating an intelligent system that could accurately and quickly predict the transaction the user was trying to add, was not our first solution... that would (did) take ages to implement well... We had originally planned to use OCR to capture the information on the receipt automatically, so all the user had to do was take a photo of the receipt...
Variations and lack of standards
OCR works best for reading simple text documents with very standardised layouts or short strings of text. We quickly realised that invoices and receipts do not have a standard format and the contained information is inconsistent. After collecting examples from around the world, we tried to identify common patterns that we could potentially use to recognize the information needed. Unfortunately, the level of variation was incredible, resulting in slower, less accurate results.
Speed & accuracy
There are several solutions available on the market that allow optical recognition, such as Tesseract (open source) or ABBYY Cloud OCR (proprietary). ABBYY Cloud OCR interesed us, as it has a rich selection of options that even allow to recognize breakdowns of purchased goods following the layout of receipt, along with wide range of supported languages. Albeit slowly, expensively and varying accuracy.
While you can sacrifice speed by pushing recognition into background (which isn't an ideal experience), accuracy is paramount when working with financial data. However, none of receipts we tested could be processed accurately enough or in a time which would have been quicker to just manual input.
The cost of the decent OCR is high, which would result in needing to increase the price of the apps or require charging for an on-going subscription, to sustainably keep developing and improving the apps.
Language & region
While different countries and regions have different regulations on what information must be included on invoices and receipts, this does not seem to impact the consistency of the information and the layout. This is made much more complicated by different languages, as translating on-the-fly would be needed, making processing slower and have to rely on Machine Translation, like Google Translate, whic would reduce the accuracy of the data being captured. The consequence was false results, which require heavy editing by the user.
Surprisingly, the currency of the transaction was not included on many of our example receipts, so can't be captured consistently. Our apps supports multi-currency and automatic conversion, to reduce user effort and increase the accuracy. To work around this we could have used 'last used currency' or automatically switch based on the user's current location, but neither produce reliable enough results.
Privacy & security
The financial data and transactions of a business are important private data. To increase speed, many OCR solutions rely on uploading the image to the providers cloud server for processing and the resulting data transmitted back. Dependencies are risky, as it means relying their security, both now and in the future. The potential risks to data privacy were too high for consideration.
Will we use OCR in the future for Money In and Money Out?
Not for now, not in it's current form. There are to many pain points for capturing reliable data from invoices and receipts. It would take significant improvements to the technology or enforced global standards for invoices and receipts.
While it would have been easier just to accept the limitations of OCR and hope it improved at some point, it gave us the opportunity to explore different solutions. After much experimentation, we found a mix of basic Machine Learning and AI to be much faster, accurate and will learn from the user's activity. While this took much longer to build, it means less dependency, cost & security issues of using 3rd party solutions, and allows for a much better experience.