I was contacted by a startup based in Madrid that owns an APP to save on our bills effortlessly. The idea is simple and effective, users upload their bills and after they provide suggestions and help to find better suppliers and save money. I help them to acquire more customers without the need of growing their team building a webapp that automatize the recognition of the bills no matter the format that customers upload it.
Bizen did not have technical team so I build a serverless architecture around AWS Lambda functions. A trigger starts the function every time a bill was uploaded to a s3 bucket. The function extract all the text from the bills, calling Amazon Textract, if necessary, and then a system of templates build upon regex expressions (no pain no gain) structure the information and find the necessary fields for injecting Bizen databases.
It’s unbelievable the variety of fields and units that billing system has, most of the time in this project was adjusting the templates to their particularities. It’s clearly necessary a standardisation in this point.
Technology stack Python + AWS Lambda + regeX + AWS textract