In the run-up to the first MTD deadline, we’re launching our Making Tax Beautiful series – a fortnightly look into the changing world of VAT from our Director of Product Strategy, Russell Gammon. This week, Russell takes a look at Machine Learning and how ones of 2018’s most popular buzzwords will work hand-in-hand with HMRC’s digital journey.
I am doomed! As a qualified accountant, if I enter my job into the expertly named “Will a robot take my job?” website, I get told there’s a 94% chance I’m soon to be automated. Well, I better bump up those pension payments whilst I still can.
But, it’s OK. I’ve found a new career. What I can do is professionally tour AI conferences. If I google “AI conferences London”, I’m hit with over 20 events between now and the end of 2018 (it’s nearly October…). I just need to find somebody to pay for it (perhaps a robot?..)
Of course, I jest. I’ve read lots of great articles on how accountants won’t be entirely removed from the equation, that they’ll use automation and robotics to remove all manual, trivial steps in a process and take time doing more “value add” activities. This makes sense and is something that’ll hopefully tide me over until the aforementioned pension kicks in.
What even IS Machine Learning?
The finalists for 2018 “Corporate buzzword of the year” have to be AI, machine learning, blockchain and RPA (Robotic Process Automation). All interesting topics that have filled up my LinkedIn Newsfeed for a while now. In the context of a Tax function, Machine Learning at its most basic is the ability for machines to learn from analysing data and improve their analysis over time.
As for the difference between ML and AI, it’s a little more nuanced but comes down to “what do we consider to be smart?”. Rather than explain it myself, this rather excellent article from Forbes explains it nicely.
But WHY is it important?
Time for a little story. As a “wet behind the ears” graduate in a Big 4 firm, you’ll be exposed to all numbers of manual tasks. Whilst I very much enjoyed my time doing my ACA qualification, I was no exception and spent many a day formatting spreadsheets, producing scripts for software and carrying out checks on iXBRL files (what a fun notice period it was!).
However, my most dreaded of tasks was the ten months where I spent around 6 days/month doing the VAT return for a travel agency. The process was, end-to-end, entirely manual. Run reports A, B, C and D, reformat them in this way, copy/paste those results into another workbook…
One of the largest time-sinks was interrogating the Accounts Payable data. There were all manner of checks that I had to carry out, step by step, manually in Excel. The quality of the data simply meant that it needed a reasonably fine-toothed comb approach to it.
So, when it comes back to “replacing accountants with robots”, to me, this is exactly the type of task which lends itself perfectly to automation.
So, what are we doing about it?
The way we are structuring our for:sight platform is entirely ambivalent to the type of data it is looking at. Generally, a process will start with importing data from Excel. Alternatively, import data directly from a cloud-based source such as Sage, SAP or Microsoft Dynamics etc; we really don’t mind.
Once we have all the data in the system, we then identify which parts of it matter. HMRC only require you to hold the date/time, amount and rate of supply. Whilst clearly very useful information, we expand that out where required. Country of supply, VAT charged (and/or VAT code), currency, invoice text for example, are useful bits of information.
Once that data is in, perhaps translated (FX, anybody?), perhaps merged (multiple source systems, anybody?), it’s time to analyse it. This is where Machine Learning can significantly short-cut the poor ACA-studying junior who dreads the monthly (or quarterly) days of trawling through Excel.
What does it do?
So, let’s get into the detail. Firstly, and importantly, we’re using standard, well-known, readily available ML libraries.
We’re configuring the engine to understand that each column has potentially useful data in it, and to assign a weighting to those columns. It will then look for inconsistencies in the data, particularly (but not exclusively) looking at the VAT rate being used. It will then learn over time and adjust the weightings, based on the decisions made by users.
To show you what I mean, let’s take the following example of some AP data:
We think we’d flag 6 of the transactions. Rationale as follows:
1 – Wait a minute, this date is from the previous quarter! All other dates appear to be in Q3, so we should flag. However, this isn’t actually picked up by ML, it’s picked up by some standard checks that we’ve built in.
2 – It certainly looks like UK and T1 both strongly indicate that the VAT rate should be 20%. So, when it’s not 20%, it’s going to flag it.
3 – UK is a strong indicator of a 20% rate, and ABC Co also looks reasonably like it would be applying VAT at 20%. So here we would flag this in the first instance.
4 – Again, it looks like UK is a strong indicator of 20%, and we have limited information about T5 or “Zebra Co”, so we would likely flag this initially. However (for example) we know that T5 actually does indicate 5%, so the user would confirm it as correct. We’d then expect that over time the ML algorithm would perhaps learn and stop flagging this type of transaction.
5 – We have other rows where invoices with the phrase “client entertaining” have been rated at 0%, whereas this one has been rated at 20%. We would, therefore, flag it.
6 – A final, obvious one. China and T0 both strongly indicate 0%, and therefore this gets flagged (as it’s wrong!).
So, what does this mean?
Firstly, it takes a chunk of time out of the process. Not only does it reduce the number of hours involved, but the hours it saves are the mind-numbing hours that make bright people want to change jobs.
Secondly, it spots patterns that probably haven’t been spotted before. By interrogating every single line-item, my bet is that it’ll flag something that hasn’t been flagged before. This could result in tangible actual extra VAT back.
We’re not promising to find everything that is wrong with the data, or automagically fix everything for you. We are promising to spot patterns in data and flag things to users that look odd, in the hope of speeding up, and improving the quality of, your data cleansing process. We’re Making Tax Smart.