On the front page of the Daily Telegraph for Friday 3 November 2023 the headline reads:”Musk tells Sunak AI will end work”. Tucked away in the Business section is another headline: ”AI generated false ‘evidence’ of accounting scandals”. This piece, by Adam Mawardi, reports how James Guthrie, an emeritus professor at Macquarie University submitted evidence to an Australian parliamentary inquiry which contained false claims about the involvement of KPMG and Deloitte in financial scandals, including one scandal which never actually occurred - in other words it was a fictional “event”. Although Guthrie has acknowledged responsibility, the fictions themselves were inserted by AI.
Mawardi’s article focuses on the submission, the apology which Professor Guthrie has issued and some responses from KPMG and Deloitte. I suggest that the wider conclusions we can draw from the affair are likely to affect us all (yes, I know, drawing conclusions from a sample of one etc etc).
Professor Guthrie has spent 35 years in Accounting Education. He is the editor of a highly regarded accounting journal, and is described as “providing thought leadership to benefit the wider accounting profession”.
It is reasonable to suppose that an academic submission to the Australian parliament is not something one can buy off the shelf at Tesco, nor order online at Amazon. By implication a modicum of artisanal accounting and writing effort should be involved. Indeed the Australian parliament might well have expected it.
Guthrie used Google’s Bard chatbot for research (it is perhaps ironic that Google should use a word associated with Shakespeare, a writer of fiction, but I digress), and apparently he did not realise that: “AI can generate authoritative-sounding output that can be incorrect, incomplete or biased”.
Is it unreasonable, given Guthrie’s experience and profile, to suggest that any such incorrect output ought to have been picked up during the edit and review process, long before submission? Further, again given his undoubted expertise, should a major fictional accounting scandal not have rung alarm bells? As in “Gosh, how did I miss that when it happened?”. It is after all his specialist area.
Now, the above relates to what ought to have been an expertly hand-crafted, one-off item, with some AI input. The implications for what will almost certainly occur in a less hand-finished mass-user situation are disturbing. We already know that there is a tendency for large organisations who have problems to devote more time, effort and money to managing the reputational problem fallout than to resolving it (Post Office/Horizon is one example, the UK blood infection scandal another). Another tendency is that (even without AI) issues with roots linked to IT can go unnoticed for so long that resolution becomes impractical if not impossible. Witness the problem at Nottingham University Hospitals Trust, where 400,000 patient letters were generated, but not sent. The issue dates as far back as 2000, but the majority of unsent letters started to pile up from 2008, with up to 45,000 documents a year unsent by 2014. Apparently letters requiring sign-off were placed into a folder few staff knew existed. The trust made a statement that no patient harm had occurred, but that seems very, very unlikely. What also seems unlikely is that over a period of 18 years no-one investigated, reported or attempted to resolve the problem.
Amongst our friends a topic that has repeatedly and insistently arisen at social events over the last decade has been the increasing difficulty in sorting out issues with large organisations dealing with individual customers, particularly utility providers. All our friends and acquaintances feel positive that for the most part when things are running smoothly, goods and services are provided more effectively and efficiently than they were in the past. But we are also all agreed that problem resolution has become a nightmare: multiple lengthy phone calls (if indeed a telephone call is an option); suggestions that most issues might be resolved on the retailer’s web site (they almost never are); no ownership of the problem by an individual which might actually speed resolution etc etc. The list goes on, but the outcome is the same in every case: the customer has to put in far more time and effort than they should just to get themselves back to a fair and equitable position. A cynic might suspect that organisations have deliberately engineered customer service to perform so badly so that customers with minor issues will just go away.
As off-the-shelf AI gets deployed in customer services operations, the situation will get worse. Just as Guthrie’s research threw up fictional events corporate AI will make erroneous conclusions and inferences. The difficulties we experience now will multiply as AI bots are integrated with process flows and any human we do actually manage to interact with will be looking at a screen of information that tells them all is well and everything that could be done has been done. That information will be believed. The problem is that in some cases - probably a small minority - it will be misleading and inaccurate. For the customers thus affected the end result will be frustrating and time consuming, not to mention especially stressful for the vulnerable. Worse, there will be no management reporting highlighting the issue, and thus, there will be no issue.
The reality for all of us is that the AI cat is already well and truly out of the bag. Anyone who believes that the 2023 AI summit will help to establish controls which will benefit the overall population is deluded. Indeed, when I asked a well-known AI engine whether AI could be effectively controlled it responded with “It's important to note that complete and absolute control over AI may be difficult to achieve, especially in the long term. AI technologies can have unintended consequences, and malicious actors may attempt to use AI for harmful purposes.” Indeed.
I conclude that Musk is wrong: AI will not end work. It will generate vast amounts of it for consumers, most of it unpaid, deeply unsatisfying and perhaps most troubling of all, completely under any kind of managerial or governmental radar.