08 Feb Overcoming the Algorithmic Trust Paradox
I would be angry too. You’re sitting there, watching the halftime show and you see a commercial for an AI breakthrough. The TV cuts to the broadcast booth where a few former-players-turned-announcers work through three or four examples of where “such and such” AI identified the following three plays as the most significant events occurring in the first half of the game. Using visible markup tools and camera angle changes, all purportedly supplied by the artificially-intelligent system, the criticality of the plays mentioned is brought to life in glaring detail as the ESPN “probability of victory” widget calculates the odds in “real time.” You have to feel for the military operator or intelligence community analyst who’s got their head in their hands at this moment. Even trying to watch the game for relaxation brings them crashing back to a major complaint from their work life. “How the hell can AI work like this, bringing patterns of life to the surface for the NFL when I spend my days wishing for an AI solution to help do, essentially, the same thing?” This is a typical grievance from many users of AI. Government personnel, in particular, just can’t seem to understand WHY automation and AI, in its many forms, seems to show love to everyone else but them. The problem is in no way singular to government personnel alone, but those are the people I spend most of my time helping, so maybe it just feels that way to me. I know that across every industry, whenever anyone tries to break outside the realm of the small-scale tried and true approach we saw in a demo to something that will actually solve a business problem we have, we always seem to find an “edge case” to send us back to the training table, if not back to the drawing board. We need to be more educated when looking with wonder at the marvels of AI, and be realistic at what solutions, and more importantly, how, AI can bring.
The most shining examples of AI success in the business world. Deal with the complexity of structured data, meaning data that can be placed into rows and columns in a spreadsheet (for the most part). The entire scope of humanity is drowning in data, and while we’re finding newer and better ways of collecting ever more of it on what seems like a weekly-improvements basis, we are grossly behind in developing tools to deal with these datasets we’ve created. As an example of AI evolution, many years ago we used to have a couple of If-This-Then-That (IFTTT) statements attached to a credit card transaction log. If the owner lived in Zip code 12345 and if any transaction occurred outside that zip code then transaction would be denied, unless the card holder had previously cleared their travel plans with their bank, giving a temporary reprieve from the IFTTT rules that had been established for protection. Now, with e-commerce, easy travel, and a credit card-friendly economy making transaction location restrictions impossible, financial institutions cannot utilize those same methods of customer fraud protection. Today, banks and other financial institutions rely on Big Data Analysis methods and AI to help detect fraud in near-real-time. They do this by using personal transaction data to determine a pattern for how every card holder uses their card. There are too many cards out there creating too many transactions to look at every transaction in singular. The old options just don’t scale to this volume of transactions to review. With all the people creating all their data points, an “every transaction” model requires more computer power than is profitable for the credit card company. The solution is to now create a profile that is based on, but doesn’t require the memory of every individual transaction, to be applied to incoming data points for anything falling outside the model. If it’s outside by a statistically significant degree, the fraud protections of that company kick in. This produces a win-win for all parties involved. AI functionalities like “sort” and “sift” allow the credit card issuer to process a fraction of the data being produced, saving money. The customers can self-report the transactions that the algorithms don’t catch, making up for the false negatives that slip by the fraud protection AI tools, saving time. There is also a third win where the banks can partner with retailers and marketers to offer advertising to their customers based on their collective AI profiles and sell market trend analysis as an additional income source, but that has nothing to do with the utility of AI, just a secondary monetization opportunity.
The stakes are high all around. Nobody wants to accept an 80% solution rate, least of all military decision makers. The stakes are definitely higher when looking at their unique decision-making outcomes. But most sports teams or banks in general will not make a decision on an 80% solution, either. The commercial landscape, and the NFL, does offer an advantage over the government. Their profit margins allow for them to facilitate experimentation in the AI space with a “cover your losses” mentality that cannot be afforded by government agencies. Take, for example, the algorithm examples mentioned in earlier. The NFL and ESPN can afford a “halftime show” algorithm that isn’t exactly what it was portrayed to be, because the ROI is good and, in the end, it doesn’t matter. Same with the banks. They are able to afford their customers a very good level, say 89% just to pick a number, of automated fraud protection, knowing that their customers will call to dispute any charges missed by their AI. For the bank, to put in the development and infrastructure necessary to grow their capability by 5% just isn’t affordable. In the NFL example, the AI showcasing the important first half plays, in all actuality, likely returned back 6-10 different examples of “big plays” and a human narrowed them down the three shown during halftime. To highlight those plays, the AI definitely made the “tracks” that the players ran visible, but humans worked to clean up the false positives or false negatives before airing live on TV. Finally, different renderings were made using AI-aided graphics presentations, but humans were there making all of the important decisions where the AI ran into a wall with the halftime deadline looming. The report could not have been created without the use of AI, but the marketing associated with the delivery of that information is where we really need to be vigilant as customers in the AI marketplace. Just because something is made possible by AI, that doesn’t mean that the AI did it. It’s simply “brought to you by… AI.” The many people involved in the chain of events needed for AI to bring this to you aren’t mentioned. Arguments may certainly be made saying another important play of the first half which wasn’t mentioned prove the AI isn’t perfect, but it really doesn’t matter. The AI worked its magic to return options humans can use and massage for broadcast to enrich the fan viewing experience. Everybody wins. The place we need to stay aware of is the marketing and messaging that make the government analyst viewing the game at home feel like the world of AI is passing them by. Obviously, the government can’t take an operational risk on an 80% solution. There just aren’t enough people to run the NFL/ESPN model working in the Intelligence Community to fulfill the human costs associated with AI correction on a truncated timeline.
There are several examples where human-machine “super systems” actually only require a human to babysit an algorithm. A potential problem with systems such as these is they tend to leave the human disengaged, confused, and unprepared to retake control when the AI turns it back over to the human. There are examples of AI working with engaged humans rendering mistaken answers with no background information. Systems like these have a tendency to cause mistrust between operator and machine. In order for the government to ever fully adopt AI that matters at a higher-than-amusing level of importance, they are going to need answers they can see and verify, not answers that dictate while inviting questions. These feedback-loop and confidence-based systems are capable of building trust of man for machine, but they must be built the correct way, using the right data and the right methodology for the task at hand.
Military and government personnel are familiar with the good old-fashioned OODA loop. John Boyd’s Observe, Orient, Decide, and Act method gives a framework for how to cycle through operational problems. For the sake of my government brothers and sisters, I’m definitely not going, nor am I particularly qualified, to write a drawn-out example of how OODA loop techniques solve everything from ingrown toenails to cancer. I mention it only because when it comes to AI, shiny object syndrome takes over for everyone (from AI providers to AI consumers) and impatience reigns. With AI, in particular, there is no patience for a process-based approach. “We just want it to fire!” “How long will this thing take to render a response?” “We built this to track a car, see how well it does in this one particular demo example?” However, correctness in AI development cannot be understated. Only by describing the correct dataset in the most correct manner can a library of ground truth truly be developed and refined. During the development of these libraries, appropriate subject matter experts (SMEs) need to be engaged early and often to counteract assumptions applied to the data by technologists and data scientists. Free of bias, this library can be used to train and develop a host of algorithms in a methodology that not only applies to relevant data, but in a way that scales.
All entities looking for high-confidence AI must have the original data perfectly labeled in a way both computers and humans can understand. Then, both SMEs and computers can take advantage of that data. In such an environment, all possibilities explode into light. “If I change something here, what will happen there?” “What happens if I make two libraries of ground truth, can I test their resultant algorithms in different theaters to see which one is best?” “What if I version control my ground truth? Will that lead me to version-controlled algorithms?” We’re heading toward this future, and it is very exciting. But as we go, both government and industry need to employ some patience while stumbling toward high-trust AI, and it’s not be used to advertise during the Super Bowl.
Gabe Harris is the VP of Product Development at Orions Systems, a pioneer in the development of smart vision systems for government, sports, law enforcement, and anyone attempting to use unstructured data as a first-class data type. For further discussion visit us at http://www.orionssystems.com.