Originally published in Forbes
While the generative AI industry rides high on an audacious buildup, its promised returns are still mostly speculative. GenAI’s reliability has become the perennial elephant in the room; large language models sometimes get it wrong. For example, while almost three-quarters of lawyers plan to use genAI for their work, their AI tools hallucinate at least one-sixth of the time. If you don’t already know the answer, you won’t know when genAI leads you astray. So how can you trust it – and how can enterprises deploy it?
Predictive AI has the potential to do what might otherwise be impossible: Realize genAI’s bold, ambitious promise of autonomy – or at least a great deal of that often overzealous promise. By predicting which cases require a human in the loop, an otherwise unusable genAI system will gain the trust needed to unleash it broadly.
For example, consider a question-answering system based on genAI. Such systems can be quite reliable if only meant to answer questions pertaining to several pages worth of knowledge, but performance comes into question for more ambitious, wider-scoped systems. Let’s assume the system is 95% reliable, meaning users receive false or otherwise problematic information 5% of the time.
A generative AI system that is 95% reliable.
This degree of reliability may be impressive and unprecedented, but for many industry projects, it isn’t good enough. In that case, the system is unviable. As is, it won’t deploy and won’t realize any value.
The solution is predictive intervention. If predictive AI flags for human review the, say, 15% of cases most likely to be problematic, this might decrease the rate of problematic content reaching customers to an acceptable 1%.
A generative AI system with predictive intervention that is 99% reliable.
In such a scenario, the system employs an (expensive) human 15% of the time but achieves 85% of genAI’s promise of autonomy. That is, 85% of the time, genAI does its thing with no human in the loop – a vast improvement over an unworkable genAI system that cannot be deployed.
While it’s a safe bet that enterprises will ultimately turn to predictive intervention so that they can enjoy a healthy dose of genAI’s promised autonomy, I haven’t yet seen much movement – outside the speculative chatter when AI professions “talk shop.”
In research projects, predictive AI has been tested for detecting genAI hallucinations, but that work hasn’t made a dent outside the lab. One reason is that hallucination – getting the answer wrong – isn’t the only kind of genAI failure on which to intervene predictively. A genAI system could also fail to identify and address the user’s intention, provide an unethical or offensive response or engage in topics outside its scope, such as a healthcare system engaging in a personal conversation.
The prevalence and range of genAI’s problematic behavior will only increase as systems are designed to address more ambitious goals, including allowing them to handle sensitive data and enact consequential transactions such as purchases, prescriptions or flight changes.
One early industry step has been the use of predictive intervention to improve the reliability of other kinds of AI systems. For example, the consultancy NLP Logix applies this approach, predicting when speech transcription (aka speech recognition) systems are most likely to fail in order to target human audits that ensure the overall quality of (mostly) automatic transcriptions. The firm will present this work at Machine Learning Week (a conference series that I founded).
Predictive intervention represents an inevitable move toward automating the assignment of human labor. In general, any technology is meant to automate. As technology improves, human labor shifts; we take on tasks that are more difficult to automate, such as those that require human creativity and judgment. As AI improves, the scope of human work will become increasingly limited to those more complex tasks that a machine cannot handle.
In this way, AI itself will act as an arbiter that decides which tasks fall into human hands. That is, it will automate the identification of work that cannot (yet) be automated.
Predictive intervention is only one way in which genAI and predictive AI are destined to merge. GenAI can empower predictive AI projects by itself acting as a predictive model, genAI chatbots can guide predictive AI projects and large database models complement large language models, tapping a company’s tabular data and serving to empower predictive AI projects as well. This June, I’ll be presenting a keynote address on this topic, “Five Ways to Hybridize Predictive and Generative AI” (also at this online event on April 8, 2025).
About the author
Eric Siegel is a leading consultant and former Columbia University professor who helps companies deploy machine learning. He is the founder of the long-running Machine Learning Week conference series, the instructor of the acclaimed online course “Machine Learning Leadership and Practice – End-to-End Mastery,” executive editor of The Machine Learning Times and a frequent keynote speaker. He wrote the bestselling Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die, which has been used in courses at hundreds of universities, as well as The AI Playbook: Mastering the Rare Art of Machine Learning Deployment. Eric’s interdisciplinary work bridges the stubborn technology/business gap. At Columbia, he won the Distinguished Faculty award when teaching the graduate computer science courses in ML and AI. Later, he served as a business school professor at UVA Darden. Eric also publishes op-eds on analytics and social justice. You can follow him on LinkedIn.
The Machine Learning Times © 2025 • 1221 State Street • Suite 12, 91940 •
Santa Barbara, CA 93190
Produced by: Rising Media & Prediction Impact