While there could be a number of reasons to state why a model failed, one of them is ML over confidence. We can call an AI-ML model overconfident when the models show high confidence in all their predictions. It is more evident in the case of classifications (predicting image of a dog or cat) than in regressions (predicting continuous values like the price of a house or car) since they are probabilistic models.
“Modern AI-ML models, as they have grown in scale, have become overconfident. This means that even when the model is likely incorrect in its prediction, it can exhibit high confidence. This makes model confidence quite unreliable,” Prasanna Sattigeri, – Research Staff Member, IBM Research AI, MIT-IBM Watson AI Lab.
Overconfidence leads to false trust in model predictions and thus model errors can go undetected. If confidence of the model in its predictions is meaningful and accurate, a human or other system can intervene and take over when the model is not confident thus avoiding any errors.
“For example, consider a doctor or a loan officer utilizing an AI’s predictions for decision making. When the model is not confident, they can reject the model’s predictions and bring in new information and/or other experts to make the decision. This framework is called selective prediction, or prediction with a reject option. We need the model to accurately convey its confidence, and not be overconfident, for selective prediction to work well,” Sattigeri explained.
The overconfident models can have a significant economic impact; they also create mistrust among the users. The cost of exposing overconfident models has varying degrees of effects based on the industry. For instance, the impact of recommending a movie to your user based on an incautious model is significantly different than an overconfident prediction generated by an autonomous driving car.
For most Machine Learning tasks, the objective is to accurately predict the unseen based on the historical data we already have access to. Overconfident models impact this core premise of Machine Learning.
“Consider an example where you, as the Head of Marketing, are trying to decide the best marketing channel to spend your money on to entice a specific audience segment. An overconfident model might recommend you to allocate a larger portion of your budget to a particular channel than what’s needed. In this example, the model might have predicted the right channel accurately, but it made you spend significantly more money to generate the same amount of sales,” explained Ashwin Thote, Principal Data Scientist at Bose Corporation.
Thote further informed that overconfident models, specifically those trained on narrow datasets, pose a serious threat to companies and can cost them their reputation. Credit cards, mortgage, and insurance companies are the prime victims of synthetic identity fraud. In general, overconfident models trained on partial/biased data pay more attention to certain features in the data. The systems then suffer from fraud once these features are exposed. Massaging resumes to meet the job requirements is another frivolous way applicants are taking advantage of overconfident models.
The main reason to have ML models is to increase ROI and omit human based errors but due to the overconfident models, the company might end up losing clients, sales or may even have negative ROI.
“Depending upon where the model is being used, the impact and the cost paid could be significant due to overconfidence in a model. For instance, in the case of face detection, the model might incorrectly predict with high confidence. It could lead to a security breach if being used for authentication, which is a major threat. In contrast, it is a minor event if the model is used for just finding images from a face in an image gallery,” said Dr. Manjeet Dahiya, VP & Head – AI and Machine Learning, CarDekho.