Have you ever attended a talk expecting it was going to be about something and it ends up being about something else? Well, this happened to me yesterday. Probably my fault for not reading the abstract in detail. I decided to stay anyway to see if I could learn something. Lo and behold, I did! In fact I felt a little ignorant for not having seen or thought of that before. It was referred to by the speaker as “The Lambda Question” (not sure if this is the standard name). Say you want to optimize this function:
minimize f(x)/g(x), such that x belongs to X
You can transform this problem into a feasibility problem (like Constraint Programmers do) by asking the question: for a given lambda, is there an x for which
f(x)/g(x) < lambda ?
The question above can be answered by looking for an x that satisfies
f(x) – lambda * g(x) < 0.
If you have an efficient procedure for finding such x, you can then do binary search on lambda to solve the original minimization problem. Neat little trick!
The second topic of this post is interpretability. This year, I decided to attend as many talks/tutorials about machine learning (ML) as possible. I may be a bit late for the party, but I’m interested in the combination of ML and optimization. Therefore, I figured the first step is to learn the terminology, the algorithms, and their weaknesses. I’m taking a Data Mining class at the University of Miami, which has already taught me a good deal, and I came to Phoenix determined to expand my knowledge.
I was speaking with a friend about the issue of interpretability and this anecdote came up: a neural network model was built to help doctors predict a certain outcome and the doctors refused to use it because they did not know how the prediction was obtained and couldn’t interpret the neural net’s reasoning. This led to the creation of a decision tree model for the problem, which the doctors were much more amenable to because they could literally “see” what was going on. I believe almost everyone has heard some version of this story by now. Nevertheless, the interesting take on this idea of interpretability came up during a conversation I was having with frieds last night at dinner. It started with a comment from my friend David Bergman that went more or less like this “Sure, neural nets are not interpretable, but neither are MIPs and I don’t see anyone complaining about that.”
This statement generated a very lively conversation at the table. Are these two kinds of un-interpretability the same? My first reaction was that they are not. When a MIP solver stops and gives you the answer, it doesn’t show you “why” that is the answer, but you can go back and check that the answer satisfies all the constraints and be confident that it is feasible (typically; if you ignore the delicate cases of tiny constraint violations). The bounding mechanism of branch-and-bound is what gives you the quality quarantee. On the other hand, when a (shallow/deep) neural net outputs a recommendation, there’s some degree of uncertainty there. The counter-argument was that you could trace back the path of activated neurons and see “why” that was the answer. I’m not sure. I still believe there is some intrinsic difference here, but I can’t put my finger on it. I think MIPs are more interpretable. What do you think?