By Zulqarnain Haider
The marriage between machine learning and mixed integer programming (MIP) has recently been the talk of the town. Researchers around the world are imagining new ways of combining and merging the two powerful fields of machine learning and MIP. The joint session on machine learning and discrete optimization was second in a three-part series spanning multiple days and many speakers. Soon after the session started, the room was filled to capacity. It started with a talk by Anirbit Mukherjee who, dressed in traditional Indian garb, described his recent research in trying to understand the underlying mathematics of neural networks. In a wide-ranging survey of the field, the speaker described his work formalizing the connections between neural networks and sparse coding.
The second speaker, Elias B. Khalil from Georgia Tech, presented his work on machine learning security using integer programming. He introduced binarized neural networks (BNN) and an integer programming-based approach to attack or perturb the BNN to make sure that it predicts incorrectly. By identifying the vulnerabilities of machine learning models through adversarial examples, one could make them robust to “attacks,” an idea distilled in the maxim “attack to protect.” The speaker concluded his talk by asserting that the presented approach, called iProp compares favorably with other state of the art attack algorithms like FGSM.
The third speaker, Giulia Zarpellon from Polytechnique Montréal, spoke about learning new approaches to branch and bound (B&B) more efficiently. She suggested a three-pronged reinforcement learning framework for better defining the state of the B&B tree, the branching policy, and the rewards. She suggested to take context and temporality into account in variable branching. She concluded by describing the computational experiments with 54 heterogenous instances.
Thiago Serra (pictured), from Carnegie Mellon University was the fourth speaker and the session chair. In a fast-paced talk, he presented his results about the number of linear regions that piecewise linear functions represented by neural networks can attain, both theoretically and empirically. This number could be used to compare different, but similar, neural network configurations. The talk mainly focused on coming up with better bounds on the number of linear regions including a method for exact enumeration of the linear regions by modeling the DNN as a mixed-integer program.
The final speaker of the session, Ellen Vitercik (Carnegie Mellon), talked about algorithm design as a learning problem. She took the case of clustering problem, a special case of combinatorial partitioning problem. The aim of the study was to find an algorithm that is best over a sample of algorithms in a given domain and doing so on firm theoretical foundations.