By Zulqarnain Haider
The third joint session about the recent developments on the wild frontier between machine learning and discrete optimization was as engaging and promising as the previous two sessions. Session chair Alfredo Torrico kick started the session by inviting the first speaker, Matthew Staib of MIT, to talk about his work on distributionally robust submodular maximization. The speaker started with defining submodularity and the wide-ranging applications in combinatorial optimization, graph theory, and data mining. The talk was focused on efficient algorithms that use distributionally robust optimization (DRO) to better generalize the underlying submodular function f. The empirical results showed that DRO improves generalization to the unknown stochastic submodular function.
The second speaker, Sebastien Martin of MIT, gave a talk titled “The Benefits of Stochastic Proximal Steps.” He compared the performance of proximal point algorithm with its more famous cousin, the stochastic gradient algorithm, to solve the basic convex optimization problems. The talk emphasized that under certain conditions, proximal point algorithm can outperform the stochastic gradient algorithm. The performance of two algorithms with different minibatch sizes and proximal step approximations was compared for 1D convex quadratic optimization problem and the ordinary least squares.
The third speaker of the session was Adrian Rivera from Georgia Tech who started his talk by introducing the online convex optimization (OCO) and online saddle point (OSP) problem. He showed that the standard algorithm for OCO problem with one player fails to solve the OSP problem where two players engage in convex-concave games. He presented an algorithm that finds near-optimal bounds for OSP problems. He concluded by introducing the constrained version of the OCO problem with knapsack constraints and relating it to the OSP problem and the algorithm presented.
The fourth and final speaker of the session was Elias B. Khalil from Georgia Tech who talked about his work in learning combinatorial optimization algorithms over graphs. He introduced a realistic setting where one may need to repeatedly solve the same combinatorial graph problem with slightly different data sets. He used examples of minimum vertex cover problem, max cut problem and the travelling salesman problem. The current greedy algorithms, he emphasized, cannot exploit the distribution of problem instances and there is a need to learn better greedy heuristics. He proposed a framework based on reinforcement learning and graph embedding to address this challenge. In conclusion, he showed that the results of his approach compare favorably with standard greedy heuristics for well known graph problems.