Computer Science, Electrical and Computer Engineering, Statistical Science, Mathematics, and Biostatistics & Bioinformatics
The Extreme of Interpretability in Machine Learning
With widespread use of machine learning, there have been serious societal consequences from using black box models for high-stakes decisions, including flawed bail and parole decisions in criminal justice, flawed models in healthcare, and black box loan decisions in finance. Transparency and interpretability of machine learning models is critical in high stakes decisions. In this talk, I will focus on two of the most fundamental problems in the field of interpretable machine learning: optimal sparse decision trees and optimal scoring systems. I will also discuss a hypothesis for why we can find interpretable models with the same accuracy as black box models. I will mainly discuss work from these papers:
Jiachang Liu, Chudi Zhong, Margo Seltzer, and Cynthia RudinFast Sparse Classification for Generalized Linear and Additive Models, AISTATS 2022Hayden McTavish, Chudi Zhong, Reto Achermann, Ilias Karimalis, Jacques Chen, Cynthia Rudin, Margo SeltzerFast Sparse Decision Tree Optimization via Reference Ensembles, AAAI, 2022https://arxiv.org/abs/2112.00798Jimmy Lin, Chudi Zhong, Diane Hu, Cynthia Rudin, Margo SeltzerGeneralized and Scalable Optimal Sparse Decision Trees. ICML, 2020.https://arxiv.org/abs/2006.08690Cynthia Rudin, Caroline Wang, and Beau CokerThe Age of Secrecy and Unfairness in Recidivism Prediction. Harvard Data Science Review, 2020.https://hdsr.mitpress.mit.edu/pub/7z10o269
Cynthia Rudin is a professor of computer science, electrical and computer engineering, and statistical science at Duke University, and directs the Prediction Analysis Lab, whose main focus is in interpretable machine learning. She is also an associate director of the Statistical and Applied Mathematical Sciences Institute (SAMSI). Previously, Prof. Rudin held positions at MIT, Columbia, and NYU. She holds an undergraduate degree from the University at Buffalo, and a PhD from Princeton University. She is a three time winner of the INFORMS Innovative Applications in Analytics Award, was named as one of the “Top 40 Under 40” by Poets and Quants in 2015, and was named by Businessinsider.com as one of the 12 most impressive professors at MIT in 2015. She is past chair of both the INFORMS Data Mining Section and the Statistical Learning and Data Science section of the American Statistical Association. She has also served on committees for DARPA, the National Institute of Justice, and AAAI. She has served on three committees for the National Academies of Sciences, Engineering and Medicine, including the Committee on Applied and Theoretical Statistics, the Committee on Law and Justice, and the Committee on Analytic Research Foundations for the Next-Generation Electric Grid. She is a fellow of the American Statistical Association and a fellow of the Institute of Mathematical Statistics. She gave a Thomas Langford Lecturer at Duke University during the 2019-2020 academic year, and will be the Terng Lecturer at the Institute for Advanced Study in 2020.
Pascal Van Hentenryck
A. Russell Chandler III Chair and Professor
H. Milton Steward School of Industrial and Systems Engineering
Machine Learning and Optimization for Engineering
This talk reviews some progress at the intersection of machine learning and optimization for engineering applications. It reviews a number of new paradigms where their combination brings some unique benefits, including optimization proxies, end-to-end learning, learning to optimization, and end-to-end optimization. The methodological advances are illustrated on applications in power systems, mobility, manufacturing, and ethical decision making.
Pascal Van Hentenryck is the A. Russell Chandler III Chair and Professor in the H. Milton Stewart School of Industrial and Systems Engineering at the Georgia Institute of Technology and the Associate Chair for Innovation and Entrepreneurship. He is the director of the NSF AI Institute for Advances in Optimization, the director of the Socially Aware Mobility (SAM) and Risk-Aware Market Clearing (RAMC) labs. Several of his optimization systems have been in commercial use for more than 20 years for solving logistics, supply-chains, and manufacturing applications. His current research focuses on machine learning, optimization, and privacy with applications in energy, mobility, and supply chains.
Juan Pablo Vielma
Research Scientist in Operations Research
Modeling and Duality in Domain Specific Languages for Mathematical Optimization
Domain specific languages (DSL) for mathematical optimization allow users to write problems in a natural algebraic format. However, what is considered natural can vary from user to user. For instance, JuMP’s interface makes a distinction between conic formulations and nonlinear programming formulations whose constraints are level-sets of nonlinear functions. Tradeoffs between such alternative modeling formats are further amplified when dual solutions are considered. In this talk we describe work related to these tradeoffs in JuMP and OR-Tools. In particular, we consider modeling using a wide range of non-symmetric cones (and their solution with Hypatia.jl) and defining precise dual contracts for two-sided constraints (and other non-conic convex constraints).
Juan Pablo Vielma is a research scientist in the Operations Research team at Google in Cambridge, MA. Dr. Vielma has a B.S. in Mathematical Engineering from University of Chile and a Ph.D. in Industrial Engineering from the Georgia Institute of Technology. He has received the Presidential Early Career Award for Scientists and Engineers (PECASE), the NSF CAREER Award, the INFORMS Computing Society Prize and the INFORMS Optimization Society Student Paper Prize. Dr. Vielma has served as chair of the INFORMS Section on Energy, Natural Resources, and the Environment and as vice-chair for Integer and Discrete Optimization of the INFORMS Optimization Society. He is currently an associate editor for Operations Research, Operations Research Letters, Mathematical Programming, and Mathematical Programming Computation. He is also a member of the NumFocus steering committee for JuMP.
Department of Industrial Engineering and Management Sciences
The ARPA-E Grid Optimization Competition
In recent years, the Advanced Research Projects Agency-Energy (ARPA-E) has been organizing the “Grid Optimization Competition”. To participate, teams from academia and industry submitted computer program implementations of algorithms for solving large realistic Security-Constrained Optimal Power Flow (SCOPF) problems. The performance of the solvers was tested and ranked independently by the organizers. The goal of SCOPF is the determination of the most cost-efficient operation of an electrical power grid in a such way that it can withstand contingencies in the form of outages of any its components. Mathematically, this is an extremely large-scale two-stage nonlinear and nonconvex optimization problem. In this presentation, the approach of several teams will be described, including that of our own GO-SNIP team that placed second in the first challenge.
Andreas Wächter is a Professor in the Department of Industrial Engineering and Management Sciences at Northwestern University. His research interests include the design, analysis, implementation, and application of numerical algorithms for nonlinear continuous and mixed-integer optimization, scientific computing, power systems, and sustainability. He obtained his master’s degree in Mathematics at the University of Cologne, Germany, and this Ph.D. in Chemical Engineering at Carnegie Mellon University. Before joining Northwestern University in 2011, he was a Research Staff Member in the Department of Mathematical Sciences at IBM Research in Yorktown Heights, NY. He is a recipient of the 2011 Wilkinson Prize for Numerical Software and the 2009 Informs Computing Society Prize for his work on the open-source optimization package Ipopt.