EVENT DETAILS
Selvaprabu Nadarajah, Ph.D.
University of Illinois at Chicago
Abstract: Approximate linear programs (ALPs) are well-known models for computing value function approximations (VFAs) of intractable Markov decision processes (MDPs). VFAs from ALPs have desirable theoretical properties, define an operating policy, and provide a lower bound on the optimal policy cost. However, solving ALPs near-optimally remains challenging, for example, when approximating MDPs with nonlinear cost functions and transition dynamics or when rich basis functions are required to obtain a good VFA. We address this tension between theory and solvability by proposing a convex saddle-point reformulation of an ALP that includes as primal and dual variables, respectively, a vector of basis function weights and a constraint violation density function over the state-action space. To solve this reformulation, we develop a proximal stochastic mirror descent (PSMD) method that learns regions of high ALP constraint violation via its dual update. We establish that PSMD returns a near-optimal ALP solution and a lower bound on the optimal policy cost in a finite number of iterations with high probability. We numerically compare PSMD with several benchmarks on inventory control and energy storage applications. We find that the PSMD lower bound is tighter than a perfect information bound. In contrast, the constraint sampling approach to solve ALPs may not provide a lower bound and applying row generation to tackle ALPs is not computationally viable. PSMD policies outperform problem-specific heuristics and are comparable or better than the policies obtained using constraint sampling. Overall, our ALP reformulation and solution approach broaden the applicability of approximate linear programming.
Biography: Selvaprabu (Selva) Nadarajah is an Assistant Professor of Information and Decision Sciences at the University of Illinois at Chicago (UIC) College of Business. He obtained his PhD in Operations Research from the Tepper School of Business, Carnegie Mellon University, where he received the William L. Cooper doctoral dissertation award. Before starting his PhD, Selva was a consultant at the Canadian Tire Corporation, a large Canadian retail firm, where he helped setup the optimization and modeling group. Selva's research interests lie at the intersection of Operations Management and Business Analytics. He studies dynamic decision-making problems encountered by users or owners of energy assets (e.g., network of natural gas storage assets, oil refineries, and renewable power generators), and more recently, the accounting of social objectives in the planning of such operations. His research involves modeling these problems and developing efficient approximate dynamic programming techniques to solve the resulting models using math programming, first-order methods, and machine learning.
TIME Tuesday November 13, 2018 at 10:00 AM - 12:00 PM
LOCATION M228, Technological Institute map it
ADD TO CALENDAR&group= echo $value['group_name']; ?>&location= echo htmlentities($value['location']); ?>&pipurl= echo $value['ppurl']; ?>" class="button_outlook_export">
CONTACT Agnes Kaminski a-kaminski@northwestern.edu
CALENDAR Department of Industrial Engineering and Management Sciences