Essays in large scale optimization algorithm and its application in revenue management
- This dissertation focuses on the large-scale optimization algorithm and its application in revenue management. It comprises three chapters. Chapter 1, Managing Randomization in the Multi-Block Alternating Direction Method of Multipliers for Quadratic Optimization, provides theoretical foundations for managing randomization in the multi-block alternating direction method of multipliers (ADMM) method for quadratic optimization. Chapter 2, How a Small Amount of Data Sharing Benefits Distributed Optimization and Learning, presents both the theoretical and practical evidences on sharing a small amount of data could hugely benefit distributed optimization and learning. Chapter 3, Dynamic Exploration and Exploitation: The Case of Online Lending, studies exploration/ exploitation trade-offs, and the value of dynamic extracting information in the context of online lending. The first chapter is a joint work with Kresimir Mihic and Yinyu Ye. The Alternating Direction Method of Multipliers (ADMM) has gained a lot of attention for solving large-scale and objective-separable constrained optimization. However, the two-block variable structure of the ADMM still limits the practical computational efficiency of the method, because one big matrix factorization is needed at least once even for linear and convex quadratic programming. This drawback may be overcome by enforcing a multi-block structure of the decision variables in the original optimization problem. Unfortunately, the multi-block ADMM, with more than two blocks, is not guaranteed to be convergent. On the other hand, two positive developments have been made: first, if in each cyclic loop one randomly permutes the updating order of the multiple blocks, then the method converges in expectation for solving any system of linear equations with any number of blocks. Secondly, such a randomly permuted ADMM also works for equality-constrained convex quadratic programming even when the objective function is not separable. The goal of this paper is twofold. First, we add more randomness into the ADMM by developing a randomly assembled cyclic ADMM (RAC-ADMM) where the decision variables in each block are randomly assembled. We discuss the theoretical properties of RAC-ADMM and show when random assembling helps and when it hurts, and develop a criterion to guarantee that it converges almost surely. Secondly, using the theoretical guidance on RAC-ADMM, we conduct multiple numerical tests on solving both randomly generated and large-scale benchmark quadratic optimization problems, which include continuous, and binary graph-partition and quadratic assignment, and selected machine learning problems. Our numerical tests show that the RAC-ADMM, with a variable-grouping strategy, could significantly improve the computation efficiency on solving most quadratic optimization problems. The second chapter is a joint work with Yinyu Ye. Distributed optimization algorithms have been widely used in machine learning and statistical estimation, especially under the context where multiple decentralized data centers exist and the decision maker is required to perform collaborative learning across those centers. While distributed optimization algorithms have the merits in parallel processing and protecting local data security, they often suffer from slow convergence compared with centralized optimization algorithms. This paper focuses on how small amount of data sharing could benefit distributed optimization and learning for more advanced optimization algorithms. Specifically, we consider how data sharing could benefit distributed multi-block alternating direction method of multipliers (ADMM) and preconditioned conjugate gradient method (PCG) with application in machine learning tasks of linear and logistic regression. These algorithms are commonly known as algorithms between the first and the second order methods, and we show that data share could hugely boost the convergence speed for this class of the algorithms. Theoretically, we prove that a small amount of data share leads to improvements from near-worst to near-optimal convergence rate when applying ADMM and PCG methods to machine learning tasks. A side theory product is the tight upper bound of linear convergence rate for distributed ADMM applied in linear regression. We further propose a meta randomized data-sharing scheme and provide its tailored applications in multi-block ADMM and PCG methods in order to enjoy both the benefit from data-sharing and from the efficiency of distributed computing. From the numerical evidences, we are convinced that our algorithms provide good quality of estimators in both the least square and the logistic regressions within much fewer iterations by only sharing 5% of pre-fixed data, while purely distributed optimization algorithms may take hundreds more times of iterations to converge. We hope that the discovery resulted from this paper would encourage even small amount of data sharing among different regions to combat difficult global learning problems. The third chapter is a joint work with Haim Mendelson. This paper studies exploration and exploitation tradeoffs in the context of online lending. Unlike traditional contexts where the cost of exploration is an opportunity cost of lost revenue or some other implicit cost, in the case of unsecured online lending, the lender effectively gives away money in order to learn about the borrower's ability to repay. In our model, the lender maximizes the expected net present value of the cash flow she receives by dynamically adjusting the loan amounts and the interest (discount) rate as she learns about the borrower's unknown income. The lender has to carefully balance the trade-offs between earning more interest when she lends more and the risk of default, and we provided the optimal dynamic policy for the lender. The optimal policy support the classic "lean experimentation" in certain regime, while challenge such concept in other regime. When the demand elasticity is zero (the discount rate is set exogenously), or the elasticity a decreasing function of the discount rate, the optimal policy is characterized by a large number of small experiments with increasing repayment amounts. When the demand elasticity is constant or when it is an increasing function of the discount rate, we obtain a two-step optimal policy: the lender performs a single experiment and then, if the borrower repays the loan, offers the same loan amount and discount rate in each subsequent period without any further experimentation. This result sheds light in how to take into account the market churn measured by elasticity, in the dynamic experiment design under uncertain environment. We further provide the implications under the optimal policies, including the impact of the income variability, the value of information and the consumer segmentation. Lastly, we extend the methodology to analyze the Buy-Now-Pay-Later business model and provide the policy suggestions.
|Type of resource
|electronic resource; remote; computer; online resource
|1 online resource.
|Zhu, Mingxi, (Researcher in optimization algorithms)
|Degree committee member
|Degree committee member
|Stanford University, Graduate School of Business
|Statement of responsibility
|Submitted to the Graduate School of Business.
|Thesis Ph.D. Stanford University 2023.
- © 2023 by Mingxi Zhu
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...