Sequential decision making and learning in strategic environments
- Businesses often make operational decisions (e.g. pricing, inventory, sourcing) without precise knowledge of their environment (e.g. unknown consumer demand or supplier reliability). When a business faces such a decision repeatedly and can update their chosen action, a key aspect of their success is the ability to learn and improve their decisions over time. There is a large literature of work that studies these settings and has developed policies which enable businesses to achieve long-run success (see, e.g., Araman and Caldentey 2010). Typically, these policies achieve good outcomes by carefully balancing a tradeoff between exploring (taking an action which generates information) and exploiting (taking an action which generates the highest immediate payoff). This work extends the literature by considering problems of sequential decision making in an environment with incomplete information and other strategic participants who have their own incentives. In general, the policies proposed by previous work and the resulting dynamics are predicated on the assumption that the decision maker's environment is exogenous, so considering an environment with agents that strategically react to the policy can lead to substantially different policies and dynamics. This work explores these dynamics in two settings. In the first chapter, we ask how can a firm design an optimal dynamic sourcing policy from a supplier with privately known cost and quality? The key difference from existing models of supply learning is that the buyer and supplier must endogenously agree to a price each period. With this consideration, the buyer has two sources of information to learn about the seller; stochastic realizations of delivered quality and strategic decisions of the seller. Therefore, in addition to the classic exploration/exploitation tradeoff, the buyer must decide how to explore. We establish the equilibrium of the interaction, characterize the buyer's learning policy and then show how it compares/contrasts to more traditional learning dynamics without a strategic seller. Moreover, we show that the ability to evaluate and learn from quality outcomes can be detrimental to a buyer engaging with a strategic seller. In the second chapter, we consider an extension of the traditional dynamic pricing setup where a seller has a priori incomplete demand information but interacts with customers through a platform (e.g. Amazon) that has its own payoff and can take actions to influence customers' purchase decisions. In this setup, we characterize how the platform should optimally control the seller's information and learning dynamics in order to generate platform-preferred prices and payoffs. We establish that the platform should release (some) initial information to a seller about customer demand, and should then take costly actions to prevent the seller from learning more. In comparison to traditional settings where a seller will avoid prices which generate no information, we establish that, in equilibrium, it is in fact optimal for the seller to set such `confounding' prices
|Type of resource
|electronic resource; remote; computer; online resource
|1 online resource
|Degree committee member
|Degree committee member
|Stanford University, Graduate School of Business
|Statement of responsibility
|Gregory James Macnamara
|Submitted to the Graduate School of Business
|Thesis Ph.D. Stanford University 2020
- © 2020 by Gregory Macnamara
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...