On the power of (even a little) flexibility in dynamic resource allocation
[摘要] Abstract We study the role of partial flexibility in large-scale dynamic resource allocation problems, in which multiple types of processing resources are used to serve multiple types of incoming demands that arrive stochastically over time. Partial flexibility refers to scenarios where (a) only a small fraction of the total processing resources is flexible, or (b) each resource is capable of serving only a small number of demand types. Two main running themes are architecture and information: the former asks how a flexible system should be structured to fully harness the benefits of flexibility, and the latter looks into how information, across the system or from the future, may critically influence performance. Overall, our results suggest that, with the right architecture, information, and decision policies, large-scale systems with partial flexibility can often vastly outperform their inflexible counterparts in terms of delay and capacity, and sometimes be almost as good as fully flexible systems. Our main findings are: 1. Flexible architectures. We show that, just like in fully flexible systems, a large capacity region and a small delay can be achieved even with very limited flexibility, where each resource is capable of serving only a vanishingly small fraction of all demand types. However, the system architecture and scheduling policy need to be chosen more carefully compared to the case of a fully flexible system. (Chapters 3 and 4.) 2. Future information in flexible systems. We show that delay performance in a partially flexible system can be significantly improved by having access to predictive information about future inputs. When future information is sufficient, we provide an optimal scheduling policy under which delay stays bounded in heavy-traffic. Conversely, we show that as soon as future information becomes insufficient, delay diverges to infinity under any policy. (Chapters 5 and 6.) 3. Decentralized partial pooling. For the family of Partial Pooling flexible architectures, first proposed and analyzed by [84], we demonstrate that a decentralized scheduling policy can achieve the same heavy-traffic delay scaling as an optimal centralized longest-queue-first policy used in prior work. This demonstrates that asymptotically optimal performance can be achieved in a partially flexible system with little information sharing. Our finding, which makes use of a new technical result concerning the limiting distribution of an M/M/1 queue fed by a superposition of input processes, strengthens the result of [84], and provides a simpler line of analysis. (Chapter 7.)
[发布日期] [发布机构] Massachusetts Institute of Technology
[效力级别] [学科分类]
[关键词] [时效性]