Partially Observed Inventory Control

Industrial investigations indicate that errors in inventory recordings are common and often unavoidable. Such errors result in dramatic wastes and cost to the industry. Inventory control in the presence of such errors is essentially a partially-observed decision-making problems. Although robust framework, such as the Partially Observable Markov Decision Processes (POMDPs) have been applied to inventory control, most work apply POMDPs to single commodity problems or assume independence between commodities, due to difficulties in solving problems with large discrete action space. This work applies our method, QBASE, to problems with multiple commodities whose demand levels may be correlated. Numerical experiments on partially observed multi-commodity inventory control problems indicate that our proposed solution can find less conservative inventory control strategies that yield higher profits, compared to existing solutions.


A result of numerical experiments on a partially-observed inventory control problem with 6 different types of commodities. Left: Average total discounted reward, which reflects the average total profit, as time increases. Right: The inventory levels maintained for each type of commodity. Higher inventory level indicates a more conservative strategy.

Details of the POMDP model and expertiments are available in the reference below:



  • Erli Wang
  • Hanna Kurniawati
  • Dirk P. Kroese
. . . Vol. . No. . pp. . ed. . [pdf]