Item talk:Q151698
State-dependent resource harvesting with lagged information about system states
Markov decision processes (MDPs), which involve a temporal sequence of actions conditioned on the state of the managed system, are increasingly being applied in natural resource management. This study focuses on the modification of a traditional MDP to account for those cases in which an action must be chosen after a significant time lag in observing system state, but just prior to a new observation. In order to calculate an optimal decision policy under these conditions, possible actions must be conditioned on the previous observed system state and action taken. We show how to solve these problems when the state transition structure is known and when it is uncertain. Our focus is on the latter case, and we show how actions must be conditioned not only on the previous system state and action, but on the probabilities associated with alternative models of system dynamics. To demonstrate this framework, we calculated and simulated optimal, adaptive policies for MDPs with lagged states for the problem of deciding annual harvest regulations for mallards (Anas platyrhynchos) in the United States. In this particular example, changes in harvest policy induced by the use of lagged information about system state were sufficient to maintain expected management performance (e.g. population size, harvest) even in the face of an uncertain system state at the time of a decision.