The goal of this programming exercise was to was to deliver a package with a drone as quickly as possible when the system dynamics are subject to probabilistic uncertainty.
The drone operates in a discretized world of M x N cells and its dynamics at a certain time step are modelled through three state variables, namely its position on the x and y-axis and a binary variable telling if the UAV is carrying the package or not. At time step k the system evolves in the following way:
- One of the allowable control inputs uk is applied. In particular uk ∈ {North, South, East, West, Hover}. However the control input is not allowed if it leads ouside the world boundaries or against and obstacle;
- From the new desired position, a gust of wind, occurring with probability pwind, moves the drone either North, South, East and West uniformly at random;
- The drone crashes if it ends up outside the grid world boundaries or against the obstacles;
- Whenever a drone crashes it is brought to a base station carrying no package. The fixing procedure takes Nc time steps.
Find the policy minimizing the expected number of time steps needed to deliver a package by using the following methods for the Infinite Horizon problem:
- Value Iteration;
- Policy Iteration;
- Linear Programming.
Further Info in "Assignment.pdf"