It has been nearly two months since I started in grad school, and I have not told you what I am working on. (Well, some of you know, but I still have not announced it.) The project is called “Bounded-Parameter Markov Decision Processes” and it deals with decisions under certain uncertainty conditions. The broad context is optimization of technical systems under uncertainty, and this is one approach to the general problem.
So, what is it all about? It begins with Markov chains, which are a fairly old formalism that describes a system that has some states, like “lights on” and “lights off”. The formalism contains probabilities of the event that the system changes its state after one unit of time. With some linear algebra and probability theory, it is possible to compute the behavior of such system. Then, we go on and extend this formalism: now, in each state, we can make a decision and execute an action from some given action set; in our case these actions might be “flip switch” and “do nothing” (as you can see even in this simple model, inaction is also a decision and implies consequences). Each action has some costs attached to it (costs may be also negative, thus, being rewards) and alters the state in a different way: in our example, flipping the switch would mean a state transition and doing nothing would mean staying in the previous state. This leads to the obvious question of finding a strategy that minimizes costs. And for this case, too, one can employ Mighty Linear Algebra and find several algorithms that deliver optimal strategies.
Now, my problem is a little more complicated. In my case, the transition probabilities are not certain; one only knows lower and upper bounds, which in the most general case can mean arbitrary transition probabilities. In this case, it is far less trivial what the optimal strategy can be.
But the thing that bothers me most is that, actually, I’d like to get a joint Math/CS degree afterwards, since what I am doing is basically lots of (applied) math 🙂