Robust learning and evaluation in sequential decision making