Efficient reinforcement learning with value function generalization