Posterior sampling for efficient reinforcement learning