Non-stationary bandit learning : algorithm design and theory