Hardware-aware algorithms for efficient machine learning