3D scene understanding with efficient spatio-temporal reasoning