Submodular optimization in massive datasets