Learning Object-Centric Visual Representations for Common Sense Reasoning