EVALUATING MODELS OF REFERRING EXPRESSION PRODUCTION WITH CROSS-LINGUISTIC DATA