Pre-finetuning methods for domain and task adaptation, with applications to discourse and translation