Assessing the evolution of written language through data mining in large corpora