Inference on the generalization error of machine learning algorithms and the design of hierarchical medical term embeddings