Data representations and data quality in the context of machine learning and healthcare