3KG v2: Universal Electrocardiogram Representations for Label-Efficient Phenotype Discovery

Placeholder Show Content



We propose 3KG v2, a new self-supervised learning method for universal representation learning of electrocardiograms. This method builds upon its predecessor by featuring a new contrastive objective and transformation space. We assess the quality of representations generated by this algorithm by transferring pre- trained models from a large public dataset to various downstream tasks, including some with- out prior clinical association with ECGs. Performance evaluation is conducted in both few- shot and full data settings to account for limited and complete training data availability, respectively. For each task, we perform a linear evaluation to assess the effectiveness of the pretrained representations. For tasks in the full- data setting, we additionally perform a full-fine tuning to determine the performance ceiling of each method in a practical deployment scenario.

Our results demonstrate that 3KG v2 consistently outperforms a randomly initialized model trained solely on the target task across all downstream tasks and settings. Specifically, we achieve state-of-the-art performance in few- shot diagnosis of right ventricular function and aortic valve stenosis, two conditions that typically require large amounts of labeled ECGs for effective model training. Moreover, we find that fine-tuning 3KG v2 on our source task’s labels can lead to exceptional transfer capability across a variety of tasks. Notably, our model demonstrates a high level of accuracy in predicting left atrial volume index, achieving a 0.720 C-index even in a few-shot setting. This achievement appears to be unprecedented, as we are not aware of any other ECG model that performs well on this task. While 3KG v2 out- performs our baselines and shows promising results on the majority of phenotype discovery tasks, there is still room for improvement in the absolute performance of any ECG model on many of these complex tasks. Further research is warranted to continue developing and improving such methods for phenotype discovery tasks on ECGs.


Type of resource text
Date modified December 11, 2023
Publication date November 20, 2023


Author Gopal, Bryan
Advisor Ng, Andrew
Thesis advisor Piech, Chris
Research team head Rajpurkar, Pranav
Research team head Tison, Geoffrey H ORCiD icon https://orcid.org/0000-0002-0310-3326 (unverified)


Subject Machine learning
Subject Deep learning (Machine learning)
Subject Artificial intelligence
Subject Cardiology
Subject Electrocardiography
Subject Electrocardiography > Interpretation
Subject Echocardiography
Subject Echocardiography > Digital techniques
Subject Few-shot Learning
Subject Phenotype Discovery
Subject Universal Representations
Genre Text
Genre Thesis

Bibliographic information

Access conditions

Use and reproduction
User agrees that, where applicable, content will not be used to identify or to otherwise infringe the privacy or confidentiality rights of individuals. Content distributed via the Stanford Digital Repository may be subject to additional license and use restrictions applied by the depositor.
This work is licensed under a Creative Commons Attribution Non Commercial 4.0 International license (CC BY-NC).

Preferred citation

Preferred citation
Gopal, B., Ng, A., Piech, C., Rajpurkar, P., and Tison, G. (2023). 3KG v2: Universal Electrocardiogram Representations for Label-Efficient Phenotype Discovery. Stanford Digital Repository. Available at https://purl.stanford.edu/cs342wn7296. https://doi.org/10.25740/cs342wn7296.


Undergraduate Theses, School of Engineering

View other items in this collection in SearchWorks

Contact information

Also listed in

Loading usage metrics...