Automated discovery of novel gene-trait/disease hypotheses
Abstract/Contents
- Abstract
- Advances in genome sequencing technology gives us unprecedented access to read and study the genetic material that encode every living organism. Despite continuous research efforts to fully understand the genome, our gene function knowledge is still widely incomplete. This poses many problems especially in clinical settings. It is estimated that over 7 million births each year are affected by genetic disorders worldwide, and even with genome sequencing being available and even affordable, many patients remain undiagnosed. Identifying the genetic cause is a labor intensive task especially for disorders that are caused by genes with unknown functions or pathogenicity. This dissertation describes machine learning approaches to improve the discovery rate of novel gene functions in two domains: monogenic disorder diagnosis and mouse inbred strains analysis. These methods highlight novel gene-phenotype hypotheses that are most likely true to inspire further experimental validation and ultimately expand our gene function knowledge for clinical applications. To offer potential novel diagnosis hypotheses for monogenic disease patients who cannot be diagnosed with current patient-oriented knowledgebase, I introduced InpherNet. InpherNet is a gene prioritization classifier that ranks candidate genes based on phenotypic annotations of their biological neighbors.It aims to propose novel pathogenic, disease-causing genes when previously diagnosed patient-based annotation is missing or partial. Inbred mouse strains are carefully maintained populations of mice that have gone through successive sibling mating for over 20 generations. Through this repetitive inbreeding process, each strain homogenized genetically while developing strain specific, distinctive genotypes and phenotypes. Many genetic factors have been discovered by mapping the inter-strain genotype differences against their phenotype differences. To facilitate the acceleration of novel functional discoveries using mouse inbred strains, I built AIMHIGH (Analysis of Inbred Mouse strains' High-Impact Genotype-phenotype Hypotheses). AIMHGIH uses experiments that measure phenotypic differences among different inbred strains to automatically select trait-relevant candidate genes. Any undiscovered gene-phenotype hypotheses are ranked by a literature-based discovery classifier to propose the most promising candidates. Together, these methods employ machine learning to suggest the most exciting testable hypotheses to accelerate novel gene trait discovery and improve diagnostic rate for patients with genetic disorders.
Description
Type of resource | text |
---|---|
Form | electronic resource; remote; computer; online resource |
Extent | 1 online resource. |
Place | California |
Place | [Stanford, California] |
Publisher | [Stanford University] |
Copyright date | 2022; ©2022 |
Publication date | 2022; 2022 |
Issuance | monographic |
Language | English |
Creators/Contributors
Author | Yoo, Boyoung |
---|---|
Degree supervisor | Bejerano, Gill, 1970- |
Thesis advisor | Bejerano, Gill, 1970- |
Thesis advisor | Bernstein, Jonathan A |
Thesis advisor | Kundaje, Anshul, 1980- |
Degree committee member | Bernstein, Jonathan A |
Degree committee member | Kundaje, Anshul, 1980- |
Associated with | Stanford University, Computer Science Department |
Subjects
Genre | Theses |
---|---|
Genre | Text |
Bibliographic information
Statement of responsibility | Boyoung Yoo. |
---|---|
Note | Submitted to the Computer Science Department. |
Thesis | Thesis Ph.D. Stanford University 2022. |
Location | https://purl.stanford.edu/rk051yn2117 |
Access conditions
- Copyright
- © 2022 by Boyoung Yoo
- License
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...