Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Investigating the ability of machine learning techniques to provide insight into the aetiology of complex psychiatric genetic disorders

Vivian-Griffiths, Timothy 2017. Investigating the ability of machine learning techniques to provide insight into the aetiology of complex psychiatric genetic disorders. PhD Thesis, Cardiff University.
Item availability restricted.

PDF - Accepted Post-Print Version
Download (4MB) | Preview
[img] PDF - Supplemental Material
Restricted to Repository staff only

Download (156kB)


One of the biggest challenges in psychiatric genetics is examining the effects of interactions between genetic variants on the aetiologies of complex disorders. Current techniques involve looking at linear combinations of the variants, as considering all the possible combinations of interactions is computationally unfeasible. The work in this thesis attempted to address this problem by using a machine learning model called a Support Vector Machine (SVM). These algorithms are capable of either building linear models or using kernel methods to consider the effects of interactions. The dataset used for all of the experiments was taken from a study looking into sufferers of treatment-resistant schizophrenia receiving the medication, Clozapine, with controls taken from the Wellcome Trust Case/Control Consortium study. The first experiment used information from the individual Single Nucleotide Polymorphisms (SNPs) as inputs to the SVMs, and compared the results with a technique called a polygenic score, a linear combination of the risk contributions of the SNPs that provides a single risk score for each individual. When more SNPs were entered into the models, one of the non-linear kernels provided better results than the linear SVMs. The second experiment attempted to explain this behaviour by using simulated phenotypes made from different contributions of main effects and interactions. The results strongly suggested that interactions were making a contribution. The final experiment looked at using risk scores made from gene sets. The models identified a set involved in synaptic development that has been previously implicated in schizophrenia, and when the scores from the individual genes were entered, the non-linear kernels again showed improvement, suggesting that there are interactions occurring between these genes. The conclusion was that using SVMs is an effective way to assess for the possible presence of interactions, before searching for them explicitly.

Item Type: Thesis (PhD)
Status: Unpublished
Schools: Medicine
Subjects: R Medicine > R Medicine (General)
Date of First Compliant Deposit: 24 May 2017
Last Modified: 28 Jun 2019 02:38

Actions (repository staff only)

Edit Item Edit Item