class: center, middle, inverse, title-slide # NetCoupler: Inferring causal pathways between high-dimensional metabolic data and external factors ###
Luke W. Johnston
Steno Diabetes Center Aarhus, Denmark
--- layout: true <!-- To the Danish Epidemiological Society --> <div class="my-header"></div> <div class="my-footer"> <span> <img src="../../common/dda-logo.png" alt="DDA", width="75"> <img src="../../common/sdca-logo.png" alt="SDCA", width="55"> <a href="https://slides.lwjohnst.com/des/2021-05-21/">slides.lwjohnst.com/des/2021-05-21</a> </span> </div> --- # Outline: - History and background on the analytic problem - NetCoupler implementation and development - Examples using NetCoupler - Current challenges --- class: middle # History and background on analytic problem --- ## Amount of data generated is massive .pull-left[ - -omics type data - Metabolic biomarkers ] .pull-right[ - High dimensionality - Complex networks ] ??? Want to ask questions using this type of data, some way to handle that complexity. --- ## Potential analysis: Dimensionality reduction <div class="figure" style="text-align: center"> <img src="../../au-ph/2019-08-15/images/pca.png" alt="Reducing number of variables with PCA." width="280" /> <p class="caption">Reducing number of variables with PCA.</p> </div> ??? This has the advantage of making things simpler while trying to maximize variance in the data. Afterward you can do modelling on each principal component. The disadvantage of this approach is that it loses a lot of information since the interdependence and connections between variables it not maintained. --- ## Potential analysis: Many regression-type models .center[ ``` O1 = M1 + covariates O1 = M2 + covariates ... O1 = M7 + covariates O1 = M8 + covariates ``` ] ??? Some ways you might go about analyzing this data is by running many regression models, one for each metabolic variable for instance. This of course has problems since you're simply running a bunch of models and not taking account of the inherent interdependencies between variables. --- ## Potential analysis: Network analysis <img src="index_files/figure-html/img-traditional-network-analysis-1.png" width="50%" height="50%" style="display: block; margin: auto;" /> ??? This approach is nice in that you can extract information about the connection between metabolic variables. But there is no way to incorporate the disease outcome with this approach and in order to construct the network properly most methods require you provide a prespecified base network, which you might not know. --- ## But, what if... - want info about network structure? -- - don't know the network structure? -- - have an exposure, metabolites, and outcome? --- ## ... want to know is something like this? <img src="../../au-ph/2019-08-15/images/network.png" width="75%" style="display: block; margin: auto;" /> --- class: middle # (Potential) solution: NetCoupler --- ## Initial development .pull-left[ - Developed by Clemens Wittenbecher for his [PhD thesis](https://publishup.uni-potsdam.de/opus4-ubp/frontdoor/deliver/index/docId/40459/file/wittenbecher_diss.pdf) - Algorithm that: - Finds most likely network structure - Allows inclusion of exposure and outcome - Identifies causal links from, to, and within the network ] .pull-right[ ![Clemens Wittenbecher](https://avatars3.githubusercontent.com/u/33724052?size=200) ] --- ## Four basic phases of the algorithm <img src="../../iarc/2020-12-16/images/netcoupler-process.svg" width="90%" style="display: block; margin: auto;" /> ??? A key is that the algorithm is flexible enough to handle different types of models --- ## Infer (potentially) causal pathways with graphical model output <img src="../../au-ph/2019-08-15/images/nc-causal-pathways.png" width="85%" style="display: block; margin: auto;" /> --- ## Developing NetCoupler as an R package -- .center[ <img src="../../iarc/2020-12-16/images/netcoupler-github.png" width="70%" style="display: block; margin: auto;" /> ] .footnote[ - [github.com/NetCoupler](https://github.com/NetCoupler/NetCoupler) - **Goal**: Submit to CRAN by mid-2021. ] ??? Met him at EDEG, asked if it was an R package. Started working together after that. --- class: middle # Example projects using NetCoupler --- ## UK Biobank: Metabolic pathways between components of stature and HbA1c <img src="../../iarc/2020-12-16/images/aim-ukbiobank.svg" height="90%" style="display: block; margin: auto;" /> --- ## UK Biobank characteristics .pull-left[ - Basics: - ~480,000 participants - Cross-sectional - Stature measures - Various demographics ] .pull-right[ - Metabolic variables: - Cholesterol - Albumin - Alanine Aminotransferase - Apolipoprotein A and B - Aspartate Aminotransferase - C-reactive Protein - Gamma Glutamyltransferase - HDL and LDL Cholesterol - Triglycerides ] --- ## Link between leg length, liver markers, HbA1c <img src="../../iarc/2020-12-16/images/ukbiobank-netcoupler-results.png" width="60%" style="display: block; margin: auto;" /> --- class: middle # Current challenges --- ## Several limitations or improvements - **Mainly**, performance can be slow - E.g. larger data or networks -- - Untested on larger networks -- - Untested on non-cross-sectional/time-to-event data -- - Visualizing can be tricky -- - Interpreting estimates can be tricky --- class: middle # Thanks! Questions?