+ - 0:00:00
Notes for current slide
Notes for next slide

NetCoupler: Inferring causal pathways between high-dimensional metabolic data and external factors

Luke W. Johnston

Steno Diabetes Center Aarhus, Denmark

1 / 21

Outline:

  • History and background on the analytic problem

  • NetCoupler implementation and development

  • Examples using NetCoupler

  • Current challenges

2 / 21

History and background on analytic problem

3 / 21

Amount of data generated is massive

  • -omics type data

  • Metabolic biomarkers

  • High dimensionality

  • Complex networks

4 / 21

Want to ask questions using this type of data, some way to handle that complexity.

Potential analysis: Dimensionality reduction

Reducing number of variables with PCA.

Reducing number of variables with PCA.

5 / 21

This has the advantage of making things simpler while trying to maximize variance in the data. Afterward you can do modelling on each principal component. The disadvantage of this approach is that it loses a lot of information since the interdependence and connections between variables it not maintained.

Potential analysis: Many regression-type models

O1 = M1 + covariates
O1 = M2 + covariates
...
O1 = M7 + covariates
O1 = M8 + covariates
6 / 21

Some ways you might go about analyzing this data is by running many regression models, one for each metabolic variable for instance.

This of course has problems since you're simply running a bunch of models and not taking account of the inherent interdependencies between variables.

Potential analysis: Network analysis

7 / 21

This approach is nice in that you can extract information about the connection between metabolic variables. But there is no way to incorporate the disease outcome with this approach and in order to construct the network properly most methods require you provide a prespecified base network, which you might not know.

But, what if...

  • want info about network structure?
8 / 21

But, what if...

  • want info about network structure?

  • don't know the network structure?

8 / 21

But, what if...

  • want info about network structure?

  • don't know the network structure?

  • have an exposure, metabolites, and outcome?

8 / 21

... want to know is something like this?

9 / 21

(Potential) solution: NetCoupler

10 / 21

Initial development

  • Developed by Clemens Wittenbecher for his PhD thesis

  • Algorithm that:

    • Finds most likely network structure
    • Allows inclusion of exposure and outcome
    • Identifies causal links from, to, and within the network

Clemens Wittenbecher

11 / 21

Four basic phases of the algorithm

12 / 21

A key is that the algorithm is flexible enough to handle different types of models

Infer (potentially) causal pathways with graphical model output

13 / 21

Developing NetCoupler as an R package

14 / 21

Developing NetCoupler as an R package

14 / 21

Met him at EDEG, asked if it was an R package. Started working together after that.

Example projects using NetCoupler

15 / 21

UK Biobank: Metabolic pathways between components of stature and HbA1c

16 / 21

UK Biobank characteristics

  • Basics:
    • ~480,000 participants
    • Cross-sectional
    • Stature measures
    • Various demographics
  • Metabolic variables:
    • Cholesterol
    • Albumin
    • Alanine Aminotransferase
    • Apolipoprotein A and B
    • Aspartate Aminotransferase
    • C-reactive Protein
    • Gamma Glutamyltransferase
    • HDL and LDL Cholesterol
    • Triglycerides
17 / 21

18 / 21

Current challenges

19 / 21

Several limitations or improvements

  • Mainly, performance can be slow
    • E.g. larger data or networks
20 / 21

Several limitations or improvements

  • Mainly, performance can be slow
    • E.g. larger data or networks
  • Untested on larger networks
20 / 21

Several limitations or improvements

  • Mainly, performance can be slow
    • E.g. larger data or networks
  • Untested on larger networks

  • Untested on non-cross-sectional/time-to-event data

20 / 21

Several limitations or improvements

  • Mainly, performance can be slow
    • E.g. larger data or networks
  • Untested on larger networks

  • Untested on non-cross-sectional/time-to-event data

  • Visualizing can be tricky

20 / 21

Several limitations or improvements

  • Mainly, performance can be slow
    • E.g. larger data or networks
  • Untested on larger networks

  • Untested on non-cross-sectional/time-to-event data

  • Visualizing can be tricky

  • Interpreting estimates can be tricky

20 / 21

Thanks! Questions?

21 / 21

Outline:

  • History and background on the analytic problem

  • NetCoupler implementation and development

  • Examples using NetCoupler

  • Current challenges

2 / 21
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow