+ - 0:00:00
Notes for current slide
Notes for next slide

I assume you all are familiar enough with the general purpose of this project, that is, to help maximize the utility and general usage of the DD2 resource for researchers and eventually for diabetes patients. So I'll get into the aims right away, briefly describe what this project looks like from a conceptual level, go briefly over the timeline, and where we are right now and the next immediate steps. Then I have a couple questions that I'd like us to discuss a bit.

We'll keep this really informal, so just jump in and ask questions whenever you want.

Implementing an open and scalable infrastructure for the DD2 data

I assume you all are familiar enough with the general purpose of this project, that is, to help maximize the utility and general usage of the DD2 resource for researchers and eventually for diabetes patients. So I'll get into the aims right away, briefly describe what this project looks like from a conceptual level, go briefly over the timeline, and where we are right now and the next immediate steps. Then I have a couple questions that I'd like us to discuss a bit.

We'll keep this really informal, so just jump in and ask questions whenever you want.

Aims of the Data Infrastructure Framework (DIF) Project

We're still working out a better name, but for now we're calling it DIF

These aims are for the full project itself, and may seem vague, but bare with me.

Aims of the Data Infrastructure Framework (DIF) Project

  1. Primary aim: Create and implement an efficient, scalable, and open source data infrastructure framework that connects multiple stakeholders with the data, documentation, and findings

We're still working out a better name, but for now we're calling it DIF

These aims are for the full project itself, and may seem vague, but bare with me.

Just for some clarification, infrastructure here meaning the computational structure of the data and all its support structures, for instance, how the files and folders are structured, where the data files are saved and what file format, how to connect to data. In many ways like the roads and buildings of a city, where data is the people moving about.

"Framework" on the other hand is the bundle or package that contains the instructions to create an infrastructure, that someone can take and use to create the infrastructure somewhere else. You can think of this as the blueprint for building a city.

Aims of the Data Infrastructure Framework (DIF) Project

  1. Primary aim: Create and implement an efficient, scalable, and open source data infrastructure framework that connects multiple stakeholders with the data, documentation, and findings

  2. Secondary aim: Create this framework so that other research groups and companies, who are unable or can't build something similar, can relatively easily implement it and modify as needed for their own purposes.

We're still working out a better name, but for now we're calling it DIF

These aims are for the full project itself, and may seem vague, but bare with me.

Just for some clarification, infrastructure here meaning the computational structure of the data and all its support structures, for instance, how the files and folders are structured, where the data files are saved and what file format, how to connect to data. In many ways like the roads and buildings of a city, where data is the people moving about.

"Framework" on the other hand is the bundle or package that contains the instructions to create an infrastructure, that someone can take and use to create the infrastructure somewhere else. You can think of this as the blueprint for building a city.

Aims of the Data Infrastructure Framework (DIF) Project

  1. Primary aim: Create and implement an efficient, scalable, and open source data infrastructure framework that connects multiple stakeholders with the data, documentation, and findings

  2. Secondary aim: Create this framework so that other research groups and companies, who are unable or can't build something similar, can relatively easily implement it and modify as needed for their own purposes.

In short: Make a software product that makes it easier to find, store, and use data for research projects that abide by best practices, and make it so that it is easy and free to use for others.

We're still working out a better name, but for now we're calling it DIF

These aims are for the full project itself, and may seem vague, but bare with me.

Just for some clarification, infrastructure here meaning the computational structure of the data and all its support structures, for instance, how the files and folders are structured, where the data files are saved and what file format, how to connect to data. In many ways like the roads and buildings of a city, where data is the people moving about.

"Framework" on the other hand is the bundle or package that contains the instructions to create an infrastructure, that someone can take and use to create the infrastructure somewhere else. You can think of this as the blueprint for building a city.

Guiding principles

  1. Follow and enable FAIR principles

  2. Openly licensed and re-usable (e.g. CC-BY, MIT)

  3. State-of-the-art principles and tools in software and UI design

  4. Built from software that may be more familiar to researchers/academia

  5. Friendly to beginner and non-technical users

FAIR = Findable Accessible Interoperable Reusable

General timeline

Full 5 year timeline found on website.

By User 1 I mean any process to get the data into the format needed for the backend. I'm aware there are already well developed pipelines for getting DD2 data into a database format, right? So we'll need some way of connecting a pipeline to feed into the DIF's backend.

This timeline was stretched to account for various potential delays, so estimates could realistically be shortened by maybe 30-40%... but best to keep conservative.

Once we've gotten to the User 1 MVP, we'd like to start testing it out on DD2, to find any potential issues and so on. So within the next year, year and a half we could begin meaningfully contributing to the DD2 database.

Next steps

  • Already hired RSE and DBA starting Sept

  • Onboard the team, have two-day welcome and brainstorming session in Sept

    • Detail and agree on tasks for next several months
    • Agree on longer term plan
  • Aim for "Minimum Viable Product" of first component within ~1 to 1.5 years

General questions on next steps

Thinking of questions for next steps, some big, immediate ones that come to mind are:

General questions on next steps

  • Company helping out with this?

    • How can we fit in with those plans?
  • Co-hiring data manager for DD2 and this project, any timelines or things to discuss/consider?

Thinking of questions for next steps, some big, immediate ones that come to mind are:

Data manager to map all data, resources, and documentation.

I assume you all are familiar enough with the general purpose of this project, that is, to help maximize the utility and general usage of the DD2 resource for researchers and eventually for diabetes patients. So I'll get into the aims right away, briefly describe what this project looks like from a conceptual level, go briefly over the timeline, and where we are right now and the next immediate steps. Then I have a couple questions that I'd like us to discuss a bit.

We'll keep this really informal, so just jump in and ask questions whenever you want.

Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow