Center for Data Science and Informatics (CDSI)

The amount of data produced is exploding. It is estimated that 2.5 quintillion (1 followed by 18 zeros) bytes of data are created every day. The volume of data is growing so quickly that 90 percent of the world's data has been produced in the last two years. This explosion of data is also occurring in all areas of biomedical research. A single human genome sequence contains roughly six billion base pairs. A single research study may require analyzing the genome sequences of tens of thousands of patients. Processing and managing these data are at the forefront of modern science, including the capture, curation, storage, searching, sharing, transferring, and analysis of these huge data sets. New approaches will help to expand the impact of all of the informatics technologies on health and disease. 

Three pillars of big data

Classically, scientific progress has been anchored on two pillars - Theory and Experimentation. Recently the Big Data revolution has hit science as the sheer volume of scientific data increases exponentially. Advances in scientific computing technology, together with Big Data, have created a third pillar - Computation. Data Science brings together these three pillars to accelerate discoveries. Recently, the Harvard Business Review declared that the data scientist is the "sexiest job of the 21st century". This role brings together deep domain knowledge, a solid foundation in statistical and mathematical methods, advanced computation and visualization technology, and a desire to tackle "wicked problems".

CDSI by the Numbers:

  • NMEDW has supported 858 research projects and more than 150,000 report executions -- a 258% increase since 2011
  • 70 packages, with 164 forks, on GitHub
  • Over 60,000 Bioconductor downloads in 2014 alone

Upcoming CDSI Seminars and Training

Aug
17
10:00am - 12:00pm

Justin Starren, MD, PhD, FACMI

"Our understanding of how networks of genes interact to produce health or disease has been revolutionized in recent years due to advances in bioinformatics. Through projects like eMERGE, Northwestern is a national leader in Biomedical Data Science. Not only are we gaining knowledge about the fundamental mechanisms of disease, we also are applying that knowledge to the care of our patients."
Justin Starren, MD, PhD, FACMI
Chief of the Division of Health and Biomedical Informatics
Deputy Director, NUCATS

Learn More

About

An overview of the formation and contents of CDSI.

Leadership and Governance

An overview of the individuals contributing to leadership and governance of CDSI.

Vision

The vision for CDSI and its future at the center of big data research at Northwestern University.

Resources and Services

Detail on the resources and services supplied by CDSI for our researchers and collaborators.

Education and Training

Learn more about education and training available through CDSI.

Success Stories

Find out more about how researchers have used our resources and services to great success. 

Media

External articles on CDSI.