Pitt Turns Rapid Coronavirus Data Sharing into Sustainable Research Infrastructure

By: Erin Hare and Allison Hydzik

Wilbert van Panhuis, M.D., Ph.D., was scrolling through Twitter over the winter holiday break when he noticed chatter among infectious disease epidemiologists about a new virus infecting people in Wuhan, China.

Those tweets spurred a scramble for his team at the University of Pittsburgh to establish a platform for research collaborations and data sharing on what would become the COVID-19 pandemic. Immediately, the researchers started to compile datasets and early research publications into a central COVID-19 repository for the scientific community, and last week, they launched the online portal for COVID-19 modeling research — a clearinghouse for sharing data-driven discoveries about COVID-19.

“Multiple datasets of transcribed case information have emerged early in the outbreak,” van Panhuis said. “Scientists so far have published close to a thousand COVID-19 reports and papers, including 150+ papers reporting estimates of epidemiological characteristics of the outbreak. These numbers are growing every day.”

Wilbert van Panhuis, M.D., Ph.D., assistant professor of epidemiology, University of Pittsburgh Graduate School of Public Health and biomedical informatics at Pitt’s School of Medicine.


Van Panhuis directs the Coordination Center for the Models of Infectious Disease Agent Study (MIDAS), a collaborative research network launched by the National Institutes of Health in 2004 to establish U.S. modeling capabilities against infectious disease threats. The Coordination Center, based in Pitt’s Graduate School of Public Health, is charged with building the MIDAS community, outreach and training for infectious disease modeling, and establishing the capability for data and model sharing. In short, The Pitt MIDAS Center is a global infectious disease data matchmaking service.

Many of the 300 MIDAS members are conducting modeling research on COVID-19 and are contributing to an extraordinary international collection of data and information regarding the outbreak.

“It’s exciting and gratifying to be able to do something useful to help with this pandemic,” said van Panhuis, also an assistant professor of epidemiology and biomedical informatics at Pitt. “We’re playing a crucial role in bringing the infectious disease modeling research community together to efficiently share information.”

A completely new research culture has emerged during this outbreak. One aspect is rapid sharing of data and model results by community members.

During past outbreaks for diseases such as Ebola, individual MIDAS researchers have taken on the role of facilitating connections between scientists, data sources and public health officials. This is the first time it’s being done through a coordinating center in such an organized fashion.

The wealth of data and information emerging from the scientific and public health community can be difficult to navigate, and various reports about confusion regarding data provenance and comparability have emerged in the news and social media. The MIDAS Coordination Center and the broader MIDAS community are using established data science principles to improve the discovery and use of COVID-19 data to improve public health.

The MIDAS Online Portal for COVID-19 Modeling Research contains not only data, but also rich and standard metadata — critical information about who collected the data, when, where and how – so researchers can quickly identify the utility of each dataset for their projects.

Scientists are using this data to calculate important features of the disease, such as how infectious the virus is and how long it takes before an infected person becomes contagious.

Health officials can use this kind of information to determine the effectiveness of airport screenings, border closures and the proper duration of social distancing. The MIDAS Network Coordination Center connects researchers to government agencies, such as the Centers for Disease Control , so they can make decisions using the most up-to-date evidence-based research findings.

The Pitt MIDAS Center has deep experience in data science research for infectious diseases. Van Panhuis founded the Project Tycho Data Repository 10 years ago, and since then 5,000 people have registered to use these data, resulting in over 100 new scientific works. Now, van Panhuis is building on this experience to turn rapid COVID-19 data sharing initiatives into a sustainable infrastructure for modeling research and global health.

For example, van Panhuis’s MIDAS team is making sure that data and metadata is reliable before posting it. They check to see that at least one person has put their real name to the information and that the data appears valid. They don’t pass judgement on whether the science was done well or not, but they do make sure that basic details about the scientific process are disclosed so that people who access the findings can make their own determinations on its validity.

This is the beginning of a rich database, which is crucial for a strong, reproducible scientific process, even during a pandemic.

“It’s a very democratic, open and fair process,” van Panhuis said. “We simply want to get data to researchers and the research to public health officials in a way that doesn’t leave out anyone who may have a valuable contribution to make. It’s amazing to see our dream of being a big data facilitator come to fruition in such a meaningful way.”