Texas Virtual Data Library (Tx-ViDaL): A Secure Compliant Data Infrastructure for Population Informatics

Hye-Chung Kum
Associate Professor
Department of Health Policy and Management, School of Public Health
Department of Computer Science and Engineering
Department of Industrial and Systems Engineering
The Center for Remote Health Technologies and Systems (CRHTS)
Texas A&M University
Location: Koldus - Room 111
Time: October 9, 2017 - 2:00-3:00pm


The social genome is the collective footprints of our society captured in ever-larger and ever-more complex databases (e.g., government administrative data) about people in the digital society. Population informatics applies data science to social genome data to answer fundamental questions about human society and population health much like bioinformatics applies data science to human genome data to answer questions about individual health. To enable population informatics research at TAMU, we propose to (1) develop a secure cloud computing data infrastructure to support data intensive research that involve sensitive person level data (e.g., health data) to meet the myriad legal requirements of handling such data (e.g., HIPAA, Texas HB 300) and (2) accumulate good data sources (e.g., HCUP, CMS, SEER), which often need to be purchased or processed to be fit for research, to be available to researchers with appropriate approvals and permissions. This project will extend the current Texas A&M High Performance Research Computing (HPRC) - a shared computing infrastructure used by many A&M investigators and students - capacity to support an even wider user base to include those that need secure compliant computing as well as good data sources. It will also extend the current TX-RDC infrastructure - a shared infrastructure to access federal data - to a much wider array of data sources including Texas state data. Such a research infrastructure will enable researchers across many disciplines (e.g., public health, public policy, sociology, remote health, transportation, computer science, statistics etc.) at Texas A&M to develop new research agendas and collaborative team science becoming thought leaders for data intensive research in their respective fields. We will discuss use cases, such as the recently funded NSF ERC, PATHS-UP project, that requires secure compliant computing, how such systems will be set up, and potential data sources that could be available.

Speaker's Bio

Dr. Hye-Chung Kum is an associate professor in the department of Health Policy and Management at the School of Public Health with joint appointments in both department of Computer Science and Engineering and department of Industrial and Systems Engineering at Texas A&M University. She is the founder and director of the Population Informatics Lab (https://pinformatics.org/) and is a member of the Center for Remote Health Technologies and Systems (CRHTS) at TEES. With a PhD in computer science (datamining), and a masters in macro social work (policy and mangaement), she has over 15 years of experience in data science applied to big data (e.g., EHR, government administrative data) about people for improving population health, policy, and management. She has lead several multi-disciplinary teams of computer, health, and ELSI (Ethical, Legal, and Social Implications) scientists on topics such as record linkage, information privacy, sequential pattern mining, screening and survival of liver cancer patients, Medicaid waiver, welfare reform, hospital uncompensate care costs, and employment outcomes of children in foster care.