Leveraging Public Cloud Services for CLIA-Certified Personalized Medicine Pipelines
Event Type
TimeTuesday, July 306:30pm - 8:30pm
LocationCrystal Foyer and Crystal B
DescriptionDating back to 2012, a joint effort between Fairview Hospital's Molecular Diagnostic Lab and the Minnesota Supercomputing Institute introduced the Next-Generation Sequencing Diagnostic Pipeline (NGSDP), a novel cloud-based analysis pipeline that was validated for clinical use to be compliant with the Clinical Laboratory Improvement Amendments (CLIA) . Since then, the partnership has grown the portfolio of active pipelines to seven, and expanded the available gene panels to over 6700 genes. Likewise, the design of the backing cloud infrastructure has been significantly overhauled for efficiency, scalability, automation, and fault tolerance.

This document introduces our revised cloud-native, scheduled infrastructure solution for containerized applications. The cloud-native infrastructure, deployed on Amazon Web Services, has two core components: a compute cluster of virtual machines, and a simple queue-based job scheduler. The cluster runs analysis pipelines packaged as Docker containers; each with their own resource requirements. The scheduler, akin to products for traditional HPC, allows batch submission of patient samples with a persistent job queue.

Our solution has been successfully running as CLIA-validated pipelines since 2016. Two variants of the solution are presented to address the need for such architecture in both the traditional public cloud space, as well as in Amazon's GovCloud where fewer cloud services are available to handle sensitive or controlled-access data. Furthermore, the innovative solution is driven by a traditional research computing environment; details of which are presented with emphasis on security, monitoring, and user workflow.