Developing a Meta Framework for Key-Value Memory Networks on HPC Clusters
Machine Learning/Artificial Intelligence
TimeTuesday, July 301:30pm - 2pm
DescriptionWe propose a novel framework, DARE-MetaQA, whose underlying architecture is versatile to support various learning models for large-scale Question Answering (QA), in particular aimed at the biomedical domain, for example, backed with PubMed as an information resource for finding facts for answering. Our framework addresses inherent challenges with the existing model of the Key-Value Memory Networks in two specific aspects: (1) prediction performance and (2) computational scalability. For the first aspect, an extension of algorithmic repertoire leveraging ensemble and meta-learning is proposed, and the second aspect is tackeld with the choice of effective distributed parallel techniques. The resulting meta framework is highly advantageous for developing a flexible learning model, desirable for dealing with the complicated nature of machine reasoning. In this work, we focus on the development side towards an optimal implementation for a specific computing environment of shared-nothing multi-node systems. Specifically, the two high-end cluster systems (Comet and Bridges) were selected, both of which are equipped with multiple GPUs for each node and support the use of the container technology for HPC, Singularity, We highlight our main implementation decisions and achievements during the development, and discuss theoretical and technical underpinnings for the current framework architecture and the future work. The development has been supported by the XSEDE ECSS program.