Towards an open and reusable computing infrastructure for Minnesota Earth observation data
Event Type
TimeTuesday, July 306:30pm - 8:30pm
LocationCrystal Foyer and Crystal B
DescriptionEarth observation satellites (EOS) have become indispensable for studying Earth system processes. While the challenges of extracting information from EOS data in general are often discussed in relation to the Big Data paradigms of variability and volume, an underappreciated facit of EOS data is their multipurpose nature, particularly optical imagery obtained from satellites like Landsat and Sentinel-2. Because they record observations across a number of discrete portions of the electromagnetic spectrum from all objects within their field of view, even a single image has relevance to any number of different domain science research questions. The value of the 40+ year observational record from Landsat missions in particular represents an underexplored resource for many domain science questions and the effective utilization of this record requires new tools and open HPC computing environments.

In this paper, we discuss efforts at the Minnesota Supercomputing Institute to establish a framework that enable EOS data reuse and exploration. We describe the tools and cyberinfrastructure developed to obtain Landsat imagery for Minnesota. These tools document data provenance, allowing for fully reproducible data products. In addition, we discuss the prototype python programming environment and workflows that enables Landsat data exploration and reuse and facilitate computation within MSI’s HPC environment. Future needs and vision for creating an Open Data Cube environment for Minnesota are also discussed.