ePubs

The open archive for STFC research publications

Full Record Details

Persistent URL http://purl.org/net/epubs/work/63692
Record Status Checked
Record Id 63692
Title A Workflow Pipeline for Scientific Data Analysis
Contributors
Abstract As detector resolution and speed increase, the amount of data that must be transferred and analysed also increases. This is especially the case for production and high-throughput beamlines, and for experimental stations that require prompt feedback. Such beamlines utilize automated acquisition software, sample changers and other experimental apparatus, all of which facilitate the creation of larger amounts of data. The ability to easily and robustly handle this avalanche of data is key to scientific discovery and insight. We present a workflow pipeline for scientific data analysis that helps address this concern. It uses an industry standard messaging system for reliable task sequencing and triggering. Generic actors handle common tasks such as file transfers. Technique specific analysis code is implemented or called from custom actors that may be written in Java, C++ or Python. Experimental metadata and provenance information is stored along with raw and analysed data in a single HDF5 file that is manipulated by different stages of the pipeline. The system is deployed at the APS 2-BM-B and 8-ID-I beamlines. The tomography beamline located at 2-BM-B uses the pipeline to transfer data from detector computers to a cluster for GPU reconstruction. This beamline can produce over 10TB of raw detector data a day and over 40TB of reconstructed data a day. The x-ray photon correlation spectroscopy (XPCS) beamline at 8-ID-I uses the pipeline to move data from detector computers to a Hadoop distributed file system (HDFS) on a distributed-memory cluster for multi-tau analysis. The XPCS beamline can produce up to 2TB of raw data a day.
Organisation
Keywords NOBUGS2012
Funding Information
Related Research Object(s):
Licence Information:
Language English (EN)
Type Details URI(s) Local file(s) Year
Presentation Presented at NOBUGS 2012 (NOBUGS 2012), RAL, UK, 24-26 Sep 2012. Schwarz-NOBUGS2012.pptx 2012