In a number of previous posts, I have described how cloud computing can be effective in supporting “high-burst” or one-time computing needs. Generally, the cloud is too expensive to support science services on a 24/7 basis. I recently learned of an imaginative application of cloud computing in the service of education and training.
The Gaia mission, a cornerstone mission of the European Space Agency, aims to produce a three-dimensional map of the Milky Way by measuring positions and radial velocities of 1 billion stars. In January of this year, the Gaia Research for European Astronomy Training (GREAT) Initial Training Network held a school for graduate training school, aimed at new graduate students.
Part of the school involved exercises used to exploring various aspects of the simulated data contained in the Gaia Universe Model Snapshot version 10 (described here). The students worked in groups on exercises which involved querying the GUMS data set, which contains 1.6 billion stars, to research aspects of the structure and dynamics of the Milky Way. The students used a custom developed Java framework based on Hadoop, provided to them on a virtual machine that contained a small subset of the GUMS data set, which they installed on their laptops.
Those groups that got their code working quickly were able to run the code against the full GUMS data set which was stored in the Amazon EC2 cloud. This provided a taste of the future possibility of processing close to the data, which will likely play a big role in astronomy in the era of “big data.”
The figure below shows some of the results from the school:
I wish to thank Will O’Mullane for drawing my attention to the web page on this training school.