One of the topics that came up at coffee at the recent SPIE was why astronomers are not using the cloud very much. I decided to ask some of my colleagues this very question. Among the answers:
“What’s the cloud?”
“I can do all of my work at the desktop, so why bother with it?”
“It’s just a fad”
“Probably need to know a lot of IT to use. Don’t speak the language, and have no time to learn.”
“Is it reliable?”
” Don’t know what to use it for”
The responses alarmed me. Fair enough, a lot of science computing can be done on desktops – I do quite a bit of my own computing on my desktop too. But astronomers are missing out on a powerful tool.
I have been using the Amazon EC2 cloud a lot in the past year,in collaboration with my colleagues Ewa Deelman, Gideon Juve and Mats Rynge at ISI, and I can put a lot of misconceptions to rest. It is easy to get started. Just create an account the way you would for an any on-line service, and pick the processors you want to use, launch a virtual machine (the same type of VM you use to, say, run Windows from your Mac). And you are ready to use Amazon EC2 as if it were running on your desktop. In all the time I have been using it, it has proven more reliable than machines at my office.
It’s definitely not a fad, as usage has grown and the cloud is now part of the computing mainstream. The chart below shows the growth in usage of Amazon Ec2:
The question of what to use it for is more interesting. There is one simple answer. If you are doing processing that bogs down your processing for any length of time, it is a candidate for running on the cloud. Some examples from my own work: we generated periodograms for over 200,000 light curves (variation of intensity with time) released by the Kepler project. It took an hour and cost $11. We did not want to process them on our local cluster because it is in heavy day-to-day operations use already. We are generating multi-wavelength galactic plane mosaics to use in studies of the star formation history of our Galaxy. The processing would put excessive load our local server.
A more difficult issue is assessing cost, as Amazon is pay-as-you-go. Users should really do a cost-benefit study to determine whether they are prepared to pay for the costs. Such as study need not be hard and can use results from other cost-benefit studies – see my earlier posts. And it can be cheap – raw processing costs very little, for instance, but transferring data can be expensive.
The above discussion highlights a real concern of mine, that astronomers are missing out on powerful tools for doing science. In this age of data-intensive science, astronomers armed with knowledge of these technologies have a real competitive advantage. So what to do about it? In my view:
– Introduce IT into Ph D science curricula. The IT knowledge is as important as knowing your field.
– Have and on-line journal dedicated to science computing.
– Publish web-pages telling astronomers how to get the most out of the cloud. A discussion forum would be useful too.
– Hands-on How To sessions at meetings and conferences.