The last day, Day Five, of Astroinformatics 2013, hosted by CSIRO, is now underway. In this successful meeting, we have been discussing advances in data processing, manipulation and management – crucial topics in modern astronomy. You can see the agenda here, along with a growing list of presentations, and you can follow on Twitter with the hashtag #astroinfo. So far, a youTube feed for day 5 has not been set up, but try a search on Astr0informatics in youTube to see if one has been deployed.
This morning’s sessions discussed Astroinformatics for the SKA and its pathfinder projects. Tim Cornwell described the computing challenges imposed by the SKA. Ray Norris talked about “Data Challenges from next-generation radio continuum surveys.” He mainly talked about the SKA pathfinder telescope ASKAP, which uses phased arrays to enable very fast surveys, and the Evolutionary Map of the Universe (EMU) survey that is being conducted with it. He pointed out that modern radio surveys will be dominated by normal galaxies rather than radio loud galaxies, and some of the challenges are as follows:
- Need redshifts of the sources, as radio surveys go deeper than optical surveys. So use a statistical approach, by studying the distribution of measured properties such as polarization.
- Extracting science from such a huge amount of data. Systematic effects will be especially hard to correct. One approach that is being used is to simulate effects up front and see whether data are consistent with the simulations.
- New area of phase space. Cited discovery of pulsars as a classic example. But how does this happen with 70PB of data when most data will never be seen by a human eye? One approach is to engage young astronomers by inviting them to test new techniques by doing science with survey data.
Andreas Wicenec spoke about data management challenges imposed by SKA. He gave a good overview of what is involved in data management – managing failures, back ups and so on, and pointed out that simply finding a file in a big data system is non-trivial. He described how data from the Murchison Wide-Field Array (MWA) was serving as a pathfinder for building and SKA archive.
Sudhanshu Barway (“South African Astro-informatics Alliance (SA3)”) described how South Africa is archiving all data from it national facilities, including adoption of VO standards, in preparation for the deployment of the SKA.
Jessica Chapman described how an agile management approach is being used to design an archive for ASKAP. Data storage will rates will be 15 TB/day when ASKAP will be fully operational. The archive plans to make use of VO protocols, and the user requirements document will be released on January 2014. One of the big issues is to provide temporary disk space for groups to develop new data sets and analyze them.
Stuart Weston spoke about “Survey Cross Identification for Data Mining,” and described cross-identification methods custom to radio surveys, and how to cross-match data at other wavelengths. He emphasized that a 1% difference in the success rate of current techniques – nearest neighbor and likelihood ratio – will translate to 700K sources in the full ASKPAP data set, so more sophisticated techniques will still be needed.