This is the title of a poster presented by Tom Donaldson, Bernie Shiao (both of STScI) , John Good (Caltech/IPAC-NExScI) and myself at the 231st AAS meeting in Washington DC (January 8-12). I am attaching a copy of the poster below, and linking a copy of the paper we prepared for the proceedings of ADASS XXVII (Santiago, Chile).
Briefly, we studied the the comparative performance of databases as follows:
- Indexing depth (cell size) of Hierarchical Triangular Mesh (HTM) vs. HEALPix
- PostgreSQL vs. SQL servers
- Linux vs. Solaris vs. Windows
and we did this for two catalogs: the 2MASS All Sky Catalog (which covers the complete sky; 470,000,000) and the unmerged Hubble Source catalog (which has sparse sky coverage; 384,000,000 sources).
The main results are:
- Query time is dominated by I/O.
- Indexing depth—and not choice of index—has the greatest impact on performance: trade-off between too many sources and too many cells.
- Optimum index depth depends on query radius distribution. (We used a log scale from 1 arcsec to 1 degree).
See the poster for the figures showing these results.