A Study of the Efficiency of Spatial Indexing Methods Applied to Large Astronomical Databases

This is the title of a poster presented by Tom Donaldson, Bernie Shiao (both of STScI) , John Good (Caltech/IPAC-NExScI) and myself at the 231st AAS meeting in Washington DC (January 8-12).  I am attaching a copy of the poster below, and linking a copy of the paper we prepared for the proceedings of ADASS XXVII (Santiago, Chile).

Briefly, we studied the the comparative performance of databases as follows:

  • Indexing depth (cell size) of Hierarchical Triangular Mesh (HTM) vs. HEALPix
  • PostgreSQL vs. SQL servers
  • Linux vs. Solaris vs. Windows

and we did this for two catalogs: the 2MASS All Sky Catalog (which covers the complete sky; 470,000,000) and the unmerged Hubble Source catalog (which has sparse sky coverage; 384,000,000 sources).

The main results are:

  • Query time is dominated by I/O.
  • Indexing depth—and not choice of index—has the greatest impact on performance: trade-off between too many sources and too many cells.
  • Optimum index depth depends on query radius distribution. (We used a log scale from 1 arcsec to 1 degree).

See the poster for the figures showing these results.

 

Slide1

Link to ADASS paper (PDF)

 

Advertisements
This entry was posted in Uncategorized. Bookmark the permalink.

One Response to A Study of the Efficiency of Spatial Indexing Methods Applied to Large Astronomical Databases

  1. Pingback: A Study of the Efficiency of Spatial Indexing Methods Applied to Large Astronomical Databases – MeasurementDataBases for Industry & Science

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s