The VAO Spectral Energy Distribution (SED) Tool, Iris

The Virtual Astronomical Observatory (VAO) has just released a beta version of a Spectral Energy Distribution (SED) Tool, called Iris. It is a desktop application that analyzes  1-D astronomical spectral energy distributions (SEDs).

To quote from the release announcement: “Iris is a tool that allows the astronomer to build a SED of a source from multiple, separate data segments or photometric points, gathered from various observatories across a wide spectral range, and fit the aggregate SED with emission and/or absorption spectral models; individual data segments may also be separately analyzed.  SED data may be uploaded into the application from IVOA-compliant VOTable and FITS format files, or imported directly from the NASA Extragalactic Database (NED). Data written in unsupported formats may be converted for upload using the SED Importer tool which is bundled with the software.  Iris can write SED data files in one of the aforementioned supported file formats, as well as files for restoring custom fitting sessions.”

Posted in Astronomy, High performance computing, information sharing, software engineering | Tagged , , , , , , , , | Leave a comment

Why Advanced Computing Matters

I thought I would have a change of pace this week, and have a post that features penguins. No, not the Linux penguins, but the Penguins of Madagascar. These computer savvy flightless antipodean waterfowl have an interesting video that explains the importance of High Performance Computing in simple language:

Video courtesy of the Council on Competitiveness.

This post is adapted from a post of the same title appearing at the International Science Grid This Week.

Posted in Cloud computing, cyberinfrastructure, Data Management, High performance computing, information sharing, Parallelization, programming, social networking, software engineering, software maintenance, software sustainability | Tagged , , , , , , , , , , , | Leave a comment

Teaching Scientific Computing With The TeraGrid

I have been reading a presentation by Loffler et al. (2011) given at the TeraGrid 11 conference in Salt Lake City on a Graduate Course taught at Louisiana State University during Fall 2010. The course aimed to teach high-performance computing techniques, and used the TeraGrid as an integral part of the class. You may download the talk here:  Using the TeraGrid to Teach Scientific Computing-1

The class covered basic skills, including Numerical Methods, Vector Algebra, Basic Visualization, and Best Programming and Software Engineering practices. These skills are used in studying Networks and Data, Simulations & Application Frameworks, Scientific Visualization and Distributed Scientific Computing.

Students were given hands-on experience at running simulation codes on the TeraGrid, including codes to model black holes, predict the effects of hurricanes and optimize oil and gas production from underground reservoirs. Here is a summary of what was taught on simulations, taken verbatim from the talk:

“Coverage:

  • What is a simulation?
  • Example of an initial value problem
  • Typical work-flows on supercomputers, especially TeraGrid, such as batch processing, computing time allocations and login procedures
  • Usage of Cactus code: assemble, configure, build and execute across many cores within TeraGrid
  • Follow Einstein Toolkit tutorial to use physics example code on TeraGrid
  • Visualization of simulation results.”

This class is a very good example of the kind of class that I think needs to a mandatory part of graduate education in this era of “Big Data” astronomy.

Posted in cyberinfrastructure, education, High performance computing, information sharing, Parallelization, programming, software engineering, software maintenance, software sustainability | Tagged , , , , , , , , , , | Leave a comment

A Training Roadmap for New HPC Users

The TeraGrid 11 conference was held in Salt Lake City, July 18-21, 2011. The goal of the conference is to “..showcase the capabilities, achievements, and impact of the TeraGrid in research and education.”  I will write about some of the talks that interested me here on Astronomy Computing Today. One of the talks that took my eye was A TrainingRoadmapForNewHPCUsers by Mark Richards and Scott Lathrop (NCSA). They start by pointing out that 95% of HPC users are trained as scientists, and proceed to describe their ideas for educating the growing body of HPC users to get the most out these powerful resources.

They presented a web based flowchart (shown above) developed in consultation with the HPC community whose nodes represent the important skills needed in the HPC world.  Each node links to a description and resources, such as videos, articles and books. Their goal is to use the chart to capture new users quickly, and to use community feedback to evolve and improe this service.

Posted in astroinformatics, cyberinfrastructure, High performance computing, information sharing, Parallelization, programming, software maintenance, software sustainability | Tagged , , , , , , , | Leave a comment

TeraGrid Science Highlights 2010

After 10 years of service to the national science and engineering community, the TeraGrid project has ended. It has been replaced by XSEDE, Extreme Science and Engineering Digital Environment. The TeraGrid project recently published a brochure of the science highlights for 2010, and they cover the following topics:

Two chapters will interest astronomers. “Counting Comets” describes how Tom Quinn and Nathan Kaib used 500,000 hours of CPU Time to show that the Oort cloud contains a mere trillion comets, an order of magnitude less theorized. “Dawn of the Giants” describes how Richard Durlsen performed simulations of the formation of gas giant planets. The simulations show that for sufficiently extended and massive disks, gas giant planets can form via direct collapse caused by gravitational instabilities on a time scale of thousands
rather than millions of years.

Posted in astroinformatics, Astronomy, High performance computing, information sharing, Parallelization, programming, TeraGrid | Tagged , , , , , , , , | Leave a comment

An Astronomy Computing Today Word Cloud

I thought I would have a change of scenery this week and have a little fun. I went to the wordle web site and created a word cloud, which takes the words in the blog posts since I started the blog in May 2010, and creates a graphic that gives greater prominence to words that appear more frequently in the source text.

So what does it show? Well, computing is much more prominent than astronomy or astronomers, which reflects that the posts lean towards computing rather than astronomy.  “Cloud(s)” feature(s) prominently, along with “application” and “software,” which reflect my interest in running applications in the cloud. The large numbers of words in the word cloud suggests that I have covered a broad range of topics in the past 15 months, which is something I have tried to do here, in what I think of as my on-line diary of computing topics that interest me.

Posted in astroinformatics, Astronomy, Cloud computing, cyberinfrastructure, High performance computing, information sharing, Parallelization, programming, software engineering, software maintenance, software sustainability | Tagged , , , , , , , , , , , | Leave a comment

The Astrophysics Source Code Library

The Astrophysics Source Code Library is a free on-line reference library of astrophysical source codes  used to generate results published in or submitted to a refereed journal. The library housed on the discussion forum for Astronomy Picture of the Day (APOD) at http://asterisk.apod.com/viewforum.php?f=35.

Front Page of the ASCL Forum

Front Page of the ASCL Forum

The library cites three main benefits to astronomers:

1. Increased Falsifiability

    Perhaps a crucial error was made in the coding of a sound idea. ASCL presents a way for authors to bolster their results by demonstrating the integrity of their source code(s). Conversely, ASCL presents a way for readers to bolster their confidence in published results by checking details of the source code(s).

2. Increased Communication

    Perhaps an author finds it difficult to describe completely in the text of a paper how the results were obtained. ASCL creates a way for authors to present more detailed information about how their computations were carried out.

3. Increased Utility

      Perhaps an author has created a code that (s)he feels is itself useful to other astrophysicists. ASCL creates a way for these authors to disseminate a source code of significant utility to astrophysicists”

To date, over 200 codes have been added to the libary, which provides a description of  each code, and links to the source and papers describing it.  The library also provides  links to a very helpful list of software engineering papers that will be of interest to astronomy developers.

Disclosure: I am a member of the ASCL Advisory Board.

Posted in astroinformatics, Astronomy, information sharing, programming, social media, social networking, software engineering, software maintenance, software sustainability | Tagged , , , , , , , , , , , | 3 Comments

Magellan: Experiences From A Science Cloud

This is the title of a paper presented  by Lavanya Ramakrishnan et al. at the ScienceCloud2011 workshop (paper; slides). Magellan is a project funded through the Department of Energy’s (DOE) Advanced Scientific Computing Research (ASCR) program, to investigatethe use of cloud computing for science at the Argonne Leadership Computing Facility (ALCF) and the National Energy Research Scientic Computing Facility
(NERSC).  Ramakrishnan et al. describe their experiences in using Magellan to support science applications chosen from Genome Sequencing of Soil Sampling, Climate Change and the STAR nuclear physics experiment.

STAR Nuclear Physics Analysis on Magellan

STAR Nuclear Physics Analysis on Magellan

They used a testbed that contained virtualization software such as Eucalyptus, Open-Stack, and Hadoop (an open source implementation of MapReduce) to study issues such as whether open-source cloud software stacks are clouds are ready for DOE science applications.

Their conclusions:

  • Current day cloud computing solutions have gaps for science:
  • performance, reliability, stability need to improve
  • programming models are difficult to apply to  legacy apps
  • security mechanisms need to improve (e.g.root privileges for untrained users may be serious risk)
  • Nevertheless, HPC centers can improve service and policies by adopt some of the technologies and mechanisms used by Magellan:
  •  support for data-intensive workloads
  • allow custom software environments
  •  provide different levels of service
Posted in Cloud computing, cyberinfrastructure, High Energy Physics, High performance computing, information sharing, Parallelization, programming, software engineering, software maintenance, software sustainability | Tagged , , , , , , , , , , | Leave a comment

Debunking Some Common Misconceptions of Science in the Cloud

This is the title of a presentation by Shane Canon (LBNL) and collaborators  given at the Science Cloud2011 workshop in San Jose, June 2o11. I gave an overview of this meeting in an earlier post.

Canon et al. give what I think is the most realistic assessment of the value of cloud computing to scientific applications. They point out that the cloud is at the peak of inflated expectations in the Hype Cycle, and set out to get it on the path to the Plateau of Productivity:

They debunk the following misconceptions about the cloud:

  • Clouds are simple to use and don’t require system administrators.
  • My job will run immediately in the cloud.
  • Clouds are more efficient.
  • Clouds allow you to ride Moore’s Law without additional investment.
  • Commercial Clouds are much cheaper than operating your own system.

They conclude that

  • Cloud Computing as it exist today is not ready for High Performance Computing because
  •  Large overheads to convert to Cloud environments
  •  Virtual instances under perform bare-metal systems and
  •  The cloud is less cost-effective than most large centers”

Commercial clouds are however effective in the following examples:

  • Individual projects with high-burst needs:
  •  Avoid paying for idle hardware
  •  Access to larger scale resources (elasticity)
  • High-Throughput Applications with modest data needs:
  • Bioinformatics
  • Monte-Carlo simulations
  • Infrastructure Challenged Sites where facilities cost much greater than IT costs
  • Undetermined or Volatile Needs:  Use Clouds to baseline requirements and build application in-house
Posted in Astronomy, Cloud computing, cyberinfrastructure, High performance computing, Parallelization, software engineering, software maintenance, software sustainability | Tagged , , , , , , , , , | 4 Comments

Help! My Software Has Turned Into A Techno Turkey!

This is the title of a presentation I gave today at an internal symposium at IPAC, and I thought I would post it here. The title refers to the fact that a lot of scientific software developed on desktops will not prove useful for analyzing massive data sets that will soon be coming our way.  Many scientists simply do not have training in writing scalable and portable software. I make some suggestions for what the community can do about this. In summary:

  • Massive data sets are driving a new business model for scientific computing, where analysis will have to be done near the data.
  • The computationally self-taught scientist working at a desktop will be at a big disadvantage in this new world.
  • Software components that are portable and scalable will have a much bigger role to play in the future.  And software will need to be sharable, and used by user communities to develop new applications.
  • I think we need more formal computer education for scientists, and a cultural change to reward computational skills.
Posted in Astronomy, cyberinfrastructure, data archives, Data Management, High performance computing, information sharing, Parallelization, programming, software engineering, software maintenance, software sustainability, user communities | Tagged , , , , , , , , , , , | 2 Comments