How Good Are Performance Metrics for Parallel Applications?

This week, I am attending a Software Sustainability conference in Cardiff, Wales, so my blog post today will be brief.  I have often wondered about all the claims and counter claims for performance specifications for parallel applications.  If you are interested in this topic, I can recommend an article on the International Science Grid This Week web page.  The article, Inflated Performance, is by turns insightful and trenchant, and gives a fine overview on, e.g.,  how to interpret performance speed-up claimed when researchers port from CPU’s to GPU’s (the speedup is real, but often only because parts of the code had to be rewritten).

Posted in astroinformatics, cyberinfrastructure, High performance computing, Parallelization | Tagged , , , | Leave a comment

Resumes, interviews and getting jobs

I thought that this week, I might offer some tips on these two essential parts of getting jobs. I have been on both ends of resumes and interviews for many years, but I wanted to give a employer’s perspective on these topics. I have hired about 40 people in my career as a manager, mainly software engineers, and I think it is helpful to look at them from the employer’s point of view. If you are applying for a job, you will almost certainly be one of many people applying for it, so how can you stand out?

First of all, I look for more than just technical credentials. Invariably, I look for employees who are reliable, have the maturity to follow rules and policies, will work well with members of a team, and when things start to go wrong, have the wherewithal to start doing something about it. I also want employees who will contribute to the broader life of the department, and be an asset to the organization.

Prepare yourself in your current job by developing these traits. It may sound obvious and even silly, but get in the habit of getting to work on time, and if you cannot (and it happens to all of us) make sure someone knows about it. Same goes if you are sick. Make sure you follow policies. Become an asset. If asked to help organize seminars, or be on an organizing committee, do it. Don’t say “I’m too busy.” Everyone is. It’s important to learn how to contribute when you are very busy. Make sure at least one of your references can speak to your attitude to work. I am going to ask about it.

Let’s move on to resumes. Their only purpose is to get you an interview. Your resume should be positive and reflect your achievements and the impact of your work. If you have a paper with huge number of citations, say so. If you have, say, developed software on a very tight schedule, and it has developed a large user base, say so. This information is more important than the exact nature of the software itself. One of the most unpleasant aspects of resumes is that they are often pre-screened by HR departments, who look to see that the resume addresses all the job requirements. You should assume that this will happen, and so you may need to tailor your resume to the job description. Just what you wanted, more work. But if you take the time to do this, your chances of getting the interview are so much the higher.

Well, you have the interview, now what?  When you walk through the door, you need to be seen as someone who means business.  It doesn’t matter if you know that everyone wears T-shirts and jeans (as I do), wear a business suit. If you don’t have one, buy one. Think of it as an investment in yourself.  Don’t try to look cool by, say, wearing a T-shirt and dark glasses with a suit (yes, someone I interviewed did this.  You might look cool in a night club, but you’ll look like an idiot at an interview. Context!).  Don’t worry about being nervous. It’s normal, and it means you are taking the interview seriously.  Most interviewers are happy to see a few nerves. And it is better to look nervous than cocky, which is the kiss of death. Make sure you have a good answer when asked why you have applied for the position. Don’t say “well, I need the a job.”  Investigate the company, and respond by saying what great things you can bring to it to make it an even better place.

Finally, when you get home exhausted after the day’s events, put your feet up and have a relaxing glass of wine.  You’ve earned it. But before you do that, write a thank-you note to the interviewer(s).  So many people forget this.  I found it hard to separate the two best candidates for a position once. One sent a thank-you note, and one did not. Guess who got the job?

Posted in careers, education, jobs, Uncategorized | Tagged , , | Leave a comment

Was It Hard To Return To Astronomy and Research?

This was the most common question I was asked about my last post. I spent several years in Earth Sciences before returning to astronomy in 2000 as the Manager of the Infrared Science Archive (IRSA) at IPAC, Caltech.  My position gave me 20% of my time to spend on science as a senior member of the IPAC science staff.

I wanted to spend my science time working on brown dwarfs. I have had a longstanding interest in this field, ever since Neill Reid and myself made an early search for them as long ago as 1986.  The trouble was,  I had only 20% of my time for science, and by 2000 the field had blossomed into a dynamic research area, with a huge literature. I found it overwhelming just to read it, let alone develop a research proect.

But an opportunity was at hand.  IPAC was a contributor to the National Virtual Observatory (NVO), a project to develop the infrastructure to make astronomy databases talk to each other. In its early days, the NVO decided to perform three science pilot projects to inform the requirements on infrastructure. One of them was a search for brown dwarfs by cross-matching the databases of sources produced by the Sloan Digital Science Survey (SDSS) and Two Micron All Sky Survey (2MASS)  projects. IPAC was granted technical leadership for this project.

This was my opportunity to get back into research. I recruited my colleague Davy Kirkpatrick at IPAC, one of the founders of the brown dwarf field and one of its leading experts, to work with me on science goals. And I recruited Serge Monkewitz, a very bright Software Engineer, to build the search engine. Working together, we got the engine going, and we set up search criteria that would optimize our chances of finding brown dwarf candidates, especially very cool (in temperature!) ones, designated T-dwarfs.  Then we went through the list of candidates, and culled out image artifacts.  We then recruited Stan Metchev, then at UCLA and now on the faculty at Stony Brook, to work with us to measure spectra and make positive identifications. It took 5 or 6 years to observe the candidates, what with the usual allocations of cloudy nights, but we hit paydirt. We found 2 new very cool T-types, and made the first estimates of their space densities, and we found a number of unusual warm brown dwarfs.

So, what advice might I offer to someone in a similar position?  Trying to develop a project by yourself after an absence may prove very, very hard. Instead,  I would advise people to work with experts in the field and develop a well defined project with them, and keep at it. Learn from the experts – Davy pointed me to the papers I should read, and patiently answered all my questions.  When you have a result, go to a meeting and present it. Make a real effort to talk to attendees, and see if you can set up further collaborations with them.

Posted in Astronomy, careers, education, jobs, Uncategorized | Tagged , , , | Leave a comment

How I Survived in Astronomy

The Astrobetter site has some very good posts and sane advice on getting jobs and building a career.  This week, with the job season upon us, I decided to write about how I managed to stay in astronomy. I hope that on the way I can offer some advice and encouragement to those of you in the job market.

I got my Ph D from Caltech in 1983,  specializing in cataclysmic variable stars. I hoped that after a postdoc in Cambridge, England, I would be able to get a faculty job at a respected school and settle down to a career in academia. I enjoyed my time at Cambridge, but there were no faculty jobs in my native Britain (and I mean none), and so I returned to the U.S.  to do a second postdoc at Steward Observatory. I worked with Gary Schmidt, an exceptionally competent astronomer and an expert in polarimetry.

I learned one of my greatest lessons from Gary: that my interest in CV’s was too narrow to make me attractive to many astronomy departments. So with his encouragement,  I used polarimetry to expand my interests into the world of Seyfert galaxies and quasars, and my interest in CVs led directly to an interest in spectroscopy of red dwarfs (my first project in that area was with Neill Reid; it was an early effort to try to identify brown dwarfs. We didn’t find any).

The expansion of my interests was good for the brain – I found all of my new projects interesting and challenging – and helped me gain broader exposure in astronomy. It wasn’t enough to get me a faculty position though, and I did allow myself to be discouraged by what I saw as a failure, even though there were very few faculty jobs available then.

Instead, I accepted a Support Scientist job at Goddard working on COBE, which was about to launch, and I was to analyze the near-IR polarimetry data from the DIRBE instrument. The important thing was to be in astronomy.  Nevertheless, I went to COBE with mixed feelings, as an employee of a government contractor, but once I opened my eyes to the opportunities it presented, it did wonders for me. I was able to work on an important mission with some outstanding scientists and software engineers. Aside from the polarimetry, I became involved in data validation, and in generating data products such as weekly sky maps, and I started to learn about project management and working with closely with developers. Eventually,  I was given responsibility for managing a software product. And then I surprised myself: I was really starting to enjoy software management, and it turned out  (please excuse the immodesty) I was good at it.

After COBE ended, I worked for four years managing calibration software projects on Earth Sciences/Remote Sensing Missions at Goddard. It was there that I really learned about industry-standard software development and what it takes to manage a whole mission. I realized I was actually better at management than at science research, and the desire to be faculty member steadily waned in importance.  I moved up the ladder, eventually becoming a Branch Head in my company.  And the new field offered some fascinating new problems to keep my brain engaged.

But I did miss my astronomy, and in 2000 I accepted a position at IPAC  managing the Infrared Science Archive. IRSA was then a struggling new archive with only two missions in it.  The grounding I got in software management at Goddard turned out to be the perfect preparation to run IRSA.  I have since assumed two new roles, managing the NASA Star and Exoplanet Database, and managing the new Virtual Astronomical Observatory.   My research these days is actually in the field of IT research. For the past seven years, I have worked with colleagues at USC on investigating the applicablity of new technologies, such as cloud computing, to astronomy. I enjoy this every bit as much as I enjoy astronomy.

So what advice can I give people?  Well, you can have a fulfilling career in astronomy outside the traditional faculty path.  There are many ways to make a difference in astronomy outside pure research positions. Re-invent yourself every few years – it prevents burn out, and helps you develop broader skills. Look for opportunities rather than waiting for them. Try things out  – you may surprise yourself in discovering new interests and talents.

Posted in Astronomy, careers, education, jobs | Tagged , , , | 2 Comments

First Thoughts on The Astronomy and Astrophysics Decadal Survey

The National Academies of Science has just released its latest Decadal Survey of Astronomy and Astrophysics. This survey is commissioned every 10 years to review the state of the profession and recommend national priorities in astronomy for the next decade.  The survey recommendations are in a weighty tome entitled New Worlds, New Horizons in Astronomy and Astrophysics, which you can download free in PDF. I am going to try to digest and comment on it in coming weeks, from the point of view of computational astronomy.

New Worlds, New Horizons Cover Image

Today, I will post some initial comments. I was delighted to see recognition of astronomy archives and their contribution to research. The report pointed out the number of archival papers resulting from public 2MASS data and Space Telescope data. I was, however, disappointed in the lack of recommendations regarding of the impact that the tsunami of data will have on astronomical computing.

There is,  I think, a recognition in the community that we need to curate and preserve data sets, and our national funding agencies are indeed committed to this.  We are still a long way from curating all our data: witness the data products created by astronomers for publication, most of which are not available in electronic form and continue to be lost. But there are some serious issues to sort out in curating and preserving data. Curating data takes time and energy. It involves creating a complete record of the processing history of a data set (its provenance) in sufficient detail that an astronomer can reproduce the data set and evaluate its quality and usefulness. All this requires methodologies and standards to do right. But more than that, analyzing and combining the massive new data sets that are coming along to create new products and perform innovative research will require computing techniques and technologies that go beyond what can be done on desktop machines, and there needs to be an investment in them and in educating astronomers in how to use them. I would have liked to have seen a recommendation in the survey to do just this.

Posted in Uncategorized | Leave a comment

Hardware Virtualization

The benefits of virtualization have become very clear  – installing multiple OS’s on a server reduces the number of servers needed to run a project,  reduces the load on physical plant, saves space and saves money.

Virtualization remains a bit mysterious to many, so I am happy to recommend a series of three articles by Gref Pfister that have been publishes in the International Science Grid This Week.   The latest article is here, and you can navigate from it to the first two.

Posted in Uncategorized | Leave a comment

Social Networking and Astronomers.

I was pointed by my colleague Roy Williams to the bioinformatics site myexperiment.org.  Here, practitioners in this field share workflows they develop to perform all kinds of biochemical tasks. Others extend these workflows, which are for the most part public, and thereby enable broader and richer collaborations. I can’t tell you how successful this approach is, nor of the science value of the workflows, but I did start to wonder why astronomers don’t seem to attempt collaborations like this.

We share tools and develop libraries that are accessible to all. Kelle Cruz started the splendid  and successful astrobetter forum, now on facebook as well, which shares tips and tricks for astronomers working on Mac OS X platforms. But this is quite different than building on-line collaborations between professional astronomers. Why have astronomers not taken to this? I have been pondering this question, but I have no ready answer. As sciences go, astronomy is a small field that is highly competitive. Faculty jobs are in short supply, and career advancement is based largely on the strength of an astronomer’s publication record. Sharing too much information about analysis tools may be seen as giving an advantage to a competitor. Astronomers have tended to get little credit for sharing data and software, and I think the field’s reward system does need to change. But I think there is another characteristic of astronomy at work. Astronomy  has for many years been the province of the lonely scientist on the mountaintop communing with the universe through the telescope. I am inclined to think that isolation is a more ingrained characteristic in astronomy than other fields  and this perhaps is why astronomers have not yet taken to what might be called “social collaboration.” Astronomy is definitely emerging from these older social attitudes, as younger scientists embrace new approaches to science, driven by changes in technology and the vast quantities of data that astronomers can now process on-line. Maybe in the next few years we will see new approaches to collaboration lead to new discoveries from this treasure trove of data.

Posted in astroinformatics, Astronomy, social networking | Tagged , , | Leave a comment

The Death Knell For The Grid?

Well, the title comes not from me, but from International Science Grid This Week. It is in response  to Amazon’s announcement that it will be providing high-performance compute services.  Readers of my earlier posts will see that one of Amazon’s weaknesses for science applications is that high-performance clusters offer superior performance for I/O-bound applications.  It will be important for science apps  to see how well Amazon’s HP services perform on the Montage image mosaic engine, the subject of earlier comparative studies. And to see what the costs are.  In the meantime, read Craig Lee’s excellent article on the subject at iSGTW.

Posted in Cloud computing, cyberinfrastructure, High performance computing, Uncategorized | Tagged , , , | 2 Comments

So How Much Computing Should An Astronomer Learn?

I was asked this by a colleague after last week’s post on why astronomers have not taken to cloud computing. So I thought this week I would give my answer to this question.  What I think astronomers need to learn as part of formal instruction is

  • How a computer works and what limits its performance – when do I/O, memory, processing speed start to limit performance, and what a scientist can do about this
  • Languages – at least one low level language, preferably C, and at least one scripting language; my choice would be Python.  C is powerful – as it is close to machine language, it can perform bit-level manipulations, it is versatile, and a the best choice for many high performance apps.    Python introduces astronomers to object-oriented coding as well as a powerful scripting language with an easy syntax.  Armed with these two languages,  it becomes easier to learn IDL , Java and many other languages.
  • How to make code portable – Astronomers are much more likely nowadays to have to run their code on a remote computer, so they should learn how to develop portable code that is easy to maintain (component based design, avoid shared memory and avoid system calls…)
  • Parallel programming. Astronomers will process larger and larger volumes of data in the future, so I think they should learn how to perform parallel processing – MPI, vectorizing code etc
  • How relational databases work,  their theoretical basis (set theory), and how to design relational tables
  • Computing platforms – clusters, clouds, grids. What they are and when  to use them.

Of course, in a few years, I might advocate different technologies, such as computing on mobile platforms, if that takes off in science.

Posted in Astronomy, education, software maintenance, software sustainability, Uncategorized | Tagged , , , , | Leave a comment

Why Don’t You Use The Cloud?

One of the topics that came up at coffee at the recent SPIE was why astronomers are not using the cloud very much. I decided to ask some of my colleagues this very question.  Among the answers:

“What’s the cloud?”

“I can do all of my work at the desktop, so why bother with it?”

“It’s just a fad”

“Probably need to know a lot of IT to use. Don’t speak the language, and have no time to learn.”

“Is it reliable?”

” Don’t know what to use it for”

The responses alarmed me. Fair enough, a lot of science computing can be done on desktops – I do quite a bit of my own computing on my desktop too. But astronomers are missing out on a powerful tool.

I have been using the Amazon EC2  cloud a lot in the past year,in collaboration with my colleagues Ewa Deelman, Gideon Juve and Mats Rynge at ISI,  and I can put a lot of misconceptions  to rest.  It is easy to get started. Just create an account the way you would for an any on-line service, and pick the processors you want to use, launch a virtual machine (the same type of VM you use to, say, run Windows from your Mac). And you are ready to use Amazon EC2 as if it were running on your desktop. In all the time I have been using it, it has proven more reliable than machines at my office.

It’s definitely not a fad, as usage has grown and the cloud is now part of the computing mainstream. The chart below shows the growth in usage of Amazon Ec2:

Growth in usage of Amazonn EC2 (from http://blog.rightscale.com)

The question of what to use it for is more interesting.  There is one simple answer. If you are doing processing that bogs down your processing for any length of time, it is a candidate for running on the cloud. Some examples from my own work:  we generated periodograms for over 200,000 light curves (variation of intensity with time) released by the Kepler project. It took an hour and cost $11. We did not want to process them on our local cluster because it is in heavy day-to-day operations use already.  We are generating multi-wavelength galactic plane mosaics to use in studies of the star formation history of our Galaxy.   The processing would put excessive load our local server.

A more difficult issue is assessing cost, as Amazon is pay-as-you-go. Users should really do a cost-benefit study to determine whether they are prepared to pay for the costs.  Such as study need not be hard and can use results from other cost-benefit studies – see my earlier posts. And it can be cheap – raw processing costs very little, for instance, but transferring data can be expensive.

The above discussion highlights a real concern of mine, that astronomers are missing out on powerful tools for doing science.  In this age of data-intensive science, astronomers armed with knowledge of these technologies have a real competitive advantage. So what to do about it? In my view:

– Introduce IT into Ph D science curricula. The IT knowledge is as important as knowing your field.

– Have and on-line journal dedicated to science computing.

– Publish web-pages telling astronomers how to get the most out of the cloud.  A discussion forum would be useful too.

– Hands-on How To sessions at meetings and conferences.

Posted in Astronomy, Cloud computing, High performance computing | Tagged , , , | 5 Comments