Scientific Programming Does Not Compute?

This week’s post is about an article published in Nature by Zeeya Merali entitled “Why Scientific Computing Does Not Compute.” (Nature, 467, 775; October 14, 2010) (Nature, 467, 775; October 14, 2010). Merali develops a compelling case that most scientists teach themselves computing and this seriously compromises the quality of scientific software. As I found out myself in my younger days as an astronomy postdoc, we teach ourselves just enough to be dangerous. Merali quotes a survey by Greg Wilson (Toronto, Canada) that revealed whereas 45% of researchers spent more time writing code than they did 5 years writing code, but only 34% saw the need for more formal training and only 47% had a good understanding of testing.

So what to do? Wilson himself created an excellent on-line training class, Software Carpentry, to teach scientists the necessary software engineering skills, including version control, testing and validation. The U.K  Software Sustainability Institute was established to advise researchers on constructing quality, reusable code. I was a member of a workshop which strongly endorsed the creation  a corresponding organization in the U.S.

Yet I think we are missing an important part of the solution: per an earlier post, I think scientists should be required to have formal instruction in computing and software engineering in graduate school. They should be expected to meet a minimum acceptable standard for graduation, just as they do for their major. And this instruction should emphasize that quality software starts with specification and design and ends with delivery and documentation; coding is just one part of the job. Courses such as Software Carpentry would make an excellent starting point for designing classes.

This entry was posted in astroinformatics, Astronomy, cyberinfrastructure, High performance computing, programming, software engineering, software maintenance, software sustainability, Uncategorized and tagged , , , , , , , , . Bookmark the permalink.

6 Responses to Scientific Programming Does Not Compute?

  1. Pingback: — Software development for scientists

  2. Steve B says:

    Interesting post, Bruce. I’ll add that a lot of this stems from a misconception among scientists that “software” is not equivalent to “code”; in some sense, the “metadata” of software — requirements, testing, etc. — is as important as the code itself. It is interesting to note that historically, it has been the physical sciences pushing for this kind of reform, but increasingly, biological and social sciences are utilizing large amounts of computer resources in pursuit of research goals. This is reflected in the attendance of the workshop — only perhaps a few percent of attendees were not from the physical sciences. Indeed, quantitative biology projects (like NEON) require increasingly sophisticated cyberinfrastructure, part of which is an excellent suite of software, but finding professionals with both domain science and software engineering skills is next to impossible.

    I agree that this is something that should be instilled as part of a professional training programme, but the vast majority of practicing scientists are without such training, and are loathe to undertake curriculum revision to include it. This is despite objections to the lack of code transparency in published papers, and movements to force authors to include their code in order to gain submission approval.

    There is a bit of culture clash here as well. While many academic staff acknowledge the usefulness of software in research, many lack expertise, much less formal training. Simultaneously, incoming students are increasingly leveraging modern software technologies to get their work done. As a result, there is little departmental inertia from above for change, much less support for reform of graduation requirements from the relevant graduate divisions.

    This is a rather pervasive problem, and it will be interesting to see how things play out!

  3. astrocompute says:

    Thanks for this interesting post. You raise a very good point, those who should introduce the changes are the ones least likely to do it. I am inclined to think that the funding agencies may be in the best position to begin making changes, and there are signs that this may happen. The agencies demand, rightly so, reviews of data products before they hit the streets. Why not make demands of the software that produced them?

  4. Eli Bressert says:

    Training scientists during their graduate studies is a great idea. I have seen many scientists struggle with getting code done quickly/efficiently because they were never trained how to program. Knowing the basic algorithms to apply to problems and having knowledge of how to program in the key scientific languages (e.g. C, Fortran, Python) should definitely be a must.

    There are a handful of people I know who came from a programming background and embarked on a science-related Masters/PhD study. They have all excelled quickly since their learning curve was minimal. They could jump right into the heart of the science without worry of programming. Maybe science-based B.A./B.Sc. degrees should already include programming into their curriculum?

  5. Warrick Ball says:

    I’ve just picked up this article from last month and thought I’d point it out as it’s on the same subject.

  6. astrocompute says:

    Thanks for the reference – I will look it up.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s