Software Carpentry Lessons Learned – A Talk at Sci Py2014

This is a video of a highly entertaining presentation by Greg Wilson, the founder of Software Carpentry, which teaches lab skills for scientific computing through intensive “boot camps” and on-line resources. The story of how Software Carpentry came about alone makes this worth watching (I won’t spoil things – it’s near the beginning).  The video is based alebeit loosely on an article called Software Carpentry:Lessons learned in F1000Research, an Open Science Journal primarily aimed at researchers in Life Science. The paper and the presentation are refreshing for their honesty about what has worked and what has not.


Scientists are generally not taught the skills to design, write, test, debug, install and maintain software: they often have no more than a class or two in programming (I know that’s all I had). Software Carpentry has been aiming to remedy this, first through on-limne tutorials and videos, and now through a series of short intensive Boot Camps, 91 in all since 2013, attended by 3,500 persons.

The content and organization of these Boot Camps have been honed through some hard-earned lessons. On-line resources attracted up to 2,000 visitors a month, but were considered too easy to to be awarded credit in computer science classes, were considered unhelpful in getting science done, and did not induce people to contribute or update material. Early week long classes were considered too long and presented textbook software engineering that scientists generally don’t find useful. Efforts to hold Massive Open On-line Courses (MOOCs before the acronym took hold) were not that successful either – few people contributed material and only 10% completed the class (which later studies showed to be typical of MOOCs).

The big change came in 2012, with the introduction of short, intensive courses held on-site, and with the change in format came a change in content: out went the textbook material, and in came tools that would help scientists and instruct them in many of the methods of software engineering.  These tools are intended to develop computational competence rather than make people experts. These tools are disarmingly simple, and I quote from the lessons learned paper.

  • The Unix shell. We only show participants a dozen basic commands; the real aim is to introduce them to the idea of combining single-purpose tools (via pipes and filters) to achieve desired effects, and to getting the computer to repeat things (via command completion, history, and loops) so that people don’t have to.

  • Programming in Python (or sometimes R). The real goal is to show them when, why, and how to grow programs step-by-step as a set of comprehensible, reusable, and testable functions.

  • Version control. We begin by emphasizing how this is a better way to back up files than creating directories with names like “final”, “really_final”, “really_final_revised”, and so on, then show them that it’s also a better way to collaborate than FTP or Dropbox.

  • Using databases and SQL. The real goal is to show them what structured data actually is (in particular, why atomic values and keys are important) so that they will understand why it’s important to store information this way.


Posted in astroinformatics, computer videos, Computing, computing videos, cyberinfrastructure, informatics, information sharing, programming, R, Scientific computing, SciPy2014, software engineering, software maintenance, software sustainability, user communities, Web 2.0 | Tagged , , , , , , , , , , | Leave a comment

Videos From the 2014 Sagan Summer Workshop On-line

The NASA Exoplanet Science Center (NEXScI) hosts the Sagan Workshops, annual themed conferences aimed at introducing the latest techniques in exoplanet astronomy to young researchers. The workshops emphasize interaction with data, and include hands-on sessions where participants use their laptops to follow step-by-step tutorials given by experts.  This year’s conference topic was “Imaging Planets and Disks”.  It covered topics susch as

  • Properties of Imaged Planets
  • Integrating Imaging and RV Datasets
  • Thermal Evolution of Planets
  • The Challenges and Science of Protostellar And Debris Disks…

You can see the agenda and the presentations here, and the videos have been posted here. Some of the talks are also on youtube at

The presentations showcase the extraordinary richness of exoplanet research. If you are unfamiliar with NASA’s exoplanet program, Gary Lockwood provides an introduction (not available for embedding – visit the web page). My favorite talk, of many good ones, was Travis Barman speaking on the “Crown Jewels of Young Exoplanets.”



Posted in astroinformatics, Astronomy, astronomy surveys, computer videos, Computing, computing videos, exoplanets, informatics, information sharing, Kepler, publishing, Scientific computing, social media, social networking, Time domain astronomy, time series data, Transiting exoplanets, user communities, W. M. Keck Observatory | Tagged , , , , , , , , , , , , | Leave a comment

SciPy2014 videos on-line

This year’s annual Scientific Computing with Python conference, SciPy2014, was held in Austin ,TX, July 6-12 2014.  This conference “… allows participants from academic, commercial, and governmental organizations to showcase their latest Scientific Python projects, learn from skilled users and developers, and collaborate on code development.”

As is customary, there were some excellent presentations, with talks and tutorials from many scientific disciplines. You can view them all on-line at and I will write posts about some of them in the coming weeks. Some of my favorite talks:

Computational Thinking is Computational Learning, Lorena Barba:

The Wonderful World of Scientific Computing with Python, David Sanders

Astropy and astronomical tools Part I, Greenfield, Bray, Robitaille, Aldcroft

Posted in astroinformatics, Astronomy, computer modeling, computer videos, Computing, computing videos, cyberinfrastructure, Data mining, High performance computing, informatics, information sharing, Parallelization, Python, Scientific computing, SciPy2014, social media, social networking, software engineering, software maintenance, software sustainability, user communities | Tagged , , , , , , , , , , , , | Leave a comment

The CAVE2 Immersive Visualisation Platform at Monash University

The availability of massive data sets is driving the development of new large-scale visualization systems. One such is the CAVE2 Immersive Visualisation Platform built at Monash University. Described as a next-generation immersive hybrid 2D and 3D virtual reality environment, this facility exploits the latest design deployed at the Electronic Visualization Laboratory, in the University of Illinois at Chicago.  It combines  a scalable-resolution display walls with virtual-reality methods to create a seamless 2D/3D environment. The  virtual-reality simulation aims to have a resolution that can match human visual acuity.

Here is an introductory video:

By the numbers, it features:

  • A display system with a curved video wall of eighty 46″ 3D LCD panels, arranged in 20 four-panel columns;
  • Head and motion tracking facilities;
  • 3D sound facilities;
  • A high-performance compute and render cluster delivering one trillion computations per second for each of the 80 screens, for real-time display of 2D and 3D imagery;
  • A super-fast local disk system, enabling for example 30 frame per second playback of 84 million pixel images; and
  • A high performance 10Gbps network fabric.

Users will be able to build their own applications to take advantage of the following middleware:

  • The Scalable Adaptive Graphics Environment (SAGE) for large-screen, hybrid 2-d/3-d collaborative applications and meetings. SAGE comes configured with applications including image display (2-d and 3-d), movie playing (e.g. H.264 HD content), audio file playback and PDF document display. Additional applications we will be supporting in the near term (late 2013, early 2014) include desktop sharing. For programmers, SAGE provides the ability to stream pixels in from third-party programs running locally or remotely.
  • Omegalib for immersive 3-d mode including control input and head-tracked stereoscopic visualisation.
  • CalVR for immersive 3-d mode including control input and head-tracked stereoscopic visualisation.

Here are some of the programs that can be used already:

  • 3-d geometry models as composites of one or more Alias Wavefront (OBJ) format
  • 3-d geometry stored in a single Autodesk FBX format model.
  • H.264-encoded Quicktime (MOV) files up to HD resolution.
  • Single frame high-resolution 2-d images (JPEG)



Posted in astroinformatics, Astronomy, computer videos, Computing, computing videos, cyberinfrastructure, Data mining, High performance computing, image mosaics, informatics, information sharing, Parallelization, programming, software engineering, visualization | Tagged , , , , , , , , , , | Leave a comment