The Kid Is Back: The Next Generation of the Montage Image Mosaic Engine

I am delighted to say that we have received funding from the National Science Foundation (NSF) to deliver the next generation of the Montage Image Mosaic Engine. This new effort responds to the dramatic evolution in the computational landscape astronomy in the past few years.  We will deliver, over the next two years:

  • Support for data cubes.
  • Support for two sky partitioning schemes, the Hierarchical Equal Area isoLatitude Pixelization (HEALPix), standard in cosmic background experiments; and the Tessellated Octahedral Adaptive Subdivision Transform (TOAST), used in immersive platforms such as the World Wide Telescope.
  • A set of turnkey tools and associated tutorial  that will enable astronomers who are not expert in distributed platforms and technologies to launch and manage processing at scale.
  • A library that will allow Montage to be run directly from languages such as Python.

Montage has recently been relicensed, and is now available under a BSD 3-clause license. We will be making the code available on GitHub.  We will also overhaul the web page and revive the Montage blog.

The project staff are: Bruce Berriman (PI), John Good (Architect), Marcy Harbut (Documentation), Tom Robitaille and Ewa Deelman (collaborators). We are guided by a Users’ Panel consisting of Adam Ginsburg, August Muench and Suzanne Jacoby.

Just to whet your appetite, we show  a short video that displays the structure of a molecular disk wind in HD 163296, measured by ALMA (PI: M. Rawlings). The video shows a re-projection by Montage of a data cube of the star that covers multiple velocities relative to the center of the CO J=3-2 line.

And here is a poster that describes some of the features we will be delivering, presented at the 2015 NSF SI2 PI Workshop, February 15 and 16 2015 in Arlington, VA.

Montage-SI2-PI-Meeting-2015-Feb-11-fixed

PDF version:

Montage-SI2-PI-Meeting-2015-Feb-11-fixed

 

 

Posted in Uncategorized | Leave a comment

There Is Life – and A Career – Outside Astronomy!

This week, I am going to write about a different topic, jobs outside astronomy.  It’s a topic that is, in fact close to my heart, as I spent four illuminating years outside astronomy, working on calibration and quality assurance for Earth Science remote sensing missions at Goddard Space Science Center. Those positions were highly formative for me, as it turned out, and I learned the Software Engineering skills that prepared me for my current work in building archives and managing data.

So it was with great interest that, at the 225th AAS Meeting in Seattle,  I attended a standing room only Special Session of the  Working Group on Astroinformatics and Astrostatistics and listened to a talk by Jessica Kirkpatrick on how she made the transition from astronomer to data scientist, perhaps one of the most enticing positions for astronomers. The talk is on Jessica’s blog at http://berkeleyjess.blogspot.com/2015/01/astrophysicist-to-data-scientist-talk.html, and her blog in general has a lot of advice on making this transition – young scientists out there, go take a look if you are considering such a career move.

Jessica explained why, after doing a Ph D on statistical analysis of SDSS quasars, she became a data scientist. Most of them will look familiar- availability of jobs, higher salaries, improved work-life balance. She went on to describe the skills needed to enter this type of work. These skills do not simply refer to technical skills (Python, SQL, git, …) but also to the ability to work in an interdisciplinary team, and the ability to work with customers.

Perhaps the most heartening lesson from the presentation is that skills developed to study astronomy can be directly transferred to or adapted for new fields. I certainly found that my skills in calibrating astronomical instruments and in quality assurance were of value in remote sensing. So while astronomy jobs may be scarce these days, it is also true that there are many interesting problems out there to investigate, and the skills needed to do astronomy are much needed in the big wide world.

I am sometimes asked to give early-career scientists some advice on careers, so I will end by slightly rephrasing something I wrote in 2010:  ” … you can have a fulfilling career … outside the traditional faculty path.  There are many ways to make a difference … outside pure research positions. Re-invent yourself every few years – it prevents burn out, and helps you develop broader skills. Look for opportunities rather than waiting for them. Try things out  – you may surprise yourself in discovering new interests and talents.”

Posted in astroinformatics, Astronomy, blogging, Career Advice, careers, information sharing, jobs, programming, Python, Scientific computing, social media, social networking, software engineering | Tagged , , , , , , , | Leave a comment

Licensing Astrophysics Codes session at AAS 225

On Tuesday, January 6, the ASCL, AAS Working Group on Astronomical Software (WGAS), and the Moore-Sloan Data Science Environment sponsored a special session on software licenses, with support from the AAS. This subject was suggested as a topic of interest in the Astrophysics Code Sharing II: The Sequel session at AAS 223.

Frossie Economou from the LSST and chair of the WGAS opened the session with a few words of welcome and stressed the importance of licensing. I gave a 90-second overview of the ASCL before turning the podium over to Alberto Accomazzi from NASA/Astronomy Data System (ADS), who introduced the panel of speakers and later moderated the open discussion (opening slides), after which Frossie again took the podium for some closing remarks. The panel of six speakers discussed different licenses and shared considerations that arise when choosing a license; they also covered institutional concerns about intellectual property, governmental restrictions on exporting codes, concerns about software beyond licensing, and information on how much software is licensed and characteristics of that software. The floor was then opened for discussion and questions.

licensingsessionDiscussion period moderated by Alberto Accomazzi

Presentations
Some of the main points from each presentation are summarized below, with links to the slides used by the presenters.

  • Copy-left and Copy-right, Jacob VanderPlas (eScience institute, University of Washington)
    Jake extolled everyone to always license codes, as in the US, copyright law defaults to “all privileges retained” unless otherwise specified. He pointed out that “free software” can refer to the freedoms that are available to users of the software. He covered the major differences between BSD/MIT-style “permissive” licensing and GPL “sticky” licensing while acknowledging that the difference between them can be a contentious issue.
    slides (PDF)
  • University tech transfer perspective on software licensing, Laura L. Dorsey (Center for Commercialization, University of Washington)
    Universities care about software licenses for a variety of reasons, Laura stated, which can include limiting the university’s risk, respecting IP rights, complying with funding obligations, and retaining academic and research use rights. She also covered factors software authors may care about, among them receiving attribution, controlling the software, and making money. She reinforced the importance of licensing code and discussed the common components of a software license.
    slides (PDF)
  • Relicensing the Montage Image Mosaic Engine, G. Bruce Berriman (Infrared Processing and Analysis Center, Caltech)
    In last year’s Astrophysics Code Sharing session, Bruce had discussed the limitations of the Caltech license under which the code Montage was licensed; since then, Montage has been relicensed to a BSD 3-Clause License. Following on the heels of Laura’s discussion and serving as a case study for institutional concerns regarding software, Bruce related the reasons for and concerns about the relicensing, and discussed working with the appropriate office at Caltech to bring about this change.
    slides (PDF)

AdamPicofWhalenSlideRestricted algorithms; image by Adam M. Jacobs

  • Export Controls on Astrophysical Simulation Codes, Daniel Whalen (Institute for Theoretical Astrophysics, University of Heidelberg)
    Dan’s presentation covered some of the government issues that arise from research codes, including why certain codes fall under export controls; a primary reason is to prevent the development of nuclear weapons.Dan also brought up how foreign intelligence agencies collect information and what specific simulations are restricted, and stated that Federal rules are changing, but slowly.  slides (PDF)
  • Why licensing is just the first step, Arfon M. Smith (GitHub Inc.) Arfon went beyond licensing in his presentation to discuss open source and open collaborations, and how GitHub delivers on a “theoretical promise of open source.” He shared statistics on the growth of collaborative coding using GitHub, and demonstrated how a collaborative coding process can work and pointed out that through this exposed process, community knowledge is increased and shared. He challenged the audience to contemplate the many reasons for releasing a project and to ask themselves what kind of project they want to create. slides (PDF)
  • Licenses in the wild, Daniel Foreman-Mackey (New York University) First, I have to note that Dan made it through 41 slides in just over the six minutes allotted for his talk, covering about seven slides/minute; I don’t know whether to be more impressed with his presentation skills or the audience’s information-intake abilities! After declaring that he knows nothing about licensing, Dan showed us, and how, that he knows plenty about mining data and extracting information from it. From his “random” selection of 1.6 million GitHub repositories, he noted with some glee that 63 languages are more popular on GitHub than IDL is, the number of repositories with licenses have increased since 2012 to 17%, and that only 28,972 of the 1.6 million mentioned the license in the README file. Dan also determined the popularity of various licenses overall and by language and shared that information as well. slides (PDF)

B6s2tSYCYAAyMat

Percentage of licensed GitHub repos; image by Arfon Smith

Open Discussion
After Dan’s presentation, Alberto Accomazzi opened the floor for discussion. Takeaway points included:

  • Discuss licensing with your institution; it’s likely there is an office/personnel devoted to deal with these issues
  • This office is likely very familiar with issues you bring to it, including who to refer you to when the issues are outside their purview
  • “Friends don’t let friends write their own licenses.” IOW, select an existing license rather than writing your own
  • License your code
  • Let others know how you want your code cited/acknowledged

My thanks to David W. Hogg, Kelle Cruz, Matt Turk, and Peter Teuben for work — which started last March! — on developing the session, to Alberto for his excellent moderating and to Frossie for opening and closing it. My thanks also to the wonderful Jake, Laura, Bruce, Dan W, Arfon, and Dan F-M for presenting at this session, and to the Moore-Sloan Data Science Environment and AAS for their sponsorship.

Resources
Many resources on licensing, including excellent posts by Jake and Bruce, can be found here.

This post first appeared in the ASCL blog and is reproduced here with the permission of the ASCL Editor, Alice Allen. Disclosure: I am a member of the ASCL Advisory Board.

Posted in astroinformatics, Astronomy, blogging, BSD, Computing, GitHub, GPL, informatics, information sharing, Licenses, Montage, Open Access, Open Source, programming, publishing, Scientific computing, social media, social networking, software engineering, software maintenance, softwarte sustainability, user communities | Tagged , , , , , , , , , , , , , , | Leave a comment

It’s impossible to conduct research without software, say 7 out of 10 UK researchers

This is the the title of a blog post at the UK’s Software Sustainability Institute, which reports the results of a survey conducted amongst 417 UK researchers across many disciplines. This is the most extensive survey of its type conducted to date, and the data are available for download. The key findings, as summarized in the blog post are:

  • 92% of academics use research software
  • 69% say that their research would not be practical without it
  • 56% develop their own software (worryingly, 21% of those have no training in software development
  • 70% of male researchers develop their own software, and only 30% of female researchers do so.

These results are summarized in the graphic below:

SoftwareSurveyGraphs

The wordle below summarizes the languages used by survey participants (there were a total of 566 packages reported altogether):

ResearchSoftwareSurveyWordle27Nov14 copy

Perhaps the most striking conclusion is that a large number of researchers develop their own software, but relatively few have had software training. One other striking finding is that when asked whether they had included software costs in proposals “22% said that they had, 57% said they had not, and 20% said that they had not even though they knew software development would make up part of the bid!” So while software is crucial to research, it has far to go in being recognized as such by educators and granting agencies.

Based on http://www.software.ac.uk/blog/2014-12-04-its-impossible-conduct-research-without-software-say-7-out-10-uk-researchers by Simon Hettrick. The figures are reproduced from this article.

Posted in astroinformatics, Astronomy, cyberinfrastructure, education, informatics, information sharing, programming, Python, R, Scientific computing, social media, social networking, software engineering, software maintenance, software sustainability | Tagged , , , , , , , , , | Leave a comment

Collaborating With GitHub – 225th AAS Meeting

We are in the process of making the making the Montage mosaic engine accessible through GitHub, so it was good timing to attend the workshop on Collaborating With GitHub at the 225th American Astronomical Society (AAS) meeting in Seattle, WA (Jan 4-8, 2015). The class was attended by about 25 people, and taught by Arfon Smith, Matthew McCullough, and Gus Muench. I will refer readers as much as possible to on-line resources and training materials that they can follow to get started with GitHub, rather than describe the details of the class.  The material we covered can also be reviewed in Arfon Smith’s on-line slides at https://training.github.com/kit/foundations/index.html?teacher=arfon.

Set-up for the class was straightforward (I believe this is a standard set-up for GitHub training) and is described at  https://training.github.com/articles/github-class-prerequisites/. It involves establishing a GitHub account, install the Git Command Line and Desktop Program for your machine, and, optionally , reading the first few chapters of the  free ProGit book.

The class itself involved the following topics:

  • Introduction to Git and GitHub.
  • Paper and analysis workflows (discussion of complexities in collaborative workflows and how they might benefit from incorporation in workflows)
  • Collaborative changes  (working with partners to edit documents)
  • Richer collaboration  (e.g. permissions)
  • Advanced topics (how GitHub plugs into existing tools).

You can learn the basics of setting up Git and creating and working with repositories at https://help.github.com/articles/set-up-git/ and in Chapter 2 of the ProGit book.

GitHub is a powerful tool for collaboration, including writing papers as well as developing code. Chapter 3 of the ProGit book tells you about the mechanisms for doing this – creating push and pull requests, using fork to create your own copy, and monitoring the project for changes.

One of the most interesting topics I learned about was how GitHub plugs into powerful tools. The example we studied was integration into ShareLatex, an on-line collaborative LaTeX editor.  Learn how to connect it your GitHub account at https://www.sharelatex.com/github/

 

 

Posted in astroinformatics, Astronomy, Computing, cyberinfrastructure, document management, GitHub, High performance computing, informatics, information sharing, metrics, Open Access, Open Source, programming, publishing, Scientific computing, social media, social networking, software engineering, software maintenance, software sustainability, user communities, Version Control, Web 2.0 | Tagged , , , , , , , , , , , , , | Leave a comment

Software Carpentry Workshop at the 225th AAS Meeting

Many of you have heard of the Software Carpentry Foundation, started some years ago by Greg Wilson, a non-profit organization whose members teach researchers basic software skills. The organization exists to teach fundamental software engineering practices to scientists, who in many fields spend as much as 40% of their time on programming, yet most are self-taught.  One of the organization’s most successful teaching methods is software boot camps, immersive, intensive workshops lasting one or two days whose content is customized to the needs of the audience.  These boot camps are taught by volunteers, generally scientists who have themselves attended a boot camp: to date, 81 instructors have taught 265 workshops for 9,000 learners. And the instructors themselves are usually not programming experts: they are people who have learned valuable skills at boot camps and want to pass this knowledge to others.

I attended my first boot camp at the 225th American Astronomical Society Meeting in Seattle. I have an interest in how to best educate astronomers in software engineering skills and on developing sustainable software, so I wanted to see first-hand the dynamics of how the boot camps work. And of course, I could pick up some new software skills.  The pic below shows us hard at work, being taught the essentials of Python by Erik Bray.

photo copy

The class took place over two days, Jan 3 and Jan 4, though I was only able to attend on Jan 3, and was taught by Azalee Bostroem, Erik Bray, Matt Davis, and Phil Rosenfield ably assisted by Pauline Barmby.  The topics covered were, with more detail on the workshop page at http://abostroem.github.io/2015-01-03-aas/:

  • Automating tasks with the Unix shell (Day One)
  • Building programs with Python (Day One)
  • Code Review (Day One)
  • Version control with Git (Day Two)
  • Python for Astronomers (Day Two)

Preparation was straightforward.  A few days before the meeting, we were asked to fill out a short questionnaire to allow the instructors to assess the skill level of participants, and to install required software on our laptops, including “all-in-one” scientific Python installer Anaconda.  I had some trouble installing all the Python packages, but was able to solve the problems quickly enough using the troubleshooting guide. The instructors were on hand one hour before class started to help participants sort out any installation problems.

The instruction was by design hands-on and interactive.  In general, the instructors showed us on the projector how to perform various tasks, and we followed along on our laptops. Then we carried out various exercises by ourselves or small groups, usually taking 5 to 10 minutes. The instructors circulated to help out anyone who requested assistance. Finally, we went over at least one solution on the projector.

A complete record of all activities was kept in an Etherpad file (not sure if this is or will remain public),  shared among all the participants, who took advantage of it to ask questions that were responded to quickly bu the instructors.  The Etherpad record, as well as all the class materials, allow us to go back and reproduce the exercises, or go back and do them for the first time when things went a little too quickly.

Matt Davis introduced the Unix shell is a powerful tool for automating operations on files. We broadly followed the on-line tutorial at http://software-carpentry.org/v4/shell/ and there is a Unix reference guide at http://abostroem.github.io/2015-01-03-aas/novice/ref/01-shell.html.

The Python part of the syllabus was intended to introduce us to the important concepts in programming. The lesson plan is on-line at http://abostroem.github.io/2015-01-03-aas/intermediate/python-review/index.html,  and the class notes and materials are at http://nbviewer.ipython.org/github/abostroem/2015-01-03-aas/blob/gh-pages/intermediate/python-review/python-full.ipynb.The lessons were carried out by constructing an iPython notebook. By the end of the session, we were able to

  1. Describe and distinguish the seven core elements shared by all programming languages.
  2. Use Python to write simple programs that use these core elements, using both the core library and scientific packages such as numpy.
  3. Make and save simple publication-quality plots using matplotlib (and APLpy, time permitting).
  4. Read, manipulate, and save data files in csv and text formats.
  5. Create standalone Python scripts that can be run from the command line.

These seven core elements referenced in item 1 are:

  1. Individual things (the number 2, the string ‘hello’, a matplotlib figure)
  2. Commands that operate on things (the + symbol, the len function)
  3. Groups of things (Python lists, tuples, and dictionaries)
  4. Ways to repeat yourself (for and while loops)
  5. Ways to make choices (if and try statements)
  6. Ways to create chunks (functions, objects/classes, and modules)
  7. Ways to combine chunks (function composition)

Matt Davis led us in a coder review session. We split up into groups of three or four to give our critiques of a common Python script, and compared notes at the end. The exercise was intended to show how even rapid code reviews by third parties can improve the quality of a piece of code.

So what did I learn?  I am not a Python expert, but I learned basic Python during the lesson, and I was impressed with the Python notebook for keeping a record of my efforts. The lesson on code review was instructive and I think participants would realize the value of them (as a manager of software projects, I am an advocate of code reviews). Chatting to participants afterwards, there was strong agreement that scientists can learn important software techniques through these immersive classes, and gain the confidence to learn more. If you want to get the most out of these classes, be prepared for a day or two of sustained hard work: the boot camp moniker is apt. Finally, I am impressed by Software Carpentry’s  business model for sustaining these classes: bootstrapping itself to develop a cadre of instructors and sustain its body of expertise.

Posted in astroinformatics, Astronomy, Computing, cyberinfrastructure, education, informatics, information sharing, Open Source, programming, publishing, Python, Scientific computing, social networking, software engineering, software maintenance, software sustainability, user communities | Tagged , , , , , , , , , , , | Leave a comment

The dot Astronomy 6 Conference

For my last post of 2014, I would like to draw everyone’s attention to the on-line material about the dot Astronomy 6 Conference, held this year at the Adler Planetarium in Chicago, on December 8-10. This annual meeting on on-line astronomy generally has 50-60 participants, and includes a full day hack session, on-the-fly presentations and discussion sessions, as well as traditional talks. The aim of these meetings is to use the power of the internet to unleash creative new ideas.

This year, Brooke Simmons (Day One), Meredith Rawls (Day Two) and Elisabeth Newton (Day Three) each did a fine job of recording activities. I will summarize highlights for each day, but I recommend reading the daily blogs to get a complete picture of events.

Day One: Arfon Smith described how GitHub is a powerful tool for collaborations, and Erin Braswell talked about the Open Science Frameworm, ” … a 1-stop shop bringing together all the different tools we use for science… You can compartmentalize projects into components, some of which are allowed to be private, and identify collaborators who are working on that component (so that people get appropriate credit). Version control is built in and it’s integrated with GitHub, so you can always link back to specific states of a project or file.” Dustin Lag talked about how used Astrometry.net to analyze images of Comet Holmes. Alberto Pepe talked about Authoera supports collaborative paper writing. Geert Barentzen opined that the dat deluge is largely hyped, except in radio astronomy.

Day Two:  Hack day. A sampling of the offerings —

  • Astronomy will be able to officially name an exoplanet when the IAU people get back to us and approve our organization… in about a month
  • There is now an astronomy hubot called botastro
  • Chromoscope.net now has a trippy version: kaleidochromescope
  • Survey of software use by astronomers is in progress. Take the survey
  • We’re learning how APOD images are shared and which ones are most popular. Everyone’s favorites seem to be skyscapes, and it seems to be that the most popular images feature the Earth.
  • Progress on hacking citations: they have scraped metadata for citations in papers and are working on a paper recommendation engine (see #astrorec)
  • New Rate My Institution survey is in development and seeking feedback
  • New website featuring a timeline of museum artifacts that are astronomically related
  • Browser extension called “unclockify” will transform hms RA/Dec coordinates into decimals
  • A baby website that juxtaposes two paper titles. User clicks on the one they think has more citations and sees if they are right or not.
  • Kepler sonification: generating a pop song solely from Kepler data is in progress, a la keplerphone.
  • New game called “transits”… sounds a lot like my research, ha! Exoplanets and star orbits and such.
  • Single-column tablet-friendly emulateApJ LaTeX template
  • Statistics tutorials are making lots of progress, exploring effects of including or excluding outliers, will have an AstroBetter category called “tutorials”
  • There is a super epic parody video in the works… I am excited
  • The new astrobites website looks much shinier but is still a work in progress
  • Map-based visualization of the AAS job registry is coming along nicely
  • App to fill in citations you’re missing based on papers you are already citing: Cite Me Maybe
  • Automatic benchmarking tool “.Travis” for use with github projects is getting started
  • Created a visualization for galaxy zoo classifications

Day Three:  Much of the day was devoted to demos of Day Two’s hacks.  Two topics of broad interest got my attention. To quote, “Two sessions focused .Astronomy themes of open access and software development. One continued the discussion about open access journals, bringing up points about peer review and commenting, curation of articles, and the various cultural barriers to the adoption of new publishing processes. As one of the hacks showed, astronomers tend to have little or no training in that area. Another session discussed how to integrate some of these skills into graduate curricula, including both formal and informal education.”

 

Posted in astroinformatics, Astronomy, astronomy surveys, Computing, data archives, Data Management, Hack Days, informatics, information sharing, Internet, On-line Journals, Open Access, Open Source, programming, publishing, Python, Scientific computing, social media, social networking, software engineering, software maintenance, software sustainability, Time domain astronomy, Uncategorized, user communities, Virtual Observatory, Web 2.0 | Tagged , , , , , , , , , , , , | Leave a comment