Free the Science: One Scholarly Society’s bold vision for open access and why it matters now more than ever

This is a repost of a very interesting article by Ellen Finnie in  the IO:In The Open Blog. Ellen is a member of the ECS Group of Advisory Librarians (EGALs).

The Electrochemical Society, a small nonprofit scholarly society founded in 1902, has an important message for all of us who are concerned about access to science.   Mary Yess, Deputy Executive Director and Chief Content Officer and Publisher, could not be clearer about the increased urgency of ECS’ path:  “We have got to move towards an open science environment. It has never been more important – especially in light of the recently announced ‘gag orders’ on several US government agencies– to actively promote the principles of open science.”    What they committed to in 2013 as an important open access initiative has become, against the current political backdrop, truly a quest to “free the science.”

ECS’s “Free the Science” program is designed to accelerate the ability of the research ECS publishes — for example, in sustainable clean energy, clean water, climate science, food safety, and medical care — to generate solutions to our planet’s biggest problems.  It is a simple and yet powerful proposition, as ECS frames it:

“We believe that if this research were openly available to anyone who wished to read it, anywhere it the world, it would contribute to faster problem solving and technology development, accelerate the pace of scientific discovery, encourage innovation, enrich education, and even stimulate the economy.”

How this small society — which currently publishes just 2 journals — came to this conclusion, and how they plan to move to an entirely open access future, is, I believe, broadly instructive at a time when our political environment has only one solid state: uncertainty.

ECS’s awakening to OA was jump-started by the 2013 OSTP memorandum on public access to research.   It became clear to the ECS that while their technical audience had perhaps not at that time fully embraced open access, the OSTP memo represented a sea change.  By spring of 2013, the board had resolved that ECS was heading towards OA, and they launched a hybrid open access option for their key journals in 2014.

And here’s where the story gets even more interesting.  If you look only at their first offering in 2014 or even their current offerings, you won’t immediately see their deeper plan, which goes well beyond hybrid OA.  For ECS, as Yess clearly indicates, “Gold open access is not the way to go.”   In fact, ECS “doesn’t believe in gold open access,” seeing it as “just a shell game.”

As Yess explains it, “If we hit tipping point to all gold OA, the big commercial players will simply flip all their journals to OA, and the subscription money from library budgets will come out of author budgets, costs will spiral up and we’ll be in the same escalating price environment we’ve been suffering from for years.”  So Yess is “skeptical about gold working.  Given the size and market share of the large STM publishers, they will make Gold OA work to their stakeholders’ benefit, and it will not benefit researchers and their communities.”

There is broad (though hopefully not divisive or overly distracting) debate about whether the APC market will function well for research libraries, and what adjustments to this APC market might make it work.  But meanwhile, what’s a society – the sole nonprofit society to still be publishing their own journals in the relevant disciplines — to do?  ECS’s multi-pronged and contingency-based path is one we could all benefit from watching.  What they envision is “a community of people supporting the content.”  Their insight is to work in the same framework they have had since 1902 — community support– but to evolve what that community support looks like.

Under their subscription-based publishing model, they had relied on a combination of library subscriptions, the Society’s own coffers, and authors’ page charges. Competition from commercial publishers forced ECS to eliminate page charges and to rely on subscriptions and other revenue to support the publications program.  This model has already shown signs of falling apart, with ECS, like many smaller societies, increasingly edged out by big deals from major publishers which preclude cancellations of their journals.

So ECS felt they needed to think differently.  Starting with offering hybrid OA in their flagship journals (rather than launching new OA-specific titles) has allowed the ECS to “test the waters” and has introduced OA to their community of scholars, generating interest around all of the issues.   They started with a two-year program offering generous numbers of APC waivers to members, meeting attendees, and all library subscribers.  This has resulted, as they hoped, in raised awareness, good uptake, and recognition for their OA program.

Then in 2016 they introduced ECS Plus, through which libraries can opt to pay a bit more than the cost of single ECS APC (which is $800) to upgrade their subscription to the package of ECS journals, and as a result have all APCs waived for authors on their campuses who choose the OA option.  Since its launch, ECS has seen a small but encouraging growth in this program. They now have about 800 subscribers, and “there is some evidence the library community feels this is a valuable program,” Yess says.

ECS aims to become “platinum OA” by 2024 – entirely open access, with no APCs, operating fully in what Yess calls an “open science environment.”  They expect to take many roads to achieve this goal.  One is reducing publication costs.  Toward that end, they have entered an agreement with the Center for Open Science to build, at no cost to ECS, a new digital library platform which, once adopted, will reduce ECS’s publication costs.

In addition, this platform will allow ECS to fulfill the“need to move beyond the original concept of open access in standard journals, and beyond the idea of being a publisher in the old sense of journals, articles, issues – to get beyond containerized thinking,” Yess says.

Moving beyond those ‘containers’ will be more possible given their work with the Center for Open Science to offer a preprint server.  The preprint server will be built on the same platform and framework as the preprint servers SocArXiv and PsyArXiv, and will integrate with preprint servers outside of the Open Science Framework such as bioRxiv and arXiv.  ECS hopes to launch this preprint server in beta next month.

While reducing costs and breaking out of old containers, ECS will also need to generate non-subscription revenue if they want to balance the books.  They want to work with the library community to obtain a commitment to pay some kind of cost, possibly looking at a model inspired by SCOAP3.  They also plan to seek donations and endowments from organizations and research funders.  And if the cost reductions and new revenue streams don’t provide a sufficient financial foundation, Yess says that APCs are “a contingency plan” for ECS.

Regardless of which of these roads the ECS takes, for Yess, the overall direction is clear:  “Scholarly publishing has to change. Period.”  Their solutions to the need for change are generated from their own context, and are certainly not one-size-fits-all.   But regardless of whether the specific mechanisms work for other societies, what is instructive from the ECS approach is that they are embracing new realities, envisioning a new, open, scholarly sharing environment, and are building their future from their original base in a community of science and technology.  They are finding a way to maximize the potential of the digital age to support their mission to “free the science” for the betterment of humanity.

In this time of tumult and doubt on our national stage, when the merits of science – and even the existence of facts  — are questioned at the highest levels, ECS’s doubling down on OA and open science can help those of us committed to leveraging science and scholarship for all of humanity, everywhere, see a hopeful way forward, a way that keeps us moving toward our aim of democratized access to science.

 

Posted in Uncategorized | Leave a comment

Spherical Panoramas for Astrophysical Data Visualization

This is the title of a paper by Brian Kent (NRAO), which has been accepted for publication in the PASP Special Focus Issue: Techniques and Methods for Astrophysical Data Visualization. You can download the paper from arXiv at https://arxiv.org/abs/1701.08807 .(I understand that the special issue will be published in the Spring). This is a timely publication, given the growing number of wide-area image products released in astronomy.

Brian shows to use the the  three-dimensional software package Blender and the Google Spatial Media module are used in tandem to immerse users in data exploration. The spherical panoramas that can be output from these open-source technologies include static panoramas, single-pass fly-throughs and orbit flyovers.  Briefly, Blender is a 3D graphics suite whose output can be processed by Google Spatial Media to create 360 degree video, and that output can be exported to youTube (with the appropriate metadata added to the header).

All-sky astronomy maps are well suited to this spherical panorama approach. The paper gives an itemize workflow for creating such panoramas. It involves combing images in equirectangular, cylindrical equal are or Hammer-Aitoff projections. Tools such as Montage can perform the projections calculations.

Brian maintains a web page http://www.cv.nrao.edu/~bkent/blender/ called “An Introduction to 3D Graphics and Visualization for the Sciences.” It includes some panoramic video demos, and I embed a couple below; use your mouse to pan across the images.

 

 

Finally, if you really want to learn this technology, consider reading Brian’s book “3D Scientific Visualization with Blender.

 

 

 

Posted in Astronomy, computer videos, Computing, computing videos, cyberinfrastructure, High performance computing, information sharing, Milky Way, Montage, programming, publishing, Scientific computing, software engineering, visualization, workflows | Tagged , , , , , , | Leave a comment

From The Front Lines of SPIE Astronomical Telescopes and Instrumentation 2016

I attended the SPIE meeting on Astronomical Telescopes and Instrumentation in Edinburgh, Scotland from June 26 through July 1, and I am sharing my views on the conference presentations. Approximately 2,000 astronomers, software engineers and instrumentation specialists crowded the Edinburgh International Conference Center (EICC) for the week.  You can see a detailed review of the meeting and a large collection of photographs on the SPIE web page. Parts of this post are based on the SPIE review.

As a software specialist, I gravitated towards the software presentations, which focused on software solutions to challenges in cyberinfrastructure. There were many interesting talks. Paul Hirst of Gemini described how building the next generation of the Gemini archive in the Amazon cloud is proving cost effective, given the high cost of power in Hawaii. Steve Berukoff’s team described how they are building a  Petascale data system for the Daniel K. Inouye Solar Telescope, under construction on Maui. Trey Roby described how his team was modernizing the underpinnings of the Firefly web-based presentation system by replacing the Google Web Toolkit with Javascript. Joerg Retzlaff discussed lessons learned in the Publication of science data products through the ESO archive. Tom McGlynn described the NASA archive model for the implementation and operation of the Virtual Observatory.

Tim Jenness described the challenges of handling large amounts of data and efforts of the LSST team to join the Astropy community leveraging and contributing to those software packages within the confines set by current funding limits and methodologies. Marco Molinaro  shared the results of his team’s EU-FP7 program, VIALACTEA, which provides an infrastructure for handling and manipulating diverse datasets into a more homogeneous database. I described work at the Keck Observatory Archive using R-tree indexing schemes to enable fast, more efficient searches of solar system objects.

My favorite talk was by Asher Baltzell, who discussed a cloud-based data reduction scheme applied to Magellan AO (MagAO) images and the resulting development of a free cyberinfrastructure for community use. The MagAO system featured prominently at the meeting. See the presentations in the MagAO blog at https://visao.as.arizona.edu/uncategorized/magao-at-2016-spie-astronomical-telescopes-instrumentation/.

See the SPIE review for excellent talks on gravitational waves, the operation of the Large Millimeter Telescope (LMT), and four NASA Science Technology Definition Teams presentations on submissions for the Decadal 2020 survey, among others.

The conference reconvenes in 2018 in Austin, Texas.

Posted in astroinformatics, Astronomy, computer modeling, cyberinfrastructure, Data Management, databases, Gemini, Grid Computing, High performance computing, informatics, information sharing, programming, Scientific computing, software engineering, software maintenance, software sustainability, TMT, user communities, Virtual Observatory, visualization, W. M. Keck Observatory | Tagged , , , , , , , , , , , , | 2 Comments

Astronomy Software (1986)

I admit that I have soft spot for these old videos. This one is from 1986. Now, the interfaces may seem primitive, yet the calculations are quite sophisticated – these types of programs are just as useful today.

 

Posted in Astronomy, computer videos, Computing, computing videos, History of Computing!, information sharing, programming, Scientific computing | Tagged , , , , , , | Leave a comment

The Mother of All Demos, presented by Douglas Engelbart (1968)

If you have never seen this, I recommend it.  Douglas Engelbart’s December 9, 1968, extraordinarily prescient demonstration of experimental computer technologies that are now ubiquitous. The live demonstration featured the introduction of the computer mouse, video conferencing, teleconferencing, hypertext, word processing, hypermedia, object addressing and dynamic file linking, bootstrapping, and a collaborative real-time editor.

You can also see it in nine parts at these links:
(1/9) http://youtube.com/watch?v=JfIgzSoTMOs
(2/9) http://youtube.com/watch?v=a11JDLBXtPQ
(3/9) http://youtube.com/watch?v=61oMy7Tr-bM
(4/9) http://youtube.com/watch?v=fNXLK78ZaFo
(5/9) http://youtube.com/watch?v=7zz1SwCTCEE
(6/9) http://youtube.com/watch?v=6dVNxlLYTsQ
(7/9) http://youtube.com/watch?v=XiJA7_Sw9aM
(8/9) http://youtube.com/watch?v=EI8LZKW5Lwk
(9/9) http://youtube.com/watch?v=VYDg2wr2QfI

Posted in computer modeling, computer videos, Computing, computing videos, cyberinfrastructure, History of Computing!, information sharing, Scientific computing, software engineering, Uncategorized | Tagged , , , , , , , | Leave a comment

The Pegasus Workflow Manager and the Discovery of Gravitational Waves

We have all heard so much about the wonderful discovery of Gravitational Waves – and with just cause! In today’s post, I want to give a shout-out to the Pegasus Workflow Manager, one of the crucial pieces of software used in analyzing the LIGO data. Processing these data requires complex workflows involving transferring and managing large data sets, and performing thousands of tasks. Among other things, the software managing these  workflows must be automated and portable across distributed platforms; be able to manage dependencies between jobs, and be highly fault tolerant – if jobs fail, then they must be restarted automatically without losing data already processed. The Pegasus Workflow Manager manager performs these functions on behalf of LIGO.

Specifically, Pegasus managed the workflow for the Compact Binary Coalescence Group, which aims to find inspiral signals from compact binaries. The figure below shows the workflow:

2016-02-19_16-28-46Each of these workflows has (to quote from the Pegasus web site):

  • 60,000 compute tasks
  • Input Data: 5000 files (10GB total)
  • Output Data: 60,000 files (60GB total)

and using Pegasus in the production pipeline gave LIGO the following capabilities (again, to quote from the website).

  • “Run an analysis workflows across sites.  Analysis workflows are launched to execute on XSEDE and OSG resources with post processing steps running on LIGO Data Grid.
  • Monitor and share workflows using the Pegasus Workflow Dashboard.
  • Easier debugging of their workflows.
  • Separate their workflow logs directories from the execution directories. Their earlier pipeline required the logs to be the shared filesystem of the clusters. This resulted in scalability issues as the load on the NFS increased drastically when large workflows were launched.
  • Ability to re-run analysis later on without running all the sub workflows from start. This leverages the data reuse capabilities of Pegasus. LIGO data may need to be analyzed several times due to changed in e.g. detector calibration or data-quality flags. Complete re-analysis of the data is a very computationally intensive task. By using the workflow reduction capabilities of Pegasus, the LSC and Virgo have been able to re-use existing data products from previous runs, when those data products are suitable.”

At-scale workflows have applicability across all disciplines these days, and Pegasus has been successfully used in many disciplines, including astronomy, per the graphic below; learn more at the Pegasus applications showcase page:

2016-02-19_16-29-38

 

 

Posted in astroinformatics, Astronomy, Black Holes, Computing, cyberinfrastructure, Gravitational waves, High performance computing, informatics, LIGO, Operations, Parallelization, programming, Scientific computing, software engineering, software maintenance, workflows | Tagged , , , , , , , , , , , , , , | Leave a comment

The SAMI Data Archive: A Prototype of An Archive of the Future?

Astronomy data sets are not simply exploding in size – they are exploding in complexity too. Witness the data sets obtained from integral-field spectroscopy (IFS).  While the Sydney-AAO Multi-object Integral-field spectrograph (SAMI) survey has exceeded measurements of 1,000 galaxies, surveys such as those performed with Hector aim to survey a 100,000 galaxies, and the SDSS Baryon Oscillation Spectroscopic Survey (BOSS) is expected to survey over 1 million galaxies. Such surveys require new approaches to archiving and data access. This is because what might be termed “classical” approaches based on storing data in FITS files and finding data by SQL-based queries may prove too slow and cumbersome when applied to these new kinds of data.

This was the thinking of Konstantopoulos et al. (2015) (Astronomy and Computing 13, 58-66) in developing the archive for the SAMI project, which they call samiDB. It is available on-line and written in Python. Their archive is underpinned by using HDF5 for data storage and access. HDF5 may be best described as a smart data container that avoids the large overheads that come with hierarchical data formats (such as the Hadoop file system) and relational databases.

By taking advantage of the Python interface to HDF5, Konstantopoulos et al. were able to provide equivalent functionality to that offered by an SQL interface and with comparable performance. HDF5 in effect enables easy scanning and recovery of subsets of data within the HDF files. The authors summarize the benefits of their design this way:

“The engine behind samiDB is HDF5, a technology that packages data into a hierarchical   format thereby saving space on disk and requiring minimal processing prowess to     plough through complex data. The programmatic interface is written entirely in Python and it plugs neatly into a web front-end built with the Drupal content management system (the interface is under development).”

To give you a flavor of how data are presented to the user, here is a screenshot from their early release browser:

2016-02-10_16-59-59

Posted in astroinformatics, Astronomy, astronomy surveys, Computing, cyberinfrastructure, data archives, Data formats, Data Management, Data mining, databases, FITS, HDF5, High performance computing, informatics, information sharing, Open Source, Python, Scientific computing, software engineering, software maintenance, software sustainability, Uncategorized | Tagged , , , , , , , , , , , , | Leave a comment