CHORUS is now live - how does it stack up to PubMed?

What is CHORUS and why is it important to know about if you’re an academic? From the FAQ (bold emphasis mine):

CHORUS (Clearinghouse for the Open Research of the United States) is a not-for-profit public-private partnership to provide public access to the peer-reviewed publications that report on federally funded research. Conceived by publishers as a public access solution for funding agencies, research institutions, and the public, CHORUS is in active development with more than 100 signatories (and growing). Five goals drive CHORUS’ functionality: identification, discovery, access, preservation, and compliance. CHORUS is an information bridge, supporting agency search portals and enabling users to easily find free, public access journal articles on publisher platforms.

Only it fails in the one thing that it claims to support, public access - at least as far as I can tell so far. And this is the big worry we’ve had all along, that a paywall publisher backed solution to the White House’s OSTP mandate would not work. For a critical overview of the concerns see Michael Eisen’s comments from one year ago when CHORUS was announced.

Why isn’t CHORUS working?

Let us jump right into doing a search. Here’s an example query for NIH funded research. When I ran this search today (August 1, 2014) I got only 3,775 results. Hmmm. That can’t be right, can it? Only 3,775 NIH funded articles? Moving on…

The first result I got was to an article published July 2014 in the American Journal of Medical Genetics. Click the DOI expecting public access, and I hit a paywall. Oh wait, that’s right - CHORUS also indexes embargoed research set to actually be public open access in 12-24+ months. Next several search results - same paywall. Not until the fifth result do I reach an Open Access article.

OK fine. Perhaps it is reasonable to include a mix of embargoed papers with public open access papers - even though OPEN RESEARCH is in the name of CHORUS. I’ll just click the filter for actual public open access papers and see my results. Hmm, unfortunately there is no filter for actual public open access papers. Ruh-rohs. 

And there does not appear to be any labeling on search results indicating whether a paper is actually public open access or still embargoed (for some unknown period of 1-2 years). Ruh-rohs again.

Are we just seeing teething pains here? In some things for sure, for example only having 3,775 NIH results (when there are millions). It can take time to get all of that backlog from publishers (though I don’t know why they’d launch with such a paltry number). However, I don’t believe the lack of Open Access labels or ability to search only for papers already Open Access (rather than embargoed) is a teething problem. That’s a major oversight and makes you wonder why it was left out in a system designed by a consortium of paywall publishers. I can’t imagine SPARC, for example, leaving out an Open Access filter if they had built this search.

What else is wrong with CHORUS? 

The above was just one technical problem, albeit a very concerning one. The main issue is the inherent conflict of interest that exists in allowing subscription publishers the ability to control a major research portal. As Michael Eisen put it, that’s like allowing the NRA to be in charge of background checks and the gun permit database.

In the title I asked, “how does CHROUS stack up to PubMed?” We need to make this comparison since one of the aims of CHORUS is to direct readers to the journal website, instead of reading/downloading from PubMed Central (PMC).

Perhaps most importantly, CHORUS allows publishers to retain reader traffic on their own journal sites, rather than sending the reader to a third party repository.


And if you believe Scholarly Kitchen then PMC is robbing advertising revenues from publishers and PMC is costing taxpayers money as a useless redundant index of actual public/open access papers. Let’s not mince words, Scholarly Kitchen (and by extension the Society for Scholarly Publishing) believes that PubMed and PMC should be shut down. No one believes taxpayer money should be needlessly wasted, but it is a tall order to replace PubMed and PMC, so our expectations for CHORUS should be just as high.

Unfortunately, it is clear from using the CHORUS search tool that I have far less access and insight into publicly available research. And while an open API is slated for the future, it is questionable whether it will be as feature rich as NCBI’s own API into PubMed and PMC. 

CHORUS also fragments an otherwise aggregated index with PubMed. CHORUS looks to index only US-based federally funded research that is either Open Access or slated to be after a lengthy embargo. This means you still need to rely on PMC to find a non-US funded Open Access article. Clearly we still want that since it helps US researchers, right? Then why shut PMC down?

CHORUS isn’t free either. They’ve set the business model up such that publishers pay to have their articles indexed there. Do you think publishers are going to absorb those costs, or pass it along to authors/subscribers? The fact that CHORUS won’t index unless a publisher pays is rather scary; especially if CHORUS were to ever become the defacto database for finding research.

In Summary

I think CHORUS will improve over time, for sure. My worries though are the inherent conflicts of interest and that a major mouthpiece for CHORUS is calling for the removal of PubMed and PMC. I’m also skeptical whenever I see an organization using deceptive acronyms. CHORUS is not a database of Open Research as its name suggests. At least not ‘Open’ in the sense that the US public thinks of open.

You see, if CHORUS can convince the public and US Congress or OSTP that research under a two year embargo is still 'open’ then they’ve won. It’s a setback for what is really Open Access. Nothing short of marketing genius (or manufactured consent) to insert Open Research into the organizational name. 

I think these are legitimate concerns that researchers and the OSTP should be asking of CHORUS.


FIRST Act isn’t the first to use doublespeak against the advancement of science

The Frontiers in Innovation, Research, Science and Technology (FIRST) Act (link) is doublespeak for “we’re actually going to limit Open Access.”

The FIRST Act is yet another bill that is winding its way through the US Congress that despite making claims FOR science will actually reduce the availability of Open Access. Luckily the Scholarly Publishing and Academic Resources Coalition (SPARC) has clarified the damage that this bill would actually do to scientific advancement within the U.S. PLOS has done another writeup of the severe consequences this bill would bring. 

In the past similar bills such as the RWA "Research Works Act" backed by the Association of American Publishers and many paywall publishers have used this doublespeak. The Clearinghouse for the Open Research of the United States (CHORUS), a publisher backed proposal,  is another initiative filled with doublespeak, with the real aim to control access - not open it up. And more recently the “Access to Research” initiative from publishers does the opposite of what its title proclaims. It limits access to research in the digital age by adding a physical barrier and forcing you to travel hundreds of miles to a participating library instead of providing access in the convenience of your lab or home. 

What really fascinates me, however, is the continued use of marketing doublespeak in these legacy publisher proposals to manufacture consent and distort the facts for financial gain. That they are pronounced with a straight face each time makes me just a little sick inside that people like this actually exist. The opposite of heroes, value creators, and leaders. If you haven’t noticed, these tactics grind my gears to the point of evoking a visceral emotional response.

Now I’ve looked to see who outside Congress is backing the FIRST Act by way of either public support or Congressional campaign donations and have yet to find a connection to the usual suspect publishers or associations. Please leave a comment if you do find a connection. 


As Björn Brembs points out, a number of paywall journals and publishers have donated to the Congressmen responsible for bringing the FIRST Act to the House of Reps. This is more than a smoking gun leading back to Elsevier, and a few other large publishers known for backing previous anti-OA bills. 


Neylon highlights another misleading survey - this one from NPG

Thanks to this tweet by @CameronNeylon we see a very loaded question about Open Access licensing consequences from NPG. I should say also that there are a few other misleading questions from this NPG survey - which look to be as much as propaganda as (poorly designed) survey material. 


This seems to be the new scare tactic for anti-OA activists. Explain one possible commercial use case, one likely to offend or upset academics, while neglecting to state the many other reasons one would want to allow commercial re-use: teaching in academic situations (if the academic is paid that’s commercial use), text/data mining for new cures, in certain cases physician’s may hesitate to use or cite the research after developing new tools based on that information, etc.

Maybe you have a moral reason to not want a biopharma giant to profit off of your Open Access article. Fine, fair enough, but that actually doesn’t prevent them from using the information - facts can’t be copyrighted. More often than not, the use of a Non-Commercial OA license (e.g. CC-BY-NC) has the opposite effect from what the author hoped to achieve. Peter Murray Rust explains this in an excellent writeup here. An NC license doesn’t prevent the publisher you use from profiting off of the material, it won’t stop pharmaceutical companies, but it does deter others with many legitimate use cases.

Had all software development in the early days of the 60s, 70s, and 80s restricted commercial use then we wouldn’t be here today discussing this. Open licensing with explicit reuse for commercial interests has been the foundation of software that powers a majority of the world’s websites, and software that powers research activity in academic institutions. The parallels with Open Access articles and the early days of open sourced software are massive.

For sure, all academics should be aware of the possible uses of their research, but the point is to make them fully aware of all use cases, not just a select few intended to scare. And we also need to understand that choosing a restrictive NC license may have unintended consequences as well. 

Updated to add: Many, including the Budapest Open Access initiative, do not consider OA licenses with an NC clause to actually be Open Access. I agree with this position.


CHORUS: It’s actually spelled C-A-B-A-L

CHORUS is another attempt by subscription publishers to defeat Open Access. Probably no better writeup than Michael Eisen’s of how deceptive the intent and logic of this plan is.

CHORUS claims that it will save the US govt money if implemented, as part of the plan calls for the shuttling of PubMedCentral. The fallacy of course, is that costs to the govt (i.e. taxpayers) will actually INCREASE as publishers now have control of the “Open Access” content via a CrossRef like dispatching service. To maintain this dispatch service requires passing on the costs to their journal subscriptions — that ultimately means the libraries and agencies foot the bill.

If this is really going to save taxpayers money, then why have the publishers that are part of CHORUS not provided a cost break down? Let’s see the expected operating costs, charges to publishers to join this new organization, and the details of the API restrictions and practicality of retrieving the full-text for data mining. Then let’s compare that spreadsheet to the cost of running PubMedCentral. But that’s just the financial cost; more concerning is the cost of giving control of Open Access content to organizations whose business model is counter to the principles of OA.

Are these APIs truly open? What happens if I decide to build an aggregator with this content that is supposed to be Open Access? Will I be restricted or charged for high volume access, because publishers are now losing eyeballs as researchers go to my aggregator search engine? Do we really want publishers in charge of the key to the only source of all embargoed Open Access content? How gullible do they think the Obama Administration is? 

CHORUS is a patronizing plan to researchers, libraries, and the American taxpayer. It’s a coordinated effort to sustain subscription-based publisher revenue streams and falsely paint PubMedCentral as a waste of taxpayer money. It is not about innovating on Open Access content and expanding its accessibility.


When stealing isn’t stealing - The most disturbing part of the Aaron Swartz story

Probably the most disturbing thing to me about the Aaron Swartz tragedy is this statement in 2011 from US Attorney Carmen Ortiz:

“Stealing is stealing whether you use a computer command or a crowbar, and whether you take documents, data or dollars.”

That is teaching our children that the law is always correct and that discretion should not be used when enforcing the law. It’s teaching our children not to question what they are being told by those in power. Had the American fore-fathers believed that “treason is treason” then the United States would have never had its Revolution and founding.

There is no physical law that governs the Universe that outlines stealing, killing, lying, etc. These are human fabrications to govern us as a society, tribe, and culture. We equally have the capacity to dictate when stealing isn’t stealing, or when an act of treason is the right thing to do as the American fore-founders discovered. That is how we advance as a  civilization.

There is a tremendous difference between stealing for personal gain, and “stealing” [1] to release academic papers paid for with tax-payer money. A true leader would recognize that. It’s been reported that Carmen Ortiz had political ambitions to one day run as Governor of Massachusetts. Is that the kind of leader a state would want? A false leader who doesn’t recognize when an act has morally justified grounds? A real leader would act to make changes, not throw the book at someone.

MIT should be ashamed as well, whether they were actively pressing charges or passively standing by [reports are conflicted over this]. MIT as well is supposed to be leading us. In the past they were one of the first universities to offer free and open classroom lessons online. Here, they failed miserably to lead by example that academic research should be made open.

The Aaron Swartz story is bigger than just a 26 year-old doing some computer hijinks and getting bent-over by those in a position of power. It’s even bigger than the importance of Open Access to academic research. It’s surfacing some major issues that we have in society in both the U.S. and beyond about true leadership [note: I am a US citizen currently residing in London, UK]. Ortiz was put into a position to use her discretion. Instead, she let her ambitions dictate Aaron’s fate.

At the end of the day, if it is against the law to steal whether morally motivated or not, then you’ve broken the law. Laws can be changed though. New countries can be formed. And leaders in power can use their discretion to apply fair judgment, not to further their own ambitions. Where have all the true leaders gone?


1. Note that Aaron wasn’t even technically stealing in terms of the law, at most it was breach of contract [according to several reports].


Aaron Swartz found dead, but lives on with Open Access

If it weren’t for Aaron’s heroic actions to release academic research articles in 2011, I am unsure if PeerJ would have ever been born. 

Today’s news was shocking. Aaron Swartz was found dead from apparent suicide on the 11th of January, 2013 at the age of just 26. For the science community and Open Access advocates, Aaron was the man responsible for the (near) liberation of all pay-walled JSTOR content in mid 2011. He also co-wrote the first RSS 1.0 specification at the age of 14 and led the early development of Reddit. 

JSTOR and MIT eventually dropped the civil case against him (publicly anyway), but the U.S. government continued criminal proceedings against him. JSTOR, it should be noted, was not his first attempt at freeing information. Aaron was facing up to 35 years in prison for the act of setting academic research free. It’s unknown if this was the reason for his suicide, but that’s not why I am writing. 

The events around JSTOR and Aaron’s prosecution were probably the final straw for me. What kind of world do we live in, where such harsh punishment is sought for liberators of publicly funded information? The indictment of Aaron and the severity of the probable punishment angered me.

I wrote the following in July 2011 after learning about Aaron’s fate:

Will the JSTOR/PirateBay news be Academic Publishing’s Napster moment? i.e. end of the paywall era in favor of new biz models?

(Twitter reference)

Something had to be done. I wanted to turn Aaron’s technically illegal, but moral, act into something that could not be so easily thwarted by incumbent publishers, agendas or governments. Over the next few months I let that desire build up inside, until one day the answer came in the Fall of 2011.

It was then that the groundwork for PeerJ was first laid; a new way to cheaply publish primary academic research and let others read it for free. Aaron was significantly responsible for inspiring the birth of PeerJ and what I do now trying to make research freely available to anyone who wants it. 

I hope that when the history books are written in decades time, that Open Access crusaders like Aaron will still be remembered. My thoughts go out to Aaron’s friends and family. Know that Aaron’s light and efforts will live on. Thank you, Aaron, for inspiring us.