Scirus "journal results" and "web results"

Scirus "journal results" and "web results" Stevan Harnad 17 Jun 2002 15:52 UTC
Note: The discussion with Peter Suber below is about Scirus, a new
scientific search engine provided (for free for all) by Elsevier:
http://www.scirus.com/search_simple_boolean/

Scirus retrieves both open-access and toll-access resources, but it
retrieves them in two categories: "journal results" and "web results."

The specific point under discussion here concerns the status of published,
refereed journal articles that are listed under "web results" rather than
"journal results."

The discussion is excerpted from the Budapest Open Access Initiative
drafters' Forum, which will soon open as a public Forum for all BOAI
signatories: http://www.soros.org/openaccess/view.cfm

On Sun, 16 Jun 2002, Peter Suber wrote:

> Stevan and others,
>
> Here's my latest thinking.  I believe that Elsevier's strategy here is to
> make the Scirus index so comprehensive that scientists make it their tool
> of first resort when they have a research question.  Elsevier wants to
> change the practice of scientific searching so that "if it isn't in Scirus,
> then it isn't visible" to serious, busy researchers who don't have time to
> run parallel searches.

That's probably true. But the only way they have been able to make it so
is by including huge amounts of free resources. They have thereby,
whether they realized it or not, whether they intended it or not, set
the Darwinian stage for direct competition between for-free and for-fee
resources -- particularly those that have a dual incarnation, a for-free
and a for-fee version, both in Scirus.

Now it's true that the fee version gets somewhat more prominence in the
"journal results" category, but the more I think about it, the more
convinced I am that this will be a trivial factor. People quickly learn
short-cuts, especially under the constraint of finite budgets! Not only
will individual users figure out ways to check whether there is also
a free version of the same journal article they just retrieved, buried
lower down there somewhere among the "web results," but, as I suggested in
an earlier posting, it will be easy to design a "supplemental" service,
on top of Scirus, that will automatically do this extra within-results
search for a user on any set of Scirus hits. (Here again, if Scirus's
hopes of wide use are realized, that will at the same time be hastening
the Darwinian evolution toward the optimal and inevitable.)

So, the more I think about it, the less I am inclined to worry about
Scirus! Scirus is predicated on the idea that users can and will continue
to pay, come what may, and that giving them everything that is accessible
on one plate (with preferential status to the fee-based fare) is a way to
consolidate this state of affairs into the future. But if we open-access
advocates are right that there is an anomaly in this literature, one that
was unresolvable in the paper era but is now ripe for resolution in the
online era, then gathering everything on one plate will only serve to
highlight that conflict of interest, as well as the easy and obvious
(and inevitable) way to resolve it!

The users of Scirus will also be the authors of the articles. They
will see clearly, while wearing both their user hats and their author
hats, how their give-away texts are being put behind a financial
firewall that blocks both access and impact, and how they need not be!
For there, alongside those texts, among the "web results" will be the
growing number of examples of exactly how to get all of that work outside
the firewall!

Peter, we have been trying to persuade the scholarly/scientific
horses to drink for so long; maybe what they need is
not just more words, but direct, concrete, practical trial-and-error
experience! (I am reminded of how the "Monty Hall" puzzle, which
is often cognitively impenetrable to abstract reasoning,
quickly converges on the optimal solution as soon as it is
practically experienced as a reward-based series of trials:
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Foundations.Cognitive.Science2000/0033.html )

> If true, then letting Scirus index CogPrints, Medline, BioMedCentral, and
> arXiv (all currently in Scirus), just plays into Elsevier's hands.

I think it only seems to. If we are right about what outcome is optimal
and inevitable for this special anomalous literature (and I am pretty
sure we are!), then the users will converge on that same conclusion
through actual trial and error experience.

> On the other side, I wrote in FOSN for 4/8/02 and argue in an upcoming
> paper that we should adopt a similar strategy: make the searching and
> indexing tools for the public internet, or for the OAI subset of it, so
> powerful that they become the scientists' tools of first resort. Then we
> can say "if it is isn't freely available on the public internet, then it
> isn't visible" to busy researchers or those at institutions of low to
> moderate wealth.

What the toll-based providers are doing is adding the free web contents
to the toll-based contents as a "value-added" incentive for their search
engine-users. This is exactly equivalent to the open-access providers
doing the same thing: adding to the free contents the (metadata of)
the firewalled proprietary contents. Either way, the outcome is that
the user faces a choice, and sees the costs/benefits -- and not only
from the user's point of view, experiencing the frustrations and costs
of toll-based access and access-denial, but (because of our species'
"mind-reading" capacity to put ourselves in others' shoes), they will
also see how this amounts to impact-denial for their own work, from the
author's point of view.

It is the contrast between toll-access and open-access that will provide
this object lesson, and especially the growing number of cases where
the very same paper is available both ways! Hence it does not matter
at all whether the universal engine that will teach this object lesson
is constructed (1) "bottom-up" by the open-access providers, adding
the fire-walled material to the open-access material to illustrate
the dead-ends and what remains to be done, or (2) "top-down," by
the toll-access providers, adding the open-access material to their
toll-access wares: The lesson for the user will be exactly the same. "Some
of these author give-aways the user still has to pay for, yet all it
takes to fix this is for authors to ensure that an open-access version
is available too!"

Once that token drops, the open-access era is upon us, irreversibly.

Search engines will certainly prevail, and commercial search engines
are unlikely to be unable to compete with ever-improved free-sector
engines. I think that is transparent, and whoever is hoping to make (or
save) a business out of selling a commercial search engine is probably
making a big mistake.

But the need to be picked up by the search engines certainly will not
directly force open access, for the toll-access providers always have
the option of just releasing their metadata to the search engines, and
keeping their full-texts behind the fee-based firewall.

Which simply brings us back to where we were with Scirus: Both the
bottom-up and the top-down route lead to the same place: a Darwinian
competition between toll-access and open-access in the minds of the
users, who also happen to be the author/providers!

So the prospects are good, and I'm inclined to trust that the Darwinian
forces will optimize (or at least "satisfice") once the contingencies are
clear in real-life practise.

So let us keep working to make the contingencies clear in practise,
both by increasing the amount of open-access content available for this
Darwinian contest (through BOAI Strategies S1 and S2), and by meanwhile
continuing to try to penetrate researchers' brains directly by reason...

> What helps our strategy is that we offer free online data for the next
> generation of research software.  This allows garage-style innovation to
> occur in thousands of locations around the world.  On the other hand, what
> helps Elsevier's strategy is that it has a headstart, with Scirus, and gets
> help from every free source it annexes.

We (S1 & S2) offer open-access content; publishers offer toll-access
content. Search engines are search engines: They cover either one, the
other, or both. It doesn't matter whether they are provided by Google or
Scirus. (Scirus performs the added service of winnowing out the
scientific/scholarly content from the rest, though Google and others
-- especially the other OAI Service Provides
http://www.openarchives.org/service/listproviders.html -- will no doubt be
offering that too.)

> I reluctantly conclude that it might be best not to let FOS sources be
> indexed in Scirus, or at least not to publicize that they are indexed
> there. It will hurt FOS if Scirus becomes the tool of first choice for the
> busy scientist.

I doubt it matters, but we may as well leave the advertising to them:
It's what they're best at!

> I'm conscious that this strategy would hurt FOS if Scirus were *already*
> the tool of first resort for busy scientists.  But I don't think Scirus has
> achieved that kind of indispensability --though I'd welcome the views of
> others closer to the front on this.
> So I endorse the second of your two strategies.

I don't think it matters either way: The second strategy was:

    (2) We could ignore it, on the (probably correct) theory that it
    makes no difference whatsoever that Scirus is doing this, and it
    will just end up as one of those historic footnotes on the last days
    and futile last-ditch efforts to preserve toll-based access to the
    give-away research corpus.

> I also wonder again
> whether it would be possible to negotiate a different arrangement with
> Elsevier:
>
> > > If Scirus (Elsevier) wanted your permission before indexing CogPrints, I
> > > wonder whether it would negotiate the terms. What if you said that
> > > permission depends on making the articles freely available without the
> > > ScienceDirect detour and toll gate?
>
> > They will reply they are already doing that -- via the "web results"
> > category...
>
> But you could reply that you want the CogPrints contents (at least the
> postprints) listed as "journal results" and that you want Scirus to link
> directly to free full-text editions (both preprints and
> postprints). Elsevier could say "no", but then so can the archives it
> wants to index.

Scirus could reply (rightly) that they prefer to list as "journal results"
only those articles that they harvest from authenticated journal sources.
(ArXiv and CogPrints contain plenty of journal articles, but those are
self-archived, hence not "authenticated" by the publisher.)
http://www.eprints.org/self-faq/#2.Authentication

I'm inclined to think it doesn't matter in what category the results are
classified by Scirus. Users will figure it out, and the supplemental
search software I proposed above can make it explicit. (By the way, BMC
articles are listed as "journal results," but available free; so with
authenticated publishers Scirus is not putting a price-tag on all items
classified as "journal results.")

> Jan: I wonder whether BMC might have better success negotiating these
> terms, given its size.

As far as I can tell, there is nothing for BMC to negotiate (see above).

> It's possible the negotiations would go nowhere. But Elsevier can't
> welcome the prospect of a press release announcing that an important source
> of open-access science (CogPrints, BMC...) is withdrawing from Scirus, and
> then giving the reasons why. This would show (1) that Scirus is spotty,
> not comprehensive, and (2) that Scirus charges for access to free literature.

My inclination is that there is no need for anyone to withdraw from
Scirus, and no need for a press release, one way or the other: Let them
do their best and I'm pretty confident all will turn out for the best!

Date: Mon, 17 Jun 2002 09:43:17 -0400
From: Peter Suber <peters@earlham.edu>

The only point on which we differ is whether it will be easy or difficult
for Scirus users to figure out how to get free access to the articles in
the search results which are in fact freely available elsewhere.  I suspect
it will be difficult.  Users like you who know that much of this literature
is already free and know how to find it will know what alternatives to try
when you hit a financial barrier.  But I suspect that most Scirus users
will hit the financial barrier and either pay, ask their institutions to
pay, or turn away frustrated.

I agree that when Scirus users are authors of open-access articles, they
will see the unjustified financial barrier between their articles and
readers.  But for the ordinary Scirus user, simply looking for research on
a given topic, Scirus will successfully give the impression that all the
articles in a return set are pay-per-view.  This may frustrate users, and
build demand for FOS, but it can't lead many users to look for free
editions when most users have no reason to believe that free editions even
exist.

Just one clarification of earlier point.  Google is (and deserves to be)
the tool of first choice for most searching needs.  For that reason, users
have come to rely on it.  To the extent of this reliance, what isn't in
Google isn't visible.  My speculation was that Scirus is trying to become
the Google of science.  I don't think it has succeeded, but I do believe
that its success would be harmful to FOS, especially if most users can't
see past the Scirus toll-gate.  Hence, we should not help it succeed by
letting it index open-access texts, become more comprehensive, and hence
more useful and inviting to users.

      Peter

--------------------------------------------------------------------------

Harnad, S. (1997) How to Fast-Forward Serials to the Inevitable and the
Optimal for Scholars and Scientists. Serials Librarian 30: 73-81.
http://cogprints.soton.ac.uk/documents/disk0/00/00/16/95/index.html

Harnad, S. (2001) The Self-Archiving Initiative. Nature 410: 1024-1025
http://cogprints.soton.ac.uk/documents/disk0/00/00/16/42/index.html

NOTE: A complete archive of the ongoing discussion of providing free
access to the refereed journal literature online is available at the
American Scientist September Forum (98 & 99 & 00 & 01):
    http://amsci-forum.amsci.org/archives/september98-forum.html
                            or
    http://www.cogsci.soton.ac.uk/~harnad/Hypermail/Amsci/index.html

Discussion can be posted to:
    september98-forum@amsci-forum.amsci.org

See also the Budapest Open Access Initiative:
    http://www.soros.org/openaccess

and the Free Online Scholarship Movement:
    http://www.earlham.edu/~peters/fos/timeline.htm