Yesterday saw a few further developments in the ACS spider trap saga.
Firstly, we have a response from the ACS, quoted in part below.
…ACS Publications, like other information providers, utilizes a number of standard protocols that strike a balance between ease of accessibility for the scientific community and the necessary protection of technical infrastructure.
Early on Thursday, April 3, a link was posted to Twitter that exposed one such tool—a “spider trap” designed to prevent unlicensed machine-aided crawling and data extraction activity—that resulted in the temporary disablement of access for a number of ACS' institutional customer accounts.
ACS worked diligently to resolve the issue, and as of 4 PM EDT April 3, service was restored for all subscribers affected by this incident. Simultaneously, steps were taken to address the specific protocol that triggered this outage.
We regret the lapse in service, and we would like to assure you that ACS Publications will continue to serve the broader audience of chemical professionals, including customers, members and the scientific community who value access to the high quality, trusted original research published in ACS journals. Employing the use of these types of tools is imperative to providing users with continued access to that trusted research. We will therefore continue to refine our security procedures to support evolving publishing access models while protecting both users and content from malicious activities.
Secondly, as PMR details, the story reached Hacker News, where the ensuing discussion illuminated some of Thursday's mysteries. It seems that pubs.acs.org sits on a platfrom provided by a third party, Atypon, whose clients include Elsevier, IEEE, Informa Healthcare/Taylor & Francis, OUP and Thieme. This explains the spider trap appearing in exactly the same form across a number of publishers' sites, and the DOI-esque link's odd combination of a Wiley prefix and an Informa landing page.
The really eye-catching development, though, relates to a comment posted on PMR's second blog post on the subject. Context: when initially relating Pandora's experience, Peter recalled that the University of Cambridge had been cut off by the ACS in a similar fashion several years earlier, when one of his students had inadvertently triggered a rapid-reading monitor by (humanly) downloading 20-odd papers in quick succession. Peter had mentioned this (and the reason for it, i.e., not a spider trap) twice when a commenter, Georgios Papadopoulos, sailed in with this, quoting Peter in the first line:
> Note that my own experience was not a spider trap but simply (humanly) reading too many papers too rapidly – publications are not meant to be read rapidly, are they?
This is really funny. Tom Demeranville described the trap very acurately.
These LINKS (they are not DOIs!) are not visble or clickable. Only a (dumb) spider follows them. You created such a dumb spider and you were scraping the content. You were not reading it or clicking on anything.
You were caught, but perhaps the funniest part of that was that then you also came up and exposed yourself. We usually never identify the writers of such crawlers.
At first glance, an ill-informed troll. However, once the involvement of Atypon was revealed, the name Georgios Papadopoulos suddenly gained significance—he's Atypon's CEO.
Of course, it could be an imposter, but Peter doesn't believe so. If that really comes from the Atypon CEO, it's quite staggering in its ignorance and arrogance. Someone stumbles across a ‘security measure’ deployed by your company, a measure of irresponsibly poor execution, and the internet investigates. Your response? Wade in to the comments on a blog and gloat about it, simaltaneously displaying either an unwillingness or an inability to read and understand some fairly simple circumstances described on that blog. Wow.
No doubt there's more to come—what the ACS will do about it being of primary interest.