The Ghost in the Machine: Why PIDs Are the Last Line of Defense Against the Paper Mill Pandemic
Verified Researcher
Jun 23, 2021•4 min read

The PID Illusion: Security or Smoke Screen?
Everyone is currently high on the promise of Persistent Identifiers (PIDs). On June 21, 2021, the UK’s RINCC meeting and the new Cost Benefit Analysis Report suggest that PIDs are the magic bullet for administrative efficiency. They tell us that ORCID, Crossref, and ROR are the plumbing of a modern, streamlined research ecosystem.
But here is the uncomfortable truth that the industry is too polite to say: PIDs are not just about metadata exchange, they are the only thing standing between us and the total collapse of scholarly trust.
The reality is that we are witnessing an industrial scale assault on academic integrity. Paper mills churn out fabricated data while predatory journals provide the scholarly veneer to host the mess. While the establishment talks about reducing bureaucratic burden, the real war is being fought over identity. If we do not weaponize PIDs as a deterrent against fraud, we are not building an infrastructure. We are building a playground for bad actors.
The Identity Crisis: When Your 'Author' Doesn't Exist
Predatory publishers love the status quo. In a world of manual entry and "re-keyed" metadata, it is laughably easy to invent a researcher. You can manufacture an affiliation to a non-existent institute, claim a fictitious grant, and list a dozen ghost co-authors.
As Phill Jones and Alice Meadows observed in their recent look at the PID world, publishers have been too slow to use these tools properly. Most journals treat an ORCID iD like a library card (a passive piece of plastic). But in the hands of a predatory mill, that passivity is a massive loophole.
If a manuscript submission system neglects to query the ORCID API for a history of publication, or fails to cross-reference ROR data to check if a university actually exists, it is functionally helping the fraudsters. People argue for a friction-free author experience to keep things simple. This is a mistake. An easy submission process is a gift to a paper mill. We need meaningful friction (the kind that demands a verifiable digital footprint) before a single word goes to review.
Follow the Money: The ROI of Trust
The UK’s Cost Benefit Analysis focuses on the millions of pounds saved in administrative time. That’s a rounding error compared to the cost of a retraction. When a predatory journal accepts a paper-milled study supported by fake grant metadata, the entire record is poisoned.
Crossref and DataCite are currently pigeonholed as discovery tools. This is a failure of imagination. We should be using them for forensic validation. The PID Graph, which is that interconnected web of people and places, ought to be the first screen in any integrity check. If an author shows up with an ORCID record created yesterday that has no grants, no affiliations, and no history, the system needs to flag it immediately.
Radical Proposals for the Post-Trust Era
The time for "encouraging adoption" is over. If we want to save scholarly publishing from the onslaught of predatory practices, we need structural shifts that prioritize integrity over convenience:
Mandatory Identity Provenance: We must move beyond simply requesting ORCIDs. Major funders and legitimate publishers should refuse to process any manuscript that does not come with a complete, authenticated PID trail. If the PID can't be traced back at least three years, the paper undergoes "High-Scrutiny Review."
The API Shield: Publishers must integrate ROR and Crossref grant DOIs directly into the submission gateway. If a researcher claims a grant that doesn't resolve in the Crossref database, the submission is rejected instantly for metadata non-compliance.
We are at a crossroads. We can use PIDs to make it easier for researchers to fill out forms, or we can use them to make it impossible for fraudsters to hide. The industry is obsessed with the time saved. I am worried about the cost to our reputation. Without a hard line on identity, the PID graph is just a map of the rot.



Discussion (9)
Join the conversation
Login or create an account to share your thoughts.
Could you expand on the instrument PIDs? My department is struggling to track equipment ROI effectively.
Check out the PIDINST working group mentioned in the previous thread, it's a game changer.
Integrating RRIDs into our lab workflow has already saved us thousands in lost hours. Glad to see the 'ghost' of bad data finally being addressed.
Spot on.
wow this is actually deep the machine metaphor really hits home for anyone dealing with meta data issues
finally someone mentioned the burden of reuse logic again it literally feels like we are paying for others laziness every time we search an empty orcid profile
I'm still seeing journals that make it nearly impossible to input co-author IDs during submission. The 'friction' is a choice at this point.
Excellent follow-up to the previous post. I remember when we just used card catalogs, but these digital footprints are clearly the future for our libraries!
The idea that PIDs alone can stop industrial-scale fraud seems overly optimistic. Without manual oversight, metadata can still be spoofed.