Oh my, I thought I had replied here long ago, but must have failed to save.
First, I want to say, while it's true this issue has been pretty frustrating to our campaign and ability to access information about our subject matter and our progress, I understand that it came about through good faith efforts. I do think there are some important lessons for next time (and I think it's worth some further discussion, here and maybe at a more general venue, to figure out exactly what those lessons are -- as they may not be 100% clear to anybody yet.)
Specifically in response to the comment immediately above, about the Cain City News: Personally, I strongly disagree with the conclusion; I understand that such merges would be sub-optimal, but in the grand scheme of things, if the choice is:
- Create tens of thousands of items, of which maybe 40% are duplicates, or
- De-dupe, potentially merging some hundreds or even thousands of items that are not actually the same, and then create far fewer items
I think the second option is VASTLY superior. These are items that did not previously exist in Wikidata; to the extent they are considered important, they will be fixed up by human editors and/or automated processes over time.
Furthermore, with many smart minds on the problem, it might be possible to substantially reduce the number of false-positives in the de-duping process, so that most items like the Cain City News get caught and fixed ahead of time. (Maybe.)
Which is all to say, it's my understanding that best practice in these cases is to make meaningful consultation with other Wikidata editors prior to importing thousands of items, and to allow some time for the discussion to unfold and ideas to emerge. I think this particular instance really underscores that need. Maybe others would agree with your assessment about items like Cain City, or maybe they would agree with me; but without asking the question, how do we know? We don't. It's important in a collaborative environment to assess consensus, prior to taking large-scale actions that are difficult to undo.
Anyway, I think it would be a good idea to bring some of the points in this discussion up at WikiProject Periodicals or similar, and get some other perspectives that could inform future imports.