There should be a federated system for blocking IP ranges that other server operators within a chain of trust have already identified as belonging to crawlers. A bit like fediseer.com, but possibly more decentralized.
(Here's an advantage of Markov chain maze generators like Nepenthes: Even when crawlers recognize that they have been served garbage and they delete it, one still has obtained highly reliable evidence that the IPs that requested it do, in fact, belong to crawlers.)
Also, whenever one is only partially confident in a classification of an IP range as a crawler, instead of blocking it outright one can serve proof-of-works tasks (à la Anubis) with a complexity proportional to that confidence. This could also be useful in order to keep crawlers somewhat in the dark about whether they've been put on a blacklist.