One further reason for centralisation (again, not "instead of" but "as well as") is consistency of metadata.
When I'm mapping specimen codes to GBIF I have one query interface and one return format. If I have to go to individual providers then all bets are off. Perhaps I'm lucky and the provider supports something like linked data, so I can figure out how to retrieve data (as opposed to a human-friendly web page). But instead I expect we will have all sorts of formats. For example, today I discovered records in GenBank that are linked to a tissue database with web pages like this:
So, I have to write code to scrape this page and get the bit I need (the voucher code). Really? In this day and age? On the one had it's great that this information exists, but if it's not computer readable then make it harder to integrate the data.
Even if we use standard vocabularies we can still have problems. BigDig found a whole range of different versions of Darwin Core in the wild (see
http://bigdig.ecoforge.net/wiki/SchemaStatus ), and I suspect this is one of the sources of GBIF's problems (whoever decided that catalogNumber and catalogNumberText where a good idea has a lot to answer for).
This is one reason I argue that we want both centralisation and decentralisation.