Up: August 2009: How Google beat Amazon and Ebay to the Semantic Web | [Related] «^» «T» |
Sunday, July 28, 2002
A bit of commentary on Google and the Semantic Web
By Paul Ford
In response, ya see.
I've been receiving lots of feedback on the Google/Semantic Web piece, enough to address the readers directly.
First, ahem, RDF stands for “Resource Description Framework,” not “Format.” I am a shithead.
Second, the technologies being described in the piece all exist, more or less. Everything described could pretty much happen now, I mean, things would have to get worked out as we went along, it would take years in practice, but there's zilch rocket science in there, no miracles, since obviously people have figured out how to do really huge massive scalable data relation over at Google. Ultimately, it would be a really good thing, because it would pull down the big self-important wall-of-AI that's been built up over the last 30 years and there would be fun tools for the peoples to play with, and AI and Knowledge Management would be for the peoples, and we could escape the desktop-windows pairuhdiggum.
Third, I actually doubt Google will try to get a cut of every transaction, as I've described. They've said over and over they only care about search. Although if they smell lots of money, well, corporations will be corporations.
Third, continued. See, I get worried about Google. They're beginning to control a space that is essential for open dissemination of information. So far they have only demonstrated excellent intentions, but the invisible hand of the market is quite a thing, and you often find it stuck right up your ass, or in your pocket looking for your wallet. Google is there to make money. There is nothing evil about that, but corporate money making is not necessarily in the people's interests, and even companies that appear to have great intentions are forced to make difficult decisions that ultimately screw the consumer. When companies have power - and Google is getting real power over the way that information is disseminated - they need to be watched carefully.
Not that Google isn't sweet.
In some ways I wish there was an effort to create a P2P hugely-scaleable redundant spidering tool - exactly what Google has, but with a few million nodes on shared computers. Even better, if I could run an indexing algorithm against my own site, store the data locally, and report an overview (word list) via metadata - well, that would be snazzy, if a bit difficult to implement. Then, every relevant query via the P2P-based search mechanism could query my local server for full results. That way the search info about Ftrain would always be the most fresh, and I would control the search of my site myself.
I did not explain that well.
Fourth, I'm telling you, if you'd only listen, that spreadsheets are important to the future of the Internet. Not the gunky ones we have now, but super-futuristic ultra-spreadsheets. Say I wanted to sell my books, and put an ISBN number into a spreadsheet, and then applied a Semantic Web-based function. So I have ISBN 2884838483, and I enter item.book.isbn(2884838483) as the function. This goes out talks to the Library of Congress, which spits back a nice MARC record, and an XSLT script converts that an RDF descriptions according to the Open Products Hierarchy, and fills in title, author, publisher, number of pages, just like that in the spreadsheet. And each of those items can be related to other information, because there's a standard way to define data interchange (XML) and the actual structure of the data (RDF). Web-as-spreadsheet is fun to think about, I swear.