A New Website for Harper's Magazine (Ftrain.com)

Up: The Semantic Web

[Related] «^» «T »

Monday, December 1, 2003

A New Website for Harper's Magazine

On December 1, 2003, a new website for Harper's Magazine launched at Harpers.org. This site was conceptualized, programmed, and designed by myself, under the management of Harper's senior editor Roger D. Hodge. I also wrote some copy for the site, and have been editing the Archive of pre-1900 articles.

I desperately need a nap, but I thought I'd tell you a bit about the site first.

Image: screenshot of Harpers.org.

The site looks like this, but larger.

It's been noted that Harpers.org looks like Ftrain. It's actually the other way around: Ftrain looks like Harpers.org. I've been using you, the Ftrain reader, as a guinea pig for about 5 months, testing ideas I developed for Harper's, finding out what JavaScript worked in which browser, which interface ideas were too baffling to include, and seeing how you dealt with different sorts of links.

Thanks for that.

Because Ftrain readers are free with both praise and criticism, this turned out to be a good way to craft a site that was accessible, worked in most browsers, and was enjoyable to use (with some practice). Now that Harpers.org is up and its design is stable, Ftrain can change according to my whim, and I can begin to break things here in new ways.

Now, I am going to blow my own horn. [1]

Image: a man blows a trumpet.

Features of Harpers.org

The regular list of new-age website tomfoolery applies: XHTML/CSS/QXNYTLRPK, accessible for the people, JavaScript zip-zap, validating RSS hoo-ha, etc. The framework is solid XML and XSLT2.0, and plays nice with others. But also:

Remixing Narrative

Harper's is built upon a Semantic Web framework—albeit a primitive one. I've written about what the Semantic Web is, and why it matters before, if you're curious, so I won't rehash that here.

Using this framework, Harper's is divided into two parts: narrative content, like the Features and the Weekly Review, and a taxonomy (or ontology, depending on your preferred term), called Connections.

The taxonomy is a big list of interconnected topics—examples are Dolly the Sheep, Monkeys, and Satan.
The Weekly Review, which is narrative content, is description of the events of the past week, published every Tuesday (see an example).

We cut up the Weekly Review into individual events (6000 of them, going back to the year 2000), and tagged them by date, using XML and a bit of programming. We did the same with the Harper's Index, except instead of events, we marked things up as “facts.”

Then we added links inside the events and facts to items in the taxonomy. Magic occured: on the Satan page, for instance, is a list of all the events and facts related to Satan, sorted by time. Where do these facts come from? From the Weekly Review and the Index. On the opposite side, as you read the Weekly Review in its narrative form, all of the links in the site's content take you to timelines. Take a look at a recent Harper's Index and click around a bit—you'll see what I mean.

The best way to think about this is as a remix: the taxonomy is an automated remix of the narrative content on the site, except instead of chopping up a ballad to turn it into house music, we're turning narrative content into an annotated timeline. The content doesn't change, just the way it's presented.

Everything is in the Taxonomy

Harpers.org makes almost no distinction between data and metadata. Any block of text can have multiple blocks of text living inside of it (as when the Weekly Review contains events), and these blocks in turn can contain multiple blocks, and so forth. What this means in practice is that in addition to events and facts, I can add any arbitrary kind of data to the site. Links, Litigation, Questions, Answers, Lies, Photos, Crimes, any sort of boundary you can think of. By linking from inside of these boundaries to pages in the taxonomy, the taxonomy pages know to automatically list and sort them, whatever they are. How you display them is up to the XSLT code, and to the way the ontology is structured. Let's skip over that part.

Another example: the Bookstore is just another part of the site, and the ads for books are automatically generated from the bookstore. Advertising and editorial are produced with the same system using the same linking mechanism. In theory, this would allow by-topic sponsorships similar to keyword-based advertising on search engines: “I'd like this ad to run next to the religion category and on all pages related to religion.”

There's other stuff under the hood, and I have many plans (dynamically generated maps! queryable content! news-trackers!), but actions speak louder, etc.

No Banner Ads

Banner ads are terrible for both readers and advertisers. We got rid of them for Harper's, put non-blinking ads to the side, and made them half image, half copy, flexibly sized. This is, I believe, good for brand-building—the advertiser's message is prominently displayed and persistently there as the user reads. Because this message is not obtrusive and animated, it need not be ignored. Because it is graphical and bold, and lives in its own place to the right, it can be seen as content unto itself, not simply tacked on to make a few spare bucks. The ads are an integral part of the page.

(I wouldn't mind animated ads, but they should only animate when the user mouses over them and shows interest. “The reader's freedom is a holy thing,” says William Gass, and I agree.)

Constructing Harpers.org

This was originally going to be a case study of how Harper's came into being. But only I care about that, and my pomposity has some limits. So I'll give you the entire thing in 4 bullets.

Harper's has been very patient.
Roger is an editor who knows XML and how to program. His kind are slightly more rare than talking dogs. He took on a great deal of complicated work in order to make this site happen, without even a shrug.
3,000 facts, 6,000 events, 12,000 links, 500 topics, and over 939 separate HTML pages. 300,000 words.
I finished coding the first draft of the site by annotating printouts of XSLT code with a pencil, by propane light, in a 100-year-old log cabin in West Virginia, while muttering.

Now that everything is working fairly well, it's time to tear the guts out of the code and start again. A small team of Java coders and I are planning to take the work done on Harper's, and in other places like Rhetorical Device, and create an open-sourced content management system based on RDF storage. This will allow much larger content bases (the current system will start to get gimpy at around 30 megs of XML content—fine for Harper's, but not for larger sites), and for different kinds of content to be merged. When this will be completed is open to discussion, of course. But it seems like the right next step, if we can just figure out how to find the time to get it done. More later.

Talk to Me

Now that I've done this, I'd love to talk about it. If you're in publishing or a related industry and want to talk to me about any of the work that went into the Sitekit code, for whatever reason, with the idea that I might help you think about your own content, please contact me, arrange a meeting, take me to lunch, throw paper airplanes over the East River from Manhattan in the hope that they will come in through the hole in my screen window, and so on. It's time for a career change.

Notes

1. Trumpet provided by Leslie Harpold. Ms. Harpold is Ftrain's preferred visual archivist and art director. [Back]

Ftrain.com

PEEK

Ftrain.com is the website of Paul Ford and his pseudonyms. It is showing its age. I'm rewriting the code but it's taking some time.

FACEBOOK

There is a Facebook group.

TWITTER

You will regret following me on Twitter here.

EMAIL

About the author: I've been running this website from 1997. For a living I write stories and essays, program computers, edit things, and help people launch online publications. (LinkedIn). I wrote a novel. I was an editor at Harper's Magazine for five years; then I was a Contributing Editor; now I am a free agent. I was also on NPR's All Things Considered for a while. I still write for The Morning News, and some other places.

If you have any questions for me, I am very accessible by email. You can email me at ford@ftrain.com and ask me things and I will try to answer. Especially if you want to clarify something or write something critical. I am glad to clarify things so that you can disagree more effectively.

Story

Theory

«Life With Machines»

Webs, Semantic and Otherwise»

«The Semantic Web»

«A New Website for Harper's Magazine»

POKE

Syndicate: RSS1.0, RSS2.0
Links: RSS1.0, RSS2.0

Contact

Recent

@20, by Paul Ford. Not any kind of eulogy, thanks. And no header image, either. (October 15)

Recent Offsite Work: Code and Prose. As a hobby I write. (January 14)

Rotary Dial. (August 21)

10 Timeframes. (June 20)

Facebook and Instagram: When Your Favorite App Sells Out. (April 10)

Why I Am Leaving the People of the Red Valley. (April 7)

Welcome to the Company. (September 21)

“Facebook and the Epiphanator: An End to Endings?”. Forgot to tell you about this. (July 20)

“The Age of Mechanical Reproduction”. An essay for TheMorningNews.org. (July 11)

Woods+. People call me a lot and say: What is this new thing? You're a nerd. Explain it immediately. (July 10)

Reading Tonight. Reading! (May 25)

Recorded Entertainment #2, by Paul Ford. (May 18)

Recorded Entertainment #1, by Paul Ford. (May 17)

Nanolaw with Daughter. Why privacy mattered. (May 16)

0h30m w/Photoshop, by Paul Ford. It's immediately clear to me now that I'm writing again that I need to come up with some new forms in order to have fun here—so that I can get a rhythm and know what I'm doing. One thing that works for me are time limits; pencils up, pencils down. So: Fridays, write for 30 minutes; edit for 20 minutes max; and go whip up some images if necessary, like the big crappy hand below that's all meaningful and evocative because it's retro and zoomed-in. Post it, and leave it alone. Can I do that every Friday? Yes! Will I? Maybe! But I crave that simple continuity. For today, for absolutely no reason other than that it came unbidden into my brain, the subject will be Photoshop. (Do we have a process? We have a process. It is 11:39 and...) (May 13)

That Shaggy Feeling. Soon, orphans. (May 12)

Antilunchism, by Paul Ford. Snack trams. (May 11)

Tickler File Forever, by Paul Ford. I'll have no one to blame but future me. (May 10)

Time's Inverted Index, by Paul Ford. (1) When robots write history we can get in trouble with our past selves. (2) Search-generated, "false" chrestomathies and the historical fallacy. (May 9)

Bantha Tracks. (May 5)

More...

Tables of Contents