Up: Ftrain Codebase | [Related] «^» «T» |
Thursday, June 28, 2001
Running the Machine
By Paul Ford
Under the hood of Ftrain.com
This is the code for Ftrain with demo XML, except for the journal-manipulation stuff: ftrain_code.tar.gz (55K).
This is boring and technical, but I promised about 10 people I'd do it. If you're not interested in web site structure and layered digital narratives and XSLT and XML and so forth, don't even bother reading further. Sometimes I say that in a joking manner, but writing this up was about as much fun as pulling off my own toes, because at this point there are about 9 billion little decisions embedded in the code and I can't take the time to explain each one right now, even though I'd like to. No fun follows.
Anyone who wants to work on this code with me should drop a line. What's here is an enormously slimmed down micro-version of the original 9 billion lines of shit-code I wrote, and now that it's fast, I want to expand the features, create a core suite of small (PHP?) functions would would be able to auto-execute on each page to allow people to add content to the pages on an ad-hoc basis, and have about 30-40 different little tools to build to make a real, proper Web site publishing framework - tools that for some reason no one else seems to be bothering with, like creating an interactive fiction layer over the narrative, tokenizing the narrative to allow people to move it around as they went. Alowing people to create "guided tours" of this content for their own sites, where their own comments and ideas about a page are published at the top of the page when they link someone to the page. Finding ways for people to take narrative tokens with them to other Web sites, creating a linked space between different Web sites and a narrative contiguous experience between entirely different content zones, editorially and contextually adaptive pages, multiple story pathways, varying forms from individual content (plays from stories, stories from novels), versioning, serialized cross-linked content, and so forth. Blah blah blah.
Anyone who seriously wants to make a Web site with this stuff should let me know, too.
Someone please give me a grant and an office for a few months, while you're at it? I promise to behave and post to the site every day and avoid prison sex jokes for the duration. The grant should be in "narrative technologies" and there should be a stipend and a tiny, clean apartment within walking distance of an office and a grocery. I'll deliver open-sourced code and a public thesis or presentation at the end and scrub floors and give seminars and ask questions and actually write a continuous multiaxial narrative (spatial/ digital/ character/ chronological/ emotional/ rhetorical) and read the right books and toe the line and I give up.
Sometimes having grand plans is totally isolating. The bottleneck is my brain.
Right now, the system only works with the xsltproc commandline processor, which is part of libxslt, which requires libxml. Both are at http://www.xmlsoft.org.
Version: Using libxml 20310 and libxslt 1100
Later versions may not work! I'm not sure why. I used version 12 and no page text appeared; version 13 may have fixed the problem but I haven't tested. Version 11 of libxslt does work, however.
SAXON currently doesn't work because of a difference in the handling of the document output function between libxml and libxslt. If you'd like to know how to make it work with SAXON, let me know and I'll write it up and send you an email. This will probably change in the future, since they both claim to support the standard and one behavior or another for document output stuff must be right. I'm assuming right now that LibXSLT does it right, and perhaps the new version of SAXON does, too.
No other XSLT processors have been tested but assume they're not going to work unless really up-to-date. Sablotron probably won't; it's missing some important functions. If you're using Xalan, stop.
Once you've installed the libraries, you need to edit the file work/scripts/ftrain_vars.xsl in the installed directory and change the first variable to be appropriate to the full path on your own machine. So:
<xsl:variable name="dir_root">/home/ford/ftrainDEMO/</xsl:variable>
becomes
<xsl:variable name="dir_root">/my/home/dir/ftrainDEMO/</xsl:variable>
Then go into the work directory and run the shell script ./buildsite.sh and, if you have xsltproc available, it should spit out a whole bunch of HTML files at the top level, along with one RDF file (like the one at http://ftrain.com/ftrain.rdf. View the index.html file with your browser and you should be ready to go.
Basically, if you don't know XSLT it's going to seem like a big pot of nonsense. My stuff is probably not a good way to learn XSLT, as I learned everything wrong and now I just use a lot of tricks.
When you run the buildsite.sh script, it runs three XSLT scripts. The first script, ftrain_map.xsl slurps in demo.xml file, which is the top-level XML file describing the "site" and writes out a file called map.xml. This file contains an exact map of all the content, with all the titles, sections, descriptions, and dates. It ignores some sections, if I've flagged them as not-to-be-released.
The second script, ftrain_toc.xsl, slurps in map.xml, the file we just created. For me, this is much more efficient than using the original 2.5 meg file that is the Ftrain XML source; it's only 131K. (Remember that things can be 10-12x bigger in memory as DOM trees, so it's actually the difference between 30 megs and 1.2 megs.) This script spits out a bunch more XML files: a table of contents, a reverse-by-date table of contents, a forward-by-date table of contents, a last-10-entries listing, and an RDF representation of the site if people want to include that information on their own sites.
Now we run a third script, called ftrain_main.xsl. This reads in the demo.xml file again. The demo.xml file has entities that point to the various table-of-contents listings we just produced. So, essentially, we've made the file contain maps of itself. Instead of parsing just this file by itself, we load in the map created by the first step using xsl's document function and step through that. For each section in the map, we look up the corresponding section in the XML document using a key, and spit out a document. This original/map approach is much faster than dealing with the big XML document alone - 15 seconds vs 20 minutes.
They key to ftrain_main.xsl is that if the system doesn't have a rule for an element, it just passes it through. So essentially my DTD is HTML - <p> tags, <img src="whatever.gif"/> tags and the like, plus structural information. Each gathering of HTML is a sort of document, and gathered around it is the information that tells the document where it belongs. I've added a few things, like xrefs, which point another section in the site.
Should be enough to get you started if you're a Linux geek. Everyone else will have to wait - I've got things fairly abstract at this point but I don't know how other XSLT vendors implement their stuff and there are a bunch of little changes. Still, the basic functioning is all pretty standard and one script feeds into another. If there is any desire for a more generalized Ftrain system I'll try to meet it. But I know you're all weak, all talk, and that losing your e-commerce stocks took your fire away and you won't actually be joining me in uncovering the possibilities of new narrative connections via the global Interweb because the Web isn't cool anymore. Assholes. Me, I have a 20% stake in an e-commerce telecom startup and gave it hundreds of grievous painful hours of programming and consultancy time. 100,000 shares of nothing. That's the title of my success story. But I refuse to forsake the Web I love.
That's it. All documents linked together, all in harmony, all with full knowledge of their place in the hierarchy, but so many possibilities for each to transcend its place.