Done for a class I took this past semester on agent-based modeling and also just for fun, I present:
The SiteMap, an elaborate pun involving automata.
Nate Wessel – University of Cincinnati Department of Geography – December 2013
‘Sitemaps’ are hierarchical online website page listings used as aids to robotic web indexers. The goal of this project is to develop a sitemap that takes ‘map’ literally, translating the page hierarchy and organization into an imagined transportation hierarchy, and placing it in an imaginary spatial context derived from site metadata. Using principles from cellular automata models, I built a plugin1 for WordPress that, on each new page request, uses available post metadata to produce a unique, stochastic map-of-the-website. Written in PHP with some client-side animation help from JavaScript, the program outputs the map in Scalable Vector Graphics(SVG) in the body of a page. Human users as much as bots then can explore the map and make sense of the way posts and pages are connected in the logical structure of the site.
A ‘sitemap’ is a tool used by some websites, particularly very large or complex ones, to make sure that indexing search engines have accessed and indexed all of their (important) pages. Generally a search engine’s web-crawling bot will try to follow links from any part of the web recursively until it’s found and recorded the contents of every website that it’s indexing. The sitemap then is a sort of shortcut, and insurance against the formation of content islands in a website, where pages or clusters of pages might exist and even be important in their own right, but might not be linked to from other parts of the web and thus not indexed. The sitemap also intends to relay hierarchical information more clearly and thus to place more fundamental pages higher in search rankings all else being equal; generally, you want your ‘www.example.com‘ to show up before your ‘www.example.com/puppies/kittens/tarzan-wearing-a-powdered-wig/toothpaste.jpg‘
Sitemaps then are basically meant only to be seen by robots. But what if they could be made fit and even useful for ingestion and indexing by people too? My basic problem was this: I have a website, CincyMap.org, which is really built around maps and particularly designed around a transit theme. I put a lot of work into some of the content, but some of the best content gets buried after a few posts in a blog structure that favors recent posts over older ones. Search engines are keeping track just fine since everything is linked to internally and sometimes from other parts of the web, but I find that readers simply don’t know what if anything they’re missing. It’s also not clear what the structure of the site is beyond the linear temporal structure of the blog format.
My goal then is to secondarily, let’s just say it, riff on a engaging graphic theme and make a pun, but primarily to engage readers by making for them an engaging and understandable and thus indexable display of what kind of stuff is in my website and how it’s interrelated.
A final complicating factor is that, like all good things, the website which the sitemap should attempt to describe will always be changing in unpredictable ways, and so the output or form of the map must adapt well to an infinite range of inputs. Properties that emerge from complex interactions will set the tone of each fleeting reinvention of the sitemap.
The sitemap was developed on a WordPress 3.7.1 blog with 85 posts in 26 categories having ~500 tags, ~250 of them unique. Currently, it’s implemented as part of a custom ‘theme’, though my goal is to isolate it in a ‘plugin’ that may eventually be released to the public. Themes in WordPress are a set of files that dictate the form and presentation of a site, sort of like a die through which content, stored in the same format for each site, is extruded. A plugin, operates similarly but generally focuses not on presentation but on providing some particular functionality like visitor tracking or an inline content item like a survey form.
WordPress and it’s themes and plugins are written in PHP and so logically was this project. PHP runs on the server side, and is great for outputting web-pages or other text-based things because code can be interspersed with text elements like any HTML or SVG tags. Rather than having to ‘echo’ or ‘print’ every line of output, much of it can simply stay the same, with PHP tags printing just the dynamic content.
The map’s essential idea is taken from cellular automata models and there are two basics types of agents that interact in a cellular space: categories and tags. Categories are broad buckets into which posts can fall, and tags are generally more specific descriptive words. Every time the sitemap is requested, ie every time a page with the sitemap is requested from the server by someone visiting the page, the script will be run and a wholly new map result will be output and sent. Two people requesting the page at the same time will get totally different results. Here’s how it works:
Results are mixed. The landuse/tag model seems to have been more immediately successful than the transit/category model, almost certainly because the latter is much more complex and thus prone to bugs and my own errors in conception. I think the basic concept has proven itself to be a very interesting one and the project will be further developed for actual, prominent deployment on my site.
Specifically, the tag/landuse model demonstrated strong emergent qualities and looked plausibly like an actual map of some geographic arrangement.
It did not however, as I had hoped it would, demonstrate a strong clustering of similar or repeated tags. Tags tended to be content once they were in the middle of a very large cluster, or once they had found an identical partner. This meant that some tags, like the word ‘transit’, that were repeated as many as 30 times did not cluster among themselves but rather scattered pretty randomly with quite a few pairs of two or three. The transit/category model also demonstrated some emergent qualities but only occasionally presented a plausible transit system that interacted well with the tags and sometimes even produced erratic errors such as a line hopping randomly to the other side of the matrix. This shattered the illusion and weakened the pun.
While the results at the moment are less than totally intriguing, I think this is a very interesting problem and that I’ve taken a good basic approach to addressing it. There are as I’ve said, a number of particular problems with the implementation of the idea in code, but the approach itself is solid.
The basic problem was to create an image that replicated an artifact that modeled a complex, organic system(a city infrastructure). The emergent qualities of the city or of a landscape are some of the first things that let us identify a map as a map. A map for example of a place with perfectly rectilinear and evenly spaced streets, no center, and a perfectly square coastline, would not at first strike us as a map, even if the visual tropes of a north arrow and scale bar were right on the page. What makes a map a map at least as much as the style, the line-weights, the earth-tone colors is the subject matter and it’s recurring organic quality which we as life ourselves are so quick to recognize and appreciate as beautiful. It’s this quality that required an agent-based approach.
It’s the familiarity we have with this same quality that I hope will eventually make this map interesting to readers, and worth exploring and playing with. Fantasy maps, particularly of urban rail systems, have always seemed to have some inexplicable intrigue. Most of these maps, and there really are a good many of them, are drawn by hand, most of them of real places transformed by an imagined transit system. A brave or naive few do attempt to imagine a wholly new city infrastructure, but I think these attempts are less successful because we can’t conceive of plausible cities ourselves for the same reason that socialism couldn’t. Cities thrive where central planning and singular conception fail. They result from the actions of agents with conflicting self-interests and they demonstrate qualities and characters that we’d be more likely to locate in a god or demon than in ourselves.
Perhaps I’m suggesting a new subgenre of the fantasy rail map: that created by the same method as a real map but from imagined initial conditions. My initial conditions here are a WordPress blog; yours may vary.
The current most-stable version of the sitemapfor this site, still under development, is to be found here. The code as of today has been uploaded here.
Which one of these lines is the cincy streetcar?
The one with all the mad-crazy econ devo goin’ on all over the place yo!