The Incrementalist

3/16/2007

Dabble DB: Still sadly short of structured Shangri-La

Filed under: — Joe @ 9:41 pm

My latest side project is Headway, a resource for public transit hackers and the agencies who… often aren’t sure what to make of them. For whatever reason, the combination of sharp urban-dwelling creative folk and useful-but-confusing public transit systems has yielded many handy sites dedicated to making it easier to get around.

As I was setting up the blog, found that I really wanted some kind of outboard brain that could help me keep all the people and sites straight, and hopefully provide a useful reference for others. For expediency’s sake, I just used the handy “one-click” install of MediaWiki that DreamHost provides and started typing away. A few weekends later, the Headway Wiki was starting to become something useful—but I was definitely chafing against MediaWiki’s limitations. I found that I generally wanted to represent the same kinds of things about each entry:

  • the name of the site
  • the web address
  • who runs it
  • when it was launched (often with some degree of fuzziness, because even the site’s creator doesn’t really remember)
  • which agencies it serves

…and a few other miscellaneous things. Unfortunately, MediaWiki is really oriented towards prose—and in fact, I found myself using repetitive prose (with a smattering of bulleted lists) to express these things. Even worse, when I wanted to connect a entry about a third-party transit site to an entry on the agency that it was helping out, I had to manually maintain the link on both ends of the connection. That is, I couldn’t just tell the system that Boston Subway Station Map had information about the MBTA, and have it automatically display that in the MBTA entry—I had to go and edit the MBTA page by hand.

I did make use of MediaWiki’s (apparently) single structural feature: categories. Categories are basically simple tags that you can add to articles, so that the software can automatically generate an index of articles that all share a particular tag. Still, in the end it was far more work than I wanted to do.

There really should be a better way to put together a structured data collection like this, something in between limited expressiveness of MediaWiki and the programming involved in putting together a custom database-backed website using Ruby on Rails or what have you. I’m pretty sure that it’s possible, because I spent several years of my life working on tools like that for the MAYA Information Commons project. Sadly, that work still isn’t available to the general public, so it’s not really a contender here. However, there are a few intriguing new possibilities.

Enter Dabble DB. At first blush, it looked like just the thing that I was looking for. It has what’s probably the best available interface for experimenting with different ways of representing interconnected information. It’s pretty straightforward to create an item, add a few fields to it, and make some of those fields two-way links to other items. That’s no small feat, since my former co-workers and I spent the better part of 2004 building something similar (and if Dreaming in Code is to be believed, the folks on the respected Chandler team were at it for even longer, at around the same time). So far, so good. But after an evening trying to make the Headway data work in Dabble DB, I’ve run into a bunch of significant shortcomings.

No boolean fields

Starting with the smallest thing, there’s no straightforward way to represent a simple checkbox for things like “does this feed contain schedule information”? You can work around this by creating a multiple-choice field with the options “Yes” and “No”, but they’re missing an opportunity to make entering and displaying these fields simpler.

Limited spatial information

Here we are, a couple years after the Google Maps API catalyzed a geographic revolution on the web, and Dabble DB’s only location options are “US or Canadian state/province code” and “Country Code”. To their credit, they do automatically link to a Google Maps search for your term in some cases, but they could provide far more interesting map views if they simply had a lat/lon geocode field and just dumped it into Google maps.

Ontological limitations

It’s very cool that Dabble DB lets you put one item in multiple “categories” (schemas, basically). But in practice, their implementation is less handy than it would seem. Say you had two kinds of things, “websites” and “data providers”, both of which have names (of course) along with other more category-specific fields. If it turns out that you want to represent something that’s both a website and a data provider, and you put both categories on the same object, you end up with two name fields.

You could take a different tack and say that a “data provider” is a specific kind of “website”, so only the website category will have a name field. That’s great, but then there’s no easy way to have the system automatically add the “website” category when you go to create your next “data provider” item. Even worse, when you go to create a new view of your data based on “data providers”, there’s no way to choose to display the “name” field from the “websites” category in the table. (Note: this isn’t strictly true for the name field, since they special-case it so that you always have some kind of identifier, but it’s true for other attributes.)

Rudimentary public views

I could probably work around all those things, but there’s one thing that makes Dabble DB unusable for the Headway data set: the public view is horribly impoverished. Here are the results of my experiments: my lovingly interlinked data has been reduced to a box of yellowing printouts, metaphorically speaking. There’s no apparent way for the viewer to see a single entry laid out in a readable form, let alone follow links between items or search & filter by different attributes.

It’s a shame, because Dabble DB really is the best that I’ve seen so far in most other respects.

Freebase to the rescue?

There’s another contender on the horizon: the wonderfully named Freebase. Tim O’Reilly recently threw a debutante ball for it on his influential blog, and it’s easy to see why it stirred some excitement (and controversy) in the online community. It sounds quite a bit like the things I was working on at MAYA, but with a pleasantly simple web-based interface and without the radical peer-to-peer architecture. On the other hand, it’s hard to say for sure, since the alpha is currently only open to a few fortunate souls, and details are scarce. Hopefully I’ll get a chance to check it out soon.

In the meantime, Dabble DB has a lot of potential, especially since they recently launched their free Creative Commons version (which made it a viable option for Headway). Hopefully, with a few refinements, they’ll be able to turn it into a compelling alternative to developing custom code any time you want to share some interconnected information.