The Incrementalist

3/16/2007

Dabble DB: Still sadly short of structured Shangri-La

Filed under: — Joe @ 9:41 pm

My latest side project is Headway, a resource for public transit hackers and the agencies who… often aren’t sure what to make of them. For whatever reason, the combination of sharp urban-dwelling creative folk and useful-but-confusing public transit systems has yielded many handy sites dedicated to making it easier to get around.

As I was setting up the blog, found that I really wanted some kind of outboard brain that could help me keep all the people and sites straight, and hopefully provide a useful reference for others. For expediency’s sake, I just used the handy “one-click” install of MediaWiki that DreamHost provides and started typing away. A few weekends later, the Headway Wiki was starting to become something useful—but I was definitely chafing against MediaWiki’s limitations. I found that I generally wanted to represent the same kinds of things about each entry:

  • the name of the site
  • the web address
  • who runs it
  • when it was launched (often with some degree of fuzziness, because even the site’s creator doesn’t really remember)
  • which agencies it serves

…and a few other miscellaneous things. Unfortunately, MediaWiki is really oriented towards prose—and in fact, I found myself using repetitive prose (with a smattering of bulleted lists) to express these things. Even worse, when I wanted to connect a entry about a third-party transit site to an entry on the agency that it was helping out, I had to manually maintain the link on both ends of the connection. That is, I couldn’t just tell the system that Boston Subway Station Map had information about the MBTA, and have it automatically display that in the MBTA entry—I had to go and edit the MBTA page by hand.

I did make use of MediaWiki’s (apparently) single structural feature: categories. Categories are basically simple tags that you can add to articles, so that the software can automatically generate an index of articles that all share a particular tag. Still, in the end it was far more work than I wanted to do.

There really should be a better way to put together a structured data collection like this, something in between limited expressiveness of MediaWiki and the programming involved in putting together a custom database-backed website using Ruby on Rails or what have you. I’m pretty sure that it’s possible, because I spent several years of my life working on tools like that for the MAYA Information Commons project. Sadly, that work still isn’t available to the general public, so it’s not really a contender here. However, there are a few intriguing new possibilities.

Enter Dabble DB. At first blush, it looked like just the thing that I was looking for. It has what’s probably the best available interface for experimenting with different ways of representing interconnected information. It’s pretty straightforward to create an item, add a few fields to it, and make some of those fields two-way links to other items. That’s no small feat, since my former co-workers and I spent the better part of 2004 building something similar (and if Dreaming in Code is to be believed, the folks on the respected Chandler team were at it for even longer, at around the same time). So far, so good. But after an evening trying to make the Headway data work in Dabble DB, I’ve run into a bunch of significant shortcomings.

No boolean fields

Starting with the smallest thing, there’s no straightforward way to represent a simple checkbox for things like “does this feed contain schedule information”? You can work around this by creating a multiple-choice field with the options “Yes” and “No”, but they’re missing an opportunity to make entering and displaying these fields simpler.

Limited spatial information

Here we are, a couple years after the Google Maps API catalyzed a geographic revolution on the web, and Dabble DB’s only location options are “US or Canadian state/province code” and “Country Code”. To their credit, they do automatically link to a Google Maps search for your term in some cases, but they could provide far more interesting map views if they simply had a lat/lon geocode field and just dumped it into Google maps.

Ontological limitations

It’s very cool that Dabble DB lets you put one item in multiple “categories” (schemas, basically). But in practice, their implementation is less handy than it would seem. Say you had two kinds of things, “websites” and “data providers”, both of which have names (of course) along with other more category-specific fields. If it turns out that you want to represent something that’s both a website and a data provider, and you put both categories on the same object, you end up with two name fields.

You could take a different tack and say that a “data provider” is a specific kind of “website”, so only the website category will have a name field. That’s great, but then there’s no easy way to have the system automatically add the “website” category when you go to create your next “data provider” item. Even worse, when you go to create a new view of your data based on “data providers”, there’s no way to choose to display the “name” field from the “websites” category in the table. (Note: this isn’t strictly true for the name field, since they special-case it so that you always have some kind of identifier, but it’s true for other attributes.)

Rudimentary public views

I could probably work around all those things, but there’s one thing that makes Dabble DB unusable for the Headway data set: the public view is horribly impoverished. Here are the results of my experiments: my lovingly interlinked data has been reduced to a box of yellowing printouts, metaphorically speaking. There’s no apparent way for the viewer to see a single entry laid out in a readable form, let alone follow links between items or search & filter by different attributes.

It’s a shame, because Dabble DB really is the best that I’ve seen so far in most other respects.

Freebase to the rescue?

There’s another contender on the horizon: the wonderfully named Freebase. Tim O’Reilly recently threw a debutante ball for it on his influential blog, and it’s easy to see why it stirred some excitement (and controversy) in the online community. It sounds quite a bit like the things I was working on at MAYA, but with a pleasantly simple web-based interface and without the radical peer-to-peer architecture. On the other hand, it’s hard to say for sure, since the alpha is currently only open to a few fortunate souls, and details are scarce. Hopefully I’ll get a chance to check it out soon.

In the meantime, Dabble DB has a lot of potential, especially since they recently launched their free Creative Commons version (which made it a viable option for Headway). Hopefully, with a few refinements, they’ll be able to turn it into a compelling alternative to developing custom code any time you want to share some interconnected information.

4 Responses to “Dabble DB: Still sadly short of structured Shangri-La”

  1. Avi Bryant Says:

    Joe, thanks for the kind words and the valuable feedback on Dabble DB. The shortcomings you mention are real, and we’re aware of them and working on them. I take your point about “a box of yellowing printouts” especially to heart: to begin with, Dabble was designed under the assumption that nearly everyone working with the data would be part of a single workgroup and be a registered user with full interactive access to the data. With Dabble DB Commons, we’re experimenting with more public-facing models, but the technology is still playing catch-up.

    One small clarification about our location support: the choice between US States and ISO Country Codes is simply a necessary disambiguation, since both use two-letter abbreviations, often without any other context. We do recognize more than just those two sets of locations. And we will certainly be providing full lat/long support in the future. (Unforunately, due to their pricey commercial licensing, just “dumping it into Google Maps” isn’t as simple as it sounds - but note that we do already have a KML export for viewing data in Google Earth).

    Thanks again for the feedback,
    Avi

  2. Joe Says:

    Thanks for taking the time to respond, Avi. I was hoping that my quick case study would be useful to you guys (and others working in the same area).

    Your comments about public-facing views make sense, considering the origins of the product. Even so, when I was putting in sample data, I kept assuming that there was a much less cluttered view that made it easy to see your information when you weren’t in the process of editing, without all the extra fields and editing controls. If a data set is truly useful, whether public or private, you’ll likely end up with a disproportionate number of users that simply want to view & explore it without making changes.

    My comments about the states or country codes were trying to express my sense of surprise that the finest level of location granularity that Dabble DB offered was state/province, and only in North America, at that! (The fact that the map view only showed a state map of the US reinforced that impression.) Without more fine-grained detail, the KML feed is going to be of limited usefulness.

    In addition, the KML feed for my sample data appears to be a network link rather than containing the data itself, which means that if I feed it to Google Maps I don’t get anything useful. (To be sure, this is partly a limitation of Google Maps, but a non-network-link KML is more useful to non-Earth consumers.)

    As for not being able to use the Google Maps API, they offer the same terms as Dabble DB (free if your application is free), so it should at least be usable for the Creative Commons version. I don’t know that much about the enterprise maps pricing, so I’m curious to hear more about what makes it impractical for you guys.

    Anyway, thanks again for taking the time to listen; hopefully these will all turn out to be temporary problems, and I’ll be able to recommend Dabble DB without reservation in the future.

  3. Dae Park Says:

    Thank you for your interest in Freebase. Shoot me an email and I will send an invite.

  4. Joe Says:

    Thanks for the offer, Dae, but Kurt beat you to it—I’ve been having a blast wiring up types & topics all afternoon.

Leave a Reply