We are the Metadata

I remember a conversation I had with a retired Coast Guard pilot seven years ago who was testing our Flight Record application still in development:

“I just recorded a shipboard landing with a C130 in your application,” he complained.

“Is that a bad thing?” I blinked.

His eyes bulged, “I’d like to see you land a C130 on a Coast Guard cutter!”

I frowned, confused, “What’s a C130?”

He was taken aback. See, to him, a C130 was a great big airplane used to put out forest fires and fly heavy equipment and supplies all over the world:


C130

To me, a C130 was:

C130

Because I was the one writing the application, it was the metadata inside my head–or rather the lack thereof, that made me write it so that users could record they had landed the above behemoth on a ship with barely enough room for one helicopter. With the metadata inside the pilot’s head, we were able to get the proper constraints in place.

For nearly two years now, we have been reengineering our aviation logistics management application to accommodate the Coast Guard’s surface fleet. Our “Electronic Aircraft Logbook” has become the “Electronic Asset Logbook,” and “Aviation Logistics” has become “Asset Logistics.” But these are merely cosmetic changes. The real work has been behind the scenes, rearchitecting our database to make it wholly dynamic. For instance, an old table like this would no longer work:

aircraft_number aircraft_type aircraft_model wing_type
2001 C130 C130J Fixed
6001 H60 H60A Rotary
6501 H65 H65B Rotary

We can’t fit a boat into this table. Boats don’t have “Wing Types,” and they have numerous other attributes we want to record that we don’t have columns for. We could add new columns, but then we end up with lots of empty cells, and more empty cells when we try to put trucks or buildings into the table.

The solution is to make the database generic to any asset type. To do this, we convert the columns to rows:

asset_number attribute attribute_value
2001 aircraft_type C130
2001 asset_group aircraft
550001 boat_type RBS
550001 asset_group boat

Mind you, I’m seriously over-simplifying this. In reality, a highly-normalized database requires countless tables, each responsible for a subset of data. For instance, there would be a table of all possible attributes, a table of possible values for those attributes, a table defining the attributes for an asset group, a table establishing parent-child relationships among attributes, a table of ids for identifying assets, a table of historical values to log changing attributes with tables to support it, and so on until you get something that looks like this:


Normalized Database Detail

Normalized Database Detail

Where does metadata fit into this? Metadata is data that describes data, but I dislike this definition because it could be anything. The attributes of an asset are data that describes data, as are the structures that describe the attributes. Even the aircraft or boat itself is metadata in a metaphysical sense, self-referential in the context of the user’s understanding of it.

In practice, although rarely defined as such, metadata is data used internally to inform the application or other applications. It’s not meant for the users. There’s metadata in the source code of this blog, which search engines use to categorize its content. In the terms of our application, metadata is anything we use to inform the business-layer of our application, and the dream is to have a completely generic application that can be built entirely off of metadata descriptions.

But for now, all metadata still comes back to we human developers, who must know what to do with it functionally. There are very few algorithms flexible enough to run completely dynamically based on any metadata fed to it, but this does not make metadata “metacrap” as Cory Doctrow described it.* Metadata categorizes my posts, pictures, and is the only viable strategy for constructing the semantic web. One day, we will produce an Artificial Intelligence that can learn metadata, creating a strange loop that will raise it to the level of being our peer.

Until then, it’s back to tagging photos and posts for me.


* Doctrow totally loses prescience points with that essay, but it’s easy to identify his error in logic. If metadata won’t work because people will abuse it, then by that same logic, the Internet and e-mail wouldn’t work because too many people would be trying to game it. Metadata works for the same reason the World Wide Web works, the people who want it to work far outnumber the people trying to overthrow it.


Posted

in

by

Tags:

Comments

3 responses to “We are the Metadata”