IMPORTANT NOTE: Most of what is discussed here has already being implemented by BeOS since 1996, however the author has never used BeOS and so he was not familiar with its capabilities while writing this article.
As technical people, we can all think of a bunch of cunning uses for a database filesystem. My personal dream use would be a superlative code management system; when integrated with a good editor/IDE, it could provide revision control, tagging, searchable documentation, name completion, and probably any number of other things. Imagine being able to search the Doxygen comments for the function you can vaguely remember provides exactly the feature you want. Imagine being able to find every place a method is called so you can tweak its interface. Imagine being able to examine, shuffle and package changesets, like Bitkeeper.
But that is quite a lot to implement all in one go, even as an imaginary system, and it doesn't really show how a general user would be able to take advantage of the tools we want to provide.
Instead, I'd like to focus on the humble email client. Email clients have a number of features that make them interesting here:
- Everyone uses a mail client
- Email messages have a bunch of attributes that can be easily extracted
- Mail clients use a custom database
This observation inspires the following questions:
- If database filesystems are so good, are there any good reasons why no-one has implemented one for email?
- Can we explore the usefulness of a database filesystem by implementing one within a mail client?
- What "killer features" would such a mail client provide, and would they convince users to switch?
The rest of this article tries to find some answers to these questions by creating a specification for a database-backed mail client.
The Mail DatabaseFirst of all, we should explore what features a database backed mail client would provide the user. In a pure email system, we would only need to store two different types of objects: email and addressbook entries. To simplify things, I'm ignoring all the other things, like task items and diary dates, that some mail clients store.
We can divide the attributes for each object into three different categories:
- Intrinsic attributes – These are defined by the objects themselves, e.g.,
- The sender, date, recipients, subject etc. for an email.
- The name, email address etc. for an addressbook entry.
- Client attributes – These are invented by the mail client to manage the database objects, e.g.,
- Object type
- Unique identifier
- Per-message flags: draft, sent, unread, deleted etc.
- Received date
- User attributes – These are attributes that the user maintains e.g.,
- Per object flags e.g., message has been replied-to, message has been forwarded, message needs response
- Object category attributes, e.g., message is a personal/work message, addressbook entry is a friend/business associate
- Custom attributes e.g., Deal-with-by date
We want to use the message attributes to help a user organize their email in ways that weren't possible with the old folder paradigm. For example, the user might want to
- Set a "Needs reply" flag so that the user can see which messages need to be responded to.
- Set a "Deal with by" date so that the user can specify any deadlines imposed by the message and a completed flag the user can set when the task is complete.
- Set flags indicating that the message is work/personal/etc.
The user can can use these new attributes to manage his email in lots of new and interesting ways, for example,
- The user can find all messages that have been waiting for a reply for longer than a week
- The user can find all messages with imminent deadlines
- The user can find all work messages from a particular recipient
One attribute type that I haven't mentioned is a explicit message folder. Instead we can produce a folderlike hierarchy using any set of attributes. But will the user want to sort his email into a hierarchy? Considering the precedents – current mail clients, hierarchical databases and filesystems, DNS, taxonomy and any number of other examples – I think we can safely assume that the need to categorize objects into a hierarchy is hardwired into the human brain.
I can think of two approaches to producing a hierarchy from object attributes. First of all, we can categorize objects using a subset of the available attributes. At each level of the hierarchy, we choose an attribute, and assign messages into subcategories using that attribute.
This hierarchy is very simple to achieve but its usefulness is probably limited. Most attributes aren't suitable. Who would want to categorize their messages using the message ID? How would we use a multi-valued attribute such as recipients? Even the originator will only be useful under limited circumstances.
The second option is to use a specific user-defined category attribute. The user enumerates all possible values of this attribute and assigns messages to their appropriate categories as he sees fit. To produce a hierarchy, we divide the category attribute into fields, with each field used to categorize objects at a given level in the hierarchy.
The most useful solution would probably be a combination of these two. At the highest level, the user would want to see their messages categorized using the message flags to produce categories like unread and uncategorized messages, messages waiting to be sent, deleted messages etc. Afterwards, it is probably sufficient to arrange messages according to the single category attribute.
Note that with this scheme, we no longer guarantee that the message categorization is disjoint – a given message can exist in more than one category. In fact it might be useful to make the category attribute multivalued. After all, not every message is easy to pigeonhole.
- "DB fs, Page 1/3"
- "DB fs, Page 2/3"
- "DB fs, Page 3/3"



