« Getting the most from David Allen's RoadMap seminar? | Main | Dealing with Meeting Notes - GTD to the Rescue! »
Sunday
Sep252005

Organizing Electronic Documents GTD-Style?

Over at Lifehack.Community user anithri asks "How do I organize a large and growing collection of Electronic documents?":
I have a collection of 200+ PDF's, Word docs, text files...It's easy to find one if I know the name of what I'm looking for already, but opening a large number of them looking for what I'm currently interested in is getting very old very quickly.

What I'd ideally like is an application that allows me to "tag" my files ala del.icio.us or flickr.com and then allow me to pull up lists of all files with a particular tag.
This is a big problem that's near and dear to my heart, and one that hasn't been adequately addressed yet. It's a huge topic (the British Computer Society recently called "Memories for life" one of the Grand Challenges in Computing), but I wanted to briefly: a) observe that current techniques are missing the point (relationships), and 2) ask if a GTD-style A-Z reference system apply to the digital realm.


Current Filing Techniques Aren't Relational

The two suggestions given in response to the Lifehack article ("use Spotlight as the tagging system", and "look into Google Desktop") are based on an IR-style index-and-search approach, also discussed in The Death of Folders? and The File Manager Is Dead. Long Live the Lifeblog. However, I think these approaches are missing one of the fundamental concepts about our information: It is connected. Among other things, documents relate to:
  • people (e.g., about them (incl. photos), received from them, or sent to them),
  • events (e.g., prepared for, or received during), or
  • projects (e.g., supporting information or output artifact)
In fact, it's hard for me to think of any document that simply exists by itself; i.e., context provides much greater meaning for documents. The recent move towards tagging tries to leverage connections in an ad hoc manner - by providing support for arbitrary keywords, a form of linking in which connections are expressed via sharing the same keyword(s). However, this impoverished form of linking has limitations, including not supporting attributes on the links themselves. More on this later, but you might want to leave you with some related links.
A simple alpha filing system for electronic documents?

The other idea this question stimulated is applying David Allen's GTD filing system to the digital realm. I'm currently testing this for email, and it has worked pretty well so far. Briefly, in addition to @action and @waiting-for, I have a top-level email directory for each letter of the alphabet, each of which contains email archive files (mbox files on my unix machine) for each project (e.g., n/nsf-site-visit-2005, p/personal-information-web). Finally, each of those latter files contains the relevant messages. Here's the conceptual map (vertical dimension is 'containment', with the outer-most container at the top):

paperemail
filing cabinetemail system
A-Z dividera-z top-level directory
file foldermbox email file
piece of paperemail message

This works OK - Filing is pretty fast, for the same reasons as with the analog GTD version: Quick to dream up a name, only a few places I might have put it, etc. However, due to the email client I use (pine), textual search is pretty difficult. (Side note: When will someone write a simple lucene Java app to index mbox files? JavaMail has been around forever!) What I'd like to know is how well an analogous system would apply to documents. Maybe I'll give it a try, at least for new ones. However, compared to most people my electronic document needs are pretty basic - I seem to rely mostly on email, printed documents, Manila folders, and letter size paper. (Yes, it's about as low tech as possible.)

As always, comments are welcome.

Reader Comments (25)

I've struggled with the same problem. I have so many documents and email. Plus, it's never easy to predict what or when I'll need something. For those times when I'll want to browse through documents and for easy in cleaning up old stuff, I use David Allen's simple alpha filing system. To get through so many documents quickly when the need arises, I rely on MSN Desktop Search. It's almost as good as tagging the content.

September 26, 2005 | Unregistered CommenterGadgetComa

Thanks for the information about your usage, GadgetComa. It would be great to get some detail about how you adapted the alpha system.

September 26, 2005 | Unregistered CommenterMatthew Cornell

The screen shots look really interesting, Обзоры софта. Thanks for the pointer. You might be interested in this post, which talks about meta-data and photos:

[ Photo Blogs, Wikis, and Memories for Life | http://www.matthewcornell.org/blog/2005/04/photo-blogs-wikis-and-memories-for.html ]

September 26, 2005 | Unregistered CommenterMatthew Cornell

I've been wondering about if the GTD's alpha system would work of digital form, but I was a bit afraid to make the switch. I'm glad to know it worked for you, and I'll be doing it, starting only with my personal email. If the results are good, I'll go for my work email also.

September 28, 2005 | Unregistered CommenterRicardo Mestre

I'd love to hear how your experiment goes, Ricardo.

September 28, 2005 | Unregistered CommenterMatthew Cornell

Unless I am mistaken, Microsoft's new Visa software will allow you to view your files in a tagging like enviroment.

http://www.microsoft.com/windowsvista/clear.mspx

An example of the vitual folders is located at http://www.microsoft.com/presspass/presskits/windowsvista/images/image002.jpg

While it may not be perfect it does appear to be a start.

On a related note, thank you for reminding me that MSN Desktop Search existed GadgetComa. I'll check it out.

October 2, 2005 | Unregistered CommenterJoseph

Thanks for the screen shot and link, Joseph - very interesting. I've had WinFS on my mind for a while, and it's nice to see things finally moving ahead. And I'm pleased if I was able to help on GadgetComa.

matt

October 2, 2005 | Unregistered CommenterMatthew Cornell

I just got back to my own comment and saw your question about how I use the alpha filing system in conjunction with MSN Desktop Search. Basically, I just file my documents the same way David Allen suggests filing paper documents. I use the term that means the most and create a folder for it. The desktop search tool lets me find anything I want with ease, so there's really no need to file simply to allow ease retrieval. What the filing does give me is the option to browse the content either to purge old files or to refer to for ideas.

October 3, 2005 | Unregistered Commentergadgetcoma

Thanks for the detail, gadgetcoma.

matt

October 3, 2005 | Unregistered CommenterMatthew Cornell

Havent heard anyone mention google desktop yet. I have a document collection that's in the 4000's and an email archive of 3+ GB (the result of working over 5 years at the same company). Google desktop has more than once allowed me to get that one specific email or document in less than a minute rendering most file-system or folder based ordering moot.
And it does index all kinds of files.

October 25, 2005 | Unregistered CommenterAnonymous

Get a mac
It includes Spotlight, searches whole computer

January 5, 2006 | Unregistered CommenterAnonymous

Thanks for the reminder about Spotlight, anonymous. I think search engine-based solutions like Spotlight are useful, but I'm starting to believe that, without explicit connections between information, they're limited at the kinds of uses I need.

January 6, 2006 | Unregistered CommenterMatthew Cornell

There might be company policy or privacy or copyright issues with this, but you could always email all the documents to a Gmail account, with very descriptive subject lines, and create the tagging system of your dreams there. You can tag an email with multiple tags (like del.icio.us).

May 22, 2006 | Unregistered CommenterBookworm

Thanks for the Google suggestion, Bookworm. That would solve the tagging, and maybe using tags consistently would be a form of linking. I would like explicit linking between concepts, so that I could place documents in a personal information network. This would allow finding information in novel (and hopefully useful) ways.

Love the blog, BTW...

May 23, 2006 | Unregistered CommenterMatthew Cornell

Vista does support os-wide tagging, see:
http://blogs.msdn.com/pix/archive/2006/06/15/632677.aspx
and from a pr:

"• Tagging Files. Windows Vista’s powerful new search and organization features extensively utilize file properties (metadata) to provide users with an even more dynamic way to interact with their information. Users can tag photos in the Windows Photo Gallery, music in Windows Media Player 11, and documents in the Documents Explorer; it’s simple and provides more flexibility in file organization."

Tagging is a very powerful way of organizing Matt, but you call it "impoverished". Why?

October 10, 2006 | Unregistered CommenterBob Walsh

Hi Bob,

Tagging is a very powerful way of organizing Matt, but you call it "impoverished". Why? - It goes back to the (controversial?) idea that "it's the links, silly." In other words, like Google's founders realized, the interrelationships between data items is crucial to representing the kinds of real-world data people generate.

Tags are a kind of weak tagging - I can "link" two data items (say two files) by tagging them the same, e.g., 'UMass.proposal.2006-10' (a project). But I can't add attributes to the actual connection between the files, right? Therefore I loose information...

I can say more, but not right now! Maybe it would be best to chat. Thanks for reading, and for your comment, Bob.

October 10, 2006 | Unregistered CommenterMatthew Cornell

Adding to all that's been said, I don't think a-z directories do any good for electronic documents, neither I use individual folders for documents when I can just rename them as I want (if needed, I keep the original name between parenthesis).
I'm now experimenting with a desktop wiki (moinmoin desktop) to help me organize my documents. Moinmoin makes it easy to attach documents to a page (you just have to upload them to the "pagename"\attachments directory), and allows you to comment each "folder" (really a wiki page). Then you can categorize, tag, or even link these pages. The other advantages of wiki is that it's platform independent, and even scalable to the web, if you need to share your files. Seems to be promising but, as with any file organization system, only time and volume will prove it effective.

November 5, 2007 | Unregistered Commenterpgoes

Hi pgoes,

I don't think a-z directories do any good for electronic documents

Love to hear more about it.

neither I use individual folders for documents when I can just rename them as I want

Are you saying you keep one big "flat" directory? A fine approach on Unix, but Windows suffers...

experimenting with [ moinmoin desktop | http://moinmoin.wikiwikiweb.de/DesktopEdition ] to help organize documents. comment each "folder", categorize, tag, or even link these pages

Neat! I'd love to hear how it works out

Thanks for the comment.

November 6, 2007 | Unregistered CommenterMatthew Cornell

There's also a great software tool, for Mac, called Together that I use for grouping files in various ways. It's like the iTunes for file organization. It can be found from Reinvented Software: http://reinventedsoftware.com/together/.

December 19, 2007 | Unregistered CommenterSpencer

Thanks for the tip, Spencer. I have a Mac-based client right now who might be interested. I wonder if there's an equivalent for Mail.app?

December 20, 2007 | Unregistered CommenterMatthew Cornell

I have an a-z file structure for my digital documents and have been using it for about a year now. For starters I am a mac user and I use Midnight Inbox as my GTD software app. I use mail.app for email and I only have action, hold, and archive folders. In my documents folder I created an a-z file structure with quicksilver triggers to the folders I use the most. Documents are filed in the a-z folders based not on the document name but on the project name. After a year of doing this. I say I like it but some of my recurring project files are quite big and specific files get harder to find.

February 13, 2008 | Unregistered CommenterClayton

Hi Clayton - Thanks for the story. I'm guessing the 'hold' email folder is for Waiting For items? And I've experienced the flat folder growth problem too, at least the "too many to find" part. Having a single huge project folder... I'd suggest breaking the folder down into subfolders, e.g., budget, travel, slides, ...

You would probably get a lot out of Mark Hurst's book "Bit Literacy" (I interviewed him [ here | http://www.matthewcornell.org/blog/2008/01/conversation-with-mark-hurst-web.html ] ). He covers document structuring schemes.

Thanks for the comment!

February 13, 2008 | Unregistered CommenterMatthew Cornell
Thank you for sharing your ideas regarding organized computer files. In this way, we no longer have to spend long hours tracing files in our computers especially when we are in a tight work schedule. Let us remember that it is not only computer files that need organizing but even paper documents as well. To avoid wastage and producing too many copies, try maximizing the use of papers and recycle if possible. Unlike computer files that we can easily delete, we should dispose our paper documents properly.
August 15, 2011 | Unregistered CommenterD_S_S San Antonio
I agree re: paper, D_S_S. It's a classic challenge that hasn't gone away in the computer age. In fact, the rise of personal printers has exacerbated the problem. Thanks for stopping by.
August 27, 2011 | Registered CommenterMatthew Cornell

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
All HTML will be escaped. Hyperlinks will be created for URLs automatically.