My Big-Arse Text File - a Poor Man's Wiki+Blog+PIM - The Experiment-Driven Life Blog - Matthew Cornell. Programmer, Research Software Engineer, Think, Try, Learn

Sunday

Aug212005

My Big-Arse Text File - a Poor Man's Wiki+Blog+PIM

Sunday, August 21, 2005 at 11:32AM

I was excited to to read this article (found via the unparalleled 43 Folders) describing one user's experiment with using a single (eventually large) text file to organize his stuff. For me the reason it's an interesting read is that I've been using a plain text file for my professional log/diary/journal/notes since Thu Sep 28 10:57:09 EDT 2000. In this post I'd like to talk about how I use the file, in hopes that it will give me some motivation and ideas.

FYI, my current file (see description next) has ~14,000 lines (~0.5MB), and my previous non-wiki file had ~55,000 lines (~1.5MB).

History

I've been using a single file for my professional ProgrammersNotebook since at least 1997. Initially it was a MS Word file (back in my Windows days), but when I moved to Linux I switched over to a simple ASCII file, which I edit in Emacs. The reason I used word was for its outliner - I organized the file by making each day a level 1 entry, and I listed them in reverse chronological order so that I could start at the top when adding the latest entry. The outliner let me structure the file a bit by breaking multi-line activities into separate entities. (Hey - sounds like the GTD principle of making something a project if it requires more than one next action. I'm in trouble - everything has GTD overtones these days...)

I organized the first ASCII file using Emacs' Outline Mode, but organized just like the Word file - reverse chronological, with structure via nesting. Incremental search allowed me to find (but sometimes painfully) items I needed, such as shell script notes, code snippets, and what I had been spending my time on. The problem was that it didn't allow linking and tagging, one of the primary ways I use in structuring information.

So on 2004-07-19 I moved to a second format (still ASCII edited via Emacs), which I described in my post on Photo Blogs, Wikis, and Memories for Life. Briefly, the file has simple entries separated by '----' and a time-stamp at the end. For example:


  ----
  talked w/PersonOne re: Google-style undergraduate programming
  contest. not clear what the topic should be. also talked about future
  fun projects. one possibility: ProxIncrVisualization
  (2005-08-19 12:37:48)
  ----
  continued moving information over to planner. ugh- the undated pages
  are a pain! wrote PersonTwo re: help, or maybe ordering a dated
  set. PaperPlanners
  (2005-08-19 08:32:37)
  ----
  MUS: http://www.sourcewatch.org/index.php?title=SourceWatch
  CitizenshipOversightProject
  (2005-08-19 08:29:25)
  ----
  ...
  ----

The big improvement is linking and tagging via WikiCase (AKA CamelCase or WikiWords). This helps me navigate and find needed information. Of course it opens up another issue, that of consistent tagging. But we'll save that for later. The only other formatting I use in the file is a) I define an entry by placing a WikiWord on the first line by itself, and b) I have some shortcuts for words. The shortcuts are special two- or three-letter words that end with a colon (':') and start a line. My current ones include IN (inbox), MUS (Might Be Useful), IDEA, COOL, and OFF (vacation leave). Finally, URLs are treated specially - I don't mark them up, I just paste them verbatim.

Together these merge (in a very low cost way) some of the good ideas from Wikis, Blogs, and PIM tools, with the simplicity of a text file. (There's a nice discussion of them here.)

Emacs customization

Well, not much really. All I have are keystrokes that create a new time-stamped entry and grab a URL's title. In addition I use the usual Emacs features like 'occur', interactive highlighting, and especially hippie-expand. I'd like to do more, but I just haven't had time.

Isn't this just a cheap RDBMS?

At first glance, yes, it's a just a text-based list of free-form records, which could be stored in a Relational Database System. (Actually, I helped build a new kind of database (Proximity) that directly supports representing semi-structured information like this, but that's another story.) My main reasons for not using a database are:

Easy to set up.
Customizable editors already available (easy to view, merge, format, search, edit, etc.)
Easy to backup.
Easy to write simple external tools to analyze, view, etc.
Supports schema changes.

Analysis and future

All I'll say here is that I use the file in a few basic ways. (See The Design and Long-Term Use of a Personal Electronic Notebook: A Reflective Analysis - AKA A Personal Electronic Notebook, by Thomas Erickson - for a great analysis of a personal journal tool the author built then used.) Mostly I use it to capture ideas, notes, URLs, and work activity like tasks, coding, and email. I've made myself enter every single URL that I come across that I think might even remotely be useful, because many times I've had to spend a LONG TIME trying to find something I've seen before. (Related: Stuff I've Seen and Keeping Found Things Found.)

To do nicer navigation and browsing I wrote a simple Java program (I used Jetty) to load the file's entries into RAM, show them chronologically, allow search, and turn WikiWords and URLs into links. I've used it a bit, but haven't been motivated to do more.

I think there's a great idea for a Journal Construction Kit that supports the emergence of customized specialization (see Jot for a commercial effort in this area). Here's a question: Is a general tool to support this kind of activity possible? Maybe it would be similar to Jetbrain's Meta Programming System, but for information. Related: Chandler, and these two articles by Martin Fowler.

I'd love to hear from others who have created customized journal tools that support these features. I'm not excited by Emacs programming, I'm just trying to get work done. Any thoughts would be appreciated!

35 Comments |

Email Article |

Reader Comments (35)

Thanks for the blog entry. I too have used a straight ascii text file for logging notes. I have tried many other systems but keep coming back to a simple text file. My log entries start with a date and a list of keywords, then the text, then actions, like so:

#_2008.09.08
@a_PIM @c_ANOTHER-TAG
A bunch of text, perhaps refererencing where related material is stored...whatever.
:TODO Call whomever...

The tags are denoted by @, then a major category ('a' for admin). Actions are denoted by :.
The C code is used to strip out the actions or all entries that match a keyword. I run the C code via a macro. I use EditPlus on a Windows machine.

This gives me a chronological stream which I find useful since I tend to remember things related to a rough time frame. I find this more reliable than a particular folder for a given topic. At worst, the log contains sufficient tags and references that I can find the item quickly in the log and from there where related info is physically stored.

Since I am often at meetings away from my computer, I have a paper log as well, plus associated reports and other info related to the item. Periodically, I make short log entries based on the paper log and scan the paper pages. That way I can retrieve information when on the road even when not connected to the internet.

Works for me. I would like to imbed the C code functionality (filtering) into the editor so that I can filter entries in the editor's view to show ONLY a subset of the log. Then I can add new entries with the context of past related entries in view. With EditPlus I can run the C code to open up a report file based on a given keyword but I cannot 'edit in place', if you know what I mean. I have to flip between windows, find the location of the old entries, make edits, etc. Fussy. Small quibble. DayNotez from www.natara.com is a very workable comparable solution that works on Palm, PocketPC and Windows. Mac too I think. I don't like being trapped in their file format but should they disappear, the data can be exported to text and a bit of code can be used to reformat it to you liking.

Cheers,
Bill

September 8, 2008 |

Bill Garland

Bill,

Interesting you integrated tags explicitly. Makes more sense than my camel case (strictly) convention. This assumes there are tag operators like find, rename, find all connections to/from, etc.

Re: actions, though I explictly decided not to integrate the two. My GTD-like system benefits from a dedicated tool. So mine id purely PIM.

macros... Yes; rapid entry a must. Worst case # keystrokes for mine: 5 to open file + 1 go top + 2 new entry + 2 position ... then *finally*, type content. 10 is NOT optimal, but close enough that I won't invest in improving it.

Re: paper log, same use here. Most of the time I'm at the computer when processing stuff, so I enter directly. And I always have paper capture tools on hand - a legal pad at the desk for phone calls, legal pad in briefcase for trips/meetings, and a tiny pastel (I have an 8 year old who picked it out) notebook with perforated pages. These I toss into my inbox, then when processing log it ("talked w/Nancy re: Cisco") and extract any action and text I want to keep (e.g., blog ideas :-) This use corresponds almost exactly (IIRC) to a nice use of the Paper Tiger program for Windows.

Thanks for the www.natara.com tip.

Cheers!

P.S. Cool lab - Nuclear Engineering. I was out near the [ Idaho National Laboratory | https://inlportal.inl.gov ] last month, FYI.

September 8, 2008 |

matthewcornell

I got a bunch of plain text files on a directory searchable using Jedit's hypersearch feature. The outline parser plug-in offers a dockable tree view of the explicit or indent folding structured text.
Other useful plug-ins are SuperAbrevs (folding header templates), CandyFolds (folding visualization) and more. Easy data syncronization with rsync (it's just plain text files). Regarding GTD i use a system similar to the one explained here: http://www.tobinharris.com/2007/7/27/jedit-for-really-simple-getting-things-done
www.jedit.org
http://plugins.jedit.org/plugins/?Outline

October 24, 2008 |

Carlos Gomez

Thanks for the pointer, Carlos. I like the auto-search for lines starting "*" for next actions. I couldn't find much on SuperAbrevs, though. I suspect creating this kind of solution in Emacs is straightforward, but I'm at a point where I can do most everything, i.e., golden handcuffs :-)

October 24, 2008 |

matthewcornell

Hi Matt,
I just use SuperAbbrevs in Jedit to speed insert explicit folding structures in the text file i'm working on. Emacs should have a similar feature.

This plugin enables you to tab-expand an abbreviation with variables in it. After the abbreviation is expanded, you can use the tab-key to jump between the variables. If you change the content of a variable in the abbreviation, all the variables with the same name will change accordingly. This concept is previously know from Eclipse and TextMate.
http://plugins.jedit.org/plugins/?SuperAbbrevs

October 24, 2008 |

Carlos Gomez

I had the same problem with system files when recently trying to update a blogsite I used to have. I’ve realized if you don’t keep up with the new programs and technology each year you inevitably fall behind.

March 9, 2010 |

affiliate cpa network

[from http://www.tanitanis.com/ ]

Thanks genehack and sacha for the great Emacs pointers. I'll definitely check them out.

October 3, 2010 |

Matthew Cornell

[From peterson]

However, I'm assignment writing [ http://www.customwritinghelp.co.uk/assignment.php ] comfortable with Emacs. You might want to look at the Emacs add-on mentioned by assignment help [ http://www.perfectwriting.co.uk/assignment/assignment-help.php ] others in the comments.

October 3, 2010 |

Matthew Cornell

Very nice good work thanks

October 5, 2010 |

Desene online

Thanks for reading, and for your comment.

November 23, 2010 |

Radyo Dinle

Post a New Comment

Enter your information below to add a new comment.

My response is on my own website »

Author:

Author Email (optional):

Author URL (optional):

Post:

↓ | ↑

All HTML will be escaped. Hyperlinks will be created for URLs automatically.

Matthew Cornell