Large-Scale Info Managers and Scriv

Maria · July 10, 2007, 11:40pm

Hi,

the problem is not so much the license, it is the way Nifty Box recognizes the tags at the moment. It should be possible to connect the iBook to the desktop as a hard disk and tag as if it was an external disk.

The reason is that Nifty Box uses CoreServices and stores the data not only in the Finder comments field but in an SQL database (very much like Aperture). That makes it incredibly fast and reliable, even if you move the files. But you depend on the database, if you want to keep consistency while tagging The developer works on that problem and want the app to scan disks for files with Nifty Box tags (they are included into one identifier tag in the comment field) and compare with the data in the data base. Then you could even tag on other computers and move files, they will be udated, and everything is kept consistent.

The solution is so charming in my eyes because in the end it is all about simple text in the comment field. You never need Nifty Box, you can always use what you have tagged on one computer with Spotlight searches on another.

My G5 is my main machine. All data are collected on this machine, I only sometime copy data I work on or need for a presentation to my iBook. This workflow is ideal to tag on the main machine, use tagged files on any other but only change on the main machine. I do not organise metadata on the second computer (the iBook), I just work with the data, to say it shortly.

This restriction won’t apply in the future. And anyway, the whole idea works without Nifty Box as well. Only in my set up, the application is a tremendous aid in keeping my evolving tagging system simple and consistent.

Many words, I hope, I could express what I meant

Maria

autodidact · July 12, 2007, 6:23am

Maria,

thank you for posting the details on your directory structure and the process you go through to organize your files. I will need to taylor my own process, but I enjoy reading the method/process that others use.

One more question, what advantage do you find in using a date directory structure (vs. subject or topic)?

I found an article at academHack blog on tagging files. http://academhack.outsidethetext.com/home/?p=154
The author gives a general view on tagging files. But more important, he has links to another site which outlines a more detailed process. Look at the section titled “If you are on a Macâ€”a better way” for these links.

–Carlos

Maria · July 12, 2007, 6:33am

Carlos,

Thanks for your interest, questions and the link. I will read it.

The adavantage of using folders by date of tagging over topic is that I do not have to think of which topic I want to apply to a file. When I have the RFA analysis of pottery from a site in South Kyushu, I could file it into folders like “archaeometry”, “pottery”, “material analysis”, “South KyÃ»shÃ»”, “Kofun period”, etc. This is why I use tags. So I do not need the topic and just dump everything into one place. I could of course give up creating folders for the month of tagging. This is just to keep the amount of files in one folder acceptable for me as a human being.

By the way, I realized that it is helpful to have some basic knowledge in data modeling and analysis. Tagging is something very different on the one hand, but with classical knowledge and training, the tagging system becomes complete and consistent.

All the best,
Maria

Addendum: I had a look at the sites, which are basically what I read when I started the system outlined here. The software mentioned in text and comments I tried, but I cannot recommend it because they all lack the ability for really consistent and complete tagging. I cannot come to terms with Quicksilver, although there are fans and it seems to be the only solution that does almost everything needed for serious tagging (but there was something I do not remember well, that was missing). So I came up with this other solution, though the additional software is just a front end to keep me on the safe side when tagging. The basis of the system is setting up a reasonable tagging system for the comment field, applying the tags in a controlled manner and using Spotlight combined searches for tags and other properties as well as content. No software needed. And I never found data as reliably as with this set up.

Maria · July 12, 2007, 11:48am

Hi AmberV,
it seems that the comment is part of the file. See Apple Developer documents:

MetadataAttributesRef.pdf, page 12:

kMDItemFinderComment
Finder comments for this item.
Value Type: CFString
Framework Path: CoreServices/CoreServices.h
Header: MDItem.h
Availability: Available in Mac OS X v10.4 and later.

Maria

michaeltsai · July 12, 2007, 1:26pm

Despite the fact that you can sometimes access them via the metadata APIs, Spotlight comments are actually not stored in the file itself or in Spotlight. Rather, they’re stored in an invisible .DS_Store next to the file.

Maria · July 12, 2007, 1:51pm

Ah, that was it, yes! Thanks for the correction. I remember myself trying to delete these file in several ways in order to test how stable working with comments would be, but I could not break the system. So, though I understood that there might be a case when these file get lost, I decided that these should be exceptional and to work with comments and absolutely deleted the memories about separate files in my brain. It is getting old (the brain, myself…)

So now remember why I chose the NiftyBox solution beside being a frontend. I had tested exporting the data and reconstructing comments with AppleScript. It worked well, so I knew in a worst case scenario I would still be on the secure side and could recreate all tags, as long as they were administered in NiftyBox. Wow, I remember going through all that, but I did too much this year, what is done and works, is just forgotten!

Maria

AmberV · July 12, 2007, 5:02pm

This discussion on the usefulness of date organised folders prompted me to post a bit more on my filing system. I already discussed how I embed meaning into the file itself, but now how I sort the files.

I came across an interesting index card filing system designed by a neo-luddite from Japan who wanted a better PDA. He carries around a stack of index cards to keep all of his thoughts on, and at the end of the day, moves those cards to a filing system. For filing, he uses Noguchi’s strict chronological system with one exception: Accessed cards are not moved to the front, but marked on the top with a dark pen. This way, when looking at a large stack of cards, you can quickly see which ones have been accessed before. Frequently accessed cards will have a number of marks (he stops at four). So cards are stored strictly chronologically, and highlighted by re-use. Next, when creating the card he makes a mark along the top left in one of four places. The position of this mark indicates what kind of card it is. Is it information researched for some project? An original thought or idea? Something that needs to be done? Et cetera. That is along the top of the card so it can be picked out from a stack. On the front of the card he puts a small icon to denote the type, a title for the card, and the date stamp. The date stamp is also used as a unique identifier for cross-referencing.

And that’s it! But it is surprisingly effective. Knowing roughly when something was written, and what category it is can quickly narrow down a large stack of cards to a handful, and the title at the top of the card makes it easy to relocate the precise card you were looking for. Putting the date in a prominent position makes it easy to re-sort if you accidentally drop a handful of cards.

I tried out this technique for a while, and then decided that while I quite liked capturing my thoughts onto cards during the day, I really wanted things stored digitally in the end. So what I do now is simply type in these cards using MMD, and file them into my archival software. I use the same principles of accessibility. Everything is stored strictly by chronology, and the only visible data is the tag, title, and date. I don’t mark for access because that is only something that is useful in a physical stack of cards where the title cannot be available looking at 1000 of them in a filing drawer. Note, I periodically use card instead of file in the rest of this discussion because I often think of individual files as being cards; sorry for the confusion. Assume I mean files when I say cards, unless I’m specifically talking about physical index cards.

What I have found liberating is the concept of only have four tags. I put things into a year/90-day cycle structure, give it a rough category, title and date, and it is done. No more filing. Filing is drop dead simple. With cards you just put the day’s deck on the top of the stack. Done. With a computer, you just save them into the current date folder. Done.

The thing I discovered is that I simply never used extensive tags. I spent hours and hours entering in lists of keywords, and never once used them in an actual recall situation. With this current system, I have never lost a thought. Using date+type+title is exceptionally comprehensive in a single-person situation.

Now, I have gone and sub-typed a bit. For example, my second tag type is @Creative. Since I am very often looking for thoughts on a specific project within the @Creative tag, I’ll jot down the name of the project after that. So the tag becomes @Creative-book-X or whatever. But I am very careful to only make these sub-types an exception.

Another thing I like about this system is that it is well suited for the computer’s file system. The three main filing axis can be embedded right into the file system. By example:

07193687-R-Digital Photographer as Chemist, Photographer, and Darkroom Tech.md

This file name has the date at the front so it will always be hard sorted by date. The ‘R’ means ‘Record’ which means ideas or thoughts I’ve had, the title, and then the file extension (MultiMarkdown). It might look a little cluttered by itself, but in a list of several hundred, those dashes and numbers at the front form visual columns that the eye can quickly pick out data from. What to use for the tag marker is something I have not completely settled on. I’d love to use Unicode characters, but that is still not completely cross-platform and stable. Perhaps in the future that will be a valid technique. I experimented with numbers, but found letters are easier to pick out in a crowd if the letters are well chosen. C and R look quite a bit different, where 2 and 3 can be missed when rapidly viewing a large list. I use R C M and I. The nice thing is that if you do change your mind, it is very easy to change them all at once using a bulk file renaming tool. So if/when Unicode becomes a valid technique. I can search for -R- and replace it with some icon surrounded by hyphens. For those rare cases where I sub-type:

07193687-R-Dreams-Analysis of highschool reunion theme.md

That way, it doesn’t get in the way of the primary filing columns, but is still in a visible slot. Using Smart Folders, I can search for -Dreams- and quickly isolate all files by sub-type. Or agents in Tinderbox, or whatever.

This is the type of extremely durable meta-data that isn’t going anywhere. With MMD on the inside, and most of the filing data on the “cover,” a directory full of this stuff will last for as long as we have “files” and “folders” on our systems, and conceivably even beyond that. This filename will work on any modern operating system, but to make it even more durable you could replace spaces with underscores and remove grammatical punctuation. This would increase its scope to the web. To reference this card from another card, I would simply type in [ic07193687]. This is its unique number derived from the date (ic- for Index Card). If I combine all of these cards into a single file, that will create a hyperlink between the linked text and this card. It’s also a human usable link. I can double-click, press Cmd-E to move the selection to the search buffer, and press Cmd-G to find the next instance of it. Or I can visually scan the directory for that number and find the card/file manually. Another efficiency benefit of using an archival software (be it integrated with the Finder, or something entirely separate that can recreate the finder structure later), is that you can now search the contents of files as well as the titles. This should not be underestimated, and it is a big reason why I ultimately wanted to stick with digital instead of going all paper.

Exporting my knowledge archive to another computer is a simple matter of zipping up the entire top level directory and moving it to another computer. Any further filing on top of this system is gravy, and not unlike the principle I described in an earlier post where modern technology is used to add efficiency to a system that has a low-technology back up. If my archival software fails, I simply lose a little efficiency, but I do not lose any data, and the primary retrieval system still exists.

There are several important concepts for making this system work:

The first carries over from the index card paradigm and that is to keep data parcels small. These files are not big. Rarely are they more than a few paragraphs long. In this sense, they are much like a library index card method. This is the same technique that Tinderbox, Scrivener, and other programs use to keep search results relevant. Since there are very few systems that can actually link to points within a file, you must assume that you can only link to the file itself from other files. If each file is many pages long, your cross-references reduces its usefulness. But full files can be placed into the index directory without any problems. I just reference them differently internally so that I know what I am getting in to from the cross-reference link. That little ic- in front is part of that. I will rarely link to files on the drive from a card file, because this is prone to link breakage in the long term, but occasionally I will do that. More often, I’ll just copy the source file into the filing directory and give it a new name. Then I know it is going nowhere, and linking to it is not nearly as risky. The main reason I’ll link to an “external” file is if the placement of that file is important. For example, a Scrivener template needs to be in a certain location to work. That it must be there is also a safety mechanism.

Second: Picking out threads can be done using the system itself, not embedding further complexity into the system. To explain what I mean by that, consider the most common method for threading. Say for a while you have a series of ideas revolving around one project. You want to somehow mark that these files/cards are relating to that project. The typical response would be to add a keyword. There are no keywords in this system though, so you would have to increase the complexity somewhere by adding another axis. Now, excepting the rare cases where I create a sub-type, the way I approach this is by creating an index of these cards and storing it as a new file. This might look something like:

07193724-R-INDEX for modern photography techniques.md

The capitalised “INDEX” makes it very easy to spot, and since it is filed using the same techniques as everything else, it is just as easy to isolate an index as it is any other file. This file will simply contain a list of all the files/cards using cross-referencing. Subsequent index files will refer to the prior index file at the top of the list, OR, they can simply add up all of the other lists and then continue the list. The key is that the old ones are not replaced or modified.

This leads right into the third principle, and that is: Once you file something, you never ever touch it again, except to look at it! That is perhaps the most bizarre in this modern computing age. If you want to add ideas to an older file, you make a new one and cross-reference to the older one. I’ll admit, this one is probably more up to taste than being an absolute concept, but here is the philosophy behind it. The digital age has made modification of data a largely corrective instead of additive task. The original, once modified, is lost. Now in a perfect world that is fine. 99 times out of 100, you changed it for a reason and the old data is no longer necessary. But there is that 1 time where you have an accident and lose data that is now gone forever (unless you are extremely diligent in backing things up), or realise a year later that you really wish you had version one instead of version two, and so on. If you leave things strictly chronologically, where old data is never modified only referenced, then you will always have a full record of any idea’s evolution. In this day and age, when drives are large enough to store most of the textual data in the entire world, it really makes no practical sense to delete or overwrite your own thoughts with new ones. The one weak point in this, is that if you take this seriously and never touch old files, you can create “future scanning” blind spots. While back-referencing from newer files is possible, your brain might only remember the original idea formulation’s date, and not the revision date. But the original does not reference the revision, only the revision references the original. For this reason, I will sometimes, very carefully, add future-references to old data cards, if I feel it is important to do so. But I always add these as an addenda, so it is clear that anything below that “line” was written after the original drafting of the file. Another point of contention for some is whether or not anything should ever be culled from the system. I do a natural sort of culling when I transfer paper cards to digital. During the day I’ll jot down things I need to do; grocery lists and such, and these have no use in a knowledge archive so I do not store them. The individual that developed this system, on the other hand, does store them. He keeps everything, and consequently has thousands upon thousands of index cards in his archives. I believe in the retention of data that seems irrelevant because you never know, but I do draw the line with some things. What I will sometimes do is fold these into my daily diary entry. Sometimes it is fun to go back and see what you picked up at the grocery store ten years ago. You can scoff at your old eating habits or whatever. But to store it in its own file, I personally think that is overkill. I never delete anything I am unsure of though. If there is any hesitation, I save it. This philosophy has saved me many times in the past. We usually hesitate for a reason, and I’ve learned to trust that. I’m sure there is a lot of junk I’ll never need, but I’d rather have a lot of junk than a missing idea. In theory, the filing system should be able to accommodate the junk. If you can still find the important stuff amongst the junk, then it is okay to leave the junk. That is my philosophy on the matter.

Another principle is that this is a personal filing system. When we communicate with ourselves, we can take shortcuts that we could not otherwise take with other people. I think this is part of why the 4 tag category system works as well as it does. The whole concept of using large keyword lists evolved from a need to make data portable between large groups of humans. If you are indexing a catalogue of images, you want to place as many searchable terms as possible, because it is not practical to predict what everyone will think when they search for something. We have one very critical bit of information that another person will never have a complete knowledge of, when we first thought of something. The chronological aspect is useless to pretty much anyone but ourselves and maybe a very close friend or two. But to ourselves, it is an incredibly useful map. Once you get the list down to a few dozen entries, finding a title is an exceptionally rapid process, even if we do not remember the title. As for what to use for the four tags, I believe that is a personal choice. I developed my own four-tag system instead of using the original, because one of his tags was GTD. While I do use GTD, I don’t want to store GTD tasks in my knowledge archive. He actually uses the card system as his GTD system, while I use a dedicated program for that. I replaced that tag with “Communications,” which by the way, this entry will be filed under in my system. It could probably be filed as Record, but I’ll remember it in the future as “that forum post I made in the summer of 2007.” Somebody else might just put it in Record if they do not have a communication tag.

It was a little scary adopting this system at first. I discovered that my extensive filing techniques had become a bit of a “safety blanket,” that in no way reflected upon my actual needs. I continue to evolve the system here and there, but am very careful to not tread on the core concepts set forth by the Noguchi method. I’ve been using it for a year now, and there are thousands of files set up this way. I have yet to spend more than a few minutes locating anything in that archive. Usually it is more in the order of seconds.

That, for me, is all I need to continue in its usage.

SteveJ · July 14, 2007, 1:59pm

Amber–

Thanks for offering this; it is the most helpfully provocative discussion of recording and archiving I’ve encountered in a long time. One question, if you’ve got the leisure to field it. When you say:

How is the link created? I’m new to MultiMarkdown, and might be missing a trick. I can see that “[ic07193687]” would produce a link, once the files were merged, if there were an MM header comprisong that string (like “## [ic07193687] ##”), but I don’t see how your the bracketed string alone creates the link.

Again, thanks for the long and extroardinarily helpful post.

Steve

AmberV · July 14, 2007, 2:39pm

You’re welcome!

And, you are absolutely right. I did gloss over that detail. I always put the target reference in a top level header at the beginning of the document. Generally I create a stylesheet with:

body h1:first-child { display: none; }

As a rule which hides this first header from display in a browser. You don’t need brackets in the name. The use of a simple [bracketLink] in MMD is just a shorthand method for [bracketLink][]. They both do the same thing. The idea is, if you are not going to be needing to alias the link with another phrase, you might as well reduce complexity.

So [ic07193687] will create a link to a spot in the merged file that corresponds with:

# ic07193687 #

Even if the stylesheet is currently hiding the visible component from display, it will still work.

I put the two letter identification bit in front because the XHMLT spec requires a non-numerical character in the first position of an id. Otherwise, I’d just use the number. I suppose I could just use a generic two letter thing like, ‘id07193687’, but I figured why not indicate if the linked item is going to be a short paragraph or two, or an entire article.

Another alternative, if one wanted to create a network of thoughts instead of merged pages, would be to create a link to where the other file would be. If they are all going into the same web directory, [Index Card on Bamboo Fibers][./07194756-R-BambooFibers.html] would do the trick. In that case, you wouldn’t need to create a target link in the destination file at all.

fldsfslmn · July 15, 2007, 1:47am

Here’s Where I’m At

Iâ€™m currently engaged in a number of text-based projects. Here they are, ranked according to importance:

MA thesis (English)
Technical documentation (how I make money)
Creative writing (short fiction, somewhat neglected)
Web development (intermittent money-maker)
Blog (literary criticism, sorely neglected)

My MA program actually begins in the fall, but the acceptance letter was really the event which spelled the end of my haphazard Moleskine + NeoOffice workflow. I decided to look for ways technology could help me get organized, stay productive, and manage text. To that end Iâ€™ve fallen in love withâ€”and boughtâ€”Scrivener and Coda (which is like Scrivenerâ€™s nerdy cousin from Silicon Valley, I suppose).

Unfortunately, nothing else has felt right. For a while I was interested in the GTD phenomenon and OS X implementations like iGTD and Actiontastic, but GTD is exactly the sort of â€œmovementâ€

Maria · July 15, 2007, 2:20am

[quote=“fldsfslmn”]
…
Update: I wrote the first paragraph in a coffee shop without Wi-Fi. Upon moving to a wireless-enabled coffee shop, I tried clicking on my link to â€œPostscript on the Societies of Controlâ€

AmberV · July 15, 2007, 4:47am

fldsfslm, I’m in the same spot you are in. The application that I’m currently using for archival is aging and appears to have been abandoned. This makes me sad because it is beautiful program with an innovative philosophy that really clicks with me. I’ve been auditioning alternatives for a while, but nothing quite strikes me as having the elegance that my old system has, while also retaining some of the most important things I have on my internal “goal” list too. It’s an idea that I’ve had in my head for a while, and I was just sitting down to write down what I want and do not want, when I saw your post. Amusing, but it looks like we are in the same boat.

I’m probably going to pursue my ideas using Ruby on Rails with AJAX, using MySQL as the back-end. Since having a file system is important to me though, I’ll probably make it so that it keeps a mirror directory somewhere. That way Spotlight, Smart Folders, and all that jazz can integrate with the system. Using MD5 hashes, I could probably make it so that the web application could detect changes made to the file system and integrate them with the MySQL system; but I haven’t decided on that yet. It is more risky. Why use a DB at all? MySQL is very safe, fast, and it is just second nature to RoR. The only thing that makes a file system better than a database, in my opinion, is accessibility and OS integration. No two small things, granted! On the other hand: RoR+MySQL can be put up on a protected part of my web server, making access universal and consolidated. No worrying about synchronisation between machines and so on.

Regarding web site archival: I am an ardent web site archiver, mostly for the reason you brought up. You cannot trust a web site to remain in existence for a long duration, so if you need information for a long duration, it is wise to archive it. Maria points out that this particular case has duplicates on the web, making your example a bad one. The point is, there are many cases where data exists in one spot, and if you rely on that data, you are trusting that spot to remain online when you need it.

I don’t know of any legal ramifications to worry about. I think that is what you are referring to as far as red flags go; if not, my apologies. When you consider that web pages are automatically cached by browsers, sometimes for a very long time, any system of web site archival is merely a formalisation of what is already happening. I am not sure where you live, but in the U.S., this falls under fair use. A classic example of this on a scale that no single sane human will ever surpass is Google’s phenomenally massive archive and caching system. Except for the minority cases where the web site specifically tells Google to lay off, they have a duplicate stash of nearly every web site in existence. Not only that, the stash is public. What we are talking about here is keeping a reference copy for personal usage. Given that Google’s has a legal precedent protecting their cache under fair use laws, I think the concept of local archival and classification is perfectly safe. There are programs like BrowseBack that do this systematically for every web site you ever touch. It saves a PDF version of every web page you visit (which frankly, I find to be an absurd concept, like using TIFF screenshots to store text files, but I digress).

Anyway, good luck on your project. I think if a person has the skills to create their own archival system, they should. Even though it takes a lot of time to do that, when you add up all of the time you waste trying to bend other programs to what you need, I’m sure it equals out–and you end up with something you really like and can control, rather than something you pretty much like and hope the developer keeps maintained. I’m not aware of any archival program that is universally appealing. There are too many diverse needs, too many different (really different) philosophies as to how things should be handled. Should you be able to edit things you archive? Are folders better than tags? Is something like Gmail’s label system better? And so on.

SteveJ · July 15, 2007, 2:27pm

That technique for linking, Amber, is very elegant, like the whole system you describe. For me (as for fldsfslmn, apparently) this thread has landed serendipitously, since two realizations, unrelated but convergent, arrived recently and almost simultaneously: (1) I had again mortgaged myself to a particular developer’s file format, and thus to their continued health, success, and attention-span–one of the problems discussed on this thread–and (2) that system for archiving and retrieving research and writing was better when I just used pen and paper than it has been since the computer came along. (I have the haunting sense of being older than most of the participants in this thread…)

Number (1) is non-trivial. I had hundreds of hours of research in two Windows apps (one of them askSam, mentioned by AndreasE earlier) that took many hours to massage back into usable form when I moved to the Mac. (I’ve still never gotten 'round to licking it fully back into shape.) I’ve wasted too much of my life designing ways out of the corners into which my software choices had painted me to expose myself to further dangers. Thus, as Maria and Amber and others have noted, the attractiveness of using plain text and the file system. I’ve been trying EagleFiler, since it is in many ways chiefly a front-end to folder/file structure. (It’s good enough for several purposes that I’ve already bought it; whether it will be an adequate solution for research and writing remains to be seen.)

But (2) was equally important. When I was a student, I took reading notes on 8 1/2 x 11 pages, which I inserted serially into notebooks, numbering the pages, with no attempt at classification, keeping a 3 x 5 bibliography card for each source noting the page numbers. Then I would index them using 4 x 6 topic cards, topically labelled. So when I was writing about the revolt of 1381, I might have a card headed “Norfolk revolt”; single lines would list references to the revolt in Norfolk by page-number in the notes with a few words of summary.There are obviously very close analogies to forms of classification offered by computer file systems and filing apps. But a large part of what my old system did for me was enforce a discipline of recursive processing: I had to review periodically the notes I had already taken and indexed in the light of later thinking and discovery; and the very act of adding new details to already existing 4 x 6 cards would remind me of connections I had forgotten. This is very close to what Amber describes here:

AmberV:

Now, excepting the rare cases where I create a sub-type, the way I approach this is by creating an index of these cards and storing it as a new file. This might look something like:
07193724-R-INDEX for modern photography techniques.md
The capitalised “INDEX” makes it very easy to spot, and since it is filed using the same techniques as everything else, it is just as easy to isolate an index as it is any other file. This file will simply contain a list of all the files/cards using cross-referencing. Subsequent index files will refer to the prior index file at the top of the list, OR, they can simply add up all of the other lists and then continue the list. The key is that the old ones are not replaced or modified.

The temptation since moving to the computer has been to try to imagine that the machine could replace some of that work of reviewing and reconsolidating research, rather than extending its power with searching, classifying, duplicating, and so on. Things that offer relatively little profit to my way of working (like formatting notes) take proportionately more time and attention than they repay (another attraction of plain-text), while the habits of thought and work that are intellectually most profitable seem oblique to the habits of thought and work that seem naturally to suggest themselves when I sit in front of my Mac.

The upshot is that I’ve been moving toward a system that moves back to my old habits: less attention to formatting, less attention to trying to file and classify in advance, exploitation of my own memory in chronological arrangement of notes and drafts, and recursive processing. Thus my engagement with what Maria and Amber and fldsfslmn have been writing in this and related threads. I’m still experimenting, as are others, obviously; I hope we can hear more about everyone’s experiments and experiences.

autodidact · July 23, 2007, 7:08am

@AmberV:

Thanks for pointing me to MMD. I downloaded the MMD documentation, and browsed the website, and I can say that it looks very appealing. I still haven’t set it up on my computer, but I will soon. I also have to give thanks to fletcher for all the work putting MMD together.

You’ve given a very thorough description of the system you use to organize your data and it seems that a lot of people are benefitting from that information. I know I am.
I have a question… you mention that in your system, R stands for “Record.” What does C, M, and I stand for? I can use whatever letters I want to use, etc, but I’m just wondering what tags you are using. I’ve found that the more ideas I come across, the more ideas are triggered in my mind.

I’ve thought of writing my own program too. Initially, I was consideing using MySQL and PHP. But after reading this thread, I would consider just using a file scheme using the techniques described here and develop something using PHP to parse those files for given words. It wouldn’t have to be anything too elaborate at first. I’ll post something on that when I get to that point. First I need to set my own archiving scheme and start doing some real work.

–Carlos

AmberV · July 23, 2007, 6:04pm

Carlos,

The four categories I use are:1. Record -R- Just personal recording. Ideas; observations; people watching; basically anything you might put in a diary; or creative things that are not attached to any particular project, like a line of prose. It was liberating to separate creative from diary for me. In the past, I’ve had a problem with feeling guilty about keeping a mundane diary. I always felt like I should be doing something of Quality in it. This category is not about quality–simply getting the “facts” down. I don’t have to worry about it being filled with eloquence, or using only the nicest inks, nibs, and papers. Just get it all out.

Creative -C- I draw the line between Record and Creative by saying, something that intends to “become” something goes in creative. Whether that be a thing that is already taking shape, or just an idea that might expand later. If I feel it is going to be become a story, or if it is a list of subjects for the next time I take my camera out, then it goes in Creative. This is where I am most liberal about sub-categories. It just makes sense to designate which book something is about, or whatever.
Communications -M- The menmonic is that an M looks like the fold of an envelope, as Google Mail so cleverly reveals in their logo. Forums, emails, letters to friends, blog posts, tech support, and other things like that go here. I’ll sub-categorise this one too, if it is a person or forum that I frequently communicate with.
Reference -I- The menmonic is for Information, because I already had an ‘R’. Reference is just that; very similar to Record, except it is material that I have collected as opposed to producing. Everything from research for books, to recipes. This was a big one for me, because prior to really formalising all of this, I never took notes on anything. I would just memorise, and then things would fade as I stopped using them consistently. I still memorise though, and I’ll often use the reference category to aid in that; allowing this category to double as a flash card system. This is also where I store bulk documents downloaded from the web or scanned from paper media.

Before I switched to this simplified system, I was using a sort of radically simplified Dewey Decimal system of my own design (PDS. for Personal Decimal System). The six top-level categories were almost the same as the four categories above. Events, Commentary, Dreams, Creative, Communications, and Collected. Collected was different in that it could be prefixed to any of the other basic five categories. So 2.* would be commentary, like a review I wrote, but 6.2.* would be a collected review. This system allowed for a descending classification with the most important designations in front. For example, 6.2.2.3 would be a Collected (6) Commentary (2) Article (2) on Linguistics (3). I tried to avoid going any deeper than four levels (not including the Collected prefix), as I found it became too difficult to remember all of them. I still use this system in conjunction with the basic four system, because while I’ve been using it for a year now, I’m a very cautious person, and I’d like to be able to fall back to the more comprehensive system if need be. I thought of just using the DDC as it exists, but it seemed wasteful, as I do not regularly deal in 99% of it, and meanwhile there are other areas it doesn’t address that are personal to me, such as 1.5.4, diary entries relating to NanoWriMo 2004.

I actually had one other category, a seventh, which was just called ‘meta,’ it was reserved for anything regarding the development of the organisational system. For example, periodically I’d write an index or abstract document of a long evolving conversation with an individual. That way, in the following years I could quickly find “that one conversation we had about…” Such a document would get tagged as 7.1.1. The top-down method is very useful when searching or filtering documents, because most systems filter in a left to right manner. I can start typing -PDS-7… and the more numerals I add, the further the list is narrowed down topically in real-time using spotlight.

Some day I’ll make up my mind between the PDS or the 4/tag system, but for now I’ll just keep using both. Since everything is just text files, global changes to the system are very simple if you know a little scripting, or even sed or grep.

Which of course comes back to how one maintains all of this. In the past week I think I’ve drifted a bit from the web-app approach, largely for the reasons you stated. It just starts getting too complicated once you deviate from the file system approach. The one main thing that I prefer with the separated database possibility is that I could enforce immutability of older documents in a way that could never be truly done using a file system.

But there are a lot of interesting interface concepts that could be approached from the philosophy of EagleFiler, where the application is a front-end to a file structure. There are things I would like to do, such as PDS tree browsing, version grouping, and synopsis card (like Scrivener does) overviews. But honestly, I’m so comfortable with command line interfaces, I just might use straight Ruby. Something like that could be developed in a matter of weeks, whereas learning AJAX or Cocoa or some other thing would require years.

suavito · December 23, 2007, 3:49pm

suavito · January 6, 2008, 12:14am

Has anyone an idea why you can’t just drag and drop files from Journler into Scrivener? Scriv doesn’t accept them.

If you first drag them onto the desktop and from there into Scrivener everything works fine, but this is one step too many for me.

The other still in the race competitor, Together, does not have theses problems. Everything comes out of it as it went in, with the correct file name.

But Together, believe it or not, has no view zoom function whatsoever for it’s notes. You can magnify a pdf, but not it’s own note format.

As a common font size of 12 or 13 points is unreadable the only work-around would be to set the default font size much higher. But then you would have to manually format the exported note in Scrivener (or whatever program). Again, one step too many for me.

sethbrown · April 12, 2011, 3:51pm

Thanks for all the incredible insight from AmberV and others on here. I read this forum many years ago and started implementing many of the ideas discussed on this forum to help me with my own workflow. I don’t know where my files would be without everyone on here!

I just blogged about my current information managing methods on my blog here:

drbunsen.org/home/2011/4/12/ … art-1.html

I would love any comments or feedback. Thanks!

_lt_svs_gt · August 18, 2011, 11:08pm

It has been great reading through these posts and really got me thinking about managing my files.

However, I’ve hit one big hurdle. I seem to be unable to reconcile the use of folders with files in my thinking. I understand AmberV’s file naming categorisation, but how does this link in with the use of folders? Creating folders with categories that are not in the filename categories seems to weaken the system as the folder information is not attached to the file.

Do I create folders that represent areas of work, home, clients or should this data be in the filename?

Looking forward to some simple answers as my grey cells cannot seem to compute!

AmberV · August 19, 2011, 2:51am

The only reason I advocate and use folders as at all is to the avoid the trillion-file-folder problem. Most stuff gets slow once you have a few thousand files in it, and even if it doesn’t get slow, it gets unwieldy. Since the only reason I use a folder is to avoid this problem, I don’t want to spend any extra time messing with folders than I have to—either making them or sorting things into them. The idea, for me, is to make filing as mindless as possible. So to that end I have a folder called 2011, and in that I have a folder called 270. It is the 231st day of the year, so I’ll be using that folder for another ~40 days, then I’ll make a new one called 360. Thus, I make four folders a year, and I never “sort” anything. Everything just goes into the last folder in the list, and so I borrow from the IT convention of having a latest sym-link at the top level that points to this last folder. Now I don’t even have to bother with whether it is 270 or 360. I just always, always save to “latest” in the Finder sidebar. End of story.

Quarterly folders work well for recollection. It reduces the list of things you need to hunt through by quite a bit, and it’s an easy target for your memory to come up with. First quarter of 2009—great, I don’t know when I wrote this letter, but I can go through a list of 80 ‘M’ files without too much hassle. Grep that list down if I know the token and I might only have a dozen or less to poke through. So, in a sense, it’s a piece of filename data that is use in my case, since the first part of the filename is the date.

As for folders not being ‘attached’ to the file. Well, that depends on how you look at it. The full path of the file does include all of its folder ancestry, and the full path is often accessible, especially in a UNIX environment. It’s often very visible in a UNIX environment, like a URL. So in a way I think it’s a pity to waste it on something redundant like the date, but like I say, I prioritise on ease of filing.

Having a top-level distinction between Work and theRest isn’t bad. If you tend to work in a “career” you might just want to leave it at that. Stuff that I archived from where I worked two years ago is still occasionally useful to me today because I still do some of the same things (taking care of web servers; coding pages; etc). So I haven’t started a new trunk for Scrivener, it just switched over from theRest at a certain point in time to Work.

Work-Latest ⇢ Work/2011/270
Personal-Latest ⇢ Personal/2011/270

That’s it.