Scrivener as Dedicated Research Database

I’ve added a list of links related to MultiMarkdown (some more interesting than others) to the Scrivener Wiki. There’s currently no tutorial, but you could start with the Markdown Web Dingus:

daringfireball.net/projects/markdown/dingus

This is Markdown, not MMD, but the concept is basically the same.

Amber,

  • Now I know what N and R sand for, but I’m dying to know, what other taxonomy do you use?
  • How many codes are there in your system?
  • Did you start with a few and have they expanded over time?

I started down the data management journey using Journler. Like Boswell, Journler was an excellent tool for letting me organize my data. After a while Journler and I parted ways and I converted everything through Together to DEVONThink Pro. I posted a lot about this on my blog, and don’t want to repeat it here, but I’ve dumped DTP and I’m now basing all my archiving and research management on the basic file system functionality of OS X because its data retrieval and organization capabilities seem to work much more elegantly than DTP (or the other shoebox applications), and they seem to be less prone to obsolescence.

Alex Payne wrote a great web post at al3x.net (Case Against Everything Buckets.) (and the followup) about the value of using single purpose apps (of which Scrivener is one of the best) along with the file system to achieve robust research management capabilities. When I read his work it really change my outlook on these types of applications.

What Scrivener lets me do is put just the right amount of research in with my writing project, while the mother-load of the library stays out in file system. I fear that if Scrivener tried to recreate a research management tool or duplicate the file system, it would loose some of what makes it work so well as a single purpose application.

I agree with you completely, Amber, that the only valuable unique functions in DTP kick in at huge data sets - all that AI stuff. I did find value however in using DTP (and Journler, and Together, and to a lesser extend EagleFiler…) as tools to organize my data in the first place. I have thousands of notes in rtf/d files, along with a significant library of pdfs. After looking at all the lit manager program (papers et al) I came to the same conclusion that without a text based coding system you’re just setting yourself up for a future conversion project or worse, date loss problem (I once used Commence so know about that well)

Going it alone in the file system however does require one to make your own internal text based tagging system, and I am intrigued by yours, hence my question.

Doug

After using DevonNote for a couple years, I came to the same conclusion when I got Scrivener. It’s just easier for me to use the file system (basically, Finder + Spotlight + TextEdit) to store my research for various projects past, present and prospective. Then use Scrivener as the respository / organizer for the projects I’m actually working on at the moment. That system is even working well so far for my book in progress.

The file system is about as future proof as we can hope for at this point, not that I expect Devon or EagleFiler or the rest to go bellyup. (And I guess you could always export everything to rtf or even text if necessary.) I have a preference for minimalism – just the tools I need, less clutter – and this system means just one less app (at least) to have running and consuming space and RAM and my time to RTFM and scale the learning curve. I fully appreciate the usefulness of Devon et al for research intensive projects. But for my needs as a working journalist and book writer, the file system + Scrivener combo seems to be as much as I need and no more.

The arguments for and against bucket software are compelling. However, the basic OSX file system doesn’t always cut it for me. On the one hand, I’ll rely on a basic file folder system on the hard drive for storing my music projects (seems strange as those are quite huge, I know) but then on the other hand, I’m a huge news/info. junkie and need to try to merge my email and internet data together as seamlessly as possible.

I decided that my Scrivener research folders were getting a bit cumbersome so I think DEVONthink is going to be the best solution since I’ll be using agent for searches from now on anyway. I went ahead and bought DEVONthink and agent yesterday. I’m happy I’m able get searches done so quickly and thoroughly with agent and store my ‘keepers’ in a database in think. The other thing that sold me was the fact that I can have RSS feeds and archive emails from Entourage in DEVONthink. I’m currently using Times for my RSS newsreader but if I need to archive a story that I want to keep as research, there’s no quick way of getting it out of Times and into Scrivener. And with DEVONthink, archiving and accessing email from Entourage is easy as I frequently send emails to news articles to my Gmail account (I do this at work during the day on my haggish corporate Windows setup). Now, I can simply drag/export/etc… the link over from Entourage into DEVONthink for later use in Scrivener.

I would be hesitant to use DEVONthink for databasing any of my other stuff however. I like to keep my other ‘twiddly’ bits in actual folders on OSX in Documents. I don’t like the idea of certain things being locked up in a proprietary file system I may not be able to access later. iPhoto comes to mind on that point. I like to backup my photos to DVD in standard OSX folder structure.

I have Circus Ponies notebook but I stopped using it as I realized the proprietary save files it generates could get corrupted or if you have one of those notebooks password-protected and forget what it is, you’d be in real trouble. Also, notebook for me is too nebulous in workflow as far as structuring things goes. I wind with more clutter inside the notebooks than what I had outside to begin with. Then the other problem is should I save the info. somewhere else that just saved in Notebook as a ‘just in case’? Sort of kills the whole fun of it. Notebook is a neat idea but I’ve never found any real use for it. Just another bucket I wasted money on…

DEVONthink and agent seem a bit more reliable and logical but then again, should I save all that stuff somewhere else…‘just in case’? The debate rages on.

All of this data we save means nothing really. In the end, the software we buy to GTD means nothing either as the formats will forever mutate and leaves us saddened by the loss of those tiny digital ‘nothings’ we worked so hard on and spent so much time and money trying to preserve.

With that in mind, I hope to make DEVONthink and agent worth the money spent and actually use them to get all the data collected with them committed to text in the form of this book I’m working on. If there’s a tangible and lasting end result that justifies the means, then I say it’s worth the time and money spent on bucket software.

End of mindless rant but I hope it helps others in their decision on how to best organize and ‘future-proof’ their data.

End Analysis:

  1. Circus Ponies Notebook- Pretty but not very useful. Too ‘noodly-doodly’. Proprietary file system.
  2. DEVONthink- Can integrate email, RSS feeds and searches from agent together in spot. Still a proprietary database system though.
  3. When in doubt, do ‘raw’ backups of everything and frequently using OSX file structure. I use Time Machine.

With the current 2.0 beta it’s metadata is in a proprietary format, but the files that you store go directly into the Mac file system.

Amber,

This would be terrific. Thank you very much!!!

Steve

You ask some good questions. It has taken me about four years to “perfect” the system that I currently use. I don’t for one minute think it is perfect, but at its current state it is as superior as it has ever been compared to anything prior. By the way, thanks for the link to Alex Payne’s blog. What an interesting read, I enjoyed it. Odd to consider he is behind Twitter, something I consider completely frivolous and a waste of humanity on the scale of the E-Channel, but oh well. :slight_smile: We are all entitled, as they say.

Four top-level categories: Record, Communication, Manifested, and Information. Does it cover everything in life? No, but it covers the four main areas of input and output that exist in mine as relating to a digital archive, and that is all that really matters to the system. Record is everything I take down and record. Communication is self-explanatory. Manifestation encompasses mainly creative pursuits, but can also address explanatory expositions that go beyond mere record or information. Information is mostly just documentation, either generated or collected. Things could be a bit blurry if you take them literally. Naturally all capital letter items are technically manifestations. I do not define tokens by what things are but what they represent to the future me. So if something will be useful as a reference or cross-linked to from another document for elaboration—it is Information, even if it is a Record of something, or something I manifested. I feel this way of looking at identity assignment places more emphasis on the intention of a document than the identity of the document. I don’t really care so much about what it was, but what it will be in the future when I try to find it.

Beneath that there is a strict three-level depth restriction. Nothing can be less or more. Something must have a super-category, a minor-category, and a topic or key. Example super-categories in Record are simple: Internal and External. I observe and record psychology or dreams into Internal, for example, but observe and record events into External. The minor-category allows for division of the super-category into logical parts. One of the above examples: {R2.1.Dream} is Record-2(Internal)-1(Observation)-Dream. Some branches might only have one super or minor category in them at the moment, but that is fine as it enforces a rigid structure that reduces the proliferation of hair-splitting.

The fundamentals are to reduce the top-level to four broad categories and denote them as letters instead of numbers. To only have two numbers and to keep super-types as much in dichotomy as possible to reduce complexity. Record is split into Internal and External, you cannot really address much with a hypothetical R3. Perhaps a religious person might put supernatural in there, I don’t know. :wink: And finally, the most specific category which is also the most prone to proliferation is written down as a word, not a number. So there is no “5” = this person—I just write down the person’s name. Thus keywords can expand as much as they need to without increasing the need for memorisation.

I presume you mean the tokens that I have been discussing. There are other codes, but they are largely just syntax expansion of MMD to suit me. For example, I use the date and time as unique ID. Each entry in the system has its own date and time and thus a unique way to link to it. Rather than provide the precise URI to the file in MMD linking syntax, I have a shortcut that just requires the unique ID, <|unique_id|>. My MultiMarkdown parser has been modified to interscept these codes and expand them out to full MMD syntax in combination with a file search routine to pinpoint the precise URI for me. Thus when I render the file to XHTML or whatever, it gets a link to the actual file on the system. Cross-referencing is mindless and flexible. If I change where I archive everything on the filesystem, I just adjust a variable in the script that adjust the URIs. The base documents themselves are therefore ignorant of file-system specific information. This is a big program with many of these applications in that their linking ability is absolute URI based. Move a file, and they get confused. Move everything, and your cross-references are useless.

As for the tokens themselves: it is largely irrelevant how many of them exist in total since the highly specific portion is plain-English. I don’t need to memorise the difference between {I3.1.Doc} and {I3.1.Invoice} it is not only obvious which is which, but it can also be reasonably guessed what the {I3.1 part means just by looking at those two and and the (I)information lead. Thus the memory intensive parts are the two numbers in the middle. I try to keep super-categories to at least two and no more the four (though I have yet to exceed three). This took a good deal of premeditation in order to make sure the skeleton would not be inadequate as data was added to it. There are a total of nine super-categories I need to remember. Information has three branches and the rest have two. There are around 30 minor-categories; so an average of three per super-category. This might sound like a lot to memorise, but since they are hierarchal and logical it really isn’t that bad. I don’t have to memorise all thirty, but rather only the parts of each super-category individually; hierarchal memory is much more suited to the mind than linear list memory. I try to keep patterns repeating, too. So R2.1 and R1.1 are very similar except in direction Both are minor-category “Observation”, but the latter is External observation where the former is Internal observation. Likewise I2.1 is Indirect-Citation. Different concept, but similar to the notion of internal-observation (at least to me).

Yes, but there has been extremely little expansion in super/minor categories; thanks to the amount of thought I put into it initially. Nearly all of the expansion is done with the suffix keywords, which was the original intention of the design. I’m not sure how many of those there are, but probably 120 or so, which means there might be 120 or 130 total tokens. But as said, that is largely irrelevant since the hard part is the 9 and 30 amounts.

I cannot use DTP for two reasons. One I don’t need it. My taxonomic system is good enough that I don’t need AI to tell me what is related to what and whenever I’ve played with it, it has nearly always been wrong. If it is wrong on the stuff that I know about, why should I trust it with stuff I don’t know about. Second, the new version has no concept of punctuation or case-sensitive searching. Which, naturally, my token system requires strict adherence to.

On the topic of file-system dependence: in fact, for a period of time last year, I did precisely what you are doing, and documented it rather thoroughly here in a few posts; I don’t remember where. I condensed my token expression down a bit so that it could be included in file names.

09106911-M-Scrivener-on_archival.md

Would have been this conversation. I have the date and time (that’s the unique ID by the way) in the numeral (using my weird datestamp format), ‘M’ for communication and then the final specifier. It doesn’t have the whole token, but in my experience this was enough to find things as the title helps a lot. It looks kind of messy all by itself, but in a long list in Finder, the hyphens make everything stand out nicely. I’d also use Hazel to automatically label everything with -M- or whatever to a certain colour. I could use grep to narrow it down further if necessary. Knowing the date is an astonishingly useful bit of meta-data when it comes to personal archival. I cannot recommend date emphasis enough. For multi-person archival it is much less useful, but if you know when you filed something, even only roughly, that can right there eliminate 99% of the chaff when looking for something.

The reason I wasn’t using Boswell is that I had a bit of a hiccup with it that resulted in a corrupted archive. I back-up religiously, so I didn’t actually lose anything, but it was disconcerting enough to abandon it. I tried a number of other softwares; Journler (probably my favourite, but development future became uncertain so I moved out), EagleFiler (which I quite liked because it was Finder with a nice tagging system on top), and even just Finder as you are doing (actually Path Finder, but yes). I had grown so attached to the Boswell way of doing things though that nothing could measure up. There are just some things that program does that nobody else gets close to. So in the end I went back to Boswell after more carefully researching what happened; learned how to avoid the problem and I’m a happy Amber.

The problem with “everything buckets” is one of the reasons I do like Boswell so much because it focusses on only doing one thing: organising information (not files). It is dirt simple once you get over the philosophical hurdles, and allows for a more transparent and fluid assignment of information than folders and files can (for me). Here is a practical example: Every day all of my journal entries (individual documents) land in a collector designed to look for them (this would be analogous to saving all of your diary entries into one folder, but I’m already ahead because the notebook filters are scattering these entries all over the system according to keywords). At the end of the day I go through that and pull out the ones I think are important to the 10-day period. I do this by selecting them, then selecting the 10-day notebook, and dragging the entries roughly over to the notebook list. In Boswell drag and drop is “specify then drag”, not drag to specify. You select the target, then drag to a vague area. Seems like a small thing, but if I want to drag a single entry into forty notebooks all at once, I can do that with one drag. Try making forty aliases in Finder with a single drag. :slight_smile: It also allows inverse drags. Dragging ten notebooks onto a selected range of entries assigns those entries to those notebooks. You can combine two complex and arbitrary lists in a single move. Nothing else exists like that out there, nothing. It is just one of the philosophical hurdles you have to get over though, because we are so used to point-to-point assignment; it is all we generally have. The notion of 8 point to 10 point co-assignment is a bit weird, but absolute fluid once you get it. This is the way the brain works.

Every ten days I go through that notebook and put copies into a 40-day bin that I feel are important to a 40-day perspective. Again this is like dropping five aliases into a folder, the original entries are still in the main collector and the 10-day. Then once every 200-days I pull entries that are important to that cycle. I now have all diary entries in a huge notebook collecting all R1.1’s, 20 10-day bins, and 5 40-day bins. If I want to go back to the first part of 2006, I can read the entries in the 200-day bin back then; if something interests me I can drill down into the smaller level bins to get more context. Since all of these entries are “clones” scattered everywhere, they accomplish something that is very difficult to do with a file system and folders: It allows you to approach your data from multiple perspectives and intentions. The system described above is largely chronological, but what if I don’t remember the date—that’s fine I can probably find it in a topical notebook. I tried to duplicate this in folder based applications (including Finder) but it gets messy, really fast, if it was even possible at all. DTP has replicants, but it just is not as elegant. It’s five clicks and a bunch of sub-menus for one alias at a time, for what two clicks and a drag does in Boswell for any amount of aliases, where every single entry in the entire database is no more than three clicks away.

Another thing that Boswell has that so few else on the market can do: Functional searching. That means that I can not only retrieve information in a Spotlight fashion, but act on that information. This would be like creating AppleScripts, but infinitely easier. A simple example would be if I wished to create a new notebook to hold everything regarding Scrivener. As mentioned in the prior post, I would construct a search pattern for EndsWith .Scrivener}, and tell Boswell to move all matching entries into the Scrivener_NB and a few other notebooks as well. Done. It’s like creating a thousand aliases from a Spotlight search result. The beautiful thing is that now I can use that Scrivener notebook as a search source. I can rapidly split out five other notebooks pertaining to specific aspects of Scrivener, by using that notebook as the data cluster instead of the entire archive. This speeds up searches and negates the need for complex cascading search terms. In Spotlight, I’d have to re-define the EndsWith and then tack on any specifics. With Boswell, I know the Scrivener notebook has everything I need already, so just search within it. Since notebooks are like folders in that you can manually add and subtract information from them, they are even more powerful than additional search criteria. Searches become more useful as you prune the data clusters. This isn’t something terribly novel to Boswell, plenty of programs have container style search restraints, but many are artificially limited. Since Boswell is set up to search this way by default, it is very easy to describe search constraints with a dozen notebooks in a single click and drag. DTP has a search constraint but can only work on one container at a time. Spotlight’s requires setting up rule after rule.

As you can divine, Notebooks are really strange constructs in Boswell. They are definitely one of the philosophical hurdles that one has to approach if they are going to understand the software. They are simultaneously search results, manually assembled collections of data, documents (drag a notebook to the desktop and poof, it is a single file in the same vein as a compile out of Scrivener), they are filters, if their name exists in the text of the document being archived, they automatically file the document within them (example, a notebook named Scrivener will automatically acquire every single document I archive with the word ‘Scrivener’ in it), and search terms when used a source.

And of file system’s benefits? I have those too. Every ten-days I dump everything I’ve archived into a Boswell file, then run a script I wrote which builds a directory structure and puts the entries into them. It’s a mirror I never touch; a back-up; something for everything to link to universally (using those unique IDs I mentioned); something for Spotlight to search against. When Boswell fails to help me find something (rare), it’s there. Back when I was using the file system, I had a pretty decent system. I wrote a bevy of scripts in Ruby to make things easier for myself—but in the end it was just too much work. Making a new entry meant opening TextMate, then saving and locating the right folder to save in. In Boswell, its Cmd-E; start typing. Done. File it, or let it sit around for a bit and stew. I don’t have the Untitled File problem mentioned in the follow-up. Untitled stuff just stays like that until I change it or want to think about filing it.

Anyway, if you like the strictly file system approach, I recommend learning grep if you haven’t already. SpotLight is fast, but it can also be frustratingly vague. Grep is slow, but extremely accurate. Also, Hazel which I mentioned above. Hazel is one of the best Mac applications out there. It is the Scrivener of file automation and can make a folder of files almost application-like.

I’d caution using DTP as your email archive. It only has an extremely rudimentary understanding of the email header, and I’m not sure if it can even output an mbox file; or if it retains the full original email copy.

Update: Forgot to mention the one thing that always made me nervous whenever I wasn’t in Boswell: data security (which might seem ironic since it fubared my database once, but bear with me). Many of these applications support a form of locking which is in my opinion a very soft lock. The data cannot be edited, in some rare cases the meta-data remains locked, but in all cases you can still accidentally or even intentionally delete the file. Boswell just doesn’t let you. The only way to edit old stuff is to version it to a new copy. There is simply no way to delete anything, period. At first it seems limiting, but once you get used to it, it makes everything else—especially raw file systems—feel absolutely dangerous.

Amber,

What a fabulous reply. Thank you so much for the effort and insight. (I’m going to dig up your other posts as well…)

When I began managing my data in a more structured way (mostly in Journler) I began using a taxonomy of …

  • Thoughts: Things I made up (Drafts, observations, my writing, articles, papers)
  • Notes: Copies of other’s work (All those pdfs, web page printouts, notes from lectures or interviews)
  • Reference: Kind of like notes, but more referential - like systems notes, keystroke shortcuts and the like.

(I never adopted a notation like your Communications, because being somewhat shy and hermit like I tend to hide all that stuff in email clients. :smiley: )

Journler had a nice field called Category that let me add these markers, and at times I adopted and then dumped a few other categories - like Goals (which really were Thoughts with a “goal” tag.)

Journal also was a wonderful tagging tool so I went off on dozens of cross linkable tags of all sorts and my filing system tagged along as I followed my hobby horse. These have now mostly sorted themselves out as folders. I was surprised at how a given level of abstraction in folder structure can make multi cross referencing redundant, especially when supported with full text search.

Even though Phil Dow is working on a new release of Journler, and perhaps will open source it, it became clear to me that (as the risk management folks say) I had a very high exposure to developer failure, so as i said I went to flat files. I must say I miss tags. Even though I still segregate my Writing from my Notes, there was always a richness to the linkages that tags provide. The other side of that is I find myself manually flipping through files more often and find serendipitous connections that no system would have foreseen. I think of it as being closer to my data.

Thanks also for the app recommendations. I’m giving Boswell a whirl if for no other reason that my curiosity is insatiable.

Oh and I agree completely with your characterization of time stamp chronology as crucial to filing and retrieval, as my personal data collection ages, my sense is that chronology will ultimately become the trumping tag for all my aging files.

And thanks again for the generous reply.

Amber,

this was an impressive post - and an impressive praise of this software named Boswell. You recommend it, I understand :slight_smile:

But when I look on their website, the last version dates from 2005, which is quite a time. How sure are you Boswell will still work under OS X 10.6 ?

Okay, I think my eyes just glazed over… :open_mouth:

If one of AmberV’s previous posts had not done -->
snort would need the mop now.

Douger,

There are several large posts that I placed here regarding the usage of the file system as an archival engine a while back. All based on utterly portable principles—no Spotlight hacks. I don’t trust myself to still be using Apple in 20 years, and I trust Apple still be using Spotlight even less (and I’ve had situations were meta-data got lost when getting backed up onto non-Apple equipment. Not cool.)

Your original triad is actually very close to my first crack at it as well. This is, I should say, prior to my eight-wide top level system which was a monster to keep track of. :wink: I decided to go for a strict four after reading an insightful blog on physical index card filing and the Noguchi filing method. In fact that first super-category dichotomy in Information is the distinction between citations and references, and the actual works themselves. I call it Direct and Indirect to make it even more flexible. i2 is a citation or link, i1 would be the article itself mirrored; i3 is a special category (fact) for what are essentially footnotes in the archival system. Supporting documents to provide context for the things I observe. I split out Thoughts from Creativity because I am, if nothing else, a manic journalist. I probably produce on order of 35,000 to 60,000 words every forty days; sheerly just internal thoughts and reflections on the world. So that is why I have an entire top-level branch dedicated specifically toward Recording the present. Otherwise it would muck up the creative sections, though as can be expected there is a lot of cross-linkage between those two branches.

I prefer to express these cross-linkages in patterns rather than explicit double-token or binding-token usage. It’s lower maintenance. With an explicit system I have to anticipate the patterns I will be interested in within four years. I submit that is too difficult, and potentially impossible. I have to look at two articles in the archive and say, these should be connected, and affix a binding string between them. If however, I use pattern based implied linkages, I need never worry about it. One of them, as already pointed out, is the similarity in super/minor numbering. I can know correlations by numeric similarities at the simplest level, and have embedded mathematical relationships between the numbers to provide secondary and tertiary meanings. This allows the archive to literally blossom outward like a crystallising pool of water, governed by internal physical models. Extremely low maintenance and nearly entirely retroactively manifesting. I place a lot of priority on “mindless” archival like that, because in my experience—with the volume of information I archive—if it requires more than five minutes of meta-data thought it doesn’t get done.

One thing I really like about this system is that it makes multi-year thread highlight simple. If I have a thought one day and put it down, the moment I do I can see it snap into place with similar thoughts (Boswell show’s automated filing record) that I’ve had over the course of the decade since I started recording. Periodically, if the thread is striking enough, I’ll write a summary paper collecting all of their IDs together into a list and summarising the way in which it has polymorphed over the years; strategies for monitoring its further development. Given enough time, even these summarisers become reminded hits. It’s really far too big for something like Tinderbox anymore (I once had just a fraction of my archive in it, but it was too much for the linear XML format), but in a way the token system allows a form of Tinderbox philosophy in textual terms. I’ve written a bit on the topic of re-archival and retro-linking as ways of passively enriching a data set. Most people prefer in-place editing of material to enhance a data pool. Boswell forces a read-only attitude, and from that I developed the notion of enhancing data pools by reiteration and multi-faceted expansion. The core data pool remains inviolate, but annotated outward.

I like your idea of affixing a goal signifier. But with the read-only mentality I’m worried it would dilute itself into usefulness. I could possibly use the single mutable meta-data that Boswell offers, the “tag” (not at all like keywords in most apps. It’s rather more like status).

I mainly used Journler’s labels for my top-four. This made chronological list scanning extremely easy. I liked it so much that I’ve adopted the system in the Finder as well. I have Hazel set up to look for [font=Courier]{M[/font] et cetera and assign the colour label appropriately. I found the tagging system kind of redundant, personally, but the category I used.

Hmm. I’ve said this before and I’ll say it again. I kind of recommend it. I’ll put it this way:

On the positive side: I have not found a single application on any platform that more aptly approaching the problems of personal archival. Boswell just gets it. In doing so it does some things that are profoundly weird. It can take quite a lot of fiddling and making mistakes to “get” the application. I know I barraged poor Will Volnak with page after page of feature requests and modifications trying to get Boswell to work like everything else does, in short. But he was gentle about it and said, no, you just don’t get it yet, keep trying. Well, I paraphrase, but that was the gist. After about a quarter of a year of steady usage I did finally get it. It’s strange not being able to delete things, profoundly. You make a mistake, you want to get rid of it. But Boswell rightly asks, why? Drives are big, text is small. Why not just ignore it forever (there is a mechanism for doing so; it’s a bit like telling Spotlight to ignore a file but potentially leaving it out of the Trash. It is there, but you’ll never come across it when searching. Or you can remove it from everything and it sits in the trash, ignored) and if for some reason what you think was a mistake actually wasn’t a mistake, four years from now, you’ll have it. On the notion of filing—it combines the strengths of every system I’ve seen. It recognises the weaknesses of dynamic smart folders and static folders and combines the two into a single concept. A folder that gets stuff added to it by filters automatically, but you can prune or add things yourself. Adding only happens once when you archive. Boswell never dynamically touches a single thing in your database after that point. It’s a simple concept, and the only thing I know that does anything like it is Gmail’s Label and Filter system. What Gmail lacks though is a functional search as I described. Boswell’s Manager is the archivalist’s (my spell checker suggests this should be Archivolts, to which I must emphatically poetically agree) dream. Perform complex multi-point searches across manually assembled data sets and perform functions on them, adding richness and increasing future data importance.

I’ve said it elsewhere, what you get with Boswell is not so much an application with features, but an application that is built in a concise and efficient manner around a core concept. Everything about it supports that core concept and there is nothing superfluous about it. In a way, it is quite spare; but the stuff that is there is a nearly perfect symphony. There are only a few very minor things I would tweak about it if it were my code. We are talking little quibbles that most people would never even notice.

In the grey area: a lot of people don’t work in text archival. I do because I am a strict MultiMarkdown user. I never archival results but source documents. I say never, sure every once in a while I’ll link up a PDF using a text document like a library card file in Boswell. Again, easy to do with MMD since it supports file-system level links, and with my ID based system it doesn’t matter where the physical file is to the archived document. So text only is a bit of a limiting factor—and I do mean text only. It has some simple OS 9 era formatting in it, but it doesn’t get out. All it exports is raw plain-text. That is too much for many people are bread on the rich ambrosia of RTF these days. :slight_smile: For people like me though: It’s a feature. You mentioned elsewhere most of your work is in PDF and RTF. You’ll probably find Boswell way more work than it is worth.

Then there are the negatives. There are some pretty substantial negatives. As AndraesE pointed out, the current version is quite old—built to run on OS 9 as a matter of fact, and this has more implication than simply future-proofness. Actually there have been a steady stream of beta releases past the 2005 date you saw—but by an large it is an old program running in an ancient toolkit through Carbon, and on a modern computer, through Rosetta. There are translation problems going through that many filters. The program is as solid as a rock on OS 9, and pretty solid on PPC Leopard, but toss the Intel equation into the mix and it can be catastrophically unstable at times. As in, your whole computer just shuts down due to deep bugs in Apple’s Rosetta engine and this toolkit. I honestly don’t use it as much on my Intel laptop, but heavily use it on my two PPC computers. No real problems there beyond a few quirks with copy and paste.

Will it run in Snow Leopard? I don’t know. It is quirky but functional in PPC Leopard, so PPC users are set—Snow Leopard is the end of the road for us anyway on that score. It’s already marginally unstable on Intel, and unless for some reason Apple decides to fix their broken Rosetta code for people running OldStuff, I doubt Snow Leopard will change anything. Apple doesn’t fix old bugs, if you haven’t noticed. They have an attitude about released stuff being obsolete stuff, in general. Just look at their weak text engine that we all have to live with. Minor bugs getting slowly fixed over the course of entire operating system releases.

So all of that said, the ball is in Copernican’s court. I can say that they are not dead, Boswell hasn’t been abandoned, and it will not always be something written in an ancient toolkit for people running a 12 year old operating system. :slight_smile: Before I realised this, I was actually well on my way to writing my own clone from scratch. I’d rather have learned an entire programming language than use some other solution.

Something to keep in mind, in balance with what I’ve said above on the pro side: For the most part this program was done. The core philosophy was addressed. Sure little things here and there, bug fixes, but the slow release cycle is really a testament to the fact that it was a cohesive application that had reached its zenith.

So do I recommend Boswell? I recommend you play with it; if only to see the possibilities and perhaps garner some inspiration for your own system. The demo is kind of limited with only 15 entries, which is hardly good enough to get an idea of how it really plays. Where DEVONthink requires 10,000 files to do anything beyond nominally mimicking the Finder, Boswell requires the user to process maybe a 500 entries before the philosophy all clicks together—at least for me. Perhaps my explorations and explanations of them would give anyone else reading this a head start. But it is quirky. It is old and cranky. It probably will crash on you. It looks like a fossil. But these are all sacrifices I’m willing to make. Every other system out there feels philosophically fossilised. Working on a metaphor that was never meant for personal archival, and at times only dimly grasping some of the basic concepts that Boswell expresses with near symphonic purity.

That is definitely how it has worked out for me. I should say, the combination of having four possibilities in broad strokes combined with the date stamp is a knife that cuts through the haze of all time. When I was using a file system based system only, I cut everything up into 90s with some change at the end of the year. So 09090, 09180… This meant each folder represented a broad stroke of a year, rather than the smaller months. Five years from now, I’m not likely to remember whether I did something March or April of this year, but I’m most likely remember it was somewhere in early Spring. That narrows it down to the tail end of the 090 folder and the front part of the 180 folder.

Plus, I put IDs on everything, not just files. I get a new unique ID every 86.4 seconds using my system. That is usually time enough and some documents are scattered with multiple ID signifier codes so that specific notions can be tracked and referenced across time. I don’t care if I never use them. I’d rather assign the number up front and never use it than go back and wish I had an older number (but if I don’t I just use the current time; no big deal. I’ve got a lifetime supply of them).

Thank you for pointing this out, Amber. I researched online on this Noguchi Filing System and came across a post by William Lise that is very helpful, especially considering that none of the books by Noguchi was translated into any language I know. Apparently Lise had taken his post out of the web (there are lots of dead ends), but he reposted it again early this year.

Link: http://www.lise.jp/colleagues/noguchi.html

Anyway — this is a very interesting concept: organizing your files as your brain (i.e. chronological order) and not the reverse (tagging, filing in folders by category, etc.). I think I’m going to try it and see if the work flows. :slight_smile:

Exactly. This is where I got the idea to use Finder’s colour labels to shorten scan time. The principle modification made to the index card derivative is that resources are not re-ordered according to access time, but rather marked as accessed again to decrease scan time. The original chronology is thus not disturbed. Some these concepts can be relayed to the digital realm with great results.

Updated: To expand on this a bit, since I wrote it on an iPod originally. The index card filing method[1] which is based on Noguchi Filing has three key differences. The first is that it is designed for thought filing as opposed to arbitrary paper filing. You have an idea of some kind, you write it down on an index card and then use a marker to accomplish what the coloured tape does on a manilla envelope. The position of the mark indicates one of the four super-categories. Each cart is titled and dated at the top, and the rest is free-form. They are placed into a card file in strict chronological order of generation. When you pull a card later on, unlike the Noguchi method, you do not refile the card at the front of the system, but replace it from whence it came, thus keeping the original chronology intact. Instead, using the marker you create a small spot on the right-hand side of the card that indicates it had been pulled. If the card gets pulled again you put another dot—the designer of the system recommends up to four dots—any more is redundant as the card is obviously frequently used.

When all of the cards are placed into a stack or card-drawer, the markings along the top are visible. It is possible to quickly scan down a huge stack of cards and find something by date and super-category based on the staggered positioning of the left-side indicator. It is also possible to, at a glance, see which cards are frequently accessed based on right-side dots.

I tried the analogue system for a while, but did not care for what I call Index Card Syndrome. That is, sub-consciously tailoring and abbreviating your thoughts so that they may fit onto a single card. :slight_smile: It also felt kind of wasteful, as I’d go through on average about a fifty cards a week. Fortunately, the principles of this system are all fairly easy to relay to the digital file system; as elaborated on in the posts the Douger was kind enough to dig up, below.

[1]: Which can be found as blogged, here

So I just thought I’d wade into the middle of this thick conversation to let everyone know what I think of DEVONthink and DEVONagent after waffling around with it for a few days.

I really love agent since I no longer have to manually monkey around with online search engines. Agent looks very promising with it user-configurable search algorithms and archive capabilities.

DEVONthink has really freed me up to just focus on research with as little distraction as possible by allowing me to have my RSS feeds from Times, email archives, and exported Scrivener notes all in the same app. I particularly like being able to do writing in DEVONthink as doing so narrows down my working environment. My WIP in Scrivener was starting to get a little distracting actually having all my stuff right there for me look at all at once (or the temptation to do so anyway).

For me, research and writing are two totally different animals (like mixing and mastering are) and I like the fact that I can use good software to separate them into different steps of the larger process.

For anyone else who trips over this great thread, here is a link to one of the expansive 2007 posts. It contains definitions of the codes, usage, concepts, etc …

http://www.literatureandlatte.com/forum/viewtopic.php?f=15&t=2092&st=0&sk=t&sd=a&hilit=file+system+archival

I sift through so much swiftly decaying information each day it’s wonderful to find a thread like this. :slight_smile:

Well, since Amber revived the old thread, I should update what I wrote there. I no longer use SCrivener as a database. Like a couple others hereabouts, I started experiencing some serious slowdowns and even a crash or two as my Scrivener database grew. I’m not sure how many words were in there, but it was almost a year’s worth of press releases, notes, and articles. Keith said that a file that size shouldn’t choke Scrivener, yet it was happening, for whatever reason. I also started wanting to get access to my files remotely (via Dropbox) from my new iPhone. Plus, I wanted to be able to search by keyword and get a specific file, not a result that just put me in a huge Scrivener file that I’d then have to search again. And Keith has often said that Scrivener was never intended to be used as a database.

All those factors, and my natural inclination to simplify, impelled me to start keeping my research in the form of rtf files, read in Bean or TextEdit. Then when it comes time to draft an article, I import the relevant rtfs into Scrivener and proceed in the usual way. So instead of one database that holds a year’s worth of articles for each publication, I have a separate SCrivener file for each article, even the short ones.

So now my database application is the Finder. Everything works much faster this way. And when my MacBook was stolen last month – only a couple weeks after I exported all my Scriv research to rtf files – I was able to retrieve my files from my Dropbox account and read them on my iPhone, which came in very handy when I had to write a few short pieces on deadline in the several days before I was able to replace my Mac. I just feel more secure by storing my research in a form that’s easily readable with any number of applications.

I’m doing the same thing with my book in progress: collecting all research info in rtf or text files. When it comes time to start drafting chapters, we’ll see how SCrivener handles the massive load. I’d love to be able to put the whole book in a single Scrivener project, but if it struggles with that volume of data again, I’ll make a separate project for each chapter. I suppose if I need better search capability than a standard Spotlight or EasyFind search, I can use Spotlight comments or a free app like Tagit.

As I said in the earlier thread, I was a happy user of DevonNote before Scrivener arrived, but now I’m not sure why I need any database or info management apps beyond the Finder/TextEdit combo (for my general research archive) and Scrivener (for each individual article/book/project). I’d love to hear what other users of these info drawer apps think about this strategy, and what they offer that it doesn’t.

I have a very similar workflow (notes in individual txt or rtf files). I store them all in Eaglefiler. This provides more sophisticated search functions, and, most importantly, a very convenient way to view the search results, with the search terms highlighted. Unlike in a Finder based system, I do not have to open each file that Spotlight would have found and re-search within it . Added benefits: robust data integrity checks and flexible OpenMeta tagging support. Finally, the files are stored in Finder folders, so i am not commiting them to a propriotery database.

This comes up every so often, so you might do a search for some of the other threads on the subject.

I use DevonThink Pro because the Finder is inadequate for the volume of data I’m trying to manage. (My main DTP database exceeds 3.5 million words.) DTP’s search and classification functions are much more sophisticated.

I don’t use Scrivener because many items are used again and again in multiple projects. Having to keep everything in one .scriv document would defeat the purpose of Scrivener, but otherwise I’d always be trying to figure out where this or that item was stored. With everything in DTP, I don’t have that problem.

Katherine