Organizing pdfs AND word files?

I am a die-had fan of Scrivener, and can’t praise it enough. I am also impressed by the knowledgeable folk who hang out in the user forums. I really need help! Here’s the situation: I am about to dive into a new non-fiction (academic) book project. To make the project as efficient as possible, I have two problems: How to organize a huge number of pdf files with relevant research, and how to organize a lot of old word files. I am looking for ideas about the best software for truly keeping tab on accumulating research materials, notes, and old drafts and papers written in word. I don’t want to keep all this stuff in scrivener; I will only import relevant materials into scrivener as needed.

I have looked at all kinds of organizers for pdf files (Sente, Papers,etc.). So far, I am unimpressed. I work in the humanities, and often scan a few pages or a chapter from a book. These programs are not good for home-made scans. They are mostly geared towards disciplines which work entirely by downloading new research papers from online sources. Of course, I do that too, but it’s by no means my main source of information. Also, these programs don’t deal well with my own word files, which MUST be organized for this new project to work. Finally, I don’t really need a bibliography program, I have EndNote for that, which I am very happy with. But EndNote doesn’t let me organize, cross-reference, view, annotate both pdfs and word files. Is there ANY program out there that you all would recommend? If I need to, I will happily use two different ones, one for all kinds of pdfs, and one for word files. What I am trying to achieve is a system where I can quickly find themes, names, ideas in all this disparate material. Should I try a straightforward database? I have never used Bento or FileMaker, and have never tried DevonThink (for some reason). I would prefer a system that doesn’t have too steep a learning curve.

Sorry to go on for so long, but it has become clear to me that I can’t write this book unless I find a really good way to organize all my materials. I do hope you Scriveners can help!

I would suggest you look into PersonalBrain, since one of your stated needs is creating relationships among your PDF and Word documents. Nothing, in my experience, does that better than PersonalBrain.

What you would do is attach your PDFs and Word documents to Thoughts in PB. If you get the pro version, you can attach multiple files to each Thought or you can have a different Thought for each document (a Thought is just PB-speak for an individual item).

You can use the Note feature for annotations. You can also Tag and Type Thought entries for further classification.

PersonalBrain can import these documents into its database (or Brain, as they call it) or it can map existing folder structures providing what they call “Virtual Thoughts.”

I wrote a fairly extensive review for Mac.Appstorm, which you can read here:

mac.appstorm.net/reviews/product … formation/

And feel free to ask me questions. (BTW, I have no affiliation with the company that makes PB, in case I sound too much like a PR flack.)

Steve

DevonThink Pro.

Don’t be intimidated by the learning curve. While there are lots of advanced things you can do with scripting and such, you can get started by just pulling in a bunch of documents and sorting them into groups. Once your group structure is set up and populated, the Auto Classify feature can sort the rest of the material for you, and the fuzzy search and See Also features make sure you can find it again.

I personally do NOT recommend Personal Brain. I love the idea of it, but find that its interface doesn’t scale to the number of items I’d need for any substantial project.

Katherine

Steve and Katherine:
Thank you for your quick and thoughtful postings! Personal Brain does look wonderful. Steve, your review of it is a masterpiece! Yet it looks as if it will require me to have an idea of where in my thinking each bit of material will fit in before I can file it properly. Which means that it probably isn’t as useful to my situation as something that would help me to put away a lot of files without quite knowing what I want them for yet. The key for me is to be able to retrieve the information when I look for it, and then ideally organize it some more as I realize what I am really doing in my book. I am willing to tag, to use keywords, etc, but it would be specially excellent if the software could also search the text itself (a little like evernote does, except evernote is hopeless for major research organization, it’s great for quick research for short pieces, though, particularly online research). I will now study DevonThink Pro. (Someone told me that whatever you put into DevonThinkPro won’t ever come out again: that it’s forever transformed into that specific format. Is that true? That’s not a dealbreaker for me, I just wonder.) If anyone else has any ideas, do let me know!

Another vote for Devonthink. See these threads for more info:

viewtopic.php?f=2&t=13678&p=98738&hilit=devonthink#p98738

viewtopic.php?f=18&t=13712&p=98945#p98945

I had 3.1 million words and many hundreds of documents, pdfs and images in my last project database. DT undoubtedly helped me to understand the subject, and I found that the database “morphed” as I gained better understanding. I’ve never tried Personal Brain, so I can’t comment. But I do know that DT handles massive amounts of data with ease.

DT isn’t all that difficult once you grasp a few central concepts – for example, a group is really the same as a folder, and gives all the files inside it a tag that is the same as the name of the folder, unless you tell DT not to do that – and a killer idea is replicants, which just means that the same file or document can appear in as many folders as you like at the same time, so a document that refers to Freud, Lacan, and Jung doesn’t have to be put only in the folder for Freud – it can appear in the folder for Lacan and the one for Jung as well. I found that moving the files around in the database was what helped me to work out the structure for what I was going to write. But I ramble.

Best, Martin.

Edit: the stuff about things in DT being trapped there is rubbish. A DT project is a package, like a Scrivener project. It is essentially a folder with loads of other folders and files inside it. You can open it and look inside it from the Finder at any time: just right click on the icon and choose “Reveal Package Contents” from the contextual menu.

And if you want something that will grow with you and help you to understand your material I would strongly recommend DT – that is EXACTLY what it did for me.

Martin, a very convincing vote for DevonThink. Particularly given your experience with handling tons of research! I do understand about packages, clearly the person who talked to me didn’t!

Ellida,

You don’t say what your subject area is, but if I may be so bold as to make a few suggestions –

if themes are going to be important in your work, you can set up groups for each of the themes (Social Constructionism, Cultural effects on cognition, etc.) and bung anything that looks like it belongs with that theme into the group (i.e. folder). Because of “replicants” a document can appear in any number of groups if it is relevant to various different themes (I find this VERY useful).

another way is to have a group / folder for every chapter, and put everything that relates to the chapter in that group – and once again, a document can appear in many groups if it is relevant to different chapters.

Learn to use the “See Also & Classify” side bar – very handy for making quick connections between stuff.

I use Bookends for all my pdfs of articles, and that uses an attachments folder where it manages and renames them after download. It is possible to index this folder (rather than importing the pdfs into DT) so that they stay where they were in the Finder and Bookends can manage them, but you can search them and read them from inside DT, because DT reads all the text of the pdfs into its database (if the pdf has a text layer). I would guess you can do the same with Sente [Edit: sorry, I had a brainstorm, you said you were using EndNote. There are lots of good ways of making DT and Sente talk to each other, and I imagine also for EndNote]. Have a look at the help file on indexing if you are not sure about it. And once you have the files indexed in DT, you can move them around and replicate them inside DT without disturbing them on the HD. Also VERY useful.

The AI will not work to begin with. You have to give it something to feed on by putting stuff into appropriate folders by hand to begin with. It will learn from that. There is some argument about whether it is best to have short documents and very specific folders with not many files in them, but I think you will find that part of the growing process – the database will morph with you as the project takes form.

Don’t be afraid to rearrange stuff – it’s a sign of developing insight (or in my case, total inability to understand what I’m doing).

Use the DT forums – there are lots of very knowledgeable people who will share their expertise. Like working with Endnote: devon-technologies.com/scrip … it=endnote

Best of luck with it, and if you need more help, let us know. We’ll be delighted to share our ignorance!
Martin.

Edit: also, learn to use some of the more sophisticated searches – being able to look for the word “decision” when it is within ten words distance of “making” (or some such) is amazingly useful at times.

I’m not trying to talk you into PersonalBrain, because if you find DevonThink works for you that’s an excellent application. I just want to make a couple of points about your objections to PersonalBrain. First, if your documents are stored inside the brain, the search will find hits within PDFs and Word documents. You’ve got to use the search tool (there’s a tab for it among the other tools), and not the quick find – though you type the search expression into the same box. This isn’t as efficient as DevonThink, but it is effective.

As for needing to know where to store a document at the time you save it, there are a couple of techniques to mitigate this issue:

  1. You can create a tag called “unsorted” and pin that to the top of the plex so that you can quickly make that the Active Thought, then drag and drop your files under that until you find the right place for them. (This was suggested by one of the commenters to my review.)

  2. You can establish known parent categories for your research that can stay alive even after you further categorize each item. For example, start with a thought called “source,” then create child thoughts for each of the sources, then, of course, store the documents as individual thoughts under the appropriate source. Add tags that provide further detail about the contents. Then, in another area of the Brain, start building the structure of your work and link to your existing research as the hierarchy grows.

Anyway, I just don’t want to leave people with the impression PersonalBrain is somehow rigid and unable to accomodate a variety of workflows. On the contrary.

Steve

I would agree that it is essential to find a program that feels right. An application can be the best program in the world, but if it doesn’t quite do what you want, or work in the way that you prefer, it will just hold you up. So I would agree that it is important to experiment before committing oneself.

Martin.

Dear Steve and Martin,
I will devote my spare time for the next couple of days to explore both personal brain and devon think. There really is no substitute for hands on trial and error. If anyone else wants to recommend something, I will take a serious look at that too: i really need to get this project off the ground in the right way!

Another vote for DevonThink Pro. Actually, I recommend DevonThink Pro Office (DTPO) if you are going to include your own scans. DTPO includes OCR software that can make scanned PDFs “readable”. I use this a lot with my own scans and the occasional downloaded academic paper that is just an image file. Incredibly helpful.

As others have mentioned, the See Also feature is great and the Replicate feature is invaluable. Not to mention 1-click adding to DTPO and system wide shortcuts for capturing those transient thoughts or intriguing text snippets (or images, PDFs, movies, web pages, etc).

DOH! If only I’d thought of that 12 months ago!! :unamused:

I probably only use 1/20th of the capabilities of DTPO, and I know I don’t use it as much as I could. Even so, although I also Papers and EndNote, I keep returning to my research database in DTPO whenever I want to look up a reference or find an obscure quote, or… well anything that involves a bit of searching or comparing. Key point is that even when under-utilised it is still helpful.

Especially if you live in a browser, you might check out Zotero (zotero.org/). I used it a bit and kind of liked it. For what I was doing, the “lock in” of a tool bothered me, but for your type of research, might be just the thing.

If you only knew how long it took me to think of it :smiley: .

I don’t think Zotero would be of much use to Ellida, because Endnote already handles bibliography. I dabbled with Zotero and never got on with it, myself. But I don’t much like Firefox, which may have had something to do with it. Overall, however, I prefer to have my references on my hard disc, not out there in the wild.

Cheers, Martin.

This used to be true of the earlier version of DevonThink, but isn’t of the latest version.

Steve

I’m in no way trying to dispute Katherine’s experience with PersonalBrain, but do want to provide a different perspective. I find scalability one of PersonalBrain’s greatest attributes. Where I feel swamped by data in DevonThink, PersonalBrain helps me keep it all under control by allowing me to stay focussed in on any one topic at a time. (Although I can zoom out for a larger perspective, if need be.) I use PersonalBrain both at my day job (where I am stuck on a PC) and for personal projects (on my MacBook). I have just one PB database at the office that holds information on all my different projects and aspects of my job. I’ve got Word documents, PDFs, Excel spreadsheets, e-mails, web links, notes, screen clips, contact information and much more all in this one database. It handles all that information very deftly. Admittedly, this doesn’t make it a better choice (and I’m not arguing that it is) for an individual research project, just that scalability isn’t an issue in my own experience.

Steve

I am awed by so many helpful comments! I have now tinkered with Devonthink and Personal Brain for a day. For me, Personal Brain would be more useful for mindmapping than for heavy-duty data management. Also, I think Personal Brain will appeal powerfully to visual types, Devonthink somewhat more to the verbally oriented.

I am delighted to announce that your input have solved my problem. At least I hope so. After a day of tinkering it does look to me as if Devonthink Pro Office 2 will be exactly what I need to get my materials gathered (I must be able to scan things to the database). I think I will be able to use Devonthink to actually THINK about the materials as I try to organize them, as Martin indicates. That is extremely valuable to me, since I have years of work accumulated on my topics, which in this case are very abstract: this book will deal with philosophy and theory, which means that the organization of the material is at least half the thinking. The nature of this specific project may be the true reason why Personal Brain doesn’t appear to fit this project as well as Devonthink: I honestly don’t know what argument I want or need something for until late in the process. I need to move the item around from group to group, change its tags many times. I also need to rethink tags, and groups over and over again, to figure out what I am really writing about.

My inclincation is to keep one database per project (i.e. not put research for other projects into the “book” database, but give them their own database. In cases of overlap, it looks as if it wouldn’t be too hard to put the same material in two different DT databases. Does anyone have any experience with one versus many databases in DT?

My own practice has been to have a database for each project. Some feel that this reduces the possibility of serendipity, but I find the benefits of focus on one thing outweigh this.

Glad we could help!

Cheers, Martin.

It really depends on the amount of crossover of your research materials, and whether finding more crossover is good or bad. You can easily store documents in top-level folders for each of your research projects, and then use the AI to mine all of your past projects for useful documents in any given research attempt. You can always separate your data into different databases later if you find having one large database to be problematic.

I’m going to suggest something that no one else seems inclined to do: Use Tags. Tags allow any given document to “belong” to many different topics. You can still use folders to categorize research materials according to their main subject matter, but tags allow you to note that a document touches on a number of other topics.

Also I didn’t notice if anyone mentioned the book on DT, but I highly recommend it for getting up to speed on what DT can do for you. It’s title is “Take Control of Getting Started with DEVONthink 2” and it should be for sale as a PDF alongside Devonthink. Don’t let the awkward title deter you; it’s very helpful.

I don’t have any skin in the game, and you may be right in that this doesn’t fit for her use, but just to make aware, I have used Zotero for filing and searching for various documents and references. It will allow much more than bibliography data, although it may not be obvious right away. I found it valuable for the ability to tag documents (I like that method of filing).

Secondly, Zotero syncs to the “cloud” optionally (which can be handy for some), but by default (at least it was back when I used it), it stores the data locally… it just happens to use the browser for an interface.

Anyway, like I said, just tossing out a suggestion and wanted to show the product in a good light since I liked using it.

Looks like Ellida has found a good solution for her use, which I’m glad… enjoy!

Bruce

With DevonThink, I’ve found it more helpful to think in terms of topic areas, not projects. Materials related to a given topic are likely to be used for multiple projects in that area. Though DTP can work across multiple databases, its search and AI tools only look at the current database by default, so there’s value in keeping everything together.

Since DevonThink can handle very large databases without difficulty – databases of several million words are pretty routine, and some users have much more than that – my suggestion would be to throw everything into one database and only split it down when/if a clear distinction emerges.

I find it helpful to use tags to highlight materials relevant to a single project, and groups to construct the topical hierarchy. That’s partly for historical reasons, as tags didn’t exist when I started using DTP. YMMV.

Katherine