Is there a way to search for (example) the word "accounting" in a PDF file?

thanks in advance for your time and help and happy new year

There is a way. If the PDF has as a so-called ‘text layer’ in it usually put there by the creator of the PDF or after the fact by doing an OCR on it, in most all the PDF viewers out there you press “ctrl-F” or the search menu button and do the search.

If you can select words within your pdf as it shows in the editor in Scrivener, then you can also search for words within it directly in Scrivener using cmd-F. The ability to select words within the pdf just establishes that the PDF has, as rms says, a text layer and the pages are not just images of text only.

I apologize to @rms and @gr for not expressing myself correctly.
If I look for the word accounting, I get a very long list of results. Too long to sift through. The results include PDFs which I always OCR.
I was looking for a way to search for the word accounting located exclusively in PDF files/docs.
thank you and again very sorry

If you mean searching a PDF from within Scrivener – a PDF you have placed in the project somewhere – the answer is not to do that. Open the PDF in external editor and search the PDF in an app designed for PDFs.

thank you but no: I want to search through hundreds of Scrivener docs to identify those which 1- are PDFs and 2- contain the word accounting.
thank you

Spotlight search on the relevant folder(s) with those two criteria called from Finder should do the trick. lots of articles on the “net” eg How to run advanced Spotlight searches in the Finder

1 Like

good idea. thank you

I tried that just now and found Spotlight did not search the contents of packages 
 but Find Any File
does.

Interesting. I assumed without checking. oops.

another option (without testing as i am not at the computer but the OP could try), use the “grep” command in a terminal window. my hunch it can enter the subfolders in the packages. already available on all mac’s.

I am much too old to tackle grep to any extent. You’d have to tell it to search only PDFs, only in the packages, only if a word appears. If you can show us how, that would be instructive.

maybe i will give it a go tomorrow. back at computer then. those *nix commands have stood the test of time and are an under-used resource by so many and no where “dangerous” as feared by some. consider me challenged to try.

I think there is not a way within Scriv to do a search constrained by file type, but you can restrict your search to “Binder Selection Only”. Perhaps this helps.

thank you @drmajorbob @gr and @rms for all your comments 
 on new year’s eve !
my question was basically about Scrivener search syntax, as in entering accounting filetype:pdf in search box, but this does not appear possible.
As far as folder searches is concerned, my favorite is HoudahSpot which has many options, ability to create templates, etc
Happy New Year to all of you

Well, as my first techy task in the New Year 
 I could’t make grep work either. Apologies for me suggestion things I hadn’t really known what I was talking about. Surprise to me that Spotlight didn’t work with packags, but there you go.

Grep can’t, far as I can tell, look inside PDFs to detect human-readable text. I had expected to see the PDF text layer for searching, but no. Works great of course for text files. And it found stuff in the Scrivener file ‘search.indexes’ but I didn’t pursue how that file organised.

From internet searches I found there are techniques to search PDF’s with convoluted use of “find” and “grep”. There is small utility called ‘pdfgrep’ out there which I did not download and try. I also did not purchase “Find Any File” to give it a try, but looks interesting and I’m glad to hear it works. When I actually need to do this sort of searching, I’ll get it.

The journey to try to make ‘grep’ work was not a complete waste of time as I did some poking around the package contents of a Scrivener project. I learned how they store the drafts and research files. cleaver and not how I expected compared to how other apps do similar things. So worth the time.

1 Like

HoudahSpot can search for text inside a Scrivener project, but I don’t know if it can search only pdfs – I don’t keep pdfs in Scrivener projects. I use DEVONthink for that kind of thing. I find it a much more flexible way to work.

Yes, so do I keep all research for Scrivener projects in DEVONthink. This abortive exercise to search PDFs in Scrivener package showed me that the PDFs in Scrivener are buried and hidden to mere mortals like us. Their approach works well for most Scrivener writers but i prefer the comfort of knowing the research PDFs are unchanged in DEVONthink. Just me.

1 Like

good point, thank you

In the kind of work I do, it is highly unlikely that any pdf would only be relevant to a single project, therefore to me it makes no sense to store them inside the project. I regard the Research folder as a place where I keep links to research material, not the material itself. And even if a pdf were relevant only to a single project, I’d sooner keep it in a central “library” with all my other material. You never know when it might come in useful again, and then again you might stumble upon it by chance and realise it is still useful. If it is hidden away inside a Scrivener project, the chances of stumbling across it are almost none, particularly if you archive and zip up old projects. With the method I use, I can find a pdf (even accidentally) using Bookends, HoudahSpot, DEVONthink, DEVONsphere, the Finder and various others. That is partly what I mean by flexibility.

3 Likes