PDF Workflows for Academics

I did a quick search on my Mac and I currently have 4, 832 PDFs on my computer. While not all of them are academic articles, a huge number of them are. PDF continues to be one of the most versatile file types and one that I prefer when passing files back and forth. Saving invoices, webpages — pretty much everything can now be saved as a PDF. As you work with PDFs here are a number of things you should know and tips you can use.


Creating PDFs from Digital Files

Many people are still under the impression that you need Adobe products in order to create and work with PDFs. While this was the case a while back, it is not the case anymore. On a Mac, you can "print to PDF" from anywhere on the computer. Simply hit print, and then hit the PDF button that appears in the bottom left corner. On PCs, in the export or save dialog boxes, you will have an option to save to PDF.


Creating PDFs from Hard Copy Files

At the beginning of my PhD studies, I already had a huge pile of PDF articles copied from many library visits. Except for being stapled, they were unorganized. Furthermore, I started preferring reading on my computer so that I could easily extract notes and organize my PDFs in my reference manager.

My solution was to digitize all of my articles (a long process but very worth it). Many new office printers can scan documents to PDF now. But if you do not have that ability, then getting a ScanSnap (which I now have on my desk) is the option for you. I purchased the smaller, cheaper option of the ScanSnap S1300, but if I had to choose again I probably would get the larger model ScanSnap iX500 as it can handle larger stacks of paper.


Apply OCR to PDF Files

OCR is short for Optical Character Recognition. It is the process of looking at an image and finding text in it. When you create PDFs from hard copy files, you are (usually) just creating images. In other words, you could not highlight any text in these PDFs, as there is no text there. OCR fixes this by finding the text and adding it to the PDF. There are several options out there for people to use.

  1. Free. The free options are not as accurate (in other words the scan doesn't recognize the text as well) but free is sometimes all you can afford. There are several free online options, though not all of them will create for you another PDF (rather than giving you a text file). Here is one example.
  2. Paid. There are several main companies that specialize in OCR. IRIS readiris, Abbyy fine reader, and OmniPage are the top choices. I personally make use of Abbyy, but it is because I use DEVONthink Pro Office, and Abbyy is built into that software. If you have need of major PDF software that can manipulate PDFs, options like Adobe Acrobat or PDF Pen on Mac also have OCR capabilities. 


Compress PDF Files

PDF files that are created from scans tend to be quite large, larger than they need to be. This is why you need to compress PDF files. Not only does it save hard drive space, but if you are using cloud backup  or a cloud reference manager, this will save you a lot of space. I compress every PDF file I create from a scan.

There are two ways that I know of for compressing a PDF, paid and free. The free option is an online option on the Small PDF site, check it out here. Another great free option right on your Mac is a service item. I created a simple service item on Mac. You can right-click any PDF file, go to "services" and choose this compress PDF option. I have zipped this service file for you. Place it in user>library>services. File here.  If you are using PDF software like Adobe Acrobat or PDF Pen on Mac you can compress as well.


Combine PDF Files

I sometimes need to combine PDF files into a single file. Again the big PDF software mentioned above can do this, but even Preview on a Mac can do this by simply dragging and dropping. I have also created a service for combining PDFs in the link above. On the great Small PDF site you can also merge PDFs for free.


Highlighting & Notetaking on PDFs

Because of my huge love of reference managers, I highly recommend working with PDFs within that environment. In this regard, Sente, Bookends, Papers, Endnote, Qiqqa, and Mendeley can all highlight and make notes directly on the PDFs (see my posts on reference managers here).

If you use Zotero or just don't use a reference manager, both Adobe Reader and Mac's Preview can highlight and make notes directly on PDFs. On an iPad GoodReader makes PDF style notes similar to Preview and Reader. Or you can use something like Notability if you want to draw freehand on a PDF.


Intelligent Searching Your PDF Collection

When you want to do some serious searching within your own PDF collection, you need to go a step beyond a general search. And, while you can search within individual PDFs in Reader and Preview, having an app that specializes in searching a larger collection and presenting it well is very valuable.

For this job, I personally use DEVONthink Pro Office. Within this app, I point to my PDF collection which is part of my reference manager library, and it can search it intelligently (DEVONthink probably has the most intelligent searching of any app). Another absolutely fantastic searcher on Mac is the FoxTrot Pro search tool. Qiqqa on the PC is the reference manager I suggest to PC users, as it does this very well too. (If you're a PC user and know of another good app in this regard, let me know in the comments!)


Collaborating via PDF

While MS Word might work for collaborating, not everyone uses MS Word. For example, I use Mellel as my word processor, and during my PhD I had to find a convenient way for my advisors to annotate my files, as Mellel is a Mac only app, and moving from Mellel to MS Word would muck up my process. The solution I found was absolutely perfect for my needs, it is a.nnotate. It allowed my two advisors to create notes, for them to see each others notes, and a way for me to quickly scroll through their notes. You need to pay for credits, but it is not expensive and was very worth it for me.

Another newer option which I haven't fully tested but seems to be a nicer, slicker, version of a.nnotate is Kami. Kami is free, but there are advanced options for pay as well. It also has a Chrome extension. In fact, Kami also does a lot of what has been mentioned above (OCR, combining, and splitting PDFs). I highly suggest checking out Kami if you need to do a lot of collaboration on PDFs, I certainly will be.


Is there anything I missed or any apps I should have mentioned? Let me know in the comments.

Posted by Danny Zacharias.