Zotero Can Now Do Even More with Your Citations

Zotero is a free tool for managing bibliographies and citations.1 It’s now even more useful for researchers in biblical studies. That’s particularly true if you use the styles for the

Catholic Biblical Association

The style for the CBA is what you’ll see if you read a Catholic Biblical Quarterly article. Zotero has supported CBA style for some time. But per CBA’s current guidelines, the Zotero style now

  • supports custom citations specified by CBA and stored in Extra via the annote variable (e.g., annote: BDF),
  • allows series abbreviations to be stored in Extra via the collection-title-short variable (e.g., collection-title-short: NIGTC),
  • truncates page ranges per the guidance of the Chicago Manual of Style (e.g., 115-116 becomes 115-16),2
  • capitalizes English titles stored in sentence or lower case in “headline” style,
  • gives citations with a “sub verbo” locator the “s.v.” notation and those with a “section” locator the § symbol,3
  • overrides Chicago’s en dash with a hyphen when delimiting page ranges, and
  • includes a period at the end of a citation.

The style now also comes without a few bugs that it had previously. These include

  • correcting the output of a work cited with only editors as responsible parties from “, ed. [name(s)]” to “[name], ed.” or “[names], eds.”,
  • correcting the delimitation and spacing with volume-page citations (e.g., “1:105”), and
  • lowercasing “rev. ed.” and, if it appears other than at the start of a sentence, “ibid.”

Society of Biblical Literature

Like CBA, SBL style requires you to cite a number of resources by specific abbreviations.4

Abbreviation-based Citations

I’ve previously discussed how you could modify the SBL style in order to store and cite by these abbreviations. That was pretty messy.

Or you could install a customized style file where I’d already made that change. That worked, but it meant that you didn’t receive updates as quickly. It also meant that I had to keep re-producing the modified style every time an update came out. Or neither you nor I would benefit from the corrections that that update included.

Now, however, abbreviation-based citations are supported in the SBL style that’s in the Zotero repository.5

Commas before Locators

SBL style consistently calls for a comma before the abbreviation for “sub verbo” when you cite a source like BDAG.6 But other types of locators don’t get commas before them (e.g., section numbers or page numbers when you’re citing a multivolume reference work).7

Consequently, the style supplies a comma after the abbreviation when you select a “sub verbo” locator in the Zotero citation dialog. But the style otherwise omits one.

If you need a comma, you can include the comma as part of the abbreviation in the annote variable (e.g., annote: <i>ANET</i>,).8

Similarly, when citing signed dictionary articles, the style had been producing a comma before the locator. But SBL style calls for no comma to appear there, and that’s now the case.

Section Locators

In addition, for some time, citations with section locators had a space after § or §§ that shouldn’t have been there (thus, e.g., “§ 105” rather than “§105”). That’s now fixed too.

So, if you cite a grammar, you can just choose “section” as the locator type. You don’t any longer need to drop in § or §§ as the first characters in the locator field.

Just choose a “section” locator and enter the sections you’re citing. Zotero will take care of the rest.

Quotation Marks with Sub Verbo Locators

When citing lexicon entries from sources like BDAG or HALOT, SBL style wants the head word to come in quotation marks. The Zotero style will automate this behavior if you select the “sub verbo” locator type in the citation dialog box.

Support for Identifying Sources as Physical

When you have an electronic source that’s identical to its print counterpart, SBL style generally treats the citations identically.9

In such cases, you give no DOI or URL in the citation because you’re citing a print-equivalent source. But in other styles—like that for the Tyndale Bulletin—you need to include a DOI or URL for a source whenever possible.

One solution is to add or remove DOIs or URLs from your Zotero library as needed for a given style. But that’s entirely unnecessary busywork.

Even if you have a DOI or URL stored for a given record, you can get the SBL style to suppress that information. To do so, just enter dimensions: yes in Zotero in that record’s Extra field.10

That way, you’re telling Zotero to treat the source as something that has physical dimensions. So, the SBL-style citation won’t include DOI or URL information.

Tyndale Bulletin

According to the Tyndale Bulletin style guide,

In most respects, Tyndale Bulletin follows the conventions described in the second edition of The SBL Handbook of Style.11

And of course, Zotero has long supported SBL style. But there are also important differences between the styles in some details. Some of these differences include Tyndale Bulletin’s preferences for

  • British-style punctuation for quotations and any punctuation appearing with them12 and
  • including a work’s Digital Object Identifier (DOI) whenever one is available.13

Quotations

You could spend quite a while accommodating these requirements by hand. But if you install Zotero’s Tyndale Bulletin style, Zotero will be able to handle the type of quotation marks required and the placement of punctuation with them. Just select the Tyndale Bulletin style as the one you want to use in a given document, and you’ll be good to go.

DOIs

Once you start using the Tyndale Bulletin style, Zotero will also start including any DOIs you’ve saved for the works you’re citing.

That said, if you don’t normally ensure you save a DOI when it’s available, you’ll have to add that information to Zotero. Otherwise, Zotero won’t know to include a DOI in a given citation.

It’s not hard to add DOIs where they’re available, however. And thankfully, there are some good tools you can use to help you streamline that process as well.

Conclusion

Citing sources is important work. And no matter how good software gets, you still have to know the style you’re writing in because you’re responsible for the final product.

That responsibility never changes. But it also doesn’t mean you have to do everything by hand.

Careful use of tools like Zotero will go a long way in helping you keep your citations in order while also clearing your way so that you can focus on the substance of your research and writing.


  1. Header image provided by Zotero via Twitter. For more information or to download Zotero for yourself, see Corporation for Digital Scholarship, “Zotero: Your Personal Research Assistant,” Zotero, n.d. 

  2. If you specify the locator type as “section” rather than “page,” however, Chicago-style truncation doesn’t currently happen. 

  3. The style should be able to output § when you cite only one section and §§ when you cite multiple sections. But it currently uses § even when you cite multiple sections. 

  4. These comments pertain to the note-bibliography version of Zotero’s SBL style. If you use the parenthetical citation-reference list version, your needs and the behavior you observe may differ. 

  5. For some occasions where these abbreviations are relevant, see J. David Stark, “How to Cite Dictionaries with Zotero,” weblog, J. David Stark, 8 February 2021; J. David Stark, “How to Use Zotero to Properly Cite Grammars in SBL Style,” weblog, J. David Stark, 14 June 2021. 

  6. “Citing Reference Works 2: Lexica,” SBL Handbook of Style, 30 March 2017. 

  7. J. David Stark, “How to Use Zotero to Properly Cite Grammars in SBL Style,” weblog, J. David Stark, 14 June 2021; “Citing Text Collections 2: ANET,” weblog, SBL Handbook of Style, 1 June 2017. 

  8. It should be possible to further automate the inclusion or suppression of this comma (e.g., based on the number of volumes specified in a given record). But it’ll take some work to confirm exactly where this comma should appear or not beyond the cases noted here and how best to trigger that. 

  9. E.g., SBL Press, “Migne’s Patrologia Latina,” weblog, SBL Handbook of Style, 31 January 2017; Society of Biblical Literature, The SBL Handbook of Style, 2nd ed. (Atlanta: SBL, 2014), §6.2.25. 

  10. You can actually follow dimensions: with anything you like. The property just has to have some value to trigger the suppression of DOIs and URLs for SBL style. 

  11. Tyndale Bulletin Style Guide” (Tyndale House, 2021), §4.1. 

  12. Tyndale Bulletin Style Guide,” §8.1. This preference means that commas or periods appear outside a closing single quotation mark in citations of book sections and journal articles. “Tyndale Bulletin Style Guide,” §§11.3.6–11.3.8. 

  13. Tyndale Bulletin Style Guide,” §§11.1, 11.3.2, 11.3.7 

How to Expand Your Research Materials with Amazon

Biblical scholars need materials for research.1 And you can access quite a lot through your libraries.

In addition, there are also several good places to go online when you need access something. One of these is Amazon.

Amazon as Bookseller

Of course, on Amazon, you can buy books. And the prices you’ll find there are often very competitive. But Amazon can also be a particularly helpful place to conduct research, even if you don’t buy something.

One of the best things about physical bookstores is cracking open a book and reading some of it for yourself. Amazon originally focused on selling books but has now obviously expanded quite far beyond that.2 Even so, they still try to mimick the experience of opening and previewing a physical book.

Looking Inside

So, here enters the “Look inside” option. “Look inside” isn’t available for every book—particularly if it’s a new or prerelease title.

But many titles will have this option. And when one does, you’ll see it over the upper-right hand corner of the book’s cover picture. To start previewing such a book, simply click the cover to “pick it up.”

Partial product page from Amazon showing the "Look inside" option for the book displayed on the page

Most helpful here are the search tool and the options under the menu button at the left. Sometimes these are a bit different, but generally what you’ll find are things like:

  • Front cover: This link takes you directly to the front cover of the book.
  • First Pages: Just as it sounds, this link takes you directly to the first few pages in the volume. This might be the frontmatter, preface, forward, introduction, or first chapter. It just depends on how the links are done for that individual volume.
  • Back Cover: If you’re interested in endorsements for the volume or information about the author, you can use this link to jump to the back cover, which will often have information like this. In hardbacks with dust covers, sometimes you might see links for the front and back flaps in addition to or instead of a back cover link.
  • Surprise Me!: This link mimics the experience of flipping open a book at random and looking at whatever you happen to find there.

Use Cases

As with Google, Amazon will only show you some of a book’s pages due to copyright law. Even so, Amazon’s “Look inside” option can give you helpful information about a volume or its contents in several scenarios:

  1. With the copyright page preview, you can confirm bibliographic data. For instance, you might inter-library loan a chapter from a book but not get all of its publication information. Being able to “look inside” it on Amazon can be a good way to fill in what’s missing for your citation or bibliography.
  2. From the table of contents, you might be able to navigate to various sections of a book or confirm where a particular section ends.
  3. If you use the index or have found a reference to a given page and want to see that page, you can try typing the page number into the search box. This won’t always give you the page you’re looking for, and sometimes you need to look through a longer list of places in the book where the same number occurs. But by searching for the page number or another keyword, you’ll often be able to turn up a page or section that you need even if it’s not directly linked to elsewhere.
  4. If you already have a copy of the book, you can use the search box to help you find that quotation you half remember but can’t seem to turn up again in your physical copy.
  5. You might find that Amazon allows you to preview different pages than Google Books does, or vice versa. So, if you can’t preview what you need with one, it might be worth searching the other.

Conclusion

In the end, the same caution applies to Amazon as with Google Books. You always want to be sure you haven’t inadvertently misunderstood an argument simply because you’ve only read the portions of it that are available in an online preview.

That said, Amazon’s previews can make it easier for you to access some parts of some of the books you need for your research.

A hammer isn’t a substitute for a screw driver, but that doesn’t mean you can only ever use a screw driver. Similarly, while neither Amazon’s nor Google’s previews substitute for having a fuller copy of an argument all together, they can be valuable in making certain kinds of research jobs easier than they would have been otherwise.


  1. Header image provided by César Viteri

  2. Jillian D’Onfro, “Look at How Much Amazon Has Changed since It First Launched,” Business Insider, 20 March 2015. 

How to Expand Your Research Materials with Google Books

Biblical scholars need materials for research.1 And you can access quite a lot through your libraries.

In addition, there are also several good places to go online when you need access something. One of these is Google Books, which aims to be “the world’s most comprehensive index of full-text books.”

As Google has pursued this aim, it’s had various challenges, twists, and turns over the years.2 But for all of that, Google Books can be quite helpful both for titles in the public domain and for those still under copyright.

Titles in the Public Domain

Google Books’s selection includes numerous full-text titles for works in the public domain. In these cases, you can download the books in EPUB, plain text, or—the probably most useful format—PDF.

For instance, let’s say you wanted to read William Sanday and Arthur Headlam’s International Critical Commentary volume on Romans (Scribner, 1899). You could search for and find the title on Google Books. Then, simply click the button for “Download PDF.”

Screenshot of Google Books showing how to download Sanday and Headlam's ICC commentary on Romans in PDF

Titles under Copyright

In addition, Google Books can be helpful for accessing titles still under copyright. For such titles, Google Books provides three levels of access:

  • Preview: Titles with previews available allow you to search and view select pages in the book. Google only shows you some of the book in order to comply with copyright law.
  • Snippet-view: Titles with “snippet” views allow you to search the book and view select portions of pages. In this case, you normally get a few lines of a given page immediately around a given search result.
  • No preview: Titles that don’t have a preview give you only basic metadata about the title. This information can be helpful. But it also often contains errors or inaccuracies (e.g., incorrect additional authors, wrong publication years, missing series information). So before you rely on Google Books metadata, you need to cross check it with the print title.

Of these, I’ve very rarely found snippet view helpful. But occasionally, it’s helpful to search a title that I also have in print. That way, I can find where to read more thoroughly (e.g., in the case of non- or poorly indexed volumes).

Working with Book Previews

Where Google Books provides a preview of a book still under copyright, however, the service is more useful. Often, the table of contents is linked to the rest of the text so you can jump to individual sections.

If this isn’t the case or if you’re feeling a bit geeky and want to jump to a particular page, click the Share option in the three-dots menu.

Google Books link illustration

Then copy and paste the link provided into a new browser tab. The link should look something like
https://books.google.com/books?id=qEsn-q1qOu8C&pg=PA257#v=onepage&q&f=false.

The portion of the link with the “PA257” indicates the page number of the link. Simply change the number portion (e.g., 257) to go to a different page (e.g., 232). Of course, if Google hasn’t made available the page you choose, the new link won’t open that page.

Also, sometimes book links will have more than one section that looks like the “PA257” in the link above. This seems particularly to be the case when you’re previewing a text that has two volumes in one or some similar situation. In these cases, play around with both portions of the link until you find the one that adjusts the page number.

Lastly, from the over-head menu bar (see above), you can search for text within a given volume. This can be particularly helpful if you have a print copy of a book that you’ve read, but you can’t seem to find a particular statement or section that’s relevant to your current project.

Searching Google Books’ Database

As with Internet Archive, searching Google Books’s massive database for what you need can take some time and patience. This is particularly true with older texts or series.

For example, sometimes the series name will display in the search results but without clearly indicating the contents of the particular volume for that link. So, you may need to click through several links or try different searches to identify the volume that’s actually what you’re looking for.

Other features that can be helpful are the “Other editions,” “More by author,” and “Similar books” sections on any given volume’s page. This section shows results based on similar titles, authors, or other metadata.

There have been a number of times when I’ve tried every search I can think of to find a given volume only to see it then listed among the volumes collected under these sections.

Conclusion

Especially if you’re using Google Books for accessing titles that are still under copyright, what you can get on Google Books is no substitute for the full text either in print or perhaps (if you need only a smaller section) electronically via inter-library loan. You always want to be sure you haven’t inadvertently misunderstood an argument simply because you’ve only read the portions of it available in Google Books (!).

Still, with this qualification in mind, Google Books can be an extremely helpful tool for getting access to a wide variety of research material—whether in the public domain or still under copyright.


  1. Header image provided by César Viteri

  2. E.g., see Scott Rosenberg, “How Google Book Search Got Lost,” Wired, 11 April 2017; James Somers, “Torching the Modern-Day Library of Alexandria,” The Atlantic, 20 April 2017. 

How to Expand Your Research Materials with Libraries

Researchers need materials.1 For biblical scholars, this most often means books and journals.

We’re responsible for interacting with relevant literature largely irrespective of how easy it is to access. But that doesn’t mean you can’t exercise some research savvy to access what you need more easily and cost effectively. After all, you didn’t get into biblical scholarship because it has the same upside potential as venture capital investing. 🙂

Current technology means that libraries aren’t the only places where you can expand your research materials. But libraries do have a wealth of materials that might not otherwise be at your disposal. Or you might not be able to access these materials as easily as you can through a library.

So, as you think broaden the research materials you have access to, your libraries are good places to begin. And depending on your situation, you might find yourself with access to several different kinds of libraries.

Your School’s Library

If you’re already at an academic institution, this suggestion might seem overly obvious. You’re likely familiar with your school’s library and, at least generally, its holdings.

As the saying goes though, sometimes “familiarity breeds contempt.” That’s not to say you don’t like your library. But you might not think to look there for a given resource because “of course, it won’t have something like that.”

Still, you should check. You might be surprised by what you have access to either by searching the catalog or browsing the stacks.

This has happened to me more than once, and I’ve been pleasantly surprised by what my institution’s library happened to have.

For instance, in working on the land(s) promised to Abraham, I almost assumed my institution’s library wouldn’t have W. D. Davies’s The Gospel and the Land: Early Christianity and Jewish Territorial Doctrine (Berkeley: University of California Press, 1974). But thankfully I looked in the catalog and happily found that it was actually already on there on the shelves.

Your Public Library

Even more likely to be overlooked is your local public library. It’s certainly true that public libraries cater to quite a general clientele. So, in principle, they’ll be less likely to have significant holdings of scholarly sources pertinent to biblical studies.

As with your institution’s library, however, it’s possible that you might be surprised by what’s on the shelves at your local public library. But your local public library is more likely to have holdings of interest in its own extended materials that are available either electronically or via interlibrary loan.

Other School’s Libraries

Even if you’re not a student, if you live near a theological library, you can almost always simply walk in and use materials in that library.

You can start finding them simply by searching Google Maps for “library,” perhaps along with the “near:[your address].” In addition to walking in and using materials at a library, you can often apply for checkout privileges at that library.

For instance, if you weren’t a Faulkner student but wanted to use Faulkner’s library, you could gain check out privileges for $25 per year. Though, in our case, a number of biblical studies-related resources are also held in the Kearley special collection, which doesn’t normally circulate. So, you’ll also need to learn the particular policies and processes of whatever local library you might find helpful to use.

Before paying even a nominal additional fee for check out privileges at a library, however, it’s worth looking into what reciprocal arrangements your school’s library may have with others that you might want to visit.

For example, if you attend a school that’s a member of the American Theological Library Association (ATLA), you already have check out privileges at all other libraries at all other ATLA institutions (non-circulating collections and other specific policies excepted). To look for what other ATLA libraries might be near you, you can start with this Google Map that ATLA has prepared to show all their participating libraries.

If you need a specific resource, you can also search for that source in WorldCat to see libraries near you that may have this resource.

Your Libraries’ Extended Collections

Aside from what you’ll find if you walk into a physical library, any given library where you have check out privileges likely also has access to ways of extending its own collection. Two primary ways of doing so are electronic collections and interlibrary loan.

Electronic Collections

For various combinations of reasons, your libraries likely have access to substantive collections of electronic journals and books. Such resources have come a long way in recent decades.

More often than not, you’ll probably find that a given book or journal, if it’s held electronically, is held in the form of high-quality PDF files. These files mean that, when you look at the electronic holding, you’re seeing on the screen exactly what you’d see in a hard copy of the text.

Of course, onscreen reading has its downsides. But if it comes down to trying to find a hard copy or using an electronic version to which you have instant access, you might well find that you often prefer the electronic text, all things considered.

Your libraries’ electronic collections may also well surprise you with what they contain. For instance, for the same project I mentioned earlier, I needed to get a copy of Jacques T. A. G. M. van Ruiten, “Land and Covenant in Jubilees 14,” in The Land of Israel in Bible, History, and Theology: Studies in Honour of Ed Noort, ed. Jacques T. A. G. M. van Ruiten and J. Cornelis de Vos, VTSup 124 (Leiden: Brill, 2009), 259–76.

Besides me for this project, interest in this title might be quite low among current Faulkner library users. So, for all its wonderful scholarship, it might not be the best use of limited shelving space. Even so, our library had it available as an ebook that proved entirely adequate for what I needed from that essay for the project I was working on.

Interlibrary Loan

“Interlibrary loan” (ILL) is a service in which libraries cooperate to loan resources to each other’s patrons. No library is going to have everything. You can request an ILL through your institution’s library or another theological library where you have check out privileges.

But your local public library should also be able to provide some amount of ILL access. And you might be quite surprised at what you can borrow through the mail via ILL from a local public library—and the public librarians might be quite interested to see your ILL requests for what are, for their normal audience, some very obscure titles.

Continually requesting Harry Potter and the Sorcerer’s Stone has to get old. Surely a good request for Richard Bauckham’s The Fate of the Dead: Studies on the Jewish and Christian Apocalypses would help spice things up, right? Or, maybe a good scholarly French or German title? 🙂

So, if you have access to ILL services at a theological library, you can certainly use those. But don’t discount you can access via ILL at your local public library either.

Conclusion

So, for students and faculty, the moral of the story is: Your library is a gem for you—don’t let it be a hidden one. Even if you doubt there is anything helpful, still look.

Think about what libraries you have access to—theological and otherwise. Go by, browse the shelves, and talk to the librarians to ensure you’re not overlooking a bank of helpful resources just because something’s accessible to you in a bit different place than you thought to look. And while you’re at it, explore what you may have access to through your libraries’ electronic holdings or ILL.

Doing so can save you valuable time and effort in the research process, as well as expand the range of materials you have readily at your disposal.


  1. Header image provided by Jonathan Simcoe

How to Make Bulk Editing Items in Zotero Easy

Zotero is a fabulous tool for managing research material.1 The word processor integration makes it easy to insert citations on the fly as you write.

But the citations you insert will only be as good as the information in your Zotero library. So, if some of that’s incorrect or mis-formatted, Zotero will reflect those problems in the citations it creates.

Zotero makes it easy to correct information about any item in your library. But what happens if you need to change many items?

Fortunately, it’s quite easy to change many items all at once. So, there’s no need to make the same change to each of them individually.

1. Set up Zutilo.

To bulk edit multiple items in Zotero, you’ll need to install the latest version of the Zutilo extension.2 Once you have Zutilo installed in Zotero, go to Tools > Zutilo Preferences ….

From there, you’ll notice Zutilo can do a number of things and make several changes to your Zotero interface. To start bulk editing Zotero items, however, it might be simplest to disable all the options on the Zutilo User Interface tab except for

  • Copy item fields,
  • Paste into empty item fields, and
  • Paste into non-empty item fields.

For these items, choose to display them either in the Zotero context menu (i.e., the right- or command-click menu) or in a Zutilo-specific flyout from that menu.

Click OK, and you’ll have Zutilo ready to go.

2. Collect the items you need to edit.

Next, if you haven’t done so already, collect into one place all of the items you need to edit. You can do this by creating a saved search in Zotero based on the item metadata that you want to edit.3

For instance, if you’re using SBL style, “Grand Rapids” is a “well known” place of publication.4 Consequently, it shouldn’t be accompanied by a state name or abbreviation.

So, if you had some entries in your library with this additional information, you might create a saved search to group them all together for easy editing.

3. Use information from an existing item as a template.

For one of the items in this saved search, you’d open the context menu, and use Zutilo to copy the fields for that item.

Then, open a plain text file, and paste in the item fields that you copied. This will give you a long string of what might, at first, look like unreadable code gobbledygook. But if you look closely, especially at the beginning of what you pasted, you should notice how some of what shows up in that item’s record as you look at in Zotero appears pretty transparently in what you’ve pasted into your text editor.

In the text editor, be sure to leave

  • the opening {,
  • the line with "itemType":,
  • any other lines for fields you want to use in your bulk edit,
  • and the closing }.

But delete the other lines. In this example, I’m bulk editing only the place of publication. So, the code gobbledygook above simplifies down to just the following:

{
  "itemType": "book",
  "place": "Grand Rapids, Michigan",
}

From this point, you need to make two changes. These are to

  1. change Grand Rapids, Michigan to just Grand Rapids, which is what you want the place name to be for all the relevant items in Zotero, and
  2. delete the comma before the closing }.

Your text file will then look as follows:

{
  "itemType": "book",
  "place": "Grand Rapids"
}

4. Bulk edit the items in your saved search.

From this point, copy this content from your text file back onto your clipboard, and return to Zotero. Select all the records you want to update (i.e., all the records in your saved search), and open the context menu.

In this example, there aren’t any empty fields to fill. So, you’ll select “Paste non-empty item fields.”

It may take Zotero a few seconds to process the changes depending on how many you’re making and how many records are involved. But once Zotero finishes, you should see an empty saved search folder.

The folder will be empty because you’ve updated all the records it contained. Now, none of those records matches the search criteria. All of them now have “Grand Rapids” and not “Grand Rapids, MI” or “Grand Rapids, Michigan.”

You can then delete your saved search folder and enjoy the benefits of cleaner citations from a tidier Zotero database without the time and tedium of having needed to edit each record manually.

Conclusion

Zotero’s a wonderful tool. And the various ways of getting bibliographic data into it make entering new items into your library incredibly easy.

But there’s also no accounting for the quality of the data that you’ll initially import into Zotero from whatever sources. And as the saying goes, “garbage in, garbage out.”

Fortunately, Zutilo makes it very easy to quickly correct data in multiple Zotero records, leaving you with less work to do in managing your materials and more time to focus on your research and writing.


  1. Header image provided by NordWood Themes

  2. For this resource and the fundamentals of the process described here, see “Editing Multiple Items at Once,” Zotero Forums, n.d. 

  3. For information about searching and saving searches in Zotero, see Zotero, “Searching,” Zotero, 30 January 2022. 

  4. Society of Biblical Literature, The SBL Handbook of Style, 2nd ed. (Atlanta: SBL, 2014), §6.1.4.1. 

How to Extract Text from Image-only PDFs with Zotero

If you have a PDF of a book chapter or journal article, it’ll be one of two basic types.1

On the one hand, it might have real text inside it. If so, you’ll be able to select specific letters or words inside the PDF.

On the other hand, it might just be a series of page images. If this is what you have, you can click on it all you want, but all you’ll select is the whole page image.

Even when you have real text in a PDF, you’ll have various issues if you try to copy and paste from it. And you probably shouldn’t be doing a lot of that anyhow. Strings of quotations generally isn’t the most effective way to make an argument.

But having real text inside your PDF chapter or article will make that PDF searchable and easier to annotate if you intend to read it electronically, underline or highlight text, or otherwise use your PDF like electronic paper.

If your PDF doesn’t have real text inside it, however, you can use Zotero to add it through “optical character recognition” (OCR). That is, you can have Zotero

  • “look” at an image-only PDF,
  • give a best guess about what text is on the page, and
  • save that text back with the image into another, combined PDF.

The OCR may not be perfect. But it will make your PDFs more usable.

1. Get Zotero ready.

To get Zotero ready to add text to your image-only PDFs, you’ll first need to

Once you have these tools, install the Zotero OCR extension in Zotero.

After you restart Zotero,

  1. Go to Tools > Zotero OCR Preferences.
  2. For the path to your OCR engine, enter the path to tesseract.exe (e.g., C:\Program Files\Tesseract-OCR\tesseract.exe).
  3. For the path to pdftoppm, enter the path where you have Poppler’s pdftoppm.exe (e.g., C:\Users\[yourusername]\poppler-0.68.0\bin\pdftoppm.exe).
  4. Customize the other options according to your preferences, and click “OK.” If you want Zotero’s OCR text back in a PDF file, you should at least leave the “Save output as a PDF with text layer” box checked. But you may want to leave unchecked the option to overwrite the initial PDF, just in case something goes amiss with the conversion.

2. Create a PDF with real text.

At this point, Zotero is ready to

  • run OCR on any image-only PDF in your library and
  • create a new PDF that maps these page images to real text.

To do so, find an image-only PDF in Zotero, right click it, and choose to “OCR selected PDF(s).”

After you click this option, you’ll want to be patient. The process may take a while, even with a comparatively short PDF. And it can look like not much is happening.

But eventually, you should get a command line window that gives you some progress indicators as Tesseract works through your PDF.

When Tesseract finishes, you’ll see a new linked attachment in Zotero with a “.ocr.pdf” ending to the file name. You can use this file to interact with the real text that Tesseract worked out for your PDF’s page images. Zotero’s indexer and your PDF reader’s find function can do the same as well.

If you want to be able to search the new text in your PDF from Zotero, you might want to rebuild or update your Zotero index (Edit > Preferences > Search > Rebuild Index …).

3. Clean up the leftovers.

If you don’t care to keep the leftovers from the conversion process, you can clean them up at this stage. Just right-click either the new linked file attachment or the original one in your Zotero library, and choose to “Show File.”

You’ll then be shown the Zotero storage folder where your PDFs are stored. Be sure not to touch the .zotero-ft-cache or .zotero-ft-info files. But any leftover text (“.txt”) files you can delete.

And if you’re satisfied with the results of the conversion, you can also delete your original PDF from this folder and rename the “.ocr.pdf” file to omit the “.ocr” portion of its file name. It should then have the same name as your original PDF.

So, the original stored file link in Zotero (the one without the little chain icon) should work to open it. And you can delete also the Zotero link to the “.ocr.pdf” file (which you’ve now renamed).

Conclusion

Having real text in a PDF makes it possible to search that document. It also makes it easier to mark it up. Older PDFs or PDFs of older sources might not come with this real text already in them, and OCR is rarely perfect.

But you can use Zotero to add a good amount of accurate text to your image-only PDFs, which will make annotating and referencing these files that much easier.


  1. Header image provided by Zotero via Twitter