Skip to Main Content
UCF Libraries Home

LIN 4660 - Linguistics and Literature (Young)

Introduction

Antedating will help you better understand the kind of crowdsourcing that made the OED possible, as well as what's involved in lexicographical research generally. Also, this activity has a real-world application--it will let you contribute to the OED, one of the greatest language resources in the English language.

You definitely don't need to be an expert linguist to antedate words!  Here is an account of how Nathaniel Sharpe, a 22-year-old amateur genealogist from a small town in North Dakota near the border with Canada, was able to antedate the term scalawag:

 http://www.bostonglobe.com/ideas/2013/03/10/the-original-scalawag/fuFLccvsn4b1T6t18WFvxL/story.html (pdf if you are paywalled). 

And here's an article published by Fred Shapiro, whose antedating hobby dates back to his time in law school: (Shapiro, Fred R. "Confessions of the Antedater." Dictionaries: Journal of the Dictionary Society of North America, vol. 39 no. 1, 2018, pp. 23-42. Project MUSEdoi:10.1353/dic.2018.0011)

Things to Look For

Words where the first illustrative quotations date from 1800-1923. Prior to 1800, spelling was variable, and spelling becomes more variable the farther back in time you look.

Words created through affixation; look for prefixes and suffixes. These morphemically complex words are likely to have fewer meanings and they are probably less common. For example, creepiness will be easier to locate than the word creep.

Words for which entries were written or last revised before 1990. Most full-text databases did not exist before 1990, which means that lexicographers would not have been able to consult them.

Words that relate to specialized databases you can access (see below)

Words that the OED has appealed to readers to help locate: https://www.oed.com/information/using-the-oed/contributing-to-the-oed. Right now, the OED is specifically interested in entries from M through R.

Words that the Merriam-Webster Dictionary dates significantly earlier than the OED. The MWD lexicographers found that earlier attestation somewhere, so it must exist.

To get an idea of the kinds of words people have already researched, browse through some Antedating Accounts.

Things to Avoid

You can only antedate words that the OED already contains. Some words will be easier to antedate than others. Think carefully about the evidence available to you and the likelihood that the OED lexicographers have already searched that evidence. For example, you probably should NOT search for

Extremely old words. If the earliest illustrative OED quotation dates from the OE period, you would need to find evidence of the word used in an even earlier manuscript. How many OE manuscripts do you have lying around that the OED lexicographers don't already know about? (Zero.)

Extremely rare words. The more rare the word, the less likely you are to find it used at all, much less to antedate the OED records . . . unless you have access to a specialized database or corpora that is likely contain that rare word.

Words with quotations that define the word. When a word is new, writers often comment on its newness ("as young people say") or even provide its definition ("He's normcore. Dangerously regular. Dresses only in T-shirts an' jeans.") If the earliest OED quotation defines the word, you're unlikely to find an earlier quotation.

Extremely new entries. If the OED lexicographers have just finished updating the entry for your word, that means they have recently searched intensively through various online databases. Unless you're planning to search in places that they don't know about, you probably won't have any antedating luck.

An unusual sense of an extremely common word. Weeding out the common results will take a very long time, unless you have other words to add to your search query to narrow things down.

Compound words. Words that are compounds, particularly open compounds (spelled with a space in the middle, e.g., hot dog), are more difficult to search for because they require exact phrase searching (which not every database provides) and because they can easily be confounded with the same two words that happen to be next to each other but do not form a compound. For example, a hot dog might be the food item or it might be an overheated canine.

Decide Where / How to Search

1. Make sure you aren't duplicating someone else's research.

As a bare minimum, to minimize the chances that you'll be duplicating the work of a classmate, you MUST

  • Post to Yellowdig to announce your words, using the hashtag #YOURWORD to make your words easier to find
  • Review other people's posts and hashtags 

It's ok to join forces with someone else, if you both really want to research that word! Explain how you divided the work on your Weekly Research Report.

You don't have time to intensively review all of the Antedating Accounts online, but you might want to quickly Google or Chat the word (bearing in mind that online sources are often wrong) to get an idea of what has already been discovered.


2. Figure out the date range to search.

Look through the illustrative quotations the OED provides for the word/sense you're looking for. The date of the earliest quotation should be the end of your search window. You're looking for uses that predate the OED quotations.

The beginning of your search window should be a date that is plausible. For example, if you're trying to antedate an automobile-related word, you don't need to search earlier than the date the automobile was invented. You may find that your "earliest" date is constrained by the database you're searching.

For inventions, scholarly concepts, etc., you might find it useful to check a source like Wikipedia to see when the word's referent was invented.

The Google Ngram Viewer can show you when a word is used in Google Books, helping you figure out what date range to search.


3. Gather clues about spellings, source texts, collocations, etc.

The OED "Forms" tab will list alternate spellings that the OED has recorded and tell you when those spellings were in use.

Review the OED illustrative quotations to see what kinds of texts the word appeared in. Newspapers? Advertisements? Novels? etc. You might look in the same kinds of texts, or maybe you will notice a gap in the sources that you want to search.

English-Corpora.org can also give you clues about which registers a word is likely to appear in.


4. Consider which words likely occur nearby the word that you are researching.

Words are used in combination with other words, and that means you can often figure out patterns of usage. Pay attention to nearby words as you review your search results.

Do you notice any patterns in the OED illustrative quotations? In English-Corpora.org? Sometimes it's the nearby words that help you find the word you're seeking. And if you notice your results are confounded with other meanings of your word, you can sometimes narrow results by screening out specific terms.


5. Determine where to look based on what you have learned in the previous steps.

Consider:

  • What text registers are included (newspapers, books, magazines, online discussions, etc.)?
  • How far back does the database go?
  • Does the database let you do a search of all full-text at once? Or do you have to identify a likely text by searching metadata, then search inside that specific text?

You might be able to find an online archive related to your word's topic. For example, if your word relates to hot air ballooning, maybe there is an archive of back issues of Ballooning Magazine. (I don't know whether there is--this is just an example.) The New York Public Library has a digitized collection of restaurant menus that might be a good source for food terms. etc.

Don't search where you know that the OED lexicographers have already searched. If the OED's earliest illustrative quotation comes from Punch magazine, it's likely that there aren't any earlier citations in Punch magazine or the lexicographers would have found them.


6. Use specialized search functions wherever possible. IMPORTANT: These commands differ from database to database.

Phrase searching: An "exact text" search can help you find compounds or phrases. Often you do this by putting quotation marks around your search term. For example, if you are looking for pace car or glass slipper, if you type the words into the search box without quotation marks, your hits might have only one of those words, or those words might be sort of near each other but not next to each other. If you put quotation marks around "pace car" or "glass slipper", you'll get those words right next to each other. Some databases handle phrase searching differently (e.g., Library of Congress) so if the quotation mark trick doesn't work, check the help files or ask a librarian.

Excluded terms: You can often disambiguate results by removing particular terms from the search. For example, if I wanted to search for only the dictionary Samuel Johnson, I might tell the database to exclude results containing the terms "football" and "actor" (to exclude the British footballer and the Australian actor Samuel Johnsons). Databases often let you do this, but the method varies from database to database. Sometimes you can put a - in front of the word, other times you need to use a drop-down menu to specify words to exclude, etc.

Wildcards: Often, but not always, an asterisk * can stand in for multiple letters and a question mark ? or a dot . will stand in for a single letter. Some databases allow regex searching (a powerful search method too complex to explain here). Using wildcards can help you locate multiple spellings and/or multiple inflections in one search.

Other Tips

Don't rely too much on correct spellings! OCR (optical character recognition) software is imperfect, which means that digital databases will often contain misspellings that are not in the original. Some researchers systematically search for likely misspellings; for example, not just Houdini but also Hondini, Houdlni, Houdinl, Hoadini, Hloudini, etc. You don't have to do that unless you want to, but do remember that "zero results" doesn't necessarily prove the word isn't in there somewhere.

Here's a memorable article that illustrates this OCR problem: http://www.theguardian.com/books/booksblog/2014/may/01/scanner-ebook-arms-anus-optical-character-recognition (Once you read that article, you'll never forget it.)


Watch for signs that a writer thought the word or expression was particularly clever or up-to-date, such as setting it off with quotation marks or italics, or introducing it by saying something like, "as the boys say" or "to use a Kentucky expression" (KY having once been a wild frontier), or [as they] "say on the Bowery" or [to use a] "Bowery expression" (the Bowery having a reputation as a low but colorful street). Unfortunately, typographical tricks can't easily be entered into a search engine, but they can help you find antedatings in texts you're reading for another purpose, or recognizing when you are getting close to a word's first attestation.


Consider searching for nearby words instead. Words occur in patterns. If a text has OCR flaws, or if your word occurs with an unexpected spelling or inflection, you can still find it if you look for words that commonly occur nearby (aka collocates).

Additional information on search strategies:


Devising the right search query, according to lexicographer Ben Zimmer, is like "finding the right incantation to release the proverbial genie in a bottle. When you do find the right incantation, it's a great feeling. It's one of the joys of the competitive (and collaborative) sport of antedating, which I recommend to all word lovers." Here's an account of how Zimmer located the origins of the title "Ms." (It's older than you think!)

Know What Evidence You Need

The OED needs primary sources only: verifiable evidence that the word was used on a particular date. In practice, this means only precisely dated citations, verified from original print sources or reliable facsimile images. (Here is a UCF library guide to primary sources: http://guides.ucf.edu/c.php?g=78169&p=507879

Things students have submitted in the past that do NOT count:

  • Entries from another dictionary. Maybe the Merriam-Webster Dictionary says the word dates from 1907, earlier than the first illustrative quotation in the OED. I'm sure the MWD has evidence for its date, but unless you have the same evidence (a citation that is dated 1907, verifiable from the original print source or a reliable facsimile), it doesn't count. (The exception would be entries in a very specialized industry glossary.) 
  • Entries from a different sense of the word in the OED. Maybe you think a particular quotation illustrates sense 2a better than sense 2c, but the OED doesn't think so. Find evidence that the OED lexicographers haven't seen, not evidence that you think they have miscategorized.
  • Articles that describe earlier usages. Journalists love to run articles claiming various origins for a given word or expression. But the unsupported claim of a 21st century journalist means little--you need to find the primary source, the word actually in use on the earlier date. An article from 1998 that claims a word was coined in 1898 doesn't count. 
  • Modern editions of older texts. Sometimes the editor will deliberately adjust the wording of an old text to make it more understandable for modern readers. Sometimes the editor will make a mistake. Either way, to show that a word was used in 1901, you need to consult the text (or a reliable facsimile) that was published in 1901, not a modern edition of the 1901 text. Metadata is often wrong, so be sure to look at the image of the printed pages, and scroll around until you see the date that the page was published.
  • Entries where the word does not appear. Maybe you can type the words "Sam Browne" into a search engine and get some great images of a "Sam Browne belt," but unless the images themselves contain the words "Sam Browne," that doesn't count. (After all, it probably was a modern editor who attached the keywords "Sam Browne" to that image.) If you're looking for the expression "Sam Browne belt," you need to find that expression--those words--in use.
  • Texts from other languages. Maybe English borrowed the word from French, but we need to find it used as an English word. Not a French word.
  • Translations of older texts. Often modern translators use modern words when they are translating older texts. The date of that word is the date the translator used it--not the date that the original author wrote the original text.

What does count?

Primary source evidence, verifiable from the original print source or a reliable facsimile, that the word was used on a particular date.

In particular, you need all the information required by the OED submission form, plus a quotation long enough to show how the word is being used: https://www.oed.com/information/using-the-oed/contributing-to-the-oed/making-sure-your-contribution-to-the-oed-is-useful/

The OED prefers evidence drawn from print publications because it is more stable and therefore more easily re-traceable in the future. However, some evidence may exist primarily online, as described in this article about The Origins of "Big Data." (The origins in that article were antedated by the OED the following year.)

Be suspicious of datings in databases, especially if they seem too good to be true. The metadata in databases such as Google Books are notoriously faulty (http://chronicle.com/article/Googles-Book-Search-A/48245/ ). Always try to confirm datings with a date in an image of the original document. 

For more information about what the OED accepts, see here: https://www.oed.com/information/using-the-oed/contributing-to-the-oed

Take Good Notes!

Keep track of what you search for, where you search, how you searched, and why. Detailed accounts of high-quality searches will earn credit even if they do not result in successful antedatings.

 

Many thanks to Peter Gilliver, Fred Shapiro, Jonathan Lighter, Stephen Goranson, Bonnie Taylor-Blake, W. Brewer, Gerald Cohen, Hugo, Clai Rice, George Thompson, Dan Goncharoff, Katherine Martin, Damien Hall, and all the ADS-L members who helped me compile the original version of this how-to.