Banburyshire Family History

A site designed for you to share your family history with others from the Banbury area

skip to links

go back to the last page you were on Searching Databases

Soundex, Metaphone, Fuzzy & Boolean Searches; & RootsWeb Message Boards


Googling your ancestors

When you learn how to translate a problem into keywords and symbols to use in a search engine (let's say Google at www.google.com), the keys to the Internet are yours. Here is how it works:

The plus (+) symbol forces a key word to be INCLUDED.
The staple of genealogy research is vital records -- when our ancestors were born, died, and married -- so a search with any surname and these keywords work well: Example: +ragan +born. You can use the keyword "died" or "married" in the spot where "born" is in the above example.

The minus (-) symbol forces a keyword to be EXCLUDED.
This is an extremely powerful research tool when you learn how to use it properly with the plus symbol.

One of the variant spellings of my surname is REAGAN. What do you think happens when you start doing searches with this name? Many of the "hits" or "results" that come back are about former U.S. President Ronald Reagan. Here is how to fix it: Use: +reagan +born -president. The word "president" will not be in any of my results. Get the idea?

What if the surname was "Morse" or "Cook"? Here are example of searches that you might do using the symbols:
Search: +morse +married -code
     (the word "code" is excluded because of "morse code").
Try: +cook +died -food -chef
     (the words "food" and "chef" are excluded).

Fine Tune Your Searches: Count on doing a search from three to five times. Each time you click on the "Search" button examine the results and see if there are any other keywords you can include or exclude.

Don't worry about a large amount of results. What matters is the top 10 or 20 -- ignore the rest and start fine tuning. This way you really can find the needle in the haystack. You will be able to get the good stuff to come to the top like cream floats to the top of Grandpa's milk pail.

You can sometimes use a general search engine for genealogy. My favorite is Google,
http://www.google.com
but there are others -- AltaVista, Lycos, MSN, Dogpile, AOL. They all work about the same. The key is what they call an exact phrase, which you enclose in quotation marks. Let's assume you are looking for Eltweed Pomeroy and Malinda McCorkle, married in Pocatello, Idaho in 1888.

This argument in the search engine: Eltweed Pomeroy Malinda McCorkle (without the quotation marks) means "show me all the pages that have the four words Eltweed, Pomeroy, Malinda and McCorkle on them". You might strike pay dirt right away; you might also get a page that listed Eltweed Smith, Pomeroy Murgatroyd, Malinda Smith and Ebeneezer McCorkle.

This argument in the search engine: "Eltweed Pomeroy" "Malinda McCorkle" (with two sets of quotation marks) means "show me all the pages that have the exact phrases "Eltweed Pomeroy" and "Malinda McCorkle" on them". Given the rarity of the names, if you got a hit it would almost certainly be useful.

However, if your ancestors are listed last name first, the argument above won't get them. You won't find them if they have middle initials on the page, either. This is a combination of exact phrase and any match: "Eltweed Pomeroy" Malinda McCorkle.

It says "show me all the pages with the exact phrase "Eltweed Pomeroy" and the two words Malinda and McCorkle somewhere on the page." This argument would find a page with the sentence "Eltweed Pomeroy married Malinda, second daughter of Alphonse McCorkle ..." or "Eltweed Pomeroy married Malinda Q. McCorkle ...".

General search engines are not perfect. They don't have a Soundex option, although Google will sometimes suggest alternate spellings for you. Some of them require a plus sign with each word or phrase, although Google doesn't. They work best for relatively uncommon names. If you are looking for John Smith who married Mary Johnson in New York City, you'll get a lot of hits, but your chances of getting the right one are slim.

Most importantly and worth repeating, the phrase "Eltweed Pomeroy" is NOT the same as the phrase "Pomeroy, Eltweed" to a search engine. You get what you ask for. I usually try to use enough words and phrases in the argument that I get 20 hits or less. Quite often I don't get any, but I'd rather get a few of the right hits than a thousand wrong ones. In this case I would try all of these arguments:

Four pairs of exact phrases:


Four combination searches:


And, just in case one of them was listed without a spouse:

top of page

Soundex

This produces a four-character code beginning with the first letter of a surname followed by three numbers assigned according to the Soundex Coding Guide. Soundex keeps together names with the same and similar sounds, regardless of spelling.

But, if the first letter of the surname is written incorrectly or misread, it can be a major search problem.

For example: Huffman and Kauffman both have the numerical coding of 155 although their Soundex codes are H155 and K155.

Metaphone

This is an algorithm to code English words phonetically by reducing them to 16 consonant sounds. It is similar in concept and purpose to Soundex, but more comprehensive in its approach.

Fuzzy searches

"Fuzzy" in geek-speak means "not exact." A fuzzy search will find variant spellings by applying the rules of Soundex, usually reserved for surname searches.

Boolean

Gives preferences to search results that contain given keywords. For example, the search "gardenias RANK wisteria" will find pages containing the word gardenias, but will rank higher those pages which also contain the word wisteria.

top of page

RootsWeb Message Boards

  *   after the first three characters in the word (i.e., users can search for "gehr*" to find "gehring, gehrig, gehrke,"etc.)
  +   sign in front of any word, that word must appear in a message for it to be a match
  -   sign in front of any word, that word will be excluded

For example, a search for "+gehring heinrich -baden" will return only those hits that contain the word "gehring" but not the word "baden", and in this group any messages containing "heinrich" will be displayed before those that don't.

Happy Surfing!

top of page

by Ted Pack tedpack@thevision.net
http://www.tedpack.org/
RootsWeb Review Vol 6, No. 16