| How Search Engines Work |
|
|
|
There are three different kinds of resources on the internet that harvest, index and supply information based on what is "on the internet". Most often they are generically referred to as search engines. Crawler-Based Search Engines Crawler-based search engines, like Google find and display their information automatically, they "crawl" or "spider" the web. These "crawlers" or "spiders" are software applications that surf the internet, one webpage at a time. Gathering all the information that it can read (search engines can not read JavaScript or flash). Then the information is indexed and archived. This is important to know because when you type in your search on a search engine, it is not going to the website at that time. The search engine is going in its own archives to retrieve information that it has previously collected and index. If you make changes to your website, crawler-based search engines eventually find these changes, and that can affect how you are listed. Page titles, text content and other elements are all factors. Crawler-based search engines have three major elements. First is the spider, also called the crawler. The spider visits a web page, reads it, and then follows links to other pages within the site. This is what it means when someone refers to a site being "spidered" or "crawled." The spider returns to the site on a regular basis, such as every month or two, to look for changes. Everything the spider finds goes into the second part of the search engine, the index. The index, sometimes called the catalog, is like a giant book containing a copy of every web page that the spider finds. If a web page changes, then this book is updated with new information. Sometimes it can take a while for new pages or changes that the spider finds to be added to the index. Thus, a web page may have been "spidered" but not yet "indexed." Until it is indexed -- added to the index -- it is not available to those searching with the search engine. Search engine software is the third part of a search engine. This is the program that sifts through the millions of pages recorded in the index to find matches to a search and rank them in order of what it believes is most relevant. Human-Powered Directories A human-powered directory, such as the Open Directory, SuperPages, InfoUsa and other internet directories depends on humans for its listings. You submit a short description to the directory for your entire site, or editors write one for sites they review. A search looks for matches only in the descriptions submitted. This is especially important for "geo-targeting" where the geographic location (address) of that business, home or institution is recorded and sometimes verified. This is useful when a search user is looking for specific results in their geographic area. If you make a change to your web pages it has no effect on your listing. Things that are useful for improving a listing with a search engine have nothing to do with improving a listing in a directory. Hybrid Search Engines Usually, a hybrid search engine will favor one type of listings over another. For example, MSN Search is more likely to present human-powered listings. However, it does also present crawler-based results, especially for more obscure queries. Major Search Engines: The Same, But Different All crawler-based search engines have the basic parts described above, but there are differences in how these parts are tuned. That is why the same search on different search engines often produces different results. Keep your content enriching and engaging; be sure to keep your listings with directories up to date. To learn more about how we can help guide you, click here or call us at (800) 711-2492 for a free consultation. |

