Navigating the Deep Web
- What is the Deep Web?
- Dynamic v. Static web pages
- What content is part of the Deep Web?
- Why do some people call it the "Invisible Web"?
- Why Search the Deep Web?
- How do I search the Deep Web?
- Links to More Information
What is the Deep Web?
The deep web is made up of dynamic pages, non-textual file formats, and databases of information. It is often referred to as the "invisible web" because it is "invisible" to search engine spiders. Experts on the Internet estimate that the deep web may be 500 times larger than the indexable web. The deep web receives 50% greater traffic than surface web and is more often linked to than the surface web. 95% of the deep web is publicly availible information, not subject to fees.
Dynamic v. Static web pages
The web is made up of two different types of pages dynamic and static pages. Dynamic pages are pages which are created as the result of a database search. (These are also referred to as database-driven web pages.) The content displayed on dynamic pages changes often. Therefore, the page is regenerated "on the fly" based on the information in the database every time it is accessed. A good example of a dynamic web page is Amazon.com All of the information about what books Amazon has available, how much they cost etc. is stored in a database. As people make purchases the information in the database is updated. Amazon wants its customers to have the most up to date information about its books. Therefore, its pages are run off a database and are generated "on the fly". Most e-commerce sites utilize dynamic pages. Another way you can tell a page is "dynamic" is if it utilizes interactivity. These dynamic pages make up what is referred to as the Deep Web.
In contrast, static pages are not constantly changing. These pages are basic HTML files. They are not reliant on a database for their content. The pages simply reside on a server waiting to be retrieved. When a change need to be made to a page it must be made directly to the HTML code. Memorial Library's homepage is a good example of a static page. Static pages make up what is referred to as the Surface Web.
What content is part of the Deep Web?
- Non-textual file formats
This includes multimedia files, graphical files, software and documents in non-standard formats, such as Portable Document Format (PDF). - Databases of information
This includes subscription-base commercial online information services (InfoTrac, OCLC, Gale, Lexis-Nexis), library catalogs, online encyclopedias, telephone directories - Database/Application Driven pages
This includes dynamically generated pages such as ASP, Cold Fusion, PHP, or CGI. - Other
Breaking news reports, online discussion and chat, streamed content
Why do some people call it the "Invisible Web"?
The Deep Web is often referred to as the "Invisible Web" because the content it contains is rarely shows up in a search engine result. This is because search engine spiders do not go into databases and extract data. As a result, database content is "invisible" to these spiders.
However, the term "invisible web" is not an accurate description of this content. While this content is "invisible" to search engines, other web search technologies and techniques can be used to access this content. For example, directories such as Librarian's Index to the Internet frequently index access points to deep web content.
Why Search the Deep Web?
Experts on the Internet estimate that search engines only index somewhere between 20% to 50% of the surface web. Furthermore, the quality of information in the deep web is usually very high , because the deep web consists of the major, authorative websites available on the net. Today, there are over 200,000 database-driven websites and many modern websites operate using databases to generate pages on-the-fly. As a result, often the deep web is best place to look for the answer to your question.
Typically, the deep web is used for "targeted" queries. You should use the deep web if you are looking for information, which is likely to be stored in a database. Examples of information that would be stored in a database include address and phone number information. You should also use the deep web if you want to retrieve information that dynamically changes in content. The best example of this is news information.
Examples of databases that are part of the Deep Web:
How do I Search the Deep Web?
There are several ways you can search the deep web. First, databases that are part of the deep web are often indexed by directories. If you want to find a database containing a certain type of information you might try searching Librarians' Index to the Internet (LII).
Example: I want to find a recipe for hummus.
- Search LII for recipes
- Retrieve results which are divided into groups, one of these groups is databases.
- Examine the list of databases retrieved and decide that the site RecipeSource looks like a good place to try.
- Click on hyperlink to RecipeSource
- Search RecipeSource database for hummus
Second, there are several search engines that index deep web content, not all the pieces of what is in a database, but the databases themselves. Therefore, you can use a search engine such as Google or AltaVista to search for a database of a particular type of information.
Example: I need to find a list of hotels in Washington DC.
- Search Google for database +hotels +"washington dc"
- Examine results list and choose A1 Discount Hotels - Washington DC
- Click on this hyperlink and search their database for hotels in Washington, DC
Finally, there are sites which specialize in indexing databases availible on web. These sites can help you find the databases appropriate to your search.
Sources for deep web content include:
Links to More Information
Currently, the deep web is a topic a great discussion in the Internet community. For further information please see the following links and articles.
- Barker, Joe. "Invisible Web: What it is, Why it exists, How to find it and Its inherent ambiguity." Finding Information on the Internet: A Tutorial 2001. http://www.lib.berkeley.edu/TeachingLib/Guides/Internet/InvisibleWeb.html (16 Sept. 2001)
- Cohen, Laura "The Deep Web." 2001. http://library.albany.edu/internet/deepweb.html (10 Oct. 2002)
- The Deep Web: Surfacing Hidden Value. http://128.121.227.57/download/deepwebwhitepaper.pdf - a study by BrightPlanet that maintains the Web is 500 times larger than what is indexed by standard search engines.
- Price, Gary and Sherman, Chris (2001). The Invisible Web: Uncovering Information Sources Search Engines Can't See, New York: CyberAge Books
- Sherman, Chris "The Invisible Web." FreePint. 64. 2000. http://www.freepint.com/issues/080600.htm#feature (22 Oct. 2002)
- Sullivan, Danny "Invisible Web Gets Deeper." Search Engine Watch. 2000.http://searchenginewatch.com/sereport/00/08-deepweb.html (16 Sept. 2001)

