Abstract:
The Deep Net refers to the thousands of topic-specific search engines on the Internet, including those that are inaccessible to traditional crawler-based search engines. Commercial metasearch engines have been slow to provide a simple, universal interface to these smaller topic-specific search engines. Turbo 10 has developed a commercial metasearch engine that connects to these resources en masse.
Turbo 10 automates the process of creating and maintaining software adapters that connect to, search, and extract results from a multitude of search engines. This poster outlines the functional mechanics of how Turbo 10 searches the Deep Net.
Recent research has highlighted a large number of topic-specific search engines that are inaccessible to crawler-based search engines. These engines have been variously grouped under the umbrella terms: invisible web, deep web, and hidden web.
The research has found that crawler-based engines cannot access the information stored in some of these engines, hence the monikers: invisible and hidden.
Turbo10, however, prefers to use the term ‘Deep Net’ because some of these information sources are not web-based (e.g., peer to peer networks) and the contents of these databases are not hidden or invisible to meta search engines. The challenges for a commercial metasearch engine are, first, to connect to these Deep Net sources, second, to select the most relevant, and third, to return relevant results as fast as possible.