• Skip to main content
  • Skip to primary sidebar
  • Skip to footer

Deep Web

The Dark World

  • Deep Web
  • Deep Web Links
  • Best VPN
  • Tor
  • Hidden Wiki
  • News
You are here: Home / Deep Web Research Papers / Searching for Hidden Web Databases

deepwebadmin / November 24, 2015

Searching for Hidden Web Databases

Share
Pin

Abstract:

Recently, there has been increased interest in the retrieval and integration of hidden Web data with a view to leverage high-quality information available in online databases. Although previous works have addressed many aspects of the actual integration, including matching form schemata and automatically filling out forms, the problem of locating relevant data sources has been largely overlooked. Given the dynamic nature of the Web, where data sources are constantly changing, it is crucial to automatically discover these resources.

However, considering the number of documents on the Web (Google already indexes over 8 billion documents), automatically finding tens, hundreds or even thousands of forms that are relevant to the integration task is really like looking for a few needles in a haystack. Besides, since the vocabulary and structure of forms for a given domain are unknown until the forms are actually found, it is hard to define exactly what to look for.

We propose a new crawling strategy to automatically locate hidden- Web databases which aims to achieve a balance between the two conflicting requirements of this problem: the need to perform a broad search while at the same time avoiding the need to crawl a large number of irrelevant pages.

The proposed strategy does that by focusing the crawl on a given topic; by judiciously choosing links to follow within a topic that are more likely to lead to pages that contain forms; and by employing appropriate stopping criteria.

We describe the algorithms underlying this strategy and an experimental evaluation which shows that our approach is both effective and efficient, leading to larger numbers of forms retrieved as a function of the number of pages visited than other crawlers.

Download

Share
Pin

Filed Under: Deep Web Research Papers Tagged With: deep web research papers, focused crawler, hidden web, large scale information integration

Primary Sidebar

STAY ANONYMOUS

CyberGhost VPN Deep Web Access

Footer

Follow US

Recent Post

  • 11 Spine-Chilling and Nightmarish Deep Web Stories from Users
  • Deep Web Destinations – A Massive List of Places to Visit on the Deep Web
  • How Dark Web Whistleblowers Work
  • Money on the Dark Web: Bitcoin Fades as Monero Rises?
  • The Story of Deep Web Narcotics

Disclaimer

The information contained in this website is for general information purposes only. The information is provided by Deep Web Sites and while we endeavour to keep the information up to date and correct, we make no representations or warranties of any kind, express or implied, about the completeness, accuracy, reliability, suitability or availability with respect to the website or the information, products, services, or related graphics contained on the website for any purpose. Any reliance you place on such information is therefore strictly at your own risk. Read more>>

© 2023 · Deep Web

  • Terms and Conditions
  • Privacy and Cookie policy
  • Disclaimer
  • Contact us