Google passes 1 trillionth page of info
http://www.nytimes.com/2009/02/23/technology/internet/23search.html?partner=rss&emc=rss&src=ig
One day last summer, Google’s search engine trundled quietly past a
milestone. It added the one trillionth address to the list of Web
pages it knows about. But as impossibly big as that number may seem,
it represents only a fraction of the entire Web.
Beyond those trillion pages lies an even vaster Web of hidden data:
financial information, shopping catalogs, flight schedules, medical
research and all kinds of other material stored in databases that
remain largely invisible to search engines.
The challenges that the major search engines face in penetrating this
so-called Deep Web go a long way toward explaining why they still
can’t provide satisfying answers to questions like “What’s the best
fare from New York to London next Thursday?” The answers are readily
available — if only the search engines knew how to find them.