Why Writing Your Own Search Engine is Hard
Anna’s focus is on scaling architecture, tackling one of the major problems in search—the exponential growth of the Internet. Anna was the architect of Google’s large search index, TeraGoogle, that launched in early 2006. While at Google, Anna was the technical lead of one of the two Web ranking groups at Google, in charge of GoogleBase, and the manager for the core piece of Google’s ad-matching technology. She joined Google in 2004 after designing, writing and selling Recall—the largest search engine in existence at the time at 12 billion pages. Anna has a PhD in Computer Science from the University of Illinois at Urbana-Champaign, and was a Research Scientist at Stanford University.
The super-stealth search project was founded by highly respected search experts. Husband and wife team (CEO) and (VP Engineering) were joined by Russell Power. Patterson and Power are also ex-Google employees, and the company has been the subject of intense speculation over the last couple of years.
Much of the secret sauce of Cuil is in the way they index the web and handle actual queries by users. Both are costly to scale, and Cuil claims to have found a way to massively reduce those costs. That allows them to run the search engine a lot cheaper, even at Google-scale should it ever reach that point. By some estimates, Google spends a billion dollars a year to run the back end infrastructure of it’s search business.
Cuil also claims to have better search results than Google and others based on how they index websites. They do not simply catalog keywords on a site and then rank the site based on its importance. They also work to understand how words are related (France - cheese - wine, for example), to return more relevant results to users. This is a semantic approach to search, but very different from a natural language approach. Powerset uses artificial intelligence to try to understand what sentences on a website actually mean. Cuil, by comparison, simply tries to properly categorize and file a web page, even if the category name doesn’t appear on the site.
According to her bio, she was a research associate to Formal Reasoning Group in the Computer Science Department at Stanford and her interests included: programming languages, concurrency and verifications and analysis of distributed systems. Her side interest? Visualizing structures using 3-D imaging techniques. (Cuil!) http://www-formal.stanford.edu/annap/www/annap.htmlPhoto: www.formal.stanford.edu
Enter your email address:
You will get email if the Educational CyberPlayGround has produced new content on that day.
Disclaimer
The opinions expressed do not represent Educational CyberPlayGround™ views in anyway.
© Copyright 2008, edu-cyberpg.com
E-mail