Monday, July 28, 2008
« Help for honeybees | Main | Wind Power »

IT'S BETTER THAN GOOGLE CAUSE IT DOESN'T SPY ON YOU AND COLLECT INFO TO SELL.

Why Writing Your Own Search Engine is Hard

by Anna Patterson, Stanford University

Anna Patterson, President and Founder Of CUIL.com

Anna Patterson Anna’s focus is on scaling architecture, tackling one of the major problems in search—the exponential growth of the Internet. Anna was the architect of Google’s large search index, TeraGoogle, that launched in early 2006. While at Google, Anna was the technical lead of one of the two Web ranking groups at Google, in charge of GoogleBase, and the manager for the core piece of Google’s ad-matching technology. She joined Google in 2004 after designing, writing and selling Recall—the largest search engine in existence at the time at 12 billion pages. Anna has a PhD in Computer Science from the University of Illinois at Urbana-Champaign, and was a Research Scientist at Stanford University.

Patterson received her PhD in computer science from the University of Illinois at Urbana-Champaign and was a research scientist at Stanford University working on data mining. She also created a search engine index of 30 billion pages using information from the Internet Archive at Archive.org, after which she was employed by Google from 2004 until 2006.[1] Patterson then founded Cuil and became its president.[2]

At Google she was the architect of Google's large search index TeraGoogle which launched in early 2006. Besides serving architecture she was the technical lead for one of the two Web ranking groups at Google, in charge of GoogleBase and the manager for a core piece of Google's ads-matching technology.

Patterson is the mother of four children and married to Tom Costello.

Anna Patterson, PhD Thesis



TechCrunch
Menlo Park based Cuil will launch later this evening with an index of 120 billion web pages, making them arguably the most comprehensive search engine on the web (Google doesn’t disclose the size of their index, although they claim to know about a trillion unique web pages. It’s pronounced “cool.”

The super-stealth search project was founded by highly respected search experts. Husband and wife team (CEO) and (VP Engineering) were joined by Russell Power. Patterson and Power are also ex-Google employees, and the company has been the subject of intense speculation over the last couple of years.

Much of the secret sauce of Cuil is in the way they index the web and handle actual queries by users. Both are costly to scale, and Cuil claims to have found a way to massively reduce those costs. That allows them to run the search engine a lot cheaper, even at Google-scale should it ever reach that point. By some estimates, Google spends a billion dollars a year to run the back end infrastructure of it’s search business.

Cuil also claims to have better search results than Google and others based on how they index websites. They do not simply catalog keywords on a site and then rank the site based on its importance. They also work to understand how words are related (France - cheese - wine, for example), to return more relevant results to users. This is a semantic approach to search, but very different from a natural language approach. Powerset uses artificial intelligence to try to understand what sentences on a website actually mean. Cuil, by comparison, simply tries to properly categorize and file a web page, even if the category name doesn’t appear on the site.



According to her bio, she was a research associate to Formal Reasoning Group in the Computer Science Department at Stanford and her interests included: programming languages, concurrency and verifications and analysis of distributed systems. Her side interest? Visualizing structures using 3-D imaging techniques.  (Cuil!)
http://www-formal.stanford.edu/annap/www/annap.html
Photo: www.formal.stanford.edu


Add to Technorati Favorites

Monday, July 28, 2008 5:52:12 PM (Eastern Daylight Time, UTC-04:00)    Disclaimer  |  Comments [0]  |  Related posts:
[ECP] NetHappenings News and Resources
Lori Drew was found guilty of three misdemeanor charges
Yiddish: A Struggle for Survival
My Uncle Stan
[ECP] K-12 Newsletters November 2008
Webcam Suicide and Teachable Moments