Sandbox‎ > ‎Archive‎ > ‎IPT 2008-09‎ > ‎Blair's page‎ > ‎Blair's updates!‎ > ‎

2009-02-12 Query and sort

posted Feb 11, 2009, 7:13 PM by Unknown user   [ updated Apr 10, 2009, 2:07 AM by Eddie Woo ]
For further details, please refer to IPT: Overview of Databases

"Query" and "sort" both work at Google. Do some research and discover how each of them is used when carrying out searches, through Google's database.
 
Google uses query in the search query e.g. "bach NOT vu" will return all results with "bach" but not "vu". Google uses sort with the "pagerank", which sorts results by estimated relevance - estimated through incoming links, etc.
 
What relational/logical operators are offered?
 
Relational: CONTAINS, DOES NOT CONTAIN, EQUALS (phrase)
Logical: AND, OR, NOT
Special: DEFINE, SITE, INFORMATION, LINK, CACHE, [#]...[#] (number range search)
 
What are the most generally useful and why? What situations would require the "less useful" operators?
 
The most useful operator in a search engine is CONTAINS, as the majority of searches look for pages containing the search term, not equalling the search term - a page that has, "temperature in switzerland", and nothing more, is not particularly useful.
 
Very specific searches require the less common operators. For example, the end user may want to further filter the search to exclude popular (but unwanted) results, e.g. searching for "Bach" but omitting results about the music composer.
 
What is actually being sorted in a Google search? What field is being sorted, and how?
 
Records that art sorted in a Google search are documents on the internet. The field being sorted is the "pagerank" field, a field that varies for each result depending on the number of incoming links, estimated relevance, etc.
Comments