Search engine wars…

We all use search engines several times a day, left, right and centre. Most of us use our favourite engine, which is most likely Google, and are happy with the results it fetches. But the fact remains that, particularly considering the disconnectedness of the web, we’re indeed missing out on several results. Popular things — people, companies, search engines — always have to pay a price. Just like most of the viruses are written for windows, most of the search engine hacks are targetted at Google.

As a simple example, a search for miserable failure leads you to Biography of President George W. Bush (also by searching for failure) and Biography of Jimmy Carter! This is a typical example of a googlebomb:

“A Google bomb or Google washer is a certain attempt to influence the ranking of a given page in results returned by the Google search engine. Due to the way that Google’s PageRank algorithm works, a page will be ranked higher if the sites that link to that page all use consistent anchor text. A Google bomb is created if a large number of sites link to the page in this manner. Google bomb is used both as a verb and a noun.”
Wikipedia entry on Google Bomb.

Wikipedia has an entry on miserable failure, as well.

It’s no big deal that a search for failure pops up such a result, since you are quite unlikely to google for such terms. But it clearly emphasises the fact that, despite spam filters etc, it is still likely that you end up seeing results that really do not matter to you in the context of your search. Note context.

That brings me to the point of clustering search engines. Considering how big Yahoo! powered by Inktomi was, it was indeed foolhardy to venture to build another search engine, but that’s what great technology can do, and we end up with a legendary Google. The next burning question is to question the hegemony of Google. Every newcomer probably now builds on the google experience of having an extremely uncluttered start page. So that’s good. Next, they try various other things, one of them being clustering. Clustering is certainly useful for disambiguation (you’ve seen this term often if you regularly use wikipedia!). Searching for ‘Cricket’ would give you entries on the game as well as the insect, certainly more of the former, and you would be hard-pressed to find entries on the latter! Cricket is perhaps a wrong example, in that the entries are just so many, that the game bulldozes the insect out of the search engine!

Vivisimo is a clustering search engine, that does a decent job.

tm.gif main_logo.gif

Previewseek (“the world’s most advanced search engine” is their slogan) is a very very good looking search engine, that throws up quite interesting results. For example, a search for cricket directly gives you several useful links. It already says at the top that “Previewseek know this about cricket …” at the top, which would be of great use if you are looking to understand a term or search (I almost said google!) for information about something. The wikipedia entry for cricket is also on the first page. It does use some bandwidth by displaying screenshots for the pages (the parent page of the page you wish to see, I feel), but for broadband users, that’s nothing to worry; you can preview a page before you jump into it. It looks a bit of an AJAX-type interface that’s cool and allows you to add search terms based on the results (a plus next to India, in cricket results , when clicked, will take you to cricket india results). Google does not give the wikipedia entry for cricket in its search, in the top ten. In fact, it’s at a rather bad 34 on the results list!

While I am on the topic of result ranks, I must mention this website, synerge, which compares the ranks of the results between Yahoo! and Google:

yahoo-vs-google1.jpg

PageRank is good no doubt, but spamming etc has its effects on it. I am certain that any algorithm for search, by definition, can be tricked. It’s just like encryption: no key is unbreakable (I’ll ignore quantum cryptography for the moment!). However, 1024-bit encryption does offer practically foolproof security. It’s just that we must continue to strive for such a search algorithm, that is practically foolproof.

There’s another search engine Kosmix, and this is what they have to say about themselves:
kosmix_logo.gif

“At Kosmix, we’re passionate about building a world class search engine that lets people search less, and discover more great stuff. There are billions of pages on the web that are useful, but never see the light of day through a standard search engine. We want to help you find those great pages, and make it easy and fun to do in the process.

Right now we’re in the early stages of Kosmix, and at this point only cover a handful of categories. Our list is growing fast, so check back with us to see what’s new.”

While Google searches pages based on popularity; Kosmix promises to work differently, and will have categorization based on content. With Kosmix, users will be asked to define a search category, and the search engine will then find Webpages that are closely associated in meaning with the search term. Kosmix – which has already started testing a health search on its website, will launch several other search categories over the next year.

The puropse of this article is just to make sure you have an eye on other search engines, including MSN, and not blindly run behind Google, however difficult that may be. None of them is way behind the other, except in popularity! Of course, I’d use Google Scholar ahead of any other search engine, but even there, there’s Entrez PubMed (or HubMed), Scirus and the like which are quite good.
The race is well and truly on, and we, as consumers, are in for a treat, for the problem of plenty is good to have!

Advertisements

India to help Boeing fly into future…

This article on rediff today, does make one proud of the research in the country. Once again, goes to show that great people can do great work anywhere.

Read on:

George Iype in Kochi |
October 26, 2005
http://www.rediff.com/money/2005/oct/27spec.htm

In one more example of the world’s discovery of India as the place for cutting edge technology development, most of the designs for building Boeing’s next generation aircraft are going to be created and tested by the Indian Institute of Science, Bangalore.

IISc, India’s premier scientific research institute, has joined hands with Boeing, the leading American manufacturer of satellites, commercial jetliners, and military aircraft, to build next generation aircraft.

Nearly 40 faculty members from various IISc departments — like aerospace, metallurgy, centre for product design and manufacturing and civil engineering — are involved in the Boeing project, which is being managed by the Society for Innovation and Development. SID is IISc’s commercial arm, which was founded more than a decade ago.

SID undertakes research and development projects based on individual or joint proposals from IISc faculty and scientists, in collaboration with national and international organisations and business houses.

SID Chief Executive S Mohan said Boeing signed a memorandum of understanding with the Institute earlier this year.

“IISc is the only Asian institution that Boeing has tied up with for research and transfer of technology,” Mohan told rediff.com

Boeing’s other global partners in research include Carnegie Mellon, Stanford Engineering, Massachusetts Institute of Technology, Caltech, University of Illinois at Urbana Champaign and University of Cambridge.

The IISc-Boeing tie-up says the aerospace major would invest $50,00,000 in research every year for the next five years in the company’s projects with the Institute.

“We have identified nine projects in which we will work with Boeing to build next generation aircraft,” Mohan said.

To build these new planes, the IISc team has proposed the use of smart structures and the application of lightweight components like nano materials, alloys and their composites.

IISc’s areas of focus include developing flaps for the aircraft that are fitted with smart sensors — so that they can direct wind currents better — and use of aluminium alloys in high temperature areas as well as in landing gear boxes.

The designs will be tested in a virtual environment being developed at the Institute.

“The Boeing project involves lots of innovative research. It is going to be interesting and very challenging,” a researcher involved with the project said.

SID will enable innovations in science and technology by helping industries and business establishments compete and prosper in the face of global competition, turbulent market conditions and fast moving technologies, Mohan said.

The Boeing project is one of SID’s many ongoing ones.

IISc launched SID with just one project in the year 1994, and a total financial outlay of Rs 2,25,000. Till date, SID has generated approximately Rs 600 million worth of research projects.

Some of SID’s successful projects have been:

  • Development of a software tool for performance evaluation of ATM switches
  • Development of a 2.7 MW thermal gasifier system
  • Development of dynamic surface force apparatus
  • Development of high voltage power supplies for airborne application
  • High speed oxygen sputtering system
  • Initiation of umbrella R&D programmes with organisations like Nokia, General Motors, Honeywell