How to improve web search

During the following days I thought about this, then while driving the solution popped into my head (very dangerous!). Use the results of searching to present possible terms to the user to augment the search. One technique I thought of using was Tag Clouds. In Bing, Yahoo, or Google, one would just do a normal search, but new panels would then show with different type of Tag Clouds. The user could click on a term and that would be added to the search string. That would solve the problem of not knowing what to add to the terms to narrow the search, but also whether the search terms are helping get closer to the “real” topic.

I originally wrote this on January 21, 2010.  A few weeks ago I was helping a family member with some research.  I searched the web on the topic.  It was very bad.  No matter what I used as a query the results were bad.  Not only were the results not relevant,  duplicates and noise were overshadowing everything.  I tried using different search engines, Google, Bing, Yahoo, and others.

This is not new, of course.  Though search services are very good, there is just a lot of noise out there.  Worse, there is an adversarial relationship between search targets and the search services.   To see an example of this see this blog post The Anatomy Of A Bad Search Result

During the following days I thought about this, then while driving  the solution popped into my head (very dangerous!).  Use the results of searching to present possible terms to the user to augment the search.  One technique I thought of using was Tag Clouds.  In Bing, Yahoo, or Google, one would just do a normal search, but new panels would then show with different type of Tag Clouds: terms, news, location, entertainment, etc.  The user could click on a term and that would be added to the search string.   That would solve the problem of not knowing what to add to the terms to narrow the search, but also whether the search terms are helping get closer to the “real” topic.  Subsequent searches would allow refinement, adding and removing terms, and new tag clouds would form.

Well, I get an idea and know how to embellish it and make it powerful, its fun to do this.   So, I also saw how to extend this to do much more such as using Semantic tools, RDFa, TopicMaps, and graphical visualizations.  As I’m doing research to see if I can write this up, I stumble upon DeeperWeb.  Wow, they use the TagCloud technology to do just what I wrote about above.   Not only did they implement this, they also started to use some Semantic Web technologies like Topic Maps, which I was also thinking about.   Oh well, at least I found something that hopefully will improve search.

Unfortunately, DeeperWeb plugin has not improved my search results.  In fact, it has not been useful at all.  Occasionally the phrases panel was useful.   Puzzling.  I think this is not possible with a plug-in.  Instead, the search services themselves must provide the “deep web” results.  For example, Google has an ginormous database that can create a tag cloud or other type of  search augmentation.  I know there were some lab experiments like a mindmap like visual explorer (which was lame).

In the meantime, just for fun, I’ll look into the other ideas I had.  Tag Clouds were just an interim means, much more can be done.   There is a rich storehouse of more information that can be correlated and presented to assist the user in search.   Even location based filtering.  Social network based topic discovery, cultural cues, and even search augmented by crowd sourcing.  Let say you are search for xyz, this search can be broadcast to subscribers of a search service who then can augment the search terms to provide better results.  They would ask pertinent questions of course, and privacy is paramount.

As a Google employee stated, most of search has been solved, its the last part that will take the most effort, the usual 80/20 rule.

I’ve had this blog post in draft so long now I lost the momentum.  Oh well.

Fixed but fluid ontologies?

I had an idea that bridging that gap would enable some cool new applications.

I wonder what is the relationship between the Colon Classification system in library science developed by Shiyali Ramamrita Ranganathan and other Ontological studies like Top-Level Categories developed by John F. Sowa.

Does not appear to be  any, from:

“Faceted classification is used in faceted search systems that enable a user to navigate information along multiple paths corresponding to different orderings of the facets. This contrasts with traditional taxonomies in which the hierarchy of categories is fixed and unchanging.” — http://en.wikipedia.org/wiki/Faceted_classification

I had an idea that bridging that gap would enable some cool new applications.  Unfortunately, I don’t remember the details and can’t find the page I wrote it in.  Oh well, probably was nothing.