Search 2.0 what it might be

we Take as a starting point, there are two pretty well known quotes:
    the
  1. "Search engines have become one of the two new wonders of the world, leaving Homo Sapiens unlimited and instant access to information." Ilya Segalovich, "How search engines work"
  2. the
  3. "the Internet is like a large dustbin: there is everything, but find it impossible." Folk wisdom

Select three major problems faced by modern search engines.

the Main problems of information retrieval


    the
  1. Instead of searching information search references.
    You have not yet tired of snippets? How much time is spent on the campaign the links and subsequent search of the information on the pages. Yes, while we twist banners, counters and other visitors, pleasing the owners, but from the publication of the RSS stream nobody suffers? In 90 cases out of 100 we are looking for information and receive a link where she might be located. Instead of instant access to the desired information is moving away from us behind some barrier.
  2. the
  3. Instead of breadth-first search depth-first search.
    On request "Internet" Yandex offers 602 million pages, and this figure continues to grow. And if Your question is "What is Internet" the interviewee will say "I heard this word 602 million times, you that once interested in?" You will hardly be satisfied. Most likely our thought, the source will begin to talk about the protocols, if it is technical specialist, or about the social network, otherwise. In any case, it will have a single the answer to the question. Even if this answer will be absolutely accurate and absolutely useless, as in the famous joke.
  4. the
  5. Mixed results of the search.
    Specify the query "the Seagull." You will get a heap of information about the watch, car, bird, etc. a Single list, mixed with each other. Yes, there are timid attempts of some systems to bring next related word, or a tree of clusters, but it will give birth to a new query with your local chaos. Why can't the results be carefully cut into sections (not related words), and give the result according to the sections? Birds separately, machine separately, otherwise the immediacy of access to information is sinking under its heterogeneity.

The developers there is a special mantra for all occasions which can be applied to the listed questions, "the User does not know what he wants", so we are forced to do what we do. Excuse me, when You ask a colleague or a passerby, he understands what You want from him?

Description search 2.0


Criticize — too easy, but still I would like to see in the search engines "tomorrow", or 2.0?
    the
  1. Interface: web chat, a command-line interface (cli), instant messenger (im)
  2. the
  3. the Principle of dialogue between the search system and the user
  4. the
  5. Differences:
    the
      the
    • are information
    • the
    • support the context of the conversation ("what's IT worth?")

  6. the
  7. Bonus:
    the
      the
    • AI is able, by inference, generate new information
    • the
    • support for the dialogue ("hi, how are you?")


Finally, a small brief on his own trying to tackle at the outset of the problem (important note — all the services described below work only with the English language, because everything is just prototypes). It does not find 2.0, this facilitative add-on Google, pereformuliruem search results.

Search explorer


His first attempt to make own order in the search results I called Search explorer and placed at newisearch.com. Among the main features:
the
    the
  • identify key phrases context
  • the
  • Clustering them according to thematic groups
  • the
  • Navigation for groups in the "Bookmarks"
  • the
  • Mode "I feel lucky", allows navigarius not according to the search results, and on "ready" sites, as relevant to the selected cluster

Search explorer

Command line interface agent


Second attempt — jabber agent agent@clisearch.net performing the selection of the most appropriate five paragraphs about the subject of the search. It's already a command-line interface, however without the support of context and with a long response time, because you need to process a large number of result pages.
Example of work:

me: -q internet
the agent:
1. The Internet is a global system of interconnected computer networks that use the standard Internet Protocol Suite ( TCP/ IP) to serve billions of users worldwide. It is a network of networks that consists of millions of private, public, academic, business, and government networks of local to global scope that are linked by a broad array of electronic and optical networking technologies. — en.wikipedia.org/wiki/Internet
2. All Categories Communications Computer Industry Companies Computer Science Data Graphics Hardware Internet and Online Services Mobile Computing Multimedia Networks Open Source Operating Systems Programming Software Standards Types of Computers Wireless Computing World Wide Web www.webopedia.com/internet.html
3. Most traditional communications media, such as telephone and television services, are reshaped or redefined using the technologies of the Internet, giving rise to services such as Voice over Internet Protocol ( VoIP) and IPTV. Newspaper publishing has been reshaped into Web sites, blogging, and web feeds. The Internet has enabled or accelerated the creation of new forms of human interactions through instant messaging Internet forums and social networking sites. — en.wikipedia.org/wiki/Internet
4. The origins of the Internet reach back to the 1960s when the United States funded research projects of its military agencies to build robust, fault - tolerant and distributed computer networks. This research and a period of civilian funding of a new U.S. backbone by the National Science Foundation spawned worldwide participation in the development of new networking technologies and led to the commercialization of an international network in the mid 1990s, and resulted in the following popularization of countless applications in virtually every aspect of modern human life. -http://en.wikipedia.org/wiki/Internet
5. The Internet today is a widespread information infrastructure, the initial prototype of what is often called the National ( or Global or Galactic) Information Infrastructure. Its history is complex and involves many aspects — technological, organizational, and community. And its influence reaches not only to the technical fields of computer communications but throughout society as we move toward increasing use of online tools to accomplish electronic commerce, information acquisition, and community operations. — www.isoc.org/internet/history/brief.shtml

Search wave


The following service implemented attempt to build a dialog mode of operation with the support of the context of the conversation. Because each discussion thread was like a wave in a certain service and he was named wave Search (newisearch.com/wave).
Search wave

Search summary


The latest in this article attempt search engine optimization — Search summary (newisearch.com/sum), which is increasing the number of search results reduces them to a manageable number (forgive me, optimizers), cutting them by topic. Among the main features:
the
    the
  • a Breakdown of search results into a number of clusters, several snippets in each, with navigation between them.
  • the
  • the Ability to "fall" inside the selected cluster (drill down) to initiate a new search based on the current keywords
  • the
  • Further development of the project: right instead of display summary snippets

Search summary

Our homegrown efforts are not over, next milestone is semantic search. But that's another story.
Article based on information from habrahabr.ru

Популярные сообщения из этого блога

Approval of WSUS updates: import, export, copy

The Hilbert curve vs. Z-order

Configuring a C++ project in Eclipse for example SFML application