Skip to main content

Internet: Search Tools

Welcome to Internet Guide that is intended to provide basic information about the Internet and how to effectively search it

Search tools

Internet, the world's largest library (library without walls, library without librarians), is estimated to have billions of sites, and this number is growing exponentially. Many directories and search engines attempt to bring order to the Web in the form of indexed lists or subject directories to help in retrieving information.

The Web is a dynamic entity; new sites appear every day, some disappear and some change their addresses. Search engines are continuously modifying their search features and new search engines appear every now and then. There is no perfect search engine or anything close. The Web is far too huge, disorganized and fast changing.

Internet Search Tools

Keyword Engines
Subject Directories
Meta-Search Tools
Invisible Web
Natural-language search
Image search
Medicine-Specific Search

 

Keyword Search Engine ex. Google

Keyword searching is similar to searching the index of a book. These search tools are essentially databases/robots compiled by computer programs that roam and index the Web sites by title, heading, URL and text. A disadvantage to such a search is that they produce a large number of results, this was later improved by introducing advanced search techniques like Boolean and field search facilities.

Google
indexes billions of sites, making it one of the largest web search engines. It offers advanced search mode, Boolean Operators (default is AND), automatically ignores stop-words (to override this use + sign before the stop-word), allows phrase search and truncation using *.

Google allows searching for words appearing in the title or url such as searching on:
      intitle:hospital infections
      allintitle:hospital infections
      inurl:lb
The first search retrieves any web site that has 'hospital' in its title with the word 'infections' occurring anywhere, whereas the second retrieves sites that have both words 'hospital' and 'infections' present at the same time in their titles; the third retrieves sites that have 'lb' in their URLs.

In addition, Google indexes 12 non-HTML files such as PDF files from the Internet ("Invisible Web" that are usually not indexed by other search engines). This is achieved by adding the following at the end of the search strategy: 
      filetype:pdf

Google allows searching a specific site such as:
      admission site:www.stanford.edu
      admission site:.edu
The first statement searches for the word admission in the Stanford University Site. The second statement searches for the word admission in all edu sites.

Could also search for Web sites that have a link to a particular site example to see who has a link to AUB home-page enter:
      link:www.aub.edu.lb  

It allows definition searching for a term or a phrase example search:
      define:infection
To have a search done also for synonyms of a term search:
      ~term

Google applies stemming technology; so search for "dietary" retrieves also diet, dieting, or diets etc...
Also available are Google Scholar, images, Blog search, News, Chrome...

Results are ranked by the number of links from pages ranked high by the service, or have the highest linking from other web sites. This might be a drawback for Google as very new sites will not appear in Google hits as it takes some time for the creators of other web pages to link to this new site.

Cached link is a snapshot of the web page when Google indexed it. I'm feeling lucky gets users directly to the first home-page or the top ranked source for a query. It also offers translation services.

Other key-word search engines include:

AltaVista http://www.av.com
HotBot http://www.hotbot.com
Web Crawler http://www.webcrawler.com
Infoseek http://www.infoseek.com
Excite http://www.excite.com

Subject Directories Tools

This is similar to examining the table of contents of a book, and it is used by users requesting a broader perspective of information. Documents are manually screened and reviewed before entering into the directory, as a result they are of a relatively small size compared to the key-word search engines. To search here it might be helpful to search broad terms to describe your topic as you are not searching the full-text of the site unlike keyword search engines. Many have in addition search engines to search by keywords (called hybrid search tools), others have only browsing capability.

Yahoo!: http://www.yahoo.com/
It is one of the oldest and most popular directory, each site is assigned a subject classification by Yahoo! staff. Resources are classified within 14 subject headings which allow you to go from a broad subject to narrower concepts. Another way is to search yahoo! catalog using its search engine, this might be useful if one is looking for a concept and is not sure about its hierarchical location.

It allows Advanced search mode where you can specify to "include all of the words", "include this exact phrase", "include at least one of these words" or "exclude these words". More options are available such as language, country, date, keyword location (title, text, URL) and domain (.com, .gov, .edu). Also you could specify to search the Web or just Yahoo!.
"Search by URL" allows you to "find Web pages similar to" or "find Web pages that link to".

Results listed under "web matches" are a combination of listings from third-party search engine providers and the Yahoo! Directory. When a site is also listed in Yahoo! Directory a red arrow appears that will take you to more sites on the same topic in the Yahoo! Directory. "Inside Yahoo!" presents products or services on Yahoo! that match search request. "Directory category matches" highlights categories in Yahoo! directory. "Sponsor matches" are relevant hits that are paid for by businesses or organizations.

Other subject arranged directories include:

World Wide Web
Virtual Librar
Librarians Index to Internet
Open Source Directory

http://www.w3.org/vl/
http://vlib.org/
http://lii.org
http://www.dmoz.org

Meta-Search Tools

Meta-Search tools do not have databases of their own, but they send queries to multiple search engines simultaneously.

Advantages of meta-search engines:

  • You need to access one single page to do the search.
  • You need to learn one interface for searching
  • You need only type search query once.
  • You can perform more thorough searching across a wider number of search engines.
  • You get an integrated set of results with (in many cases) the duplicate results stripped out (such as MetaCrawler and Ixquick).

Ixquick: http://www.ixquick.com
Searches about 12 search engines including Netscape, AltaVista, LookSmart, Yahoo!, Ask Jeeves / Teoma, Go, WiseNut… It translates the search query into each search engine's syntax and removes duplicates from retrieved result. It accepts natural language queries, "phrase search", + or - , Boolean operators and parenthesis, truncation using *, and field searching including title:"medical libraries", domain:uk, url:uk, related:www.aub.edu.lb (finds sites with similar content to AUB). Ixquick returns only those documents that appear in the top 10 of any search engine results. It uses a star system whereby the number of stars indicates the number of sites ranking each result in the top 10.

Dogpile: http://www.dogpile.com
Searches 13 search engines such as Ask Jeeves, Google, FAST, LookSmart, and a user can configure it to his favorite search tools by selecting to search in the advanced mode. Duplicates may appear from different search engines. It allows Boolean (default is AND), use of + or -, and "phrase search".

Inference Find: http://www.inferencefind.com
Searches ten search engines such as AltaVista, Yahoo!, Lycos, Google, NorthernLight,, and removes duplicates in final results. It allows Boolean (default AND), use of + or -, and "phrase search".

Meta Crawler: http://www.metacrawler.com/
Searches search engines such as About, AskJeeves, FAST, Google, Excite. It supports Boolean, "phrase search" and the use of + or -. It collects hits from each of the search engines used, combines them, provides the search results and removes duplicate hits.

Vivisimo: http://www.vivisimo.com
Searches a number of search engines such as AltaVista, MSN, Netscape, LookSmart, Lycos, CNN… It returns a list of clusters that are further organized into subto/tutorials//tutorials/images (this is a clustering technology where documents are organized into subject groups or clusters). It allows use of + or -, AND, OR, ( ), "phrase search", URL:, title:, and host:.

Invisible Web Search Tools


The Deep Web (Invisible Web) is estimated to be 400 times larger than the visible web. It mainly consists of databases that contain information stored in tables created by programs such as Access, Oracle, SQL, and also non-textual files such as multimedia files and documents in non-HTML formats such as PDF. To search for these databases you could search general web directories ex. Yahoo! or Google and use the term you are looking for with the word database, ex 'medical database' to retrieve the medical databases.

In addition to Google we have the following:

Adobe PDF search http://searchpdf.adobe.com
InvisibleWeb http://www.invisibleweb.com
CompetePlanet http://www.completeplanet.com
Invisible-web.net http://www.invisible-web.net
ProFusion http://www.profusion.com
SearchIQ http://www.search.com/subjects
Direct Search http://www.freepint.com/gary/direct.htm
IncyWincy http://www.incywincy.com/

Medicine Specific Search Engines

 

Medicine Specific Search Engines are those engines that search medical or other allied health sites

HealthAtoZ http://www.HealthAtoZ.com/
This searches for health and medicine on the Internet. It includes Websites, FTP, Gopher servers and Newsgroups. This site is mainly for consumers use. You can search directly by keyword or browse by subject categories. Retrieved results display rating stars (five stars is the best), review, accuracy, type, related category and URL.

MedBot http://medworld.stanford.edu/medbot/
This can query up to four databases at one time. Included in the list of searchable medical sites are Achoo, Medical Matrix, MedWeb, Virtual Hospital, Doctor’s Guide, WebDoc, Yahoo! etc…

Medical World http://www.mwsearch.com/
This uses Unified Medical Language as its thesaurus allowing automatic searching for related terms. Mainly for the health professional use. This includes the full-text of Medical Matrix resources plus other medical resources. Go to modify query (from the results screen) to select thesaurus term or do Boolean search (default is OR).

MedExplorer http://www.medexplorer.com
This is a health/medical Internet search engine. There is no need to use Boolean “AND” as it is already incorporated into this search engine. It is divided into categories and subcategories by subject.

MedHunt http://www.hon.ch/MedHunt
This is a full text index of Internet resources with medical and health content. The retrieved results show the best score for the sites retrieved, a higher score signifies that the page corresponds better to your query. It offers simple and advanced search facilities. In the simple mode one can search for “all the words”, “any of the words” or “adjacent words”, and can specify the geographical region. In the advanced mode, use of + or -, truncation (*) and Boolean operators can be utilized in addition to options available for the simple search mode.

Evaluated Medical Subject Directories

Evaluated Medical Subject Directories allow users to search authoritative high-quality resources of medical/health information. Since it is possible for anyone around to publish on the Internet, one has to be careful about the quality and reliability of the content of Internet documents specially for medical information.  The number of resources evaluated is relatively small, and a short description is given for most entries to give information about that particular site.

As Tim Berners-Lee, the founder of the Web, stated “a link from a quality source will generally be only to another quality documents”.

  1. Medical Matrix: http://www.medmatrix.org/index.asp
    This is developed by the American Medical Informatics Association . It is mainly designed for the physician, and can be either browsed or searched. This offers cme, official guidelines and current therapeutic standards, medicine-case database, references and textbooks like Merck Manual, Family Practice Handbook etc….

  2. CLINIWEB: http://www.ohsu.edu/cliniweb/
    This is developed by Oregon Health Sciences University, it allows searching by the NLM subject headings (MeSH) and provides access to biomedical information on the Web. It is an “index and table of contents to clinical information on the Web” that can be either searched or browsed. It has a special software that maps the textword entered into the correct MeSH term. Information is mainly for use by health care profession practitioners and students.

  3. CIC Health Web: http://www.healthweb.org/
    This site, developed by about 100 US Health Science Librarians, provides links to evaluated resources on the Web for the health care professionals as well as consumers. A user could browse by subject or search by keyword, and results are given scores and are annotated.

  4. OMNI: http://omni.ac.uk
    OMNI (Organizing Medical Networked Information), is a gateway to world wide high quality biomedical Internet resources. Boolean operators and phrase search are allowed. Field searching is allowed for title (ex. title=epilepsy), MeSH heading, or description. Remember that here searching is done for title, MeSH or descriptions and not by searching the source itself .

  5. Hardin Meta Directory: http://www.lib.uiowa.edu/hardin/md/index.html
    This is a good site that lists the best sites on the Internet, arranged by subject

Image Search Engines

Image search engines attempt to provide access to images available on the Internet.
This is offered by a number of search tools:

  • General search engines like AltaVista, Google, Lycos, AllTheWeb.
  • Specialized image search engines devoted to indexing images:
Amazing Picture http://www.ncrtec.org/picture.htm
CobionVisoo http://www.visoo.com
Ditto http://www.ditto.com

 

  • Meta image search engines that pass request to a number of engines:
Mamma http://www.mamma.com
Fazzle http://www.fazzle.com

 

  • Collection-based image engines where humans collect and index images:
Corbis http://www.corbis.com
Getty http://creative.getty/images.com

Natural Language Search Engines

Natural-language search engines are those engines that accept strategies written in natural language, that is, the way we speak in English ex. What is the capital of Lebanon?

One popular natural-language search engine is: 
        Ask        http://www.ask.com