Internet, the world's largest library (library without walls, library without librarians), is estimated to have billions of sites, and this number is growing exponentially. Many directories and search engines attempt to bring order to the Web in the form of indexed lists or subject directories to help in retrieving information.
The Web is a dynamic entity; new sites appear every day, some disappear and some change their addresses. Search engines are continuously modifying their search features and new search engines appear every now and then. There is no perfect search engine or anything close. The Web is far too huge, disorganized and fast changing.
Internet Search Tools
Keyword searching is similar to searching the index of a book. These search tools are essentially databases/robots compiled by computer programs that roam and index the Web sites by title, heading, URL and text. A disadvantage to such a search is that they produce a large number of results, this was later improved by introducing advanced search techniques like Boolean and field search facilities.
Google indexes billions of sites, making it one of the largest web search engines. It offers advanced search mode, Boolean Operators (default is AND), automatically ignores stop-words (to override this use + sign before the stop-word), allows phrase search and truncation using *.
Google allows searching for words appearing in the title or url such as searching on:
The first search retrieves any web site that has 'hospital' in its title with the word 'infections' occurring anywhere, whereas the second retrieves sites that have both words 'hospital' and 'infections' present at the same time in their titles; the third retrieves sites that have 'lb' in their URLs.
In addition, Google indexes 12 non-HTML files such as PDF files from the Internet ("Invisible Web" that are usually not indexed by other search engines). This is achieved by adding the following at the end of the search strategy:
Google allows searching a specific site such as:
The first statement searches for the word admission in the Stanford University Site. The second statement searches for the word admission in all edu sites.
Could also search for Web sites that have a link to a particular site example to see who has a link to AUB home-page enter:
It allows definition searching for a term or a phrase example search:
To have a search done also for synonyms of a term search:
Google applies stemming technology; so search for "dietary" retrieves also diet, dieting, or diets etc...
Also available are Google Scholar, images, Blog search, News, Chrome...
Results are ranked by the number of links from pages ranked high by the service, or have the highest linking from other web sites. This might be a drawback for Google as very new sites will not appear in Google hits as it takes some time for the creators of other web pages to link to this new site.
Cached link is a snapshot of the web page when Google indexed it. I'm feeling lucky gets users directly to the first home-page or the top ranked source for a query. It also offers translation services.
Other key-word search engines include:
This is similar to examining the table of contents of a book, and it is used by users requesting a broader perspective of information. Documents are manually screened and reviewed before entering into the directory, as a result they are of a relatively small size compared to the key-word search engines. To search here it might be helpful to search broad terms to describe your topic as you are not searching the full-text of the site unlike keyword search engines. Many have in addition search engines to search by keywords (called hybrid search tools), others have only browsing capability.
It is one of the oldest and most popular directory, each site is assigned a subject classification by Yahoo! staff. Resources are classified within 14 subject headings which allow you to go from a broad subject to narrower concepts. Another way is to search yahoo! catalog using its search engine, this might be useful if one is looking for a concept and is not sure about its hierarchical location.
It allows Advanced search mode where you can specify to "include all of the words", "include this exact phrase", "include at least one of these words" or "exclude these words". More options are available such as language, country, date, keyword location (title, text, URL) and domain (.com, .gov, .edu). Also you could specify to search the Web or just Yahoo!.
"Search by URL" allows you to "find Web pages similar to" or "find Web pages that link to".
Results listed under "web matches" are a combination of listings from third-party search engine providers and the Yahoo! Directory. When a site is also listed in Yahoo! Directory a red arrow appears that will take you to more sites on the same topic in the Yahoo! Directory. "Inside Yahoo!" presents products or services on Yahoo! that match search request. "Directory category matches" highlights categories in Yahoo! directory. "Sponsor matches" are relevant hits that are paid for by businesses or organizations.
Other subject arranged directories include:
World Wide Web
Meta-Search tools do not have databases of their own, but they send queries to multiple search engines simultaneously.
Advantages of meta-search engines:
Searches about 12 search engines including Netscape, AltaVista, LookSmart, Yahoo!, Ask Jeeves / Teoma, Go, WiseNut… It translates the search query into each search engine's syntax and removes duplicates from retrieved result. It accepts natural language queries, "phrase search", + or - , Boolean operators and parenthesis, truncation using *, and field searching including title:"medical libraries", domain:uk, url:uk, related:www.aub.edu.lb (finds sites with similar content to AUB). Ixquick returns only those documents that appear in the top 10 of any search engine results. It uses a star system whereby the number of stars indicates the number of sites ranking each result in the top 10.
Searches 13 search engines such as Ask Jeeves, Google, FAST, LookSmart, and a user can configure it to his favorite search tools by selecting to search in the advanced mode. Duplicates may appear from different search engines. It allows Boolean (default is AND), use of + or -, and "phrase search".
Inference Find: http://www.inferencefind.com
Searches ten search engines such as AltaVista, Yahoo!, Lycos, Google, NorthernLight,, and removes duplicates in final results. It allows Boolean (default AND), use of + or -, and "phrase search".
Meta Crawler: http://www.metacrawler.com/
Searches search engines such as About, AskJeeves, FAST, Google, Excite. It supports Boolean, "phrase search" and the use of + or -. It collects hits from each of the search engines used, combines them, provides the search results and removes duplicate hits.
Searches a number of search engines such as AltaVista, MSN, Netscape, LookSmart, Lycos, CNN… It returns a list of clusters that are further organized into subto/tutorials//tutorials/images (this is a clustering technology where documents are organized into subject groups or clusters). It allows use of + or -, AND, OR, ( ), "phrase search", URL:, title:, and host:.
The Deep Web (Invisible Web) is estimated to be 400 times larger than the visible web. It mainly consists of databases that contain information stored in tables created by programs such as Access, Oracle, SQL, and also non-textual files such as multimedia files and documents in non-HTML formats such as PDF. To search for these databases you could search general web directories ex. Yahoo! or Google and use the term you are looking for with the word database, ex 'medical database' to retrieve the medical databases.
In addition to Google we have the following:
|Adobe PDF search||http://searchpdf.adobe.com|
Medicine Specific Search Engines are those engines that search medical or other allied health sites
This searches for health and medicine on the Internet. It includes Websites, FTP, Gopher servers and Newsgroups. This site is mainly for consumers use. You can search directly by keyword or browse by subject categories. Retrieved results display rating stars (five stars is the best), review, accuracy, type, related category and URL.
This can query up to four databases at one time. Included in the list of searchable medical sites are Achoo, Medical Matrix, MedWeb, Virtual Hospital, Doctor’s Guide, WebDoc, Yahoo! etc…
Medical World http://www.mwsearch.com/
This uses Unified Medical Language as its thesaurus allowing automatic searching for related terms. Mainly for the health professional use. This includes the full-text of Medical Matrix resources plus other medical resources. Go to modify query (from the results screen) to select thesaurus term or do Boolean search (default is OR).
This is a health/medical Internet search engine. There is no need to use Boolean “AND” as it is already incorporated into this search engine. It is divided into categories and subcategories by subject.
This is a full text index of Internet resources with medical and health content. The retrieved results show the best score for the sites retrieved, a higher score signifies that the page corresponds better to your query. It offers simple and advanced search facilities. In the simple mode one can search for “all the words”, “any of the words” or “adjacent words”, and can specify the geographical region. In the advanced mode, use of + or -, truncation (*) and Boolean operators can be utilized in addition to options available for the simple search mode.
Evaluated Medical Subject Directories allow users to search authoritative high-quality resources of medical/health information. Since it is possible for anyone around to publish on the Internet, one has to be careful about the quality and reliability of the content of Internet documents specially for medical information. The number of resources evaluated is relatively small, and a short description is given for most entries to give information about that particular site.
As Tim Berners-Lee, the founder of the Web, stated “a link from a quality source will generally be only to another quality documents”.
Image search engines attempt to provide access to images available on the Internet.
This is offered by a number of search tools:
Natural-language search engines are those engines that accept strategies written in natural language, that is, the way we speak in English ex. What is the capital of Lebanon?
One popular natural-language search engine is: