SlideShare uma empresa Scribd logo
1 de 78
Web Search 101
Finding Lesson Plans, Activities, Songs, Games, and
Conducting Serious Academic Research
MADE EASIER, FASTER AND MORE ACCURATE

                                          Developed By

                                    William Tweedie
October 2011 & 2012


Table of Contents
Preface....................................................................................................................... 4
Objectives .................................................................................................................. 5
Materials: ................................................................................................................... 5
Timing: ....................................................................................................................... 5
Procedure................................................................................................................... 6
Part 1 – The Surface Web, Search Engines and Directories...................................... 6
A. Activating Prior Knowledge..................................................................................... 6
B. Search Engine – An online (Internet) World Wide Web search program................7
D. Search Queries ..................................................................................................... 8
FRAMING YOUR SEARCH STRATEGY.................................................................... 8
ACTIVITY:.................................................................................................................. 9
E. Basic Boolean Search Operators (AND, OR, NOT).............................................. 10
F. Search Tips, Tricks and Techniques..................................................................... 10
G. Wrap-up of Part 1................................................................................................. 10
Part 2 – The Hidden Web......................................................................................... 10
The Internet, World Wide Web and the Hidden Web................................................ 11
   Scratching the Surface and Digging Deep – Layers of the Web............................ 12
   Education.............................................................................................................. 14
Three Types of Search Engines .............................................................................. 18
   Crawler-based search engines ............................................................................. 18
   Human-powered directories ................................................................................. 19
   Hybrid search engines ......................................................................................... 20
   Table of Search Engine Features ......................................................................... 20
   How do Search Engines Work?............................................................................ 22
   Table of Directory Features................................................................................... 23
   Subject Directories (Contain Databases), and Portals ......................................... 24
   How to Find Subject-Focused Directories for a Specific Topic, Discipline, or Field
   .............................................................................................................................. 24
   What Are "Meta-Search" Engines? How Do They Work? ..................................... 25
   Are "Smarter" Meta-Searchers Still Smarter?....................................................... 25
   Better Meta-Searchers.......................................................................................... 25




                                                                                                                                  2
Meta-Search Engines for SERIOUS Deep Digging .............................................. 26
Search Basics: Constructing a Google Query .......................................................... 26
   Where does the term Boolean originate from?...................................................... 27
   Is Boolean Search Complicated?.......................................................................... 27
   Boolean Search And / Or / Not.............................................................................. 27
   Boolean Search Examples Boolean Connectors:.................................................. 28
   Interactive Text Equivalent.................................................................................... 28
   How the Search Engines Differ............................................................................. 30
   Search Engine Syntax & Features Comparison Chart ......................................... 30
   Some Search Tips, Tricks, & Techniques ............................................................ 33
   Invisible or Deep Web: What it is, How to find it, and its inherent ambiguity.........34
   Why isn't everything visible?................................................................................. 34
   How to Find the Invisible Web .............................................................................. 35
   The Ambiguity Inherent in the Invisible Web: ....................................................... 35
   Want to learn more about the Invisible Web?........................................................ 35
   10 Search Engines to Explore the Invisible Web................................................... 36
   How do we get to this mother lode of information?................................................ 36
The Invisible Web Databases................................................................................... 41
Dictionaries, Translators, & Other Language & Reference Tools ............................. 44
Web directories ........................................................................................................ 48
Internet Gateways, Jumplists, & Specialized Link Collections................................... 48
   Finding Jumplists & Gateways.............................................................................. 49
www.invisible-web.net.............................................................................................. 49
   Saving pages with Microsoft Internet Explorer ..................................................... 50
   Peer-to-Peer Computing ...................................................................................... 50
   Education ............................................................................................................. 50
   Subject-orientated search services....................................................................... 52
   Additional information about search engines, their use, and how they find
   resources.............................................................................................................. 52
   Data services requiring registration ...................................................................... 52
   Data services with unrestricted access................................................................. 54
   Search Engines .................................................................................................... 55
   Subject-orientated search services....................................................................... 56
   Dictionaries and Thesauri .................................................................................... 57
   Reference Works ................................................................................................. 58
General Tips for Searching the Web......................................................................... 60




                                                                                                                           3
Carefully Select Your Search Terms..................................................................... 60
   Framing your search strategy............................................................................... 60
International Educational Research Links................................................................. 62
Education databases................................................................................................ 64
Teaching websites.................................................................................................... 64
Journals.................................................................................................................... 65
Newsletters............................................................................................................... 65
New Educational Technology Standards for Teachers and Students.......................65
   NETS for Teachers 2008...................................................................................... 65
   NETS for Students 2007....................................................................................... 67
   Glossary ............................................................................................................... 69
   A to Z Computer/Internet Terms............................................................................ 69
Appendix A............................................................................................................... 74




Preface
The Internet and its World Wide Web are growing, developing and adding new
features at an explosive exponential rate. As you read this there are new
technologies being developed and implemented to make ‘surfing’ the Internet for
useful information of all types easier and more accurate, from the traditional
document to flash videos and file types previously inaccessible These types of pages
used to be invisible but can now be found in most search engine results:
     •     Pages in non-HTML formats (pdf, Word, Excel, PowerPoint), now converted
           into HTML.
     •     Script-based pages, whose URLs contain a ? or other script coding.




                                                                                                                             4
•   Pages generated dynamically by other types of database software (e.g.,
       Active Server Pages, Cold Fusion). These can be indexed if there is a stable
       URL somewhere that search engine crawlers can find.
The "visible web" is what you can find using general web search engines. It's also
what you see in almost all subject directories. The "invisible web" is what you
cannot find using these types of tools.
Search engines' crawlers and indexing programs have overcome many of the
technical barriers that made it impossible for them to find "invisible" web pages.
Computer robot programs, referred to sometimes as "crawlers" or "knowledge-bots"
or "knowbots" that are used by search engines to roam the World Wide Web via the
Internet, visit sites and databases, and keep the search engine database of web
pages up to date. They obtain new pages, update known pages, and delete obsolete
ones. Their findings are then integrated into the "home" database. Most large search
engines operate several robots all the time. Even so, the Web is so enormous that it
can take six months for spiders to cover it, resulting in a certain degree of "out-of-
datedness" (link rot) in all the search engines.
http://www.lib.berkeley.edu/TeachingLib/Guides/Internet/Glossary.html
Therefore this is truly just a starting point for the serious researcher whether in
academia or as a consumer of goods and services.
Objectives
In this brief overview we will look at and explore the elements that make for effective
research on the Internet.
   1. You will learn the Internet is composed of the “Surface Web” and the “Deep or
      Hidden Web.
   2. You will learn how to access information on both in the most expedient way
      through Search Engines, Meta-search engines and other Internet tools.
             a. You will learn what Search Engines are and the various types
                available.
             b. You will learn what Subject Directories, Portals, and Databases are.
   3. You will learn how to construct a search strategy.
   4. You will learn the basics of Boolean parameters which narrow search results.
   5. You will be provided special resources for academic research.


Materials:
This workshop needs to be conducted in a computer lab with very good Internet
access. Participants will follow specific areas of this reference book throughout the
workshop.
These areas can be changed according to the needs of the group. This reference
book is as comprehensive a guide as possible at the time of production.


Timing:



                                                                                          5
This workshop is designed to give a brief introduction to the complex world of the
‘Surface’ and ‘Hidden’ Webs with a focus on helping make searches more effective
and productive. Normal time allotted is 2 hours but it can be extended according to
time availability and the group’s level of expertise and interest. It is fully expected that
participants will regularly refer to this book and refine their search skills
independently.


DISCLAIMER: Changes on the Internet and in the Hidden Web occur at a rapid pace
so some of the search engines, sites, directories and databases may no longer be
available at the web addresses provided and some may no longer exist. Be prepared
to move quickly to the next point of interest. Broken links and inaccessible web-sites
can be researched at a later date.


Procedure
It is preferable to distribute this reference book well in advance of the workshop so
participants can familiarize themselves with the terms, content, and explore a few of
the sites.




Part 1 – The Surface Web, Search Engines and Directories


A. Activating Prior Knowledge


ACTIVITY: PRIME TASK: Q & A


1. The Surface Web (WWW) – What is it composed of?




                                                                                          6
Write as many types of information or components of the World Wide Web as you
can.
Time: 10 minutes
___________________________________________________________________
___________________________________________________________________
___________________________________________________________________
___________________________________________________________________
___________________________________________________________________
___________________________________________________________________
___________________________________________________________________
___________________________________________________________________
___________________________________________________________________


2. How can you access this information?
Write as many ways as you can?
Time: 10 minutes
___________________________________________________________________
___________________________________________________________________
___________________________________________________________________
___________________________________________________________________


3. How many Search Engines can you name? What is your favorite search engine?
Do you use more than one?
Write your answers.
Time: 5 minutes
___________________________________________________________________
___________________________________________________________________
___________________________________________________________________
___________________________________________________________________


4. How often do you use a search engine in a day? Week? What do you search for?
How long do you spend per search? Do you get the results you need or want?
Write your answers.
Time: 5 minutes
___________________________________________________________________
___________________________________________________________________
___________________________________________________________________
___________________________________________________________________
___________________________________________________________________


B. Search Engine – An online (Internet) World Wide Web search program.



                                                                                  7
1. There are 3 types of Search Engine:
a). Crawler-based (e.g. Google) – these create their listings automatically through
special programs that crawl or spider the web which follow links in web pages it
already has to its collection of sources, retrieve information found in index servers of
web-sites (containing key words) then send it back to the engine’s doc servers which
retrieve the entire document and create snippets to describe the document and which
contain the key words that might be the subject of a search query. – Very fast.
b). Human Powered Directories (e.g. (Open Directory Project) – gets its information
from visitor submissions which include a short description which is the source of any
key words in a search. – Also fast.
c). Hybrid Engines – combine results from the first two though one engine may have
a preference over the other. – Depends on the engine.
Search engines rely on their own ‘cache’ of web pages they have harvested but
when accessed (clicked) you are taken to the source’s latest page. If a page is never
linked it cannot be indexed.
The pages indexed are visible pages only. We’ll look at the Invisible web in Part 2.
2. How many search engines do you think there are? 80% of web pages in a major
search engine exist only on that engine; so, it is worth taking a look at some of the
others for a ‘second opinion’.


ACTIVITY:
Chose a topic and search for it on Google or www.DuckDuckGo.com. Then do the
same search on Exalead (www.exalead.com/search/. Compare the number of results
and the sources of these results.
Time: 20 minutes


C. Meta – search Engines – combine the results of many search engines.
(www.dogpile.com), (www.surfwax.com)


ACTIVITY:
Use the same search term as in the previous activity and compare the results again.
Time: 10 minutes


D. Search Queries


FRAMING YOUR SEARCH STRATEGY
To get a successful search result, you must ask the right search question. Framing a
good question requires you to think strategically about exactly what you need.
"By taking the time to identify key phrases and visualize the ideal answer, you will be
more likely to recognize that answer when you find it online." (Nora Paul)

Her guidelines are based on the standard journalist approach of "who, what, when,
where, why and how" reporting and include these tips, among others:



                                                                                        8
Who:
   •    Who is the research about: a politician, a businessperson, a scientist, a
        criminal?
   •    Who is key to the topic you are researching? Are there any recognized
        experts or spokespersons you should know about?
What:
   •    What kind of information do you need: statistics, sources, background?
   •    What kind of research are you doing: an analysis, a background report, a
        follow-up?
   •    What would the ideal answer look like?
When:
   •    When did the event being researched take place? This will help determine the
        source to use, particularly, which information source has resources dating far
        enough back.
   •    Do you know when you should stop searching?
Where:
   •    Where did the event you are researching take place?
   •    Where have you already looked for information?
   •    Where might there have been previous coverage: newspapers, broadcasts,
        trade publications, court proceedings, discussions?
Why:
   •    Why do you need the research: seeking a source to interview, surveying a
        broad topic, pinpointing a fact?
   •    Why must you have the research: to make a decision, to corroborate a
        premise?
How:
   •    How much information do you need: a few good articles for background,
        everything in existence on the topic, just the specific fact?
   •    How are you going to use the information: for an anecdote, for publication?


"Today," Schlein says, "so much data is available that, without a plan, you can easily
find yourself swimming in an ocean of information…A good, clear question will save
you hours of work." Find Paul's complete checklist and other good search
suggestions from Schlein in Find It Online (Tempe, AZ: Facts on Demand Press,
2004).


ACTIVITY:


Reframe the above criteria for research on an academic topic.
Time: 15 minutes


                                                                                      9
E. Basic Boolean Search Operators (AND, OR, NOT)


ACTIVITY:
Complete the 4 activities on the “Boolify” worksheets.
Time: 30 minutes
See Appendix A


F. Search Tips, Tricks and Techniques
See page 25 below
Time: 5 minutes


G. Wrap-up of Part 1
Reflection and Feedback




Part 2 – The Hidden Web
Look at 10 Search Engines to Explore the Invisible Web on pages 28 – 33


Experiment and explore some of the Web Portals, Directories and Databases
A. List the Categories you find in each
B. Try Boolean searching for a specific topic you currently are researching for a
paper or lesson


Time: 1 hour


                                                                                    10
You may make notes below:




The Internet, World Wide Web and the Hidden Web
The Internet is a network of computers connected together ('External net') to share
information with others through means of the World Wide Web (WWW).
World Wide Web (WWW) is part of the Internet where text and graphics are placed
together and where information can be easily accessed and shared with others to
form a Web Page along with links to different documents or other places (Hypertext
or Hyperlinks).
                      -   From the Glossary Section at the end of this reference book




                                                                                      11
The World Wide Web is also known as the ‘Surface Web’ – available to anyone who
has a computer and internet connection.

Scratching the Surface and Digging Deep – Layers of the Web
"The Invisible Web"
By Chris Sherman
There's a big problem with most search engines, and it's one many people aren't
even aware of. The problem is that vast expanses of the Web are completely
invisible to general purpose search engines like AltaVista, HotBot and Google. Even
worse, this "Invisible Web" is in all likelihood growing significantly faster than the
visible Web you're familiar with.
So what is this Invisible Web and why aren't search engines indexing it? To answer
this question, it's important to first define the "visible" Web, and describe how search
engines compile their indexes.
The Web was created a little over twenty-two years ago by Tim Berners-Lee, a
researcher at the European Organization for Nuclear Research CERN -The name is
derived from the acronym for the French Conseil Européen pour la Recherche
Nucléaire a high-energy physics laboratory in Switzerland.
Berners-Lee designed the Web to be platform-independent, so that researchers at
CERN could share materials residing on any type of computer system, avoiding
cumbersome and potentially costly conversion issues. To enable this cross-platform
capability, Berners-Lee created HTML, or HyperText Markup Language - essentially
a dramatically simplified version of SGML (Standard Generalized Markup Language).
HTML documents are simple: they consist of a "head" portion, with a title and
perhaps some additional meta-data describing the document, and a "body" portion,
the actual document itself. The simplicity of this format makes it easy for search
engines to retrieve HTML documents, index every word on every page, and store
them in huge databases that can be searched on demand.
What's less easy is the task of actually finding all the pages on the Web. Search
engines use automated programs called spiders or robots to "crawl" the Web and
retrieve pages. Spiders function much like a hyper-caffeinated Web browser - they
rely on links to take them from page to page.
Crawling is a resource-intensive operation. It also puts a certain amount of demand
on the host computers being crawled. For these reasons, search engines will often
limit the number of pages they retrieve and index from any given Web site. It's
tempting to think that these unretrieved pages are part of the Invisible Web, but they
aren't. They are visible and indexable, but the search engines have made a
conscious decision not to index them.
In recent months, much has been made of these overlooked pages. Many of the
major engines are making serious efforts to include them and make their indexes
more comprehensive. Unfortunately, the engines have also discovered through their
"deep crawls" that there's a tremendous amount of duplication and spam on the Web.
Current estimates put the Web at about 1.2 to 1.5 billion indexable pages. Both
Inktomi and AltaVista have claimed that they've spidered most of these documents,
but have been forced to cull their indexes to cope with duplicates and spam. Inktomi



                                                                                      12
puts the size of the distilled Web at about 500 million pages; AltaVista at about 350
million.
But these numbers don't include Web pages that can't be indexed, or information
that's available via the Web but isn't accessible by the search engines. This is the
stuff of the Invisible Web.
Why can't some pages be indexed? The most basic reason is that there are no links
pointing to a page that a search engine spider can follow. Or, a page may be made
up of data types that search engines don't index - graphics, CGI scripts, Macromedia
flash or PDF files, for example.
But the biggest part of the Invisible Web is made up of information stored in
databases. When an indexing spider comes across a database, it's as if it has run
smack into the entrance of a massive library with securely bolted doors. Spiders can
record the library's address, but can tell you nothing about the books, magazines or
other documents it contains.
There are thousands - perhaps millions - of databases containing high-quality
information that are accessible via the Web. But in order to search them, you
typically must visit the Web site that provides an interface to the database. The
advantage to this direct approach is that you can use search tools that were
specifically designed to retrieve the best results from the database. The
disadvantage is that you need to find the database in the first place, a task the
search engines may or may not be able to help you with.
Another problem is that content in some databases isn't designed to be directly
searchable. Instead, Web developers are taking advantage of database technology
to offer customized content that's often assembled on the fly. Search engine results
pages are an example of this type of dynamically generated content - so are services
like My Excite and My Yahoo. As Web sites get more complex and users demand
more personalization, this trend toward dynamically generated content will
accelerate, making it even harder for search engines to create comprehensive Web
indexes.
In a nutshell, the Invisible Web is made up of unindexable content that search
engines either can't or won't index. It's a huge part of the Web, and it's growing.
Fortunately, there are several reasonably thorough guides to the Invisible Web.
Gary Price, Reference Librarian at the Gelman Library at George Washington
University, is considered one of the foremost authorities on online databases and
other invaluable search resources on the Invisible Web.
http://www.resourceshelf.com/
Price's List of Lists (LOL) was started around 1998 and maintained by Gary Price for
many years. The LOL grew, and Gary's commitment to other projects and speaking
engagements made the upkeep of the LOL impossible. In late 2000, Gary
approached Trip Wyckoff, of Specialissues.com, about taking over the upkeep and
expansion of the LOL. By 2002 the online database and structure to maintain and
organize the LOL was in place and in October 2002 the LOL was transferred to
www.Specialissues.com.
"By the way, do not mistake an interest in the Invisible Web as a slam on the general
search engines because it is NOT," says Price. "General search tools are still 100%
essential for accessing material on the Internet."


                                                                                       13
One of the largest gateways to the Invisible Web is the aptly named Invisibleweb.com
<http://www.invisibleweb.com> from Intelliseek.
"Invisible Web sources are critical because they provide users with specific, targeted
information, not just static text or HTML pages," says Sundar Kadayam, CTO and
Co-Founder, Intelliseek.
"InvisibleWeb.com is a Yahoo-like directory. It is a high quality, human edited and
indexed, collection of highly targeted databases that contain specific answers to
specific questions," says Kadayam.
Intelliseek also makes BullsEye, a desktop based metasearch engine that can also
access many of the sites included in InvisibleWeb.com. More information can be
found at <http://www.intelliseek.com/prod/bullseye.htm>.
A good librarian would not start looking for a phone number (specialized, Invisible
Web info) by searching the Encyclopaedia Britannica (general knowledge resource),"
says Price. "Both professional and casual searchers should at least be aware that
they could be missing some information or wasting time finding what could be found
more easily if the right tool for the job is easily accessible. This is very similar to a
good reference librarian “knowing' the major reference tools in his or her collection.
Chris Sherman is the Web Search Guide for About.com.
                          -     Extracted from http://web.freepint.com/go/newsletter/64
Gary Price's List of Lists

Agriculture, Forestry, Fishing and Hunting, Petroleum & Mining, Utilities,
Construction, Manufacturing, Wholesale Trade, Retail Trade, Transportation and
Warehousing Information, Finance & Insurance, Real Estate Rental & Leasing,
Professional, Scientific, and Technical Services, Business & Industry Management,
Administrative & Support Services, Education, Health Care and Social Assistance
Arts, Entertainment and Recreation, Accommodation and Food Services, Repairs,
Religious, Civic, Professional, and Similar Organizations, Public Administration &
Public Works, Country/Region Specific, Executives…
                                    -   extracted from http://www.specialissues.com/lol/




Education

Magazine              Article                                                      Year

American School & Top 10 Issue (biggest, best and most popular in education
                                                                            2005
University Magazine facilities and business)

American School & Top 10 Issue (biggest, best and most popular in education
                                                                            2003
University Magazine facilities and business)



                                                                                      14
American School & Top 100 School Districts and Colleges Facilities (ranked
                                                                             2003
University Magazine by size of facilities)

American School & Top 10 Issue (biggest, best and most popular in education
                                                                            2004
University Magazine facilities and business)

American School & Top 100 School Districts and Colleges Facilities (ranked
                                                                             2004
University Magazine by size of facilities)

American School & Top 100 School Districts and Colleges Facilities (ranked
                                                                             2002
University Magazine by size of facilities)

American School & Top 10 Issue (biggest, best and most popular in education
                                                                            2006
University Magazine facilities construction, operations and management)

American School & Top 100 School Districts and Colleges Facilities (ranked
                                                                             2006
University Magazine by size of facilities)

Business Week
(Global edition)    Best Business Schools (ranking and review of the world's
                                                                             2002
(formerly North     leading business schools) (1986)
America edition)

Business Week
(Global edition)    Best Executive Education/Business Schools (ranking and
                                                                           2005
(formerly North     review of the world's leading business schools) (1986)
America edition)

Business Week
(Global edition)    Best Executive Education/Business Schools (ranking and
                                                                           2004
(formerly North     review of the world's leading business schools) (1986)
America edition)

Business Week
(Global edition)
                    Young Professionals: Best Undergrad B-Schools            2007
(formerly North
America edition)

Business Week
(Global edition)
                    Young Professionals: Best Undergrad B-Schools            2008
(formerly North
America edition)

                  MBA Report (annual look at master of business
                  administration education, we've decided to forgo our
Canadian Business traditional ranking of Canada's MBA programs and instead 2003
                  examine the ever-increasing variety of choices Canadian
                  schools are offering) (1991)

Chief Executive     Annual Best Business Schools for Executive Education     2006



                                                                               15
(2004)

                    Almanac of Higher Education (statistical/demographic
Chronicle of Higher databook on education covering four major topical areas:
                                                                              2002
Education, The      students, faculty and staff, resources, and institutions)
                    (separate issue)

Expansion
                       Metro With the Best Public Education Systems              2005
Management

                       College Census (2001 performance report for 100 top self-
Foodservice Director                                                             2002
                       op colleges)

                       School Census (performance report for top 100 school
Foodservice Director                                                             2002
                       districts)

                       Best Business Schools (ranked by return on investment)
Forbes                                                                           2008
                       (2001, biennial)

                       Best Business Schools (ranked by return on investment)
Forbes                                                                           2007
                       (2001, biennial)

Fortune                Top 50 MBA Employers                                      2007

Fortune
(International
Version: Asia,         20 Great Employers for New Grads                          2007
Europe, Latin
America)

Fortune Small
                       10 Cool Colleges for Entrepreneurs                        2006
Business: FSB

Fortune Small
                       Best Colleges for Entrepreneurs                           2007
Business: FSB

Maclean's              Canada's Best Schools                                     2004

Maclean's              Annual University Ranking (1990)                          2004

                       Scholastic Top 10 (top 10 universities ranked by the
                       quality and variety of workshops, conferences and short
Meat & Poultry                                                                   2004
                       courses available at universities throughout the U.S.)
                       (2000)

                       Top 10 Universities (top 10 universities ranked by the
                       quality and variety of workshops, conferences and short
Meat & Poultry                                                                   2007
                       courses available at universities throughout the U.S.)
                       (2000)




                                                                                   16
National Law JournalNLJ Law Schools Report                                   2008

Progress Magazine
                  The High School Report Card (the AIMS Ranking of High
(CA) (formerly
                  School Performance in Every District in Atlantic Canada 2009
Atlantic Progress
                  and Maine) (2002)
Magazine)

Quirk's Marketing
                    University Degree Programs in Marketing Research         2008
Research Review

School Bus Fleet    Statistics & Top Rankings                                2003

School Bus Fleet    Top 50 Contractor Fleets                                 2002

School Bus Fleet    Top 100 School District Fleets                           2002

School Planning &
                    Leading the Way: America's Fastest Growing Districts     2007
Management

Technology Review University Research Scorecard (ranking and analysis of
(formerly MIT      intellectual property and research revenues and spin-offs, 2002
Technology Review) includes profiles of hot start-ups)

U.S. News and
                    Best Graduate Schools Guide                              2002
World Report

U.S. News and
                    America's Best Colleges Guide                            2002
World Report

U.S. News and
                    Colleges (1,400+ schools)                                2002
World Report

U.S. News and
                    Community Colleges (1,200+ schools)                      2002
World Report

U.S. News and
                    Corporate E-learning vendors (600+ providers)            2002
World Report

U.S. News and
                    E-learning courses and degrees (1,000+ institutions)     2002
World Report

U.S. News and
                    Graduate Schools (1,000+ programs)                       2002
World Report

U.S. News and
                    Scholarships (600,000+ awards)                           2002
World Report

U.S. News and
                    Best Graduate Schools                                    2005
World Report

U.S. News and       Best Colleges                                            2004


                                                                                17
World Report

Virginia Business    Special Report: Business Schools Directory                   2006

Virginia Business    Private Schools Directory                                    2006

Virginia Business    Special Report: Community Colleges Directory                 2006

Virginia Business    Education: Engineering/IT Schools Directory                  2006




Three Types of Search Engines
The term "search engine" is often used generically to describe crawler-based search
engines, human-powered directories, and hybrid search engines. These types of
search engines gather their listings in different ways, through crawler-based
searches, human-powered directories, and hybrid searches.


Crawler-based search engines
Crawler-based search engines, such as Google (http://www.google.com), create their
listings automatically. They "crawl" or "spider" the web, then people search through
what they have found. If web pages are changed, crawler-based search engines
eventually find these changes, and that can affect how those pages are listed. Page
titles, body copy and other elements all play a role.
The life span of a typical web query normally lasts less than half a second, yet
involves a number of different steps that must be completed before results can be
delivered to a person seeking information. The following graphic (Figure 1) illustrates
this life span (from http://www.google.com/corporate/tech.html):




                                                                                     18
1. The web server sends the query to the index
3. The search results are          servers. The content inside the index servers is
returned to the user in a          similar to the index in the back of a book - it
fraction of a second.              tells which pages contain the words that match
                                   the query.

               2. The query travels to the doc
               servers, which actually retrieve
               the stored documents.
               Snippets are generated to
               describe each search result.




Human-powered directories
A human-powered directory, such as the Open Directory Project
(http://www.dmoz.org/about.html) depends on humans for its listings. (Yahoo!, which
used to be a directory, now gets its information from the use of crawlers.) A directory
gets its information from submissions, which include a short description to the
directory for the entire site, or from editors who write one for sites they review. A
search looks for matches only in the descriptions submitted. Changing web pages,
therefore, has no effect on how they are listed. Techniques that are useful for
improving a listing with a search engine have nothing to do with improving a listing in
a directory. The only exception is that a good site, with good content, might be more
likely to get reviewed for free than a poor site.




                                                                                      19
Hybrid search engines
Today, it is extremely common for crawler-type and human-powered results to be
combined when conducting a search. Usually, a hybrid search engine will favor one
type of listings over another. For example, MSN Search (http://www.imagine-
msn.com/search/tour/moreprecise.aspx) is more likely to present human-powered
listings from LookSmart (http://search.looksmart.com/). However, it also presents
crawler-based results, especially for more obscure queries.

Recommended Search Engines
UC Berkeley - Teaching Library Internet Workshops
Google is currently the most used search engine. It has one of the largest databases
of Web pages, including many other types of web documents (blog posts, wiki pages,
group discussion threads and document formats (e.g., PDFs, Word or Excel
documents, PowerPoints). Despite the presence of all these formats, Google's
popularity ranking often places worthwhile pages near the top of search results.
Google alone is not always sufficient, however. Not everything on the Web is fully
searchable in Google. Overlap studies show that more than 80% of the pages in a
major search engine's database exist only in that database. For this reason, getting a
"second opinion" can be worth your time. For this purpose, we recommend Yahoo!
Search or Exalead. We do not recommend using meta-search engines as your
primary search tool.

Table of Search Engine Features
Some common techniques will work in any search engine. However, in this very
competitive industry, search engines also strive to offer unique features. When in
doubt, look for "help", "FAQ", or "about" links.



  Search           Google              Yahoo! Search          Exalead
  Engine        www.google.com        search.yahoo.com www.exalead.com/search/

 Links to    Google help             Yahoo! help          Exalead help and FAQ
   help

Size, type IMMENSE. Size not    HUGE. Claims over LARGE. Claims to have
           disclosed in any way 20 billion total "web over 8 billion searchable
           that allows          objects."             pages.
           comparison. Probably
           the biggest.

Noteworthy PageRank™ system          Shortcuts give       Truncation lets you search
 features includes hundreds of       quick access to      by the first few letters of a
           factors, emphasizing      dictionary,          word.
           pages most heavily        synonyms, patents,   Proximity search lets you
           linked from other         traffic, stocks,     find terms NEAR each
           pages.                    encyclopedia, and    other or NEXT to each



                                                                                     20
Many additional          more.                   other.
            databases including                              Thumbnail page previews.
            Book Search, Scholar                             Extensive options for
            (journal articles), Blog                         refining and limiting your
            Search, Patents,                                 search.
            Images, etc.

Phrase      Enclose phrase in         Enclose phrase in      Enclose phrase in "double
searching   "double quotes".          "double quotes".       quotes".


Boolean     Partial. AND assumed      Accepts AND, OR,       Partial. AND assumed
logic       between words.            NOT or AND NOT.        between words.
            Capitalize OR.            Must be                Capitalize OR.
            ( ) accepted but not      capitalized.           ( ) accepted.
            required.                 ( ) accepted but not   See Web Search Syntax
            In Advanced Search,       required.              for more options.
            partial Boolean
            available in boxes.

+Requires/ - excludes                 - excludes             - excludes
-Excludes + retrieves "stop           + will allow you to    + retrieves "stop words"
           words" (e.g., +in)         search common          (e.g., +in)
                                      words: "+in truth"

Sub-        The search box at the     The search box at      The search box at the top
Searching   top of the results page   the top of the         of the results page shows
            shows your current        results page shows     your current search. Modify
            search. Modify this       your current           this (e.g., add more terms
            (e.g., add more terms     search. Modify this    at the end.)
            at the end.)              (e.g., add more
                                      terms at the end.)

Results     Based on page             Automatic Fuzzy        Popularity ranking
Ranking     popularity measured AND.                         emphasizes pages most
            in links to it from other                        heavily linked from other
            pages: high rank if a                            pages.
            lot of other pages link
            to it.
            Fuzzy AND also
            invoked.
            Matching and ranking
            based on "cached"
            version of pages that
            may not be the most
            recent version.

Field       link:                     link:                  intitle:



                                                                                         21
limiting     site:                  site:                  inurl:
             intitle:               intitle:               site:
             inurl:                 inurl:                 after:[time period]
             Offers U.S.Gov't       url:                   before:[time period]
             Search and other       hostname:              (For details, click on
             special searches.      (Explanation of        "Advanced search")
             Patent search.         these distinctions.)

Truncation, No truncation within   Neither. Search         Use *
 Stemming words. Automatically with OR as in               example: messag*
     )      stems some words.      Google.
            Search variant
            endings and
            synonyms separately,
            separating with OR
            (capitalized):
            airline OR airlines
            Use * or _ as
            wildcards substituting
            for initials or words:
            sickle * anemia
            george _ bush

Language     Yes. Major             Yes. Major             Extensive language and
             Romanized and non-     Romanized and          geographic options. Use
             Romanized languages    non-Romanized          "Advanced Search".
             in Advanced Search.    languages.

Translation Yes, in "Translate this Available as a         Yes, in "Translate this
            page" link following    separate service.      page" link following some
            some pages. To and                             pages.
            sometimes from
            English and major
            European languages
            and Chinese,
            Japanese, Korean.
            Ues its own translation
            software with user
            feedback.



How do Search Engines Work?
Search engines do not really search the World Wide Web directly. Each one
searches a database of web pages that it has harvested and cached. When you use
a search engine, you are always searching a somewhat stale copy of the real web
page. When you click on links provided in a search engine's search results, you
retrieve the current version of the page.



                                                                                     22
Search engine databases are selected and built by computer robot programs called
spiders. These "crawl" the web, finding pages for potential inclusion by following the
links in the pages they already have in their database. They cannot use imagination
or enter terms in search boxes that they find on the web.
If a web page is never linked from any other page, search engine spiders cannot find
it. The only way a brand new page can get into a search engine is for other pages to
link to it, or for a human to submit its URL for inclusion. All major search engines offer
ways to do this.
After spiders find pages, they pass them on to another computer program for
"indexing." This program identifies the text, links, and other content in the page and
stores it in the search engine database's files so that the database can be searched
by keyword and whatever more advanced approaches are offered, and the page will
be found if your search matches its content.
Many web pages are excluded from most search engines by policy. The contents of
most of the searchable databases mounted on the web, such as library catalogs and
article databases, are excluded because search engine spiders cannot access them.
All this material is referred to as the "Invisible Web" -- what you don't see in search
engine results.
Recommended Subject Directories
UC Berkeley - Teaching Library Internet Workshops
                                                                    - extracted from
        http://www.lib.berkeley.edu/TeachingLib/Guides/Internet/SubjDirectories.html
Recommended General Subject Directories:

Table of Directory Features

Web         ipl2                Infomine              About.com           Yahoo!
Directories www.ipl.org         infomine.ucr.edu      www.about.com       dir.yahoo.com

Size, type   Over 40,000.       Over 125,000.         Over 2 million.     About 4 million.
             Highest quality    Useful, reliable      Generally good      Very short
             sites only.        annotations.          annotations done    descriptions.
             Useful, reliable   Compiled by           by "Guides" with    Often useful,
             annotations.       academic librarians   various levels of   especially for
             Formed by a        from the University   expertise.          popular and
             merger of the      of California and                         commercial
             Librarians'        elsewhere.                                topics.
             Internet Index
             and the
             Internet Public
             Library.

Phrase       No.                Yes. Use " "         Yes. Use " "         Yes. Use " "
searching                       |term term| requires
                                exact match

Boolean      OR implied         AND implied           No.                 Yes, as in


                                                                                         23
logic        between          between words.                           Yahoo! Search
             words. Also      Also accepts OR,                         web search
             accepts AND      NOT, and ( ).                            engine.
             and NOT.
             Nesting with (
             ) does not
             work.

Truncation No.                Use *. Also stems. Use *.                No.
)                             Can turn stemming Not accepted
                              off. Use " " or | | to consistently.
                              search exact terms.

Field        No.              Limit to Author,      No.                As in Yahoo!
searching                     Title, Subject,                          Search web
                              Keyword,                                 search engine.
                              Description, and
                              more.

Subject Directories (Contain Databases), and Portals

How to Find Subject-Focused Directories for a Specific Topic, Discipline, or
          Field
There are thousands of specialized directories on practically every subject. If you
want an overview, or if you feel you've searched long enough, try to find one. Often
they are done by experts -- self-proclaimed or heavily credentialed. Here are some
ways to find them:
Use any of the Subject Directories above to find more specific directories. Here are
some tips:
    •   In ipl2 or Infomine, look for your subject as you would for any other purpose,
        and keep your eyes open for sites that look like directories. Read through the
        descriptions. Sometimes these resources are identified as "Directories,
        "Virtual Libraries," or "Gateway Pages."
    •   In About.com (A Portal which is a site that links to many other sites according
        to its site construction or Directory) or Yahoo! directory, try adding the terms
        web directories to your subject keyword term:
EXAMPLES:
civil war web directories
weddings web directories
    •   In About.com, search by topic and look for pages that are described as "101"
        or "guides" or a "directory." About.com is written by "Guides" who,
        themselves, often are experts in the sections they manage. Sometimes they
        write excellent overviews of a topic.



                                                                                      24
Meta-Search Engines
UC Berkeley - Teaching Library Internet Workshops

What Are "Meta-Search" Engines? How Do They Work?
In a meta-search engine, you submit keywords in its search box, and it transmits your
search simultaneously to several individual search engines and their databases of
web pages. Within a few seconds, you get back results from all the search engines
queried. Meta-search engines do not own a database of Web pages; they send your
search terms to the databases maintained by search engine companies.

Are "Smarter" Meta-Searchers Still Smarter?
"Smarter" meta-searcher technology includes clustering and linguistic analysis that
attempts to show you themes within results, and some fancy textual analysis and
display that can help you dig deeply into a set of results. However, neither of these
technologies is any better than the quality of the search engine databases they
obtain results from.
Few meta-searchers allow you to delve into the largest, most useful search engine
databases. They tend to return results from smaller and/or free search engines and
miscellaneous free directories, often small and highly commercial.
Although we respect the potential of textual analysis and clustering technologies, we
recommend directly searching individual search engines to get the most precise
results, and using meta-searchers if you want to explore more broadly.
The meta-search tools listed here are "use at your own risk." We are not
endorsing or recommending them.

Better Meta-Searchers

                    What's Searched
 Meta-Search      (As of date at bottom of   Complex
                                                                   Results Display
    Tool            page. They change      Search Ability
                           often.)

     Yippy       Searches Bing, Ask,           Accepts         Results accompanied with
   yippy.com     Open Directory, and           Boolean         subdivisions based on
   (formerly     Yahoo (as of 6/15/10).        operators AND, words in search results,
    Clusty)                                    OR, NOT, and intended to give the major
                                               limiting by     themes. Click on these to
                                               "filetype:" and search within results on
                                               "site:".        each theme.

    Dogpile      Searches Google, Yahoo,
www.dogpile.com Bing, and Ask.com (as of
                6/15/10). Sites that have
                purchased ranking and
                inclusion are mixed into the
                results. Watch for



                                                                                     25
"Sponsored:".




            Meta-Search Engines for SERIOUS Deep Digging

                          What's        Complex Search
  Meta-Search Tool                                               Results Display
                         Searched          Ability

  SurfWax         A better than        Accepts " ", +/-.      Click on source link to
  www.surfwax.com average set of       Default is AND         view complete search
                  search engines.      between words. I       results there.
                  Can mix with         recommend fairly       Click on      to view
                  educational, US      simple searches,       helpful "SiteSnap™"
                  Govt tools, and      allowing SurfWax's     extracted from most
                  news sources,        SiteSnaps and other    sites in frame on right.
                  or many other        features to help you   Many additional
                  categories.          dig deeply into        features for probing
                                       results.               within a site.

  Copernic Agent Select from list      ALL, ANY, Phrase,      Must be downloaded
  www.copernic.com of search           and more. Also         and installed, but Basic
                   engines by          Boolean searching      version is free of
                   clicking on         within results under   charge. Table
                   Advanced, then      "Find in results" >    comparing versions.
                   "Modify search      "Advanced Find"
                   engine              (powerful!).
                   settings".

Search Basics: Constructing a Google Query
Search engines work by providing you with a screen form containing one or more
fields into which you type your search term (a combination of words and/or phrases).
Single words are quick and easy, but produce much too general a result. With Google,
for example, looking for florists yields 24 million hits (search results). If we narrow
the search to florists in Vancouver (i.e. type florists Vancouver), we come up with
1.7 million results. Narrow further by making your search term a phrase. To do this,
enclose the words in double quotation marks, as in "Vancouver florists". In Google,
this example produces just 27,000 hits, because Google is making a match for the
exact string of characters we typed.
Some search engines provide radio buttons that allow you to specify whether the
search must match Any or All of the terms you type. Most default to All, returning
pages that contain every word used in your search. Choose Any to retrieve pages
that contain one or more of your search words. This AND versus OR distinction is
called Boolean logic, and it's the key to controlling the search engines. To specify an
OR in Google, you must type the word OR between words. In our Vancouver florists
scenario, for example, typing florists OR vancouver results in 85 million hits
because it returns all pages containing either the word florists or the word Vancouver.



                                                                                         26
Thus, you might get florists in Hungary and welders in Vancouver! By combining
ANDs, ORs, and phrases, you can begin to build truly powerful queries. Learn these
techniques and many more powerful search strategies in our popular Internet research
course.

Where does the term Boolean originate from?
Boolean searching is built on a method of symbolic logic developed by George
Boole, a 19th century English mathematician. Most online databases and search
engines support Boolean searches. Boolean search techniques can be used to carry
out effective searches, cutting out many unrelated documents.

Is Boolean Search Complicated?
Using Boolean Logic to broaden and/or narrow your search is not as complicated as
it sounds; in fact, you might already be doing it. Boolean logic is just the term used to
describe certain logical operations that are used to combine search terms in many
search engine databases and directories on the Net. It's not rocket science, but it
sure sounds fancy (try throwing this phrase out in common conversation!).
Basic Boolean Search Operators - AND
Using AND narows a search by combining terms; it will retrieve documents that use
both the search terms you specify, as in this example:
   •   Portland AND Oregon
Basic Boolean Search Operators - OR
Using OR broadens a search to include results that contain either of the words you
type in. OR is a good tool to use when there are several common spellings or
synonyms of a word, as in this example:
   •   liberal OR democrat
Basic Boolean Search Operators - NOT
Using NOT will narrow a search by excluding certain search terms. NOT retrieves
documents that contain one, but not the other,of the search terms you enter, as in
this example:
   •   Oregon NOT travel.
Keep in mind that not all search engines and directories support Boolean terms.
However, most do, and you can easily find out if the one you want to use supports
this technique by consulting the FAQ's (Frequently Asked Questions) on a search
engine or directory's home page.

Boolean Search And / Or / Not
This is an algebraic concept, but don't let that scare you away. Boolean connectors
are all about sets. There are three little words that are used as Boolean connectors:
   •   and
   •   or
   •   not



                                                                                       27
Think of each keyword as having a "set" of results that are connected with it. These
sets can be combined to produce a different "set" of results. You can also exclude
certain "sets" from your results by using a Boolean connector.
AND is a connector that requires both words to be present in each record in the
results. Use AND to narrow your search.

                           Search Term                                 Hits

     Television                                                999 hits

     Violence                                                  876 hits

     Television and violence                                   123 hits

The words 'television' and 'violence' will both be present in each record.

OR is a connector that allows either word to be present in each record in the results.
Use OR to expand your search.

                           Search Term                                     Hits

     Adolescents                                                 97 hits

     Teenagers                                                   75 hits

     Adolescents or teenagers                                    172 hits

Either 'adolescents' or 'teenagers' (or both) will be present in each record.

NOT is a connector that requires the first word be present in each record in the
results, but only if the record does not contain the second word.

                          Search Term                                             Hits

High school                                                         423 hits

Elementary                                                          652 hits

High school not Elementary                                          275 hits

Each record contains the words 'high school', but not the word 'elementary'.

Boolean Search Examples Boolean Connectors:

Interactive Text Equivalent
This Boolean demonstration provides a simple example of how Boolean connectors
can help focus your search as finitely as possible.



                                                                                         28
THE SCENARIO
Your research topic: television violence
You do a separate search for each keyword and get back the following results:
Television = 999
Violence = 876
That's a lot to wade through. Select 'AND,' 'OR,' or 'NOT' to see how that Boolean
connector will affect this search.

AND
You use 'AND' to connect terms or phrases.
We have two words 'television' and 'violence.' To connect them we use the Boolean
connector 'AND'. Compare the results of the search options below:
SEARCH #1: television
Result: A circle balloons until it fills about half the play area. As it gets bigger we see
the word 'television' appear. When it's finished generating the results show up '=999
results'.
SEARCH #2: violence
Result: A circle balloons until it fills about half the play area. As it gets bigger we see
the word 'violence' appear. When it's finished generating the results show up '=876
results'.
SEARCH #3: television AND violence
Result: The two circles balloon until they fill the play area as in those above. As they
get bigger we see the words 'television' and 'violence' appear. When they're finished
generating the results show up as above, plus, the same in between the two circles is
a different color and it reads as followings:
AND =123 results

OR
You use 'OR' to search for multiple terms or phrases.
You've decided to focus on how violence on television affects a specific age group.
That is, teenagers. But in your searches you've encountered another term that's
frequently used: "adolescents.'
So, in order to get information that uses either term, you'd use the OR connector.
SEARCH: teenager OR adolescent:
Result: Both circles balloon until they fill the play area as above. As they get bigger
we see the words 'teenager' and 'adolescent' appear. When they're finished
generating the results show up as above.
Next 'OR' appears between them, and the two circles come towards one another.
The text 'teenager, 75 result' and 'adolescent 97 results' stay where they are. As the
circles merge (and change into a new color) the 'OR' disappears behind them. When
the merging has finished, the following text appears in the middle of the new circle.



                                                                                         29
Teenager OR Adolescent
75 + 97 = 172 results
the 'teenager = 75 results' and 'adolescent =97 results' should now be outside the
circle to the left and right.

NOT
You use 'NOT' to exclude terms or phrases.
In one of your searches you use "high school" as a keyword phrase. You notice that
you get many results which cover both high school and elementary school. The main
emphasis of your research, as you've followed the process, has turned towards how
television violence affects students in high school.
So, in order to eliminate unwanted results you use the NOT connector.
SEARCH: high school
The circle to the left balloons. As it gets bigger we see the words 'high schools'
appear. When it's finished generating the results show up as follows. High school =
423 results.
SEARCH: elementary
The circle to the right balloons. As it gets bigger we see the words 'elementary'
appear. When it's finished generating, the results show up as follows. Elementary =
652 results.
SEARCH: high school NOT elementary
Both circles balloon until they fill the play area as above. When it's finished
generation the results appear as above, but where the circles overlap it reads: NOT =
148 exclusions.
Next the 'elementary' circle and the NOT overlap move away from the high school
circle. The NOT area like a bite taken out of the 'high school' circle.
When the elementary circle and the NOT bite stop, the results in the high school
circle change to:
High school NOT elementary 423 - 148 exclusions = 275
In excluding all references to 'high school' in combination with 'elementary' you get
275 results in which high school is only mentioned.

How the Search Engines Differ
The Web puts a variety of powerful search engines at your disposal, including
Altavista, Google, All The Web, Teoma, Wisenut, and many more. Which is best?
These tools vary in ease of use not to mention features. Your choice of search
engine should be driven by the research challenge you face. Some search engines
are better than others for particular purposes. See below for brief descriptions of
today's major players, their respective strengths and weaknesses, and their
affiliations:

Search Engine Syntax & Features Comparison Chart
An understanding of the syntax differences among search engines is essential to
mastery of these tools and the ability to force them to return the precise results you


                                                                                         30
want. Many of these sites appear to operate similarly, at least on the surface. Yet
they can differ substantially in how they understand queries and allow you to filter
results, as well as how they rank the hits returned. Consult our search basics page
for information on syntax and operators, then experiment with the search engines in
the chart provided. To click through to the various search engines, use the HTML
chart below. We have also provided a PDF version of the chart for printing.

Search       Boolean Default   Phrase Wildcards Case      Prefixes               Family
Engine                                          sensitive                        filter

Altavist + - ( )  Phrase,         ""        Yes        No      anchor,      Yes.
a                 then                      * 1-5              applet,      Password
         AND, OR,
                  AND                    characters,           domain,      protected.
         AND NOT,
                                          must type            host,
         NEAR ( )
                                           first 3             image, like,
         (Simple                         characters            link, text,
         Srch)                                                 title, url

Google OR             AND         ""    Whole word No          filetype,  Yes
                                        wildcard (*)           daterange,
         -
                                                               cache, link,
         + to
                                                               related,
         include
                                                               info, spell,
         stop words
                                                               stocks, site,
                                                               intitle,
                                                               allintitle,
                                                               inurl, allinurl

All The AND, OR, AND              ""         No        No      site, url,        Yes
Web     ANDNOT,                                                link, title,
        ( ),                                                   language,
                                                               filesize,
         +, -
                                                               filetype
         ( ) means
         OR

Wisenu +, -           AND         ""         No        No      language          Yes
t

Teoma -, OR           AND         ""         No        No      intitle, inurl, No
                                                               site, inlink,
         + to
                                                               lang,
         include
                                                               afterdate,
         stop words
                                                               beforedate,
                                                               between
                                                               date




                                                                                          31
Google: Google is the world's most popular search engine. Claiming to search 3.3
billion pages (that's practically the entire Web!), this search engine remains
undisputed king in terms of size. Google produces highly relevant results, using link
popularity for ranking. Google's original claim to fame was its speed, although its
clean, uncluttered interface has also won fans. Google defaults to AND when
processing queries containing two or more words (returning pages that match all
words specified). If you want either word (as in alternate spellings of color), you must
actually force Google to see your search this way, by specifying the Boolean OR
operator, as in color OR colour. Google supports exact phrase searching plus the
ability to exclude words (use the minus sign) and to constrain by domain and other
criteria. Alliances: Google has taken over the Deja newsgroup archive. It powers
hundreds of other search engines and the web search feature of directories like
Yahoo. Google's Web directory is provided by DMOZ.
Altavista: Still the champ in terms of raw search power, Altavista was recently
purchased by Overture, the Net's major pay-per-click search company. Altavista's
index is respectable, at 1 billion pages. It defaults to OR, ordering search results
according to number, location and proximity of search term occurrences. Use
Altavista when you need to construct complex queries containing nested
combinations of AND and OR. Altavista supports the quasi-Boolean operators (+, -)
and the formal Boolean operators (AND, OR, AND NOT, NEAR). This search engine
allows you to constrain your search by domain, location within page, date, and
numerous other criteria. Drawbacks include notoriously buggy hit counts and an
interface that could stand some usability improvements. Alliances: Altavista, too,
powers hundreds of other sites. Its web directory is provided by DMOZ.
All The Web: At first glance, All The Web looks much like Google, providing the
clean look and user-friendliness of the industry leader. All The Web defaults to AND,
with a convenient tick box that allows you to specify a phrase. Its index rivals
Google's, at 3.2 billion documents. It does not recognize formal Boolean arguments,
although it supports quasi-Boolean operators (+, -) and the ability to constrain by
domain, location within page, and several other criteria. Alliances: All The Web was
also recently taken over by Overture.
Wisenut: Known for its clean screen and speedy performance, Wisenut set out to
rival Google. A "clustering" search engine, Wisenut groups results into categories it
calls "WiseGuide." Small plus and minus signs allow you to collapse and expand
these categories. Like Google, Altavista, and other major players, Wisenut is a
spider-based search engine that crawls, links and indexes page contents. Wisenut
claims to have an index of 1.5 billion pages. Wisenut defaults to AND, and supports
phrase searching and the + and - operators, though it offers no advanced search
features as yet. Alliances: Wisenut is owned by Looksmart.
Teoma: Like Wisenut, Teoma set out to emulate Google's clean screen and fast
performance. It too defaults to AND. Teoma's index is a respectable 1.5 billion
pages. Like Google, Teoma evaluates page popularity, using complex relevance and
link popularity algorithms to rank results. Teoma clusters search results at the top of
the screen and displays a list of what it calls "Expert Link Collections" at bottom right.
These listings point to sites Teoma considers authoritative link collections relevant to
the subject of your search. Sometimes called jumplists, link collections can be among
the Web's hidden treasures. Teoma is one of the few search engines to identify




                                                                                       32
them. This feature alone makes it a valuable addition to your bookmark list.
Alliances: Teoma was acquired by Ask Jeeves in 2001.
Site contents Copyright © 1994-2005 Pam Blackstone. All rights reserved.

Some Search Tips, Tricks, & Techniques
There's more to search success than simply typing a few words into a search engine.
Here are a few points to keep in mind for your next search.
   •   Choose the right tool for the job. It's not all about search engines!
       Choosing the appropriate research tool is half the battle. Know when to use a
       specialized resource such as telephone directory , a regional directory, or a
       reference work like those you'd find at the Library.
   •   Familiarize yourself with search engine syntax. The search engines all
       differ in the rules they apply when processing your query. Did you know, for
       example, that Google limits queries to ten words? If you type more than ten
       words, Google simply truncates your query, dropping excess words off the
       end. That's one good reason to plan your search strategy carefully! Check
       search engine sites for a link labelled Help or Search Tips for syntax
       information, and see our search basics page and feature comparison chart for
       more on this important success factor.
   •   Think outside the box when specifying your search term. It's very much a
       trial and error process. Think about how the information you're after might be
       indexed. If you did not get results with one word, try a synonym. If, for
       example, you're seeking information about sailing, you might want to try both
       the words sailing and yachting. If a word has alternate spellings, specify it
       both ways (colour and color, for example).
   •   Understand results ranking. Search engines use complicated formulas to
       order results. Most search engines evaluate web documents against your
       keywords, ordering results by relevance. They do this by assigning a numeric
       score to each hit, based on how closely it matches the specified term. They
       all use different criteria for arriving at this score. Some search engines also
       factor popularity with users into how they order results, and they measure this
       in different ways as well. Be aware that advertising may also influence results
       ranking.
   •   Take advantage of collective human experience. Know when to tap into
       archived discussions. Look on the Web for facts; ask in discussion groups for
       opinions. Turn to newsgroups, mailing lists, or web forums for solutions to
       problems or for answers to obscure or esoteric questions. Google maintains a
       handy searchable archive of online discussions. Chances are, someone's
       already answered your question!
   •   Let someone else do the work. Sometimes, the fastest way to the
       information you're after is to locate a jumplist. Specialized collections of links
       on one subject or theme, jumplists are the hidden treasure of the Web. To
       find them, try adding words like links, resources, collection, or list to your
       search term. Yahoo can be useful for finding jumplists, which you can locate
       by selecting "Web Directories" from many of its menus and sub-menus. The




                                                                                        33
Teoma search engine is also useful in locating jumplists, which it calls "expert
        link collections."
    •   Sign up for our popular Internet research course to find out more. Among
        the many topics covered, you'll learn some little-known but potent Google
        techniques for ferreting out the Net's most stubbornly elusive information!
Finding Information on the Internet: A Tutorial
http://www.lib.berkeley.edu/TeachingLib/Guides/Internet/InvisibleWeb.html

Invisible or Deep Web: What it is, How to find it, and its inherent ambiguity
What is the "Invisible Web", a.k.a. the "Deep Web"?




Why isn't everything visible?
There are still some hurdles search engine crawlers cannot leap. Here are some
examples of material that remains hidden from general search engines:
    •   The Contents of Searchable Databases. When you search in a library
        catalog, article database, statistical database, etc., the results are generated
        "on the fly" in answer to your search. Because the crawler programs cannot
        type or think, they cannot enter passwords on a login screen or keywords in a
        search box. Thus, these databases must be searched separately.
            o   A special case: Google Scholar is part of the public or visible web. It
                contains citations to journal articles and other publications, with links
                to publishers or other sources where one can try to access the full text
                of the items. This is convenient, but results in Google Scholar are only
                a small fraction of all the scholarly publications that exist online. Much
                more - including most of the full text - is available through article
                databases that are part of the invisible web. The UC Berkeley Library
                subscribes to over 200 of these, accessible to our students, faculty,
                staff, and on-campus visitors through our Find Articles page.


    •   Excluded Pages. Search engine companies exclude some types of pages by
        policy, to avoid cluttering their databases with unwanted content.
            o   Dynamically generated pages of little value beyond single use.
                Think of the billions of possible web pages generated by searches for
                books in library catalogs, public-record databases, etc. Each of these
                is created in response to a specific need. Search engines do not want
                all these pages in their web databases, since they generally are not of
                broad interest.


            o   Pages deliberately excluded by their owners. A web page creator
                who does not want his/her page showing up in search engines can
                insert special "meta tags" that will not display on the screen, but will
                cause most search engines' crawlers to avoid the page.



                                                                                       34
How to Find the Invisible Web
Simply think "databases" and keep your eyes open. You can find searchable
databases containing invisible web pages in the course of routine searching in most
general web directories. Of particular value in academic research are:
    •   ipl2
    •   Infomine
Use Google and other search engines to locate searchable databases by searching a
subject term and the word "database". If the database uses the word database in its
own pages, you are likely to find it in Google. The word "database" is also useful in
searching a topic in the Google Directory or the Yahoo! directory, because they
sometimes use the term to describe searchable databases in their listings.
Examples:
plane crash database
languages database
toxic chemicals database
Remember that the Invisible Web exists. In addition to what you find in search
engine results (including Google Scholar) and most web directories, there are other
gold mines you have to search directly. This includes all of the licensed article,
magazine, reference, news archives, and other research resources that libraries and
some industries buy for those authorized to use them.
As part of your web search strategy, spend a little time looking for databases in your
field or topic of study or research. The contents of these may not be freely available:
libraries and corporations buy the rights for their authorized users to view the
contents. If they appear free, it's because you are somehow authorized to search and
read the contents (library card holder, company employee, etc.).

The Ambiguity Inherent in the Invisible Web:
It is very difficult to predict what sites or kinds of sites or portions of sites will or won't
be part of the Invisible Web. There are several factors involved:
               o   Which sites replicate some of their content in static pages (hybrid of
                   visible and invisible in some combination)?
               o   Which replicate it all (visible in search engines if you construct a
                   search matching terms in the page)?
               o   Which databases replicate none of their dynamically generated pages
                   in links and must be searched directly (totally invisible)?
               o   Search engines can change their policies on what they exclude and
                   include.

Want to learn more about the Invisible Web?
    •   The Wikipedia "Deep Web" article provides a fairly up-to-date summary, with
        links to other resources.



                                                                                             35
10 Search Engines to Explore the Invisible Web
by Saikat Basu March 14, 2010
Image credit: MarcelGermain



        Saikat Basu
Saikat is a techno-adventurer in a writer's garb. When he is not scouring the net for
tech news, you can catch him looking for life hacks and learning tidbits.


The Invisible Web refers to the part of the WWW that’s not indexed by the search
engines. Most of us think that that search powerhouses like Google and Bing are like
the Great Oracle”¦they see everything. Unfortunately, they can’t because they aren’t
divine at all; they are just web spiders who index pages by following one hyperlink
after the other.
But there are some places where a spider cannot enter. Take library databases
which need a password for access. Or even pages that belong to private networks of
organizations. Dynamically generated web pages in response to a query are often
left un-indexed by search engine spiders.
Search engine technology has progressed by leaps and bounds. Today, we have
real time search and the capability to index Flash based and PDF content. Even
then, there remain large swathes of the web which a general search engine cannot
penetrate. The term, Deep Net, Deep Web or Invisible Web lingers on.
To get a more precise idea of the nature of this “˜Dark Continent’ involving the
invisible and web search engines, read what Wikipedia has to say about the Deep
Web. The figures are attention grabbers ““ the size of the open web is 167 terabytes.
The Invisible Web is estimated at 91,000 terabytes. Check this out – the Library of
Congress, in 1997, was figured to have close to 3,000 terabytes!

How do we get to this mother lode of information?
That’s what this post is all about. Let’s get to know a few resources which will be our
deep diving vessel for the Invisible Web. Some of these are invisible web search
engines with specifically indexed information.


Infomine




                                                                                     36
Infomine has been built by a pool of libraries in the United States. Some of them are
University of California, Wake Forest University, California State University, and the
University of Detroit. Infomine “˜mines’ information from databases, electronic
journals, electronic books, bulletin boards, mailing lists, online library card catalogs,
articles, directories of researchers, and many other resources.
You can search by subject category and further tweak your search using the search
options. Infomine is not only a standalone search engine for the Deep Web but also a
staging point for a lot of other reference information. Check out its Other Search
Tools and General Reference links at the bottom.


The WWW Virtual Library




This is considered to be the oldest catalog on the web and was started by started by
Tim Berners-Lee, the creator of the web. So, isn’t it strange that it finds a place in the
list of Invisible Web resources? Maybe, but the WWW Virtual Library lists quite a lot
of relevant resources on quite a lot of subjects. You can go vertically into the
categories or use the search bar. The screenshot shows the alphabetical
arrangement of subjects covered at the site.


Intute




                                                                                       37
Intute is UK centric, but it has some of the most esteemed universities of the region
providing the resources for study and research. You can browse by subject or do a
keyword search for academic topics like agriculture to veterinary medicine. The
online service has subject specialists who review and index other websites that cater
to the topics for study and research.
Intute also provides free of cost over 60 free online tutorials to learn effective internet
research skills. Tutorials are step by step guides and are arranged around specific
subjects.


Complete Planet




Complete Planet calls itself the “˜front door to the Deep Web’. This free and well
designed directory resource makes it easy to access the mass of dynamic databases
that are cloaked from a general purpose search. The databases indexed by
Complete Planet number around 70,000 and range from Agriculture to Weather. Also
thrown in are databases like Food & Drink and Military.
For a really effective Deep Web search, try out the Advanced Search options where
among other things, you can set a date range.
Infoplease




                                                                                         38
Infoplease is an information portal with a host of features. Using the site, you can tap
into a good number of encyclopedias, almanacs, an atlas, and biographies.
Infoplease also has a few nice offshoots like Factmonster.com for kids and Biosearch,
a search engine just for biographies.
DeepPeep




DeepPeep aims to enter the Invisible Web through forms that query databases and
web services for information. Typed queries open up dynamic but short lived results
which cannot be indexed by normal search engines. By indexing databases,
DeepPeep hopes to track 45,000 forms across 7 domains.
The domains covered by DeepPeep (Beta) are Auto, Airfare, Biology, Book, Hotel,
Job, and Rental. Being a beta service, there are occasional glitches as some results
don’t load in the browser.


IncyWincy




IncyWincy is an Invisible Web search engine and it behaves as a meta-search
engine by tapping into other search engines and filtering the results. It searches the
web, directory, forms, and images. With a free registration, you can track search
results with alerts.


DeepWebTech




                                                                                      39
DeepWebTech gives you five search engines (and browser plugins) for specific
topics. The search engines cover science, medicine, and business. Using these topic
specific search engines, you can query the underlying databases in the Deep Web.




Scirus




Scirus has a pure scientific focus. It is a far reaching research engine that can scour
journals, scientists’ homepages, courseware, pre-print server material, patents and
institutional intranets.
TechXtra




                                                                                     40
TechXtra concentrates on engineering, mathematics and computing. It gives you
industry news, job announcements, technical reports, technical data, full text eprints,
teaching and learning resources along with articles and relevant website information.
Just like general web search, searching the Invisible Web is also about looking for
the needle in the haystack. Only here, the haystack is much bigger. The Invisible
Web is definitely not for the casual searcher. It is a deep but not dark because if you
know what you are searching for, enlightenment is a few keywords away.
Do you venture into the Invisible Web? Which is your preferred search tool?
The Invisible Web Databases



Which database might have       Turbo10                  Search user-selected deep
the information I need?                                  Web resources

                                Resource Discovery       Keyword search
                                Network

                                Complete Planet          Deep Web directory

                                Digital Librarian and    Uncover databases
                                Librarians Guide to
                                the Internet

News and magazines              Google News              Search 30 day news archive
                                                         (for US, UK, others)

                                AltaVista News           Includes New York Times

                                1st Headlines            Breaking news in categories
                                                         (US & World; Business;
                                                         Health; Lifestyles; Sports;
                                                         Technology; Weather)

                                New York Times           Full-text newspaper archive
                                Washington Post          search (14 or 30 day trials
                                Seattle Times            available)
                                San Francisco
                                Chronicle

                                HeadlineSpot             Search news directory by
                                                         media, region, subject,
                                                         opinion



                                                                                       41
Directory of Open      Search or browse by subject
                        Access Journals        for peer-reviewed, scientific
                        (DOAJ)                 and scholarly titles

                        HeadlineSpot:          Search magazine directory
                        Magazines              by subject

Public Radio webcasts   PublicRadioFan.com     Search database of program
                                               listings

History                 Guide to History on    Database of more than
                        the Web                5,000 US and world history
                                               sites

Biography               Galileo Project,       Individuals
                        Thomas A. Edison
                        Papers

                        Biography.com          25,000 people

                        Biographical           28,000 short identification
                        Dictionary             information

Countries               Nations Online         Alphabetical index to
                        Project, Thomas A.     government Web pages
                        Edison Papers

                        Portals to the World   From the Library of
                                               Congress

                        World Fact Book        From the CIA

                        Infonation             U.N. member nations

                        Country Profiels       From the BBC

Data                    Finding and Using
                        Statistical Data

Books (full text)       Online Books Page      Free e-books




                                                                             42
Outstanding literature     Literature, Math and   CA Dept. of Ed.
                           Science Literature     recommended literature for
                                                  K-12

                           HAISLN                 Recommended reading lists

                           YALSA (ALA)            Outstanding Books for the
                                                  College Bound

Photographs                Digital Library Photos 80,000 images of California
                                                  and natural world

                           Time Life Pictures     Historical and current (Getty
                                                  Images)

Fine Arts                  National Gallery of    Search 17,000 images
                           Art                    (check "images only")

                           ImageBase              Search 85,000 images in the
                                                  Fine Arts Museums of SF

                           Artcyclopedia          Fine arts search engine

                           Contemporary Art       Search by medium and
                                                  theme

Cross-disciplinary         Literature, Arts and   Browse or search annotated
                           Medicine Database      bibliography of prose,
                                                  poetry, film, video and art --
                                                  comprehensive (adult and
                                                  young adult fiction) resource
                                                  for medical humanities

Education                  ERIC                   Education journals and other
                                                  resources; Check "full-text,"
                                                  limit by publication type in
                                                  advanced search

K-12 curriculum projects   Blue Web'n             PacBell project

                           American Memory        Lessons using primary




                                                                                43
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101
Web Search 101

Mais conteúdo relacionado

Semelhante a Web Search 101

300 best boolean strings - table of contents
300 best boolean strings - table of contents300 best boolean strings - table of contents
300 best boolean strings - table of contentsIrina Shamaeva
 
Refresh the road ahead first 4 chapters
Refresh the road ahead first 4 chaptersRefresh the road ahead first 4 chapters
Refresh the road ahead first 4 chapters- Michiel van Vliet -
 
Health Literacy Online: A Guide to Writing and Designing Easy-to-Use Health W...
Health Literacy Online: A Guide to Writing and Designing Easy-to-Use Health W...Health Literacy Online: A Guide to Writing and Designing Easy-to-Use Health W...
Health Literacy Online: A Guide to Writing and Designing Easy-to-Use Health W...Path of the Blue Eye Project
 
Proven Methods for Successful Search Engine Marketing
Proven Methods for Successful Search Engine MarketingProven Methods for Successful Search Engine Marketing
Proven Methods for Successful Search Engine MarketingBullsEye Internet Marketing
 
Group Violence InterventionAn Implementation Guide.docx
Group Violence InterventionAn Implementation Guide.docxGroup Violence InterventionAn Implementation Guide.docx
Group Violence InterventionAn Implementation Guide.docxshericehewat
 
Senate Academic Planning Task Force : DRAFT Report March 2013
Senate Academic Planning Task Force : DRAFT Report March 2013Senate Academic Planning Task Force : DRAFT Report March 2013
Senate Academic Planning Task Force : DRAFT Report March 2013eraser Juan José Calderón
 
Search Engine Optimization Guide For Bloggers
Search Engine Optimization Guide For BloggersSearch Engine Optimization Guide For Bloggers
Search Engine Optimization Guide For BloggersElizabeth439Boggan
 
Link building - Estratégias
Link building - EstratégiasLink building - Estratégias
Link building - EstratégiasMiguel Brandão
 
Salesforce creating on_demand_apps
Salesforce creating on_demand_appsSalesforce creating on_demand_apps
Salesforce creating on_demand_appswillsco
 
WebIT2 Consultants Proposal
WebIT2 Consultants ProposalWebIT2 Consultants Proposal
WebIT2 Consultants ProposalSarah Killey
 
Current State of Digital Content - April 2011
Current State of Digital Content - April 2011Current State of Digital Content - April 2011
Current State of Digital Content - April 2011ValueNotes
 
THE IMPACT OF SOCIALMEDIA ON ENTREPRENEURIAL NETWORKS
THE IMPACT OF SOCIALMEDIA ON ENTREPRENEURIAL NETWORKSTHE IMPACT OF SOCIALMEDIA ON ENTREPRENEURIAL NETWORKS
THE IMPACT OF SOCIALMEDIA ON ENTREPRENEURIAL NETWORKSDebashish Mandal
 
Chapter 2-beginning-spatial-with-sql-server-2008-pt-i
Chapter 2-beginning-spatial-with-sql-server-2008-pt-iChapter 2-beginning-spatial-with-sql-server-2008-pt-i
Chapter 2-beginning-spatial-with-sql-server-2008-pt-iJuber Palomino Campos
 
Mvc music store tutorial - v3.0
Mvc music store   tutorial - v3.0Mvc music store   tutorial - v3.0
Mvc music store tutorial - v3.0mahmud467
 

Semelhante a Web Search 101 (20)

Keyword Research for Professionals - SMX Stockholm 2012
Keyword Research for Professionals - SMX Stockholm 2012Keyword Research for Professionals - SMX Stockholm 2012
Keyword Research for Professionals - SMX Stockholm 2012
 
300 best boolean strings - table of contents
300 best boolean strings - table of contents300 best boolean strings - table of contents
300 best boolean strings - table of contents
 
Refresh the road ahead first 4 chapters
Refresh the road ahead first 4 chaptersRefresh the road ahead first 4 chapters
Refresh the road ahead first 4 chapters
 
SEO Book by Aron Wall
SEO Book by Aron WallSEO Book by Aron Wall
SEO Book by Aron Wall
 
Seo tutorial
Seo tutorialSeo tutorial
Seo tutorial
 
Health Literacy Online: A Guide to Writing and Designing Easy-to-Use Health W...
Health Literacy Online: A Guide to Writing and Designing Easy-to-Use Health W...Health Literacy Online: A Guide to Writing and Designing Easy-to-Use Health W...
Health Literacy Online: A Guide to Writing and Designing Easy-to-Use Health W...
 
Proven Methods for Successful Search Engine Marketing
Proven Methods for Successful Search Engine MarketingProven Methods for Successful Search Engine Marketing
Proven Methods for Successful Search Engine Marketing
 
Group Violence InterventionAn Implementation Guide.docx
Group Violence InterventionAn Implementation Guide.docxGroup Violence InterventionAn Implementation Guide.docx
Group Violence InterventionAn Implementation Guide.docx
 
Senate Academic Planning Task Force : DRAFT Report March 2013
Senate Academic Planning Task Force : DRAFT Report March 2013Senate Academic Planning Task Force : DRAFT Report March 2013
Senate Academic Planning Task Force : DRAFT Report March 2013
 
Search Engine Optimization Guide For Bloggers
Search Engine Optimization Guide For BloggersSearch Engine Optimization Guide For Bloggers
Search Engine Optimization Guide For Bloggers
 
Link building - Estratégias
Link building - EstratégiasLink building - Estratégias
Link building - Estratégias
 
Salesforce creating on_demand_apps
Salesforce creating on_demand_appsSalesforce creating on_demand_apps
Salesforce creating on_demand_apps
 
WebIT2 Consultants Proposal
WebIT2 Consultants ProposalWebIT2 Consultants Proposal
WebIT2 Consultants Proposal
 
Google Search Quality Rating Program General Guidelines 2011
Google Search Quality Rating Program General Guidelines 2011Google Search Quality Rating Program General Guidelines 2011
Google Search Quality Rating Program General Guidelines 2011
 
Google General Guidelines 2011
Google General Guidelines 2011Google General Guidelines 2011
Google General Guidelines 2011
 
General guidelines 2011
General guidelines 2011General guidelines 2011
General guidelines 2011
 
Current State of Digital Content - April 2011
Current State of Digital Content - April 2011Current State of Digital Content - April 2011
Current State of Digital Content - April 2011
 
THE IMPACT OF SOCIALMEDIA ON ENTREPRENEURIAL NETWORKS
THE IMPACT OF SOCIALMEDIA ON ENTREPRENEURIAL NETWORKSTHE IMPACT OF SOCIALMEDIA ON ENTREPRENEURIAL NETWORKS
THE IMPACT OF SOCIALMEDIA ON ENTREPRENEURIAL NETWORKS
 
Chapter 2-beginning-spatial-with-sql-server-2008-pt-i
Chapter 2-beginning-spatial-with-sql-server-2008-pt-iChapter 2-beginning-spatial-with-sql-server-2008-pt-i
Chapter 2-beginning-spatial-with-sql-server-2008-pt-i
 
Mvc music store tutorial - v3.0
Mvc music store   tutorial - v3.0Mvc music store   tutorial - v3.0
Mvc music store tutorial - v3.0
 

Mais de Joseph William M. Tweedie

How Do You See the World - Optical illusions
How Do You See the World - Optical illusions How Do You See the World - Optical illusions
How Do You See the World - Optical illusions Joseph William M. Tweedie
 
Art of Facilitating Language Learning Presentation
Art of Facilitating Language Learning PresentationArt of Facilitating Language Learning Presentation
Art of Facilitating Language Learning PresentationJoseph William M. Tweedie
 
Activating Prior Knowledge Participant's Workbook
Activating Prior Knowledge Participant's WorkbookActivating Prior Knowledge Participant's Workbook
Activating Prior Knowledge Participant's WorkbookJoseph William M. Tweedie
 
Active Learning: The PRIME Approach and Method & PRIME Projects
Active Learning: The PRIME Approach and Method & PRIME ProjectsActive Learning: The PRIME Approach and Method & PRIME Projects
Active Learning: The PRIME Approach and Method & PRIME ProjectsJoseph William M. Tweedie
 
Activating Prior Knowledge: The First Step in Active Learning
Activating Prior Knowledge: The First Step in Active LearningActivating Prior Knowledge: The First Step in Active Learning
Activating Prior Knowledge: The First Step in Active LearningJoseph William M. Tweedie
 
Theme Weaving: Tapestries of Learning Presentation
Theme Weaving: Tapestries of Learning PresentationTheme Weaving: Tapestries of Learning Presentation
Theme Weaving: Tapestries of Learning PresentationJoseph William M. Tweedie
 
5 7 - Active Learning and Reading Course Resource Book
5 7 - Active Learning and Reading Course Resource Book5 7 - Active Learning and Reading Course Resource Book
5 7 - Active Learning and Reading Course Resource BookJoseph William M. Tweedie
 
4 Active Learning and Projects Course Resource Book
4   Active Learning and Projects Course Resource Book4   Active Learning and Projects Course Resource Book
4 Active Learning and Projects Course Resource BookJoseph William M. Tweedie
 
3 Managing Motivation to Learn - Course Resource Book
3   Managing Motivation to Learn - Course Resource Book3   Managing Motivation to Learn - Course Resource Book
3 Managing Motivation to Learn - Course Resource BookJoseph William M. Tweedie
 
2 Facilitating Diversity in Learning Course Resource Book
2   Facilitating Diversity in Learning Course Resource Book2   Facilitating Diversity in Learning Course Resource Book
2 Facilitating Diversity in Learning Course Resource BookJoseph William M. Tweedie
 
1 The Art of Facilitating Language Learning (AFLL) Course Resource Book
1   The Art of Facilitating Language Learning (AFLL) Course Resource Book1   The Art of Facilitating Language Learning (AFLL) Course Resource Book
1 The Art of Facilitating Language Learning (AFLL) Course Resource BookJoseph William M. Tweedie
 

Mais de Joseph William M. Tweedie (16)

Professor Tweedie - Letter of Reference1
Professor Tweedie - Letter of Reference1Professor Tweedie - Letter of Reference1
Professor Tweedie - Letter of Reference1
 
How Do You See the World - Optical illusions
How Do You See the World - Optical illusions How Do You See the World - Optical illusions
How Do You See the World - Optical illusions
 
Art of Facilitating Language Learning Presentation
Art of Facilitating Language Learning PresentationArt of Facilitating Language Learning Presentation
Art of Facilitating Language Learning Presentation
 
Activating Prior Knowledge Participant's Workbook
Activating Prior Knowledge Participant's WorkbookActivating Prior Knowledge Participant's Workbook
Activating Prior Knowledge Participant's Workbook
 
Activating Prior Knowledge Trainers Module
Activating Prior Knowledge Trainers ModuleActivating Prior Knowledge Trainers Module
Activating Prior Knowledge Trainers Module
 
Active Learning: The PRIME Approach and Method & PRIME Projects
Active Learning: The PRIME Approach and Method & PRIME ProjectsActive Learning: The PRIME Approach and Method & PRIME Projects
Active Learning: The PRIME Approach and Method & PRIME Projects
 
Art of Facilitating Language Learning
Art of Facilitating Language LearningArt of Facilitating Language Learning
Art of Facilitating Language Learning
 
Activating Prior Knowledge: The First Step in Active Learning
Activating Prior Knowledge: The First Step in Active LearningActivating Prior Knowledge: The First Step in Active Learning
Activating Prior Knowledge: The First Step in Active Learning
 
Theme Weaving: Tapestries of Learning Presentation
Theme Weaving: Tapestries of Learning PresentationTheme Weaving: Tapestries of Learning Presentation
Theme Weaving: Tapestries of Learning Presentation
 
Language Learning Strategies
Language Learning StrategiesLanguage Learning Strategies
Language Learning Strategies
 
5 7 - Active Learning and Reading Course Resource Book
5 7 - Active Learning and Reading Course Resource Book5 7 - Active Learning and Reading Course Resource Book
5 7 - Active Learning and Reading Course Resource Book
 
4 Active Learning and Projects Course Resource Book
4   Active Learning and Projects Course Resource Book4   Active Learning and Projects Course Resource Book
4 Active Learning and Projects Course Resource Book
 
3 Managing Motivation to Learn - Course Resource Book
3   Managing Motivation to Learn - Course Resource Book3   Managing Motivation to Learn - Course Resource Book
3 Managing Motivation to Learn - Course Resource Book
 
2 Facilitating Diversity in Learning Course Resource Book
2   Facilitating Diversity in Learning Course Resource Book2   Facilitating Diversity in Learning Course Resource Book
2 Facilitating Diversity in Learning Course Resource Book
 
1 The Art of Facilitating Language Learning (AFLL) Course Resource Book
1   The Art of Facilitating Language Learning (AFLL) Course Resource Book1   The Art of Facilitating Language Learning (AFLL) Course Resource Book
1 The Art of Facilitating Language Learning (AFLL) Course Resource Book
 
Early Literacy And Reading Course
Early Literacy And Reading CourseEarly Literacy And Reading Course
Early Literacy And Reading Course
 

Último

Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxPoojaSen20
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 

Último (20)

Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 

Web Search 101

  • 1. Web Search 101 Finding Lesson Plans, Activities, Songs, Games, and Conducting Serious Academic Research MADE EASIER, FASTER AND MORE ACCURATE Developed By William Tweedie
  • 2. October 2011 & 2012 Table of Contents Preface....................................................................................................................... 4 Objectives .................................................................................................................. 5 Materials: ................................................................................................................... 5 Timing: ....................................................................................................................... 5 Procedure................................................................................................................... 6 Part 1 – The Surface Web, Search Engines and Directories...................................... 6 A. Activating Prior Knowledge..................................................................................... 6 B. Search Engine – An online (Internet) World Wide Web search program................7 D. Search Queries ..................................................................................................... 8 FRAMING YOUR SEARCH STRATEGY.................................................................... 8 ACTIVITY:.................................................................................................................. 9 E. Basic Boolean Search Operators (AND, OR, NOT).............................................. 10 F. Search Tips, Tricks and Techniques..................................................................... 10 G. Wrap-up of Part 1................................................................................................. 10 Part 2 – The Hidden Web......................................................................................... 10 The Internet, World Wide Web and the Hidden Web................................................ 11 Scratching the Surface and Digging Deep – Layers of the Web............................ 12 Education.............................................................................................................. 14 Three Types of Search Engines .............................................................................. 18 Crawler-based search engines ............................................................................. 18 Human-powered directories ................................................................................. 19 Hybrid search engines ......................................................................................... 20 Table of Search Engine Features ......................................................................... 20 How do Search Engines Work?............................................................................ 22 Table of Directory Features................................................................................... 23 Subject Directories (Contain Databases), and Portals ......................................... 24 How to Find Subject-Focused Directories for a Specific Topic, Discipline, or Field .............................................................................................................................. 24 What Are "Meta-Search" Engines? How Do They Work? ..................................... 25 Are "Smarter" Meta-Searchers Still Smarter?....................................................... 25 Better Meta-Searchers.......................................................................................... 25 2
  • 3. Meta-Search Engines for SERIOUS Deep Digging .............................................. 26 Search Basics: Constructing a Google Query .......................................................... 26 Where does the term Boolean originate from?...................................................... 27 Is Boolean Search Complicated?.......................................................................... 27 Boolean Search And / Or / Not.............................................................................. 27 Boolean Search Examples Boolean Connectors:.................................................. 28 Interactive Text Equivalent.................................................................................... 28 How the Search Engines Differ............................................................................. 30 Search Engine Syntax & Features Comparison Chart ......................................... 30 Some Search Tips, Tricks, & Techniques ............................................................ 33 Invisible or Deep Web: What it is, How to find it, and its inherent ambiguity.........34 Why isn't everything visible?................................................................................. 34 How to Find the Invisible Web .............................................................................. 35 The Ambiguity Inherent in the Invisible Web: ....................................................... 35 Want to learn more about the Invisible Web?........................................................ 35 10 Search Engines to Explore the Invisible Web................................................... 36 How do we get to this mother lode of information?................................................ 36 The Invisible Web Databases................................................................................... 41 Dictionaries, Translators, & Other Language & Reference Tools ............................. 44 Web directories ........................................................................................................ 48 Internet Gateways, Jumplists, & Specialized Link Collections................................... 48 Finding Jumplists & Gateways.............................................................................. 49 www.invisible-web.net.............................................................................................. 49 Saving pages with Microsoft Internet Explorer ..................................................... 50 Peer-to-Peer Computing ...................................................................................... 50 Education ............................................................................................................. 50 Subject-orientated search services....................................................................... 52 Additional information about search engines, their use, and how they find resources.............................................................................................................. 52 Data services requiring registration ...................................................................... 52 Data services with unrestricted access................................................................. 54 Search Engines .................................................................................................... 55 Subject-orientated search services....................................................................... 56 Dictionaries and Thesauri .................................................................................... 57 Reference Works ................................................................................................. 58 General Tips for Searching the Web......................................................................... 60 3
  • 4. Carefully Select Your Search Terms..................................................................... 60 Framing your search strategy............................................................................... 60 International Educational Research Links................................................................. 62 Education databases................................................................................................ 64 Teaching websites.................................................................................................... 64 Journals.................................................................................................................... 65 Newsletters............................................................................................................... 65 New Educational Technology Standards for Teachers and Students.......................65 NETS for Teachers 2008...................................................................................... 65 NETS for Students 2007....................................................................................... 67 Glossary ............................................................................................................... 69 A to Z Computer/Internet Terms............................................................................ 69 Appendix A............................................................................................................... 74 Preface The Internet and its World Wide Web are growing, developing and adding new features at an explosive exponential rate. As you read this there are new technologies being developed and implemented to make ‘surfing’ the Internet for useful information of all types easier and more accurate, from the traditional document to flash videos and file types previously inaccessible These types of pages used to be invisible but can now be found in most search engine results: • Pages in non-HTML formats (pdf, Word, Excel, PowerPoint), now converted into HTML. • Script-based pages, whose URLs contain a ? or other script coding. 4
  • 5. Pages generated dynamically by other types of database software (e.g., Active Server Pages, Cold Fusion). These can be indexed if there is a stable URL somewhere that search engine crawlers can find. The "visible web" is what you can find using general web search engines. It's also what you see in almost all subject directories. The "invisible web" is what you cannot find using these types of tools. Search engines' crawlers and indexing programs have overcome many of the technical barriers that made it impossible for them to find "invisible" web pages. Computer robot programs, referred to sometimes as "crawlers" or "knowledge-bots" or "knowbots" that are used by search engines to roam the World Wide Web via the Internet, visit sites and databases, and keep the search engine database of web pages up to date. They obtain new pages, update known pages, and delete obsolete ones. Their findings are then integrated into the "home" database. Most large search engines operate several robots all the time. Even so, the Web is so enormous that it can take six months for spiders to cover it, resulting in a certain degree of "out-of- datedness" (link rot) in all the search engines. http://www.lib.berkeley.edu/TeachingLib/Guides/Internet/Glossary.html Therefore this is truly just a starting point for the serious researcher whether in academia or as a consumer of goods and services. Objectives In this brief overview we will look at and explore the elements that make for effective research on the Internet. 1. You will learn the Internet is composed of the “Surface Web” and the “Deep or Hidden Web. 2. You will learn how to access information on both in the most expedient way through Search Engines, Meta-search engines and other Internet tools. a. You will learn what Search Engines are and the various types available. b. You will learn what Subject Directories, Portals, and Databases are. 3. You will learn how to construct a search strategy. 4. You will learn the basics of Boolean parameters which narrow search results. 5. You will be provided special resources for academic research. Materials: This workshop needs to be conducted in a computer lab with very good Internet access. Participants will follow specific areas of this reference book throughout the workshop. These areas can be changed according to the needs of the group. This reference book is as comprehensive a guide as possible at the time of production. Timing: 5
  • 6. This workshop is designed to give a brief introduction to the complex world of the ‘Surface’ and ‘Hidden’ Webs with a focus on helping make searches more effective and productive. Normal time allotted is 2 hours but it can be extended according to time availability and the group’s level of expertise and interest. It is fully expected that participants will regularly refer to this book and refine their search skills independently. DISCLAIMER: Changes on the Internet and in the Hidden Web occur at a rapid pace so some of the search engines, sites, directories and databases may no longer be available at the web addresses provided and some may no longer exist. Be prepared to move quickly to the next point of interest. Broken links and inaccessible web-sites can be researched at a later date. Procedure It is preferable to distribute this reference book well in advance of the workshop so participants can familiarize themselves with the terms, content, and explore a few of the sites. Part 1 – The Surface Web, Search Engines and Directories A. Activating Prior Knowledge ACTIVITY: PRIME TASK: Q & A 1. The Surface Web (WWW) – What is it composed of? 6
  • 7. Write as many types of information or components of the World Wide Web as you can. Time: 10 minutes ___________________________________________________________________ ___________________________________________________________________ ___________________________________________________________________ ___________________________________________________________________ ___________________________________________________________________ ___________________________________________________________________ ___________________________________________________________________ ___________________________________________________________________ ___________________________________________________________________ 2. How can you access this information? Write as many ways as you can? Time: 10 minutes ___________________________________________________________________ ___________________________________________________________________ ___________________________________________________________________ ___________________________________________________________________ 3. How many Search Engines can you name? What is your favorite search engine? Do you use more than one? Write your answers. Time: 5 minutes ___________________________________________________________________ ___________________________________________________________________ ___________________________________________________________________ ___________________________________________________________________ 4. How often do you use a search engine in a day? Week? What do you search for? How long do you spend per search? Do you get the results you need or want? Write your answers. Time: 5 minutes ___________________________________________________________________ ___________________________________________________________________ ___________________________________________________________________ ___________________________________________________________________ ___________________________________________________________________ B. Search Engine – An online (Internet) World Wide Web search program. 7
  • 8. 1. There are 3 types of Search Engine: a). Crawler-based (e.g. Google) – these create their listings automatically through special programs that crawl or spider the web which follow links in web pages it already has to its collection of sources, retrieve information found in index servers of web-sites (containing key words) then send it back to the engine’s doc servers which retrieve the entire document and create snippets to describe the document and which contain the key words that might be the subject of a search query. – Very fast. b). Human Powered Directories (e.g. (Open Directory Project) – gets its information from visitor submissions which include a short description which is the source of any key words in a search. – Also fast. c). Hybrid Engines – combine results from the first two though one engine may have a preference over the other. – Depends on the engine. Search engines rely on their own ‘cache’ of web pages they have harvested but when accessed (clicked) you are taken to the source’s latest page. If a page is never linked it cannot be indexed. The pages indexed are visible pages only. We’ll look at the Invisible web in Part 2. 2. How many search engines do you think there are? 80% of web pages in a major search engine exist only on that engine; so, it is worth taking a look at some of the others for a ‘second opinion’. ACTIVITY: Chose a topic and search for it on Google or www.DuckDuckGo.com. Then do the same search on Exalead (www.exalead.com/search/. Compare the number of results and the sources of these results. Time: 20 minutes C. Meta – search Engines – combine the results of many search engines. (www.dogpile.com), (www.surfwax.com) ACTIVITY: Use the same search term as in the previous activity and compare the results again. Time: 10 minutes D. Search Queries FRAMING YOUR SEARCH STRATEGY To get a successful search result, you must ask the right search question. Framing a good question requires you to think strategically about exactly what you need. "By taking the time to identify key phrases and visualize the ideal answer, you will be more likely to recognize that answer when you find it online." (Nora Paul) Her guidelines are based on the standard journalist approach of "who, what, when, where, why and how" reporting and include these tips, among others: 8
  • 9. Who: • Who is the research about: a politician, a businessperson, a scientist, a criminal? • Who is key to the topic you are researching? Are there any recognized experts or spokespersons you should know about? What: • What kind of information do you need: statistics, sources, background? • What kind of research are you doing: an analysis, a background report, a follow-up? • What would the ideal answer look like? When: • When did the event being researched take place? This will help determine the source to use, particularly, which information source has resources dating far enough back. • Do you know when you should stop searching? Where: • Where did the event you are researching take place? • Where have you already looked for information? • Where might there have been previous coverage: newspapers, broadcasts, trade publications, court proceedings, discussions? Why: • Why do you need the research: seeking a source to interview, surveying a broad topic, pinpointing a fact? • Why must you have the research: to make a decision, to corroborate a premise? How: • How much information do you need: a few good articles for background, everything in existence on the topic, just the specific fact? • How are you going to use the information: for an anecdote, for publication? "Today," Schlein says, "so much data is available that, without a plan, you can easily find yourself swimming in an ocean of information…A good, clear question will save you hours of work." Find Paul's complete checklist and other good search suggestions from Schlein in Find It Online (Tempe, AZ: Facts on Demand Press, 2004). ACTIVITY: Reframe the above criteria for research on an academic topic. Time: 15 minutes 9
  • 10. E. Basic Boolean Search Operators (AND, OR, NOT) ACTIVITY: Complete the 4 activities on the “Boolify” worksheets. Time: 30 minutes See Appendix A F. Search Tips, Tricks and Techniques See page 25 below Time: 5 minutes G. Wrap-up of Part 1 Reflection and Feedback Part 2 – The Hidden Web Look at 10 Search Engines to Explore the Invisible Web on pages 28 – 33 Experiment and explore some of the Web Portals, Directories and Databases A. List the Categories you find in each B. Try Boolean searching for a specific topic you currently are researching for a paper or lesson Time: 1 hour 10
  • 11. You may make notes below: The Internet, World Wide Web and the Hidden Web The Internet is a network of computers connected together ('External net') to share information with others through means of the World Wide Web (WWW). World Wide Web (WWW) is part of the Internet where text and graphics are placed together and where information can be easily accessed and shared with others to form a Web Page along with links to different documents or other places (Hypertext or Hyperlinks). - From the Glossary Section at the end of this reference book 11
  • 12. The World Wide Web is also known as the ‘Surface Web’ – available to anyone who has a computer and internet connection. Scratching the Surface and Digging Deep – Layers of the Web "The Invisible Web" By Chris Sherman There's a big problem with most search engines, and it's one many people aren't even aware of. The problem is that vast expanses of the Web are completely invisible to general purpose search engines like AltaVista, HotBot and Google. Even worse, this "Invisible Web" is in all likelihood growing significantly faster than the visible Web you're familiar with. So what is this Invisible Web and why aren't search engines indexing it? To answer this question, it's important to first define the "visible" Web, and describe how search engines compile their indexes. The Web was created a little over twenty-two years ago by Tim Berners-Lee, a researcher at the European Organization for Nuclear Research CERN -The name is derived from the acronym for the French Conseil Européen pour la Recherche Nucléaire a high-energy physics laboratory in Switzerland. Berners-Lee designed the Web to be platform-independent, so that researchers at CERN could share materials residing on any type of computer system, avoiding cumbersome and potentially costly conversion issues. To enable this cross-platform capability, Berners-Lee created HTML, or HyperText Markup Language - essentially a dramatically simplified version of SGML (Standard Generalized Markup Language). HTML documents are simple: they consist of a "head" portion, with a title and perhaps some additional meta-data describing the document, and a "body" portion, the actual document itself. The simplicity of this format makes it easy for search engines to retrieve HTML documents, index every word on every page, and store them in huge databases that can be searched on demand. What's less easy is the task of actually finding all the pages on the Web. Search engines use automated programs called spiders or robots to "crawl" the Web and retrieve pages. Spiders function much like a hyper-caffeinated Web browser - they rely on links to take them from page to page. Crawling is a resource-intensive operation. It also puts a certain amount of demand on the host computers being crawled. For these reasons, search engines will often limit the number of pages they retrieve and index from any given Web site. It's tempting to think that these unretrieved pages are part of the Invisible Web, but they aren't. They are visible and indexable, but the search engines have made a conscious decision not to index them. In recent months, much has been made of these overlooked pages. Many of the major engines are making serious efforts to include them and make their indexes more comprehensive. Unfortunately, the engines have also discovered through their "deep crawls" that there's a tremendous amount of duplication and spam on the Web. Current estimates put the Web at about 1.2 to 1.5 billion indexable pages. Both Inktomi and AltaVista have claimed that they've spidered most of these documents, but have been forced to cull their indexes to cope with duplicates and spam. Inktomi 12
  • 13. puts the size of the distilled Web at about 500 million pages; AltaVista at about 350 million. But these numbers don't include Web pages that can't be indexed, or information that's available via the Web but isn't accessible by the search engines. This is the stuff of the Invisible Web. Why can't some pages be indexed? The most basic reason is that there are no links pointing to a page that a search engine spider can follow. Or, a page may be made up of data types that search engines don't index - graphics, CGI scripts, Macromedia flash or PDF files, for example. But the biggest part of the Invisible Web is made up of information stored in databases. When an indexing spider comes across a database, it's as if it has run smack into the entrance of a massive library with securely bolted doors. Spiders can record the library's address, but can tell you nothing about the books, magazines or other documents it contains. There are thousands - perhaps millions - of databases containing high-quality information that are accessible via the Web. But in order to search them, you typically must visit the Web site that provides an interface to the database. The advantage to this direct approach is that you can use search tools that were specifically designed to retrieve the best results from the database. The disadvantage is that you need to find the database in the first place, a task the search engines may or may not be able to help you with. Another problem is that content in some databases isn't designed to be directly searchable. Instead, Web developers are taking advantage of database technology to offer customized content that's often assembled on the fly. Search engine results pages are an example of this type of dynamically generated content - so are services like My Excite and My Yahoo. As Web sites get more complex and users demand more personalization, this trend toward dynamically generated content will accelerate, making it even harder for search engines to create comprehensive Web indexes. In a nutshell, the Invisible Web is made up of unindexable content that search engines either can't or won't index. It's a huge part of the Web, and it's growing. Fortunately, there are several reasonably thorough guides to the Invisible Web. Gary Price, Reference Librarian at the Gelman Library at George Washington University, is considered one of the foremost authorities on online databases and other invaluable search resources on the Invisible Web. http://www.resourceshelf.com/ Price's List of Lists (LOL) was started around 1998 and maintained by Gary Price for many years. The LOL grew, and Gary's commitment to other projects and speaking engagements made the upkeep of the LOL impossible. In late 2000, Gary approached Trip Wyckoff, of Specialissues.com, about taking over the upkeep and expansion of the LOL. By 2002 the online database and structure to maintain and organize the LOL was in place and in October 2002 the LOL was transferred to www.Specialissues.com. "By the way, do not mistake an interest in the Invisible Web as a slam on the general search engines because it is NOT," says Price. "General search tools are still 100% essential for accessing material on the Internet." 13
  • 14. One of the largest gateways to the Invisible Web is the aptly named Invisibleweb.com <http://www.invisibleweb.com> from Intelliseek. "Invisible Web sources are critical because they provide users with specific, targeted information, not just static text or HTML pages," says Sundar Kadayam, CTO and Co-Founder, Intelliseek. "InvisibleWeb.com is a Yahoo-like directory. It is a high quality, human edited and indexed, collection of highly targeted databases that contain specific answers to specific questions," says Kadayam. Intelliseek also makes BullsEye, a desktop based metasearch engine that can also access many of the sites included in InvisibleWeb.com. More information can be found at <http://www.intelliseek.com/prod/bullseye.htm>. A good librarian would not start looking for a phone number (specialized, Invisible Web info) by searching the Encyclopaedia Britannica (general knowledge resource)," says Price. "Both professional and casual searchers should at least be aware that they could be missing some information or wasting time finding what could be found more easily if the right tool for the job is easily accessible. This is very similar to a good reference librarian “knowing' the major reference tools in his or her collection. Chris Sherman is the Web Search Guide for About.com. - Extracted from http://web.freepint.com/go/newsletter/64 Gary Price's List of Lists Agriculture, Forestry, Fishing and Hunting, Petroleum & Mining, Utilities, Construction, Manufacturing, Wholesale Trade, Retail Trade, Transportation and Warehousing Information, Finance & Insurance, Real Estate Rental & Leasing, Professional, Scientific, and Technical Services, Business & Industry Management, Administrative & Support Services, Education, Health Care and Social Assistance Arts, Entertainment and Recreation, Accommodation and Food Services, Repairs, Religious, Civic, Professional, and Similar Organizations, Public Administration & Public Works, Country/Region Specific, Executives… - extracted from http://www.specialissues.com/lol/ Education Magazine Article Year American School & Top 10 Issue (biggest, best and most popular in education 2005 University Magazine facilities and business) American School & Top 10 Issue (biggest, best and most popular in education 2003 University Magazine facilities and business) 14
  • 15. American School & Top 100 School Districts and Colleges Facilities (ranked 2003 University Magazine by size of facilities) American School & Top 10 Issue (biggest, best and most popular in education 2004 University Magazine facilities and business) American School & Top 100 School Districts and Colleges Facilities (ranked 2004 University Magazine by size of facilities) American School & Top 100 School Districts and Colleges Facilities (ranked 2002 University Magazine by size of facilities) American School & Top 10 Issue (biggest, best and most popular in education 2006 University Magazine facilities construction, operations and management) American School & Top 100 School Districts and Colleges Facilities (ranked 2006 University Magazine by size of facilities) Business Week (Global edition) Best Business Schools (ranking and review of the world's 2002 (formerly North leading business schools) (1986) America edition) Business Week (Global edition) Best Executive Education/Business Schools (ranking and 2005 (formerly North review of the world's leading business schools) (1986) America edition) Business Week (Global edition) Best Executive Education/Business Schools (ranking and 2004 (formerly North review of the world's leading business schools) (1986) America edition) Business Week (Global edition) Young Professionals: Best Undergrad B-Schools 2007 (formerly North America edition) Business Week (Global edition) Young Professionals: Best Undergrad B-Schools 2008 (formerly North America edition) MBA Report (annual look at master of business administration education, we've decided to forgo our Canadian Business traditional ranking of Canada's MBA programs and instead 2003 examine the ever-increasing variety of choices Canadian schools are offering) (1991) Chief Executive Annual Best Business Schools for Executive Education 2006 15
  • 16. (2004) Almanac of Higher Education (statistical/demographic Chronicle of Higher databook on education covering four major topical areas: 2002 Education, The students, faculty and staff, resources, and institutions) (separate issue) Expansion Metro With the Best Public Education Systems 2005 Management College Census (2001 performance report for 100 top self- Foodservice Director 2002 op colleges) School Census (performance report for top 100 school Foodservice Director 2002 districts) Best Business Schools (ranked by return on investment) Forbes 2008 (2001, biennial) Best Business Schools (ranked by return on investment) Forbes 2007 (2001, biennial) Fortune Top 50 MBA Employers 2007 Fortune (International Version: Asia, 20 Great Employers for New Grads 2007 Europe, Latin America) Fortune Small 10 Cool Colleges for Entrepreneurs 2006 Business: FSB Fortune Small Best Colleges for Entrepreneurs 2007 Business: FSB Maclean's Canada's Best Schools 2004 Maclean's Annual University Ranking (1990) 2004 Scholastic Top 10 (top 10 universities ranked by the quality and variety of workshops, conferences and short Meat & Poultry 2004 courses available at universities throughout the U.S.) (2000) Top 10 Universities (top 10 universities ranked by the quality and variety of workshops, conferences and short Meat & Poultry 2007 courses available at universities throughout the U.S.) (2000) 16
  • 17. National Law JournalNLJ Law Schools Report 2008 Progress Magazine The High School Report Card (the AIMS Ranking of High (CA) (formerly School Performance in Every District in Atlantic Canada 2009 Atlantic Progress and Maine) (2002) Magazine) Quirk's Marketing University Degree Programs in Marketing Research 2008 Research Review School Bus Fleet Statistics & Top Rankings 2003 School Bus Fleet Top 50 Contractor Fleets 2002 School Bus Fleet Top 100 School District Fleets 2002 School Planning & Leading the Way: America's Fastest Growing Districts 2007 Management Technology Review University Research Scorecard (ranking and analysis of (formerly MIT intellectual property and research revenues and spin-offs, 2002 Technology Review) includes profiles of hot start-ups) U.S. News and Best Graduate Schools Guide 2002 World Report U.S. News and America's Best Colleges Guide 2002 World Report U.S. News and Colleges (1,400+ schools) 2002 World Report U.S. News and Community Colleges (1,200+ schools) 2002 World Report U.S. News and Corporate E-learning vendors (600+ providers) 2002 World Report U.S. News and E-learning courses and degrees (1,000+ institutions) 2002 World Report U.S. News and Graduate Schools (1,000+ programs) 2002 World Report U.S. News and Scholarships (600,000+ awards) 2002 World Report U.S. News and Best Graduate Schools 2005 World Report U.S. News and Best Colleges 2004 17
  • 18. World Report Virginia Business Special Report: Business Schools Directory 2006 Virginia Business Private Schools Directory 2006 Virginia Business Special Report: Community Colleges Directory 2006 Virginia Business Education: Engineering/IT Schools Directory 2006 Three Types of Search Engines The term "search engine" is often used generically to describe crawler-based search engines, human-powered directories, and hybrid search engines. These types of search engines gather their listings in different ways, through crawler-based searches, human-powered directories, and hybrid searches. Crawler-based search engines Crawler-based search engines, such as Google (http://www.google.com), create their listings automatically. They "crawl" or "spider" the web, then people search through what they have found. If web pages are changed, crawler-based search engines eventually find these changes, and that can affect how those pages are listed. Page titles, body copy and other elements all play a role. The life span of a typical web query normally lasts less than half a second, yet involves a number of different steps that must be completed before results can be delivered to a person seeking information. The following graphic (Figure 1) illustrates this life span (from http://www.google.com/corporate/tech.html): 18
  • 19. 1. The web server sends the query to the index 3. The search results are servers. The content inside the index servers is returned to the user in a similar to the index in the back of a book - it fraction of a second. tells which pages contain the words that match the query. 2. The query travels to the doc servers, which actually retrieve the stored documents. Snippets are generated to describe each search result. Human-powered directories A human-powered directory, such as the Open Directory Project (http://www.dmoz.org/about.html) depends on humans for its listings. (Yahoo!, which used to be a directory, now gets its information from the use of crawlers.) A directory gets its information from submissions, which include a short description to the directory for the entire site, or from editors who write one for sites they review. A search looks for matches only in the descriptions submitted. Changing web pages, therefore, has no effect on how they are listed. Techniques that are useful for improving a listing with a search engine have nothing to do with improving a listing in a directory. The only exception is that a good site, with good content, might be more likely to get reviewed for free than a poor site. 19
  • 20. Hybrid search engines Today, it is extremely common for crawler-type and human-powered results to be combined when conducting a search. Usually, a hybrid search engine will favor one type of listings over another. For example, MSN Search (http://www.imagine- msn.com/search/tour/moreprecise.aspx) is more likely to present human-powered listings from LookSmart (http://search.looksmart.com/). However, it also presents crawler-based results, especially for more obscure queries. Recommended Search Engines UC Berkeley - Teaching Library Internet Workshops Google is currently the most used search engine. It has one of the largest databases of Web pages, including many other types of web documents (blog posts, wiki pages, group discussion threads and document formats (e.g., PDFs, Word or Excel documents, PowerPoints). Despite the presence of all these formats, Google's popularity ranking often places worthwhile pages near the top of search results. Google alone is not always sufficient, however. Not everything on the Web is fully searchable in Google. Overlap studies show that more than 80% of the pages in a major search engine's database exist only in that database. For this reason, getting a "second opinion" can be worth your time. For this purpose, we recommend Yahoo! Search or Exalead. We do not recommend using meta-search engines as your primary search tool. Table of Search Engine Features Some common techniques will work in any search engine. However, in this very competitive industry, search engines also strive to offer unique features. When in doubt, look for "help", "FAQ", or "about" links. Search Google Yahoo! Search Exalead Engine www.google.com search.yahoo.com www.exalead.com/search/ Links to Google help Yahoo! help Exalead help and FAQ help Size, type IMMENSE. Size not HUGE. Claims over LARGE. Claims to have disclosed in any way 20 billion total "web over 8 billion searchable that allows objects." pages. comparison. Probably the biggest. Noteworthy PageRank™ system Shortcuts give Truncation lets you search features includes hundreds of quick access to by the first few letters of a factors, emphasizing dictionary, word. pages most heavily synonyms, patents, Proximity search lets you linked from other traffic, stocks, find terms NEAR each pages. encyclopedia, and other or NEXT to each 20
  • 21. Many additional more. other. databases including Thumbnail page previews. Book Search, Scholar Extensive options for (journal articles), Blog refining and limiting your Search, Patents, search. Images, etc. Phrase Enclose phrase in Enclose phrase in Enclose phrase in "double searching "double quotes". "double quotes". quotes". Boolean Partial. AND assumed Accepts AND, OR, Partial. AND assumed logic between words. NOT or AND NOT. between words. Capitalize OR. Must be Capitalize OR. ( ) accepted but not capitalized. ( ) accepted. required. ( ) accepted but not See Web Search Syntax In Advanced Search, required. for more options. partial Boolean available in boxes. +Requires/ - excludes - excludes - excludes -Excludes + retrieves "stop + will allow you to + retrieves "stop words" words" (e.g., +in) search common (e.g., +in) words: "+in truth" Sub- The search box at the The search box at The search box at the top Searching top of the results page the top of the of the results page shows shows your current results page shows your current search. Modify search. Modify this your current this (e.g., add more terms (e.g., add more terms search. Modify this at the end.) at the end.) (e.g., add more terms at the end.) Results Based on page Automatic Fuzzy Popularity ranking Ranking popularity measured AND. emphasizes pages most in links to it from other heavily linked from other pages: high rank if a pages. lot of other pages link to it. Fuzzy AND also invoked. Matching and ranking based on "cached" version of pages that may not be the most recent version. Field link: link: intitle: 21
  • 22. limiting site: site: inurl: intitle: intitle: site: inurl: inurl: after:[time period] Offers U.S.Gov't url: before:[time period] Search and other hostname: (For details, click on special searches. (Explanation of "Advanced search") Patent search. these distinctions.) Truncation, No truncation within Neither. Search Use * Stemming words. Automatically with OR as in example: messag* ) stems some words. Google. Search variant endings and synonyms separately, separating with OR (capitalized): airline OR airlines Use * or _ as wildcards substituting for initials or words: sickle * anemia george _ bush Language Yes. Major Yes. Major Extensive language and Romanized and non- Romanized and geographic options. Use Romanized languages non-Romanized "Advanced Search". in Advanced Search. languages. Translation Yes, in "Translate this Available as a Yes, in "Translate this page" link following separate service. page" link following some some pages. To and pages. sometimes from English and major European languages and Chinese, Japanese, Korean. Ues its own translation software with user feedback. How do Search Engines Work? Search engines do not really search the World Wide Web directly. Each one searches a database of web pages that it has harvested and cached. When you use a search engine, you are always searching a somewhat stale copy of the real web page. When you click on links provided in a search engine's search results, you retrieve the current version of the page. 22
  • 23. Search engine databases are selected and built by computer robot programs called spiders. These "crawl" the web, finding pages for potential inclusion by following the links in the pages they already have in their database. They cannot use imagination or enter terms in search boxes that they find on the web. If a web page is never linked from any other page, search engine spiders cannot find it. The only way a brand new page can get into a search engine is for other pages to link to it, or for a human to submit its URL for inclusion. All major search engines offer ways to do this. After spiders find pages, they pass them on to another computer program for "indexing." This program identifies the text, links, and other content in the page and stores it in the search engine database's files so that the database can be searched by keyword and whatever more advanced approaches are offered, and the page will be found if your search matches its content. Many web pages are excluded from most search engines by policy. The contents of most of the searchable databases mounted on the web, such as library catalogs and article databases, are excluded because search engine spiders cannot access them. All this material is referred to as the "Invisible Web" -- what you don't see in search engine results. Recommended Subject Directories UC Berkeley - Teaching Library Internet Workshops - extracted from http://www.lib.berkeley.edu/TeachingLib/Guides/Internet/SubjDirectories.html Recommended General Subject Directories: Table of Directory Features Web ipl2 Infomine About.com Yahoo! Directories www.ipl.org infomine.ucr.edu www.about.com dir.yahoo.com Size, type Over 40,000. Over 125,000. Over 2 million. About 4 million. Highest quality Useful, reliable Generally good Very short sites only. annotations. annotations done descriptions. Useful, reliable Compiled by by "Guides" with Often useful, annotations. academic librarians various levels of especially for Formed by a from the University expertise. popular and merger of the of California and commercial Librarians' elsewhere. topics. Internet Index and the Internet Public Library. Phrase No. Yes. Use " " Yes. Use " " Yes. Use " " searching |term term| requires exact match Boolean OR implied AND implied No. Yes, as in 23
  • 24. logic between between words. Yahoo! Search words. Also Also accepts OR, web search accepts AND NOT, and ( ). engine. and NOT. Nesting with ( ) does not work. Truncation No. Use *. Also stems. Use *. No. ) Can turn stemming Not accepted off. Use " " or | | to consistently. search exact terms. Field No. Limit to Author, No. As in Yahoo! searching Title, Subject, Search web Keyword, search engine. Description, and more. Subject Directories (Contain Databases), and Portals How to Find Subject-Focused Directories for a Specific Topic, Discipline, or Field There are thousands of specialized directories on practically every subject. If you want an overview, or if you feel you've searched long enough, try to find one. Often they are done by experts -- self-proclaimed or heavily credentialed. Here are some ways to find them: Use any of the Subject Directories above to find more specific directories. Here are some tips: • In ipl2 or Infomine, look for your subject as you would for any other purpose, and keep your eyes open for sites that look like directories. Read through the descriptions. Sometimes these resources are identified as "Directories, "Virtual Libraries," or "Gateway Pages." • In About.com (A Portal which is a site that links to many other sites according to its site construction or Directory) or Yahoo! directory, try adding the terms web directories to your subject keyword term: EXAMPLES: civil war web directories weddings web directories • In About.com, search by topic and look for pages that are described as "101" or "guides" or a "directory." About.com is written by "Guides" who, themselves, often are experts in the sections they manage. Sometimes they write excellent overviews of a topic. 24
  • 25. Meta-Search Engines UC Berkeley - Teaching Library Internet Workshops What Are "Meta-Search" Engines? How Do They Work? In a meta-search engine, you submit keywords in its search box, and it transmits your search simultaneously to several individual search engines and their databases of web pages. Within a few seconds, you get back results from all the search engines queried. Meta-search engines do not own a database of Web pages; they send your search terms to the databases maintained by search engine companies. Are "Smarter" Meta-Searchers Still Smarter? "Smarter" meta-searcher technology includes clustering and linguistic analysis that attempts to show you themes within results, and some fancy textual analysis and display that can help you dig deeply into a set of results. However, neither of these technologies is any better than the quality of the search engine databases they obtain results from. Few meta-searchers allow you to delve into the largest, most useful search engine databases. They tend to return results from smaller and/or free search engines and miscellaneous free directories, often small and highly commercial. Although we respect the potential of textual analysis and clustering technologies, we recommend directly searching individual search engines to get the most precise results, and using meta-searchers if you want to explore more broadly. The meta-search tools listed here are "use at your own risk." We are not endorsing or recommending them. Better Meta-Searchers What's Searched Meta-Search (As of date at bottom of Complex Results Display Tool page. They change Search Ability often.) Yippy Searches Bing, Ask, Accepts Results accompanied with yippy.com Open Directory, and Boolean subdivisions based on (formerly Yahoo (as of 6/15/10). operators AND, words in search results, Clusty) OR, NOT, and intended to give the major limiting by themes. Click on these to "filetype:" and search within results on "site:". each theme. Dogpile Searches Google, Yahoo, www.dogpile.com Bing, and Ask.com (as of 6/15/10). Sites that have purchased ranking and inclusion are mixed into the results. Watch for 25
  • 26. "Sponsored:". Meta-Search Engines for SERIOUS Deep Digging What's Complex Search Meta-Search Tool Results Display Searched Ability SurfWax A better than Accepts " ", +/-. Click on source link to www.surfwax.com average set of Default is AND view complete search search engines. between words. I results there. Can mix with recommend fairly Click on to view educational, US simple searches, helpful "SiteSnap™" Govt tools, and allowing SurfWax's extracted from most news sources, SiteSnaps and other sites in frame on right. or many other features to help you Many additional categories. dig deeply into features for probing results. within a site. Copernic Agent Select from list ALL, ANY, Phrase, Must be downloaded www.copernic.com of search and more. Also and installed, but Basic engines by Boolean searching version is free of clicking on within results under charge. Table Advanced, then "Find in results" > comparing versions. "Modify search "Advanced Find" engine (powerful!). settings". Search Basics: Constructing a Google Query Search engines work by providing you with a screen form containing one or more fields into which you type your search term (a combination of words and/or phrases). Single words are quick and easy, but produce much too general a result. With Google, for example, looking for florists yields 24 million hits (search results). If we narrow the search to florists in Vancouver (i.e. type florists Vancouver), we come up with 1.7 million results. Narrow further by making your search term a phrase. To do this, enclose the words in double quotation marks, as in "Vancouver florists". In Google, this example produces just 27,000 hits, because Google is making a match for the exact string of characters we typed. Some search engines provide radio buttons that allow you to specify whether the search must match Any or All of the terms you type. Most default to All, returning pages that contain every word used in your search. Choose Any to retrieve pages that contain one or more of your search words. This AND versus OR distinction is called Boolean logic, and it's the key to controlling the search engines. To specify an OR in Google, you must type the word OR between words. In our Vancouver florists scenario, for example, typing florists OR vancouver results in 85 million hits because it returns all pages containing either the word florists or the word Vancouver. 26
  • 27. Thus, you might get florists in Hungary and welders in Vancouver! By combining ANDs, ORs, and phrases, you can begin to build truly powerful queries. Learn these techniques and many more powerful search strategies in our popular Internet research course. Where does the term Boolean originate from? Boolean searching is built on a method of symbolic logic developed by George Boole, a 19th century English mathematician. Most online databases and search engines support Boolean searches. Boolean search techniques can be used to carry out effective searches, cutting out many unrelated documents. Is Boolean Search Complicated? Using Boolean Logic to broaden and/or narrow your search is not as complicated as it sounds; in fact, you might already be doing it. Boolean logic is just the term used to describe certain logical operations that are used to combine search terms in many search engine databases and directories on the Net. It's not rocket science, but it sure sounds fancy (try throwing this phrase out in common conversation!). Basic Boolean Search Operators - AND Using AND narows a search by combining terms; it will retrieve documents that use both the search terms you specify, as in this example: • Portland AND Oregon Basic Boolean Search Operators - OR Using OR broadens a search to include results that contain either of the words you type in. OR is a good tool to use when there are several common spellings or synonyms of a word, as in this example: • liberal OR democrat Basic Boolean Search Operators - NOT Using NOT will narrow a search by excluding certain search terms. NOT retrieves documents that contain one, but not the other,of the search terms you enter, as in this example: • Oregon NOT travel. Keep in mind that not all search engines and directories support Boolean terms. However, most do, and you can easily find out if the one you want to use supports this technique by consulting the FAQ's (Frequently Asked Questions) on a search engine or directory's home page. Boolean Search And / Or / Not This is an algebraic concept, but don't let that scare you away. Boolean connectors are all about sets. There are three little words that are used as Boolean connectors: • and • or • not 27
  • 28. Think of each keyword as having a "set" of results that are connected with it. These sets can be combined to produce a different "set" of results. You can also exclude certain "sets" from your results by using a Boolean connector. AND is a connector that requires both words to be present in each record in the results. Use AND to narrow your search. Search Term Hits Television 999 hits Violence 876 hits Television and violence 123 hits The words 'television' and 'violence' will both be present in each record. OR is a connector that allows either word to be present in each record in the results. Use OR to expand your search. Search Term Hits Adolescents 97 hits Teenagers 75 hits Adolescents or teenagers 172 hits Either 'adolescents' or 'teenagers' (or both) will be present in each record. NOT is a connector that requires the first word be present in each record in the results, but only if the record does not contain the second word. Search Term Hits High school 423 hits Elementary 652 hits High school not Elementary 275 hits Each record contains the words 'high school', but not the word 'elementary'. Boolean Search Examples Boolean Connectors: Interactive Text Equivalent This Boolean demonstration provides a simple example of how Boolean connectors can help focus your search as finitely as possible. 28
  • 29. THE SCENARIO Your research topic: television violence You do a separate search for each keyword and get back the following results: Television = 999 Violence = 876 That's a lot to wade through. Select 'AND,' 'OR,' or 'NOT' to see how that Boolean connector will affect this search. AND You use 'AND' to connect terms or phrases. We have two words 'television' and 'violence.' To connect them we use the Boolean connector 'AND'. Compare the results of the search options below: SEARCH #1: television Result: A circle balloons until it fills about half the play area. As it gets bigger we see the word 'television' appear. When it's finished generating the results show up '=999 results'. SEARCH #2: violence Result: A circle balloons until it fills about half the play area. As it gets bigger we see the word 'violence' appear. When it's finished generating the results show up '=876 results'. SEARCH #3: television AND violence Result: The two circles balloon until they fill the play area as in those above. As they get bigger we see the words 'television' and 'violence' appear. When they're finished generating the results show up as above, plus, the same in between the two circles is a different color and it reads as followings: AND =123 results OR You use 'OR' to search for multiple terms or phrases. You've decided to focus on how violence on television affects a specific age group. That is, teenagers. But in your searches you've encountered another term that's frequently used: "adolescents.' So, in order to get information that uses either term, you'd use the OR connector. SEARCH: teenager OR adolescent: Result: Both circles balloon until they fill the play area as above. As they get bigger we see the words 'teenager' and 'adolescent' appear. When they're finished generating the results show up as above. Next 'OR' appears between them, and the two circles come towards one another. The text 'teenager, 75 result' and 'adolescent 97 results' stay where they are. As the circles merge (and change into a new color) the 'OR' disappears behind them. When the merging has finished, the following text appears in the middle of the new circle. 29
  • 30. Teenager OR Adolescent 75 + 97 = 172 results the 'teenager = 75 results' and 'adolescent =97 results' should now be outside the circle to the left and right. NOT You use 'NOT' to exclude terms or phrases. In one of your searches you use "high school" as a keyword phrase. You notice that you get many results which cover both high school and elementary school. The main emphasis of your research, as you've followed the process, has turned towards how television violence affects students in high school. So, in order to eliminate unwanted results you use the NOT connector. SEARCH: high school The circle to the left balloons. As it gets bigger we see the words 'high schools' appear. When it's finished generating the results show up as follows. High school = 423 results. SEARCH: elementary The circle to the right balloons. As it gets bigger we see the words 'elementary' appear. When it's finished generating, the results show up as follows. Elementary = 652 results. SEARCH: high school NOT elementary Both circles balloon until they fill the play area as above. When it's finished generation the results appear as above, but where the circles overlap it reads: NOT = 148 exclusions. Next the 'elementary' circle and the NOT overlap move away from the high school circle. The NOT area like a bite taken out of the 'high school' circle. When the elementary circle and the NOT bite stop, the results in the high school circle change to: High school NOT elementary 423 - 148 exclusions = 275 In excluding all references to 'high school' in combination with 'elementary' you get 275 results in which high school is only mentioned. How the Search Engines Differ The Web puts a variety of powerful search engines at your disposal, including Altavista, Google, All The Web, Teoma, Wisenut, and many more. Which is best? These tools vary in ease of use not to mention features. Your choice of search engine should be driven by the research challenge you face. Some search engines are better than others for particular purposes. See below for brief descriptions of today's major players, their respective strengths and weaknesses, and their affiliations: Search Engine Syntax & Features Comparison Chart An understanding of the syntax differences among search engines is essential to mastery of these tools and the ability to force them to return the precise results you 30
  • 31. want. Many of these sites appear to operate similarly, at least on the surface. Yet they can differ substantially in how they understand queries and allow you to filter results, as well as how they rank the hits returned. Consult our search basics page for information on syntax and operators, then experiment with the search engines in the chart provided. To click through to the various search engines, use the HTML chart below. We have also provided a PDF version of the chart for printing. Search Boolean Default Phrase Wildcards Case Prefixes Family Engine sensitive filter Altavist + - ( ) Phrase, "" Yes No anchor, Yes. a then * 1-5 applet, Password AND, OR, AND characters, domain, protected. AND NOT, must type host, NEAR ( ) first 3 image, like, (Simple characters link, text, Srch) title, url Google OR AND "" Whole word No filetype, Yes wildcard (*) daterange, - cache, link, + to related, include info, spell, stop words stocks, site, intitle, allintitle, inurl, allinurl All The AND, OR, AND "" No No site, url, Yes Web ANDNOT, link, title, ( ), language, filesize, +, - filetype ( ) means OR Wisenu +, - AND "" No No language Yes t Teoma -, OR AND "" No No intitle, inurl, No site, inlink, + to lang, include afterdate, stop words beforedate, between date 31
  • 32. Google: Google is the world's most popular search engine. Claiming to search 3.3 billion pages (that's practically the entire Web!), this search engine remains undisputed king in terms of size. Google produces highly relevant results, using link popularity for ranking. Google's original claim to fame was its speed, although its clean, uncluttered interface has also won fans. Google defaults to AND when processing queries containing two or more words (returning pages that match all words specified). If you want either word (as in alternate spellings of color), you must actually force Google to see your search this way, by specifying the Boolean OR operator, as in color OR colour. Google supports exact phrase searching plus the ability to exclude words (use the minus sign) and to constrain by domain and other criteria. Alliances: Google has taken over the Deja newsgroup archive. It powers hundreds of other search engines and the web search feature of directories like Yahoo. Google's Web directory is provided by DMOZ. Altavista: Still the champ in terms of raw search power, Altavista was recently purchased by Overture, the Net's major pay-per-click search company. Altavista's index is respectable, at 1 billion pages. It defaults to OR, ordering search results according to number, location and proximity of search term occurrences. Use Altavista when you need to construct complex queries containing nested combinations of AND and OR. Altavista supports the quasi-Boolean operators (+, -) and the formal Boolean operators (AND, OR, AND NOT, NEAR). This search engine allows you to constrain your search by domain, location within page, date, and numerous other criteria. Drawbacks include notoriously buggy hit counts and an interface that could stand some usability improvements. Alliances: Altavista, too, powers hundreds of other sites. Its web directory is provided by DMOZ. All The Web: At first glance, All The Web looks much like Google, providing the clean look and user-friendliness of the industry leader. All The Web defaults to AND, with a convenient tick box that allows you to specify a phrase. Its index rivals Google's, at 3.2 billion documents. It does not recognize formal Boolean arguments, although it supports quasi-Boolean operators (+, -) and the ability to constrain by domain, location within page, and several other criteria. Alliances: All The Web was also recently taken over by Overture. Wisenut: Known for its clean screen and speedy performance, Wisenut set out to rival Google. A "clustering" search engine, Wisenut groups results into categories it calls "WiseGuide." Small plus and minus signs allow you to collapse and expand these categories. Like Google, Altavista, and other major players, Wisenut is a spider-based search engine that crawls, links and indexes page contents. Wisenut claims to have an index of 1.5 billion pages. Wisenut defaults to AND, and supports phrase searching and the + and - operators, though it offers no advanced search features as yet. Alliances: Wisenut is owned by Looksmart. Teoma: Like Wisenut, Teoma set out to emulate Google's clean screen and fast performance. It too defaults to AND. Teoma's index is a respectable 1.5 billion pages. Like Google, Teoma evaluates page popularity, using complex relevance and link popularity algorithms to rank results. Teoma clusters search results at the top of the screen and displays a list of what it calls "Expert Link Collections" at bottom right. These listings point to sites Teoma considers authoritative link collections relevant to the subject of your search. Sometimes called jumplists, link collections can be among the Web's hidden treasures. Teoma is one of the few search engines to identify 32
  • 33. them. This feature alone makes it a valuable addition to your bookmark list. Alliances: Teoma was acquired by Ask Jeeves in 2001. Site contents Copyright © 1994-2005 Pam Blackstone. All rights reserved. Some Search Tips, Tricks, & Techniques There's more to search success than simply typing a few words into a search engine. Here are a few points to keep in mind for your next search. • Choose the right tool for the job. It's not all about search engines! Choosing the appropriate research tool is half the battle. Know when to use a specialized resource such as telephone directory , a regional directory, or a reference work like those you'd find at the Library. • Familiarize yourself with search engine syntax. The search engines all differ in the rules they apply when processing your query. Did you know, for example, that Google limits queries to ten words? If you type more than ten words, Google simply truncates your query, dropping excess words off the end. That's one good reason to plan your search strategy carefully! Check search engine sites for a link labelled Help or Search Tips for syntax information, and see our search basics page and feature comparison chart for more on this important success factor. • Think outside the box when specifying your search term. It's very much a trial and error process. Think about how the information you're after might be indexed. If you did not get results with one word, try a synonym. If, for example, you're seeking information about sailing, you might want to try both the words sailing and yachting. If a word has alternate spellings, specify it both ways (colour and color, for example). • Understand results ranking. Search engines use complicated formulas to order results. Most search engines evaluate web documents against your keywords, ordering results by relevance. They do this by assigning a numeric score to each hit, based on how closely it matches the specified term. They all use different criteria for arriving at this score. Some search engines also factor popularity with users into how they order results, and they measure this in different ways as well. Be aware that advertising may also influence results ranking. • Take advantage of collective human experience. Know when to tap into archived discussions. Look on the Web for facts; ask in discussion groups for opinions. Turn to newsgroups, mailing lists, or web forums for solutions to problems or for answers to obscure or esoteric questions. Google maintains a handy searchable archive of online discussions. Chances are, someone's already answered your question! • Let someone else do the work. Sometimes, the fastest way to the information you're after is to locate a jumplist. Specialized collections of links on one subject or theme, jumplists are the hidden treasure of the Web. To find them, try adding words like links, resources, collection, or list to your search term. Yahoo can be useful for finding jumplists, which you can locate by selecting "Web Directories" from many of its menus and sub-menus. The 33
  • 34. Teoma search engine is also useful in locating jumplists, which it calls "expert link collections." • Sign up for our popular Internet research course to find out more. Among the many topics covered, you'll learn some little-known but potent Google techniques for ferreting out the Net's most stubbornly elusive information! Finding Information on the Internet: A Tutorial http://www.lib.berkeley.edu/TeachingLib/Guides/Internet/InvisibleWeb.html Invisible or Deep Web: What it is, How to find it, and its inherent ambiguity What is the "Invisible Web", a.k.a. the "Deep Web"? Why isn't everything visible? There are still some hurdles search engine crawlers cannot leap. Here are some examples of material that remains hidden from general search engines: • The Contents of Searchable Databases. When you search in a library catalog, article database, statistical database, etc., the results are generated "on the fly" in answer to your search. Because the crawler programs cannot type or think, they cannot enter passwords on a login screen or keywords in a search box. Thus, these databases must be searched separately. o A special case: Google Scholar is part of the public or visible web. It contains citations to journal articles and other publications, with links to publishers or other sources where one can try to access the full text of the items. This is convenient, but results in Google Scholar are only a small fraction of all the scholarly publications that exist online. Much more - including most of the full text - is available through article databases that are part of the invisible web. The UC Berkeley Library subscribes to over 200 of these, accessible to our students, faculty, staff, and on-campus visitors through our Find Articles page. • Excluded Pages. Search engine companies exclude some types of pages by policy, to avoid cluttering their databases with unwanted content. o Dynamically generated pages of little value beyond single use. Think of the billions of possible web pages generated by searches for books in library catalogs, public-record databases, etc. Each of these is created in response to a specific need. Search engines do not want all these pages in their web databases, since they generally are not of broad interest. o Pages deliberately excluded by their owners. A web page creator who does not want his/her page showing up in search engines can insert special "meta tags" that will not display on the screen, but will cause most search engines' crawlers to avoid the page. 34
  • 35. How to Find the Invisible Web Simply think "databases" and keep your eyes open. You can find searchable databases containing invisible web pages in the course of routine searching in most general web directories. Of particular value in academic research are: • ipl2 • Infomine Use Google and other search engines to locate searchable databases by searching a subject term and the word "database". If the database uses the word database in its own pages, you are likely to find it in Google. The word "database" is also useful in searching a topic in the Google Directory or the Yahoo! directory, because they sometimes use the term to describe searchable databases in their listings. Examples: plane crash database languages database toxic chemicals database Remember that the Invisible Web exists. In addition to what you find in search engine results (including Google Scholar) and most web directories, there are other gold mines you have to search directly. This includes all of the licensed article, magazine, reference, news archives, and other research resources that libraries and some industries buy for those authorized to use them. As part of your web search strategy, spend a little time looking for databases in your field or topic of study or research. The contents of these may not be freely available: libraries and corporations buy the rights for their authorized users to view the contents. If they appear free, it's because you are somehow authorized to search and read the contents (library card holder, company employee, etc.). The Ambiguity Inherent in the Invisible Web: It is very difficult to predict what sites or kinds of sites or portions of sites will or won't be part of the Invisible Web. There are several factors involved: o Which sites replicate some of their content in static pages (hybrid of visible and invisible in some combination)? o Which replicate it all (visible in search engines if you construct a search matching terms in the page)? o Which databases replicate none of their dynamically generated pages in links and must be searched directly (totally invisible)? o Search engines can change their policies on what they exclude and include. Want to learn more about the Invisible Web? • The Wikipedia "Deep Web" article provides a fairly up-to-date summary, with links to other resources. 35
  • 36. 10 Search Engines to Explore the Invisible Web by Saikat Basu March 14, 2010 Image credit: MarcelGermain Saikat Basu Saikat is a techno-adventurer in a writer's garb. When he is not scouring the net for tech news, you can catch him looking for life hacks and learning tidbits. The Invisible Web refers to the part of the WWW that’s not indexed by the search engines. Most of us think that that search powerhouses like Google and Bing are like the Great Oracle”¦they see everything. Unfortunately, they can’t because they aren’t divine at all; they are just web spiders who index pages by following one hyperlink after the other. But there are some places where a spider cannot enter. Take library databases which need a password for access. Or even pages that belong to private networks of organizations. Dynamically generated web pages in response to a query are often left un-indexed by search engine spiders. Search engine technology has progressed by leaps and bounds. Today, we have real time search and the capability to index Flash based and PDF content. Even then, there remain large swathes of the web which a general search engine cannot penetrate. The term, Deep Net, Deep Web or Invisible Web lingers on. To get a more precise idea of the nature of this “˜Dark Continent’ involving the invisible and web search engines, read what Wikipedia has to say about the Deep Web. The figures are attention grabbers ““ the size of the open web is 167 terabytes. The Invisible Web is estimated at 91,000 terabytes. Check this out – the Library of Congress, in 1997, was figured to have close to 3,000 terabytes! How do we get to this mother lode of information? That’s what this post is all about. Let’s get to know a few resources which will be our deep diving vessel for the Invisible Web. Some of these are invisible web search engines with specifically indexed information. Infomine 36
  • 37. Infomine has been built by a pool of libraries in the United States. Some of them are University of California, Wake Forest University, California State University, and the University of Detroit. Infomine “˜mines’ information from databases, electronic journals, electronic books, bulletin boards, mailing lists, online library card catalogs, articles, directories of researchers, and many other resources. You can search by subject category and further tweak your search using the search options. Infomine is not only a standalone search engine for the Deep Web but also a staging point for a lot of other reference information. Check out its Other Search Tools and General Reference links at the bottom. The WWW Virtual Library This is considered to be the oldest catalog on the web and was started by started by Tim Berners-Lee, the creator of the web. So, isn’t it strange that it finds a place in the list of Invisible Web resources? Maybe, but the WWW Virtual Library lists quite a lot of relevant resources on quite a lot of subjects. You can go vertically into the categories or use the search bar. The screenshot shows the alphabetical arrangement of subjects covered at the site. Intute 37
  • 38. Intute is UK centric, but it has some of the most esteemed universities of the region providing the resources for study and research. You can browse by subject or do a keyword search for academic topics like agriculture to veterinary medicine. The online service has subject specialists who review and index other websites that cater to the topics for study and research. Intute also provides free of cost over 60 free online tutorials to learn effective internet research skills. Tutorials are step by step guides and are arranged around specific subjects. Complete Planet Complete Planet calls itself the “˜front door to the Deep Web’. This free and well designed directory resource makes it easy to access the mass of dynamic databases that are cloaked from a general purpose search. The databases indexed by Complete Planet number around 70,000 and range from Agriculture to Weather. Also thrown in are databases like Food & Drink and Military. For a really effective Deep Web search, try out the Advanced Search options where among other things, you can set a date range. Infoplease 38
  • 39. Infoplease is an information portal with a host of features. Using the site, you can tap into a good number of encyclopedias, almanacs, an atlas, and biographies. Infoplease also has a few nice offshoots like Factmonster.com for kids and Biosearch, a search engine just for biographies. DeepPeep DeepPeep aims to enter the Invisible Web through forms that query databases and web services for information. Typed queries open up dynamic but short lived results which cannot be indexed by normal search engines. By indexing databases, DeepPeep hopes to track 45,000 forms across 7 domains. The domains covered by DeepPeep (Beta) are Auto, Airfare, Biology, Book, Hotel, Job, and Rental. Being a beta service, there are occasional glitches as some results don’t load in the browser. IncyWincy IncyWincy is an Invisible Web search engine and it behaves as a meta-search engine by tapping into other search engines and filtering the results. It searches the web, directory, forms, and images. With a free registration, you can track search results with alerts. DeepWebTech 39
  • 40. DeepWebTech gives you five search engines (and browser plugins) for specific topics. The search engines cover science, medicine, and business. Using these topic specific search engines, you can query the underlying databases in the Deep Web. Scirus Scirus has a pure scientific focus. It is a far reaching research engine that can scour journals, scientists’ homepages, courseware, pre-print server material, patents and institutional intranets. TechXtra 40
  • 41. TechXtra concentrates on engineering, mathematics and computing. It gives you industry news, job announcements, technical reports, technical data, full text eprints, teaching and learning resources along with articles and relevant website information. Just like general web search, searching the Invisible Web is also about looking for the needle in the haystack. Only here, the haystack is much bigger. The Invisible Web is definitely not for the casual searcher. It is a deep but not dark because if you know what you are searching for, enlightenment is a few keywords away. Do you venture into the Invisible Web? Which is your preferred search tool? The Invisible Web Databases Which database might have Turbo10 Search user-selected deep the information I need? Web resources Resource Discovery Keyword search Network Complete Planet Deep Web directory Digital Librarian and Uncover databases Librarians Guide to the Internet News and magazines Google News Search 30 day news archive (for US, UK, others) AltaVista News Includes New York Times 1st Headlines Breaking news in categories (US & World; Business; Health; Lifestyles; Sports; Technology; Weather) New York Times Full-text newspaper archive Washington Post search (14 or 30 day trials Seattle Times available) San Francisco Chronicle HeadlineSpot Search news directory by media, region, subject, opinion 41
  • 42. Directory of Open Search or browse by subject Access Journals for peer-reviewed, scientific (DOAJ) and scholarly titles HeadlineSpot: Search magazine directory Magazines by subject Public Radio webcasts PublicRadioFan.com Search database of program listings History Guide to History on Database of more than the Web 5,000 US and world history sites Biography Galileo Project, Individuals Thomas A. Edison Papers Biography.com 25,000 people Biographical 28,000 short identification Dictionary information Countries Nations Online Alphabetical index to Project, Thomas A. government Web pages Edison Papers Portals to the World From the Library of Congress World Fact Book From the CIA Infonation U.N. member nations Country Profiels From the BBC Data Finding and Using Statistical Data Books (full text) Online Books Page Free e-books 42
  • 43. Outstanding literature Literature, Math and CA Dept. of Ed. Science Literature recommended literature for K-12 HAISLN Recommended reading lists YALSA (ALA) Outstanding Books for the College Bound Photographs Digital Library Photos 80,000 images of California and natural world Time Life Pictures Historical and current (Getty Images) Fine Arts National Gallery of Search 17,000 images Art (check "images only") ImageBase Search 85,000 images in the Fine Arts Museums of SF Artcyclopedia Fine arts search engine Contemporary Art Search by medium and theme Cross-disciplinary Literature, Arts and Browse or search annotated Medicine Database bibliography of prose, poetry, film, video and art -- comprehensive (adult and young adult fiction) resource for medical humanities Education ERIC Education journals and other resources; Check "full-text," limit by publication type in advanced search K-12 curriculum projects Blue Web'n PacBell project American Memory Lessons using primary 43