SlideShare uma empresa Scribd logo
1 de 58
Baixar para ler offline
Chapter 5
Locating Information on the WWW
Learning Objectives
• Explain how a Web search engine works
• Find information by using a search engine
• Use logical operators, filtering to express complex
queries
• Recognize and find authoritative information sources
• Decide whether Web information is truth or fiction

Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Web Search Fundamentals
• A search engine is a collection of
computer programs that help us find
information on the Web
• No one organizes the information posted
on the Web
• Search Engines must look around to find
out what’s out there and then organize
what is found
Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
How a Search Engine Works
• The first step, crawling, visits every Web
page that it can find
• How are the pages found?
– The crawler has a todo list that is loaded with
a set of pages to start
– When a URL is found while crawling a page, it
adds that URL to the todo list

• The main work of the crawler is to build an
index
Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
How a Search Engine Works
• The index is a list of tokens (or words) that
are associated with the page
• The token might be part of the page’s title
• There are other ways for a token to be
associated with a page
• For each token, the crawler creates a list
of the URLs associated with that token

Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
How a Search Engine Works
• The second step is query processing
• The user presents tokens (aka search
terms) to the query processor
• The search engine then looks up the word
in the index and returns to you a hit list
• By creating the index ahead of
time, search engines are able to answer
user queries very quickly
Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Multiword Searches
• With a multiple-word query, the pages
returned should be appropriate for all of
the queried words
• AND-query
– Each page returned should be associated
with all the words
– There is no index entry corresponding to a set
– There is only a list for the individual words

Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Intersecting Queries
• For multiple words, the query processor
fetches the index lists for each of the
terms
• URLs that are in all of the lists are looked
at and compared
• The query processor intersects the lists
• The URL lists are alphabetized to speed
up the processing … it is easier to notice
when the same URL is on multiple lists
Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Rules for Intersecting
Alphabetized Lists
• To intersect several alphabetized lists:
1. Put a marker (arrow) at the start of each
token’s index list.
2. If all markers point to the same URL, save
it, because all tokens are associated with the
page.
3. Move the marker(s) to the next position for
whichever URL is earliest in the alphabet.
4. Repeat Steps 2–3 until some marker
reaches the end of the list.
Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Power of an Indexed Search
• The computer:
– takes the time to crawl the data (Web pages)
– build an index first
– find the index entries for each word
– intersect the lists to find the information for an
AND-query

• Search engines can look at billions of Web
pages and return an answer in less than a
fifth of a second
Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Descriptive Terms
• ―Hits‖ on a page means the search term is
―associated‖ with the page
• This does not mean the word is ―on‖ the
page
• Web page structure helps a lot to identify
descriptive text

Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Descriptive Terms
• Descriptive text:
– Title—The <title> encloses a short phrase
describing the whole page
– Anchor text—The highlighted link text, inside
<a . . . > tags, describes the page it links to
– Meta—A <meta . . . > tag in the head section
can hold a several sentence description of the
page
– Alt attributes—The <img . . . > tag has an alt
attribute that gives a textual description
Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Page Rank
• Why, when the hit list is returned, the page
you’re looking for is often first on the hit list
or in the top 10?
• The order in which hits are returned to a
query is determined by a number called
the PageRank
• The higher the PageRank, the closer to
the top of the list
Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Links to Other Pages
• Google pioneered page ranking as a way
to determine which pages are likely to be
most important
• PageRanking works like a voting system:
– If page A links to page B, A’s link adds to B’s
importance

• Pages that are linked-to by many pages
have a higher page ranking and are
assumed to be more important
Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Links to Other Pages
• Links from pages with a high page ranking
are also viewed as more
• PageRank is computed by the crawler:
– The crawler looks at page A
– It notices the links to page B
– It scores one for B

• Counting the number of links to a page is
not sufficient
Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Links to Other Pages
• After the crawling is completed, the
PageRank computation is completed
• The query processor is puts together the
hit list
• The URLs are sorted by their page
ranking, highest to lowest, and returned in
that order

Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Advanced Searches
human AND powered AND flight
• The Logical Operator AND
– Basic queries are AND-queries
– All words given must be associated with the
page for a hit
– The word AND is a logical operator
– Logical operators specify a logical
relationship between the words it connects

Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Advanced Searches
human AND powered AND flight
• The Logical Operator AND
– Search engines treat search words as three
independent words
– The words can appear anywhere on the page
in any order
– Use of quotes mean the exact phrase must
appear as given…this is more than an AND
query
Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Complex Queries
marshmallow OR strawberry OR chocolate

• Another logical operator is OR
– OR-queries hit on pages that are associated
with at least of the words

Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Complex Queries
tigers AND NOT baseball

• Another logical operator is NOT
– NOT queries specify words that are not to be
associated with the page
– AND is included because we want both
requirements to be true

Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Combining Logical Operators
(marshmallow OR strawberry) AND sundae

• The logical operators work like arithmetic
• They can be combined and grouped using
parentheses
• Google also uses a minus (–) as an
abbreviation for NOT

Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Restricting Global Search
• Many sites offer the opportunity to perform
a site search
• A site search means looking only on the
current site
• The site search is usually offered on the
homepage with a search window and a Go
button

Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Filtered Searches
• Constraints can be used to help pinpoint
specific pages

• Using these areas on the advanced
search, can limit our search hits

Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Filtered Searches
• Site searches don’t necessarily use
PageRanking to order the hits
• Use of the advanced search filters can be
used to get PageRanked hits

Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Web Searching
• Research on the web requires serious
consideration and strategies:
1. Selecting Search Terms
choosing good words to include in a query
2. The Anatomy of a Hit
how to use the information returned
3. Using the Hit List
skimming to find what you want quickly
4. Once You Find a Likely Page
locating the desired data on the page
Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
1. Selecting Search Terms
• How do you find the best search terms?
• It is a sequence of finding ever-more
precise terms:
I.
II.
III.
IV.
V.
VI.

Advanced Search
General Topic
Descriptive Terms
Refining (Adding Words)
Avoiding Over Constraining
Removing Words

Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
1. Selecting Search Terms
I. Use Advanced Search
– Google’s Advanced Search gives control over
the results returned
– You can do complex queries, but Advanced
Search provides control

Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
1. Selecting Search Terms
II. Begin with the General Topic
– Words have multiple meanings
– Giving the topic can eliminate most of those
conflicting words
– Many hits can be eliminated
– Start with the topic word(s)

Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
1. Selecting Search Terms
III. Choose Descriptive Words
– Be picky about the words we select
– Select precise terms
– Take care using terms that have many other
meanings—they become less useful

Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
1. Selecting Search Terms
IV. Refine by Adding Words
– Begin with a ―first guess‖ search
– Check the results
– Check the initial hit list to see what you’ve
found
– The initial list often suggests additional terms
to search
– This may require several rounds of adding a
word to the query each time
Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
1. Selecting Search Terms
V. Avoid Over Constraining
– Adding more words one at a time works best
because interesting pages might be
accidentally overlooked
– Care is required in this process:
Add a word only if you are sure the pages you
want will have it

Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
1. Selecting Search Terms
VI. Remove Specific Words (Minus or NOT)
– It is useful to consider eliminating pages with
certain words
– It is the opposite of adding words that may/will
appear on a page
– The minus sign is a good way to eliminate
wrong interpretations of words

Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
2. Anatomy of a Hit
• What is displayed with each hit?
– Title
• This is the text between the page’s <title> and
</title> tags.

– Snippet
• This information is a preview of what might be on
the page
• Usually a short phrase from the page containing
one or more of the searched words

Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
2. Anatomy of a Hit
• What is displayed with each hit?
– URL
• The URL that is linked from the title line

– Site Links
• These are useful links from the site, which are
basically shortcuts
• These links are found algorithmically
• They are not ―sponsored‖ links – no one is paying
for them

Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
3. Using the Hit List
• Checking the hits is a process of filtering
• Skim the top level of information, and look
deeper if it looks promising
• If not, continuing skimming

Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
3. Using the Hit List
• ―Looking deeper‖ ranks the information:
– Title
• It’s the first source of information
• When the title checks out, look at the . . .

– Snippet
• Search terms are shown in bold
• The context around the occurrence is given
• If the snippet checks out, look at the . . .

Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
3. Using the Hit List
• ―Looking deeper‖ ranks the information:
– URL
• The domain of the site hosting the page is given
• The site name is the first check of how
authoritative the information is
• If the URL checks out, look at the . . .

– Pageitself

• At this point, there is some likelihood that
the page includes information you need
Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
4. Once You Find a Likely Page
• A page has been found!
– A suggestive title
– A promising snippet
– A reliable domain

• How close are we to what is wanted?
– First, ―roll through the page‖ checking out its main
features
– Next, look for a date to see how current it is
– Finally, find the location of one of the keywords

Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
4. Once You Find a Likely Page
• The goal is still to determine whether you
want to stay on the page.
– If you’ve found what you want, you’re done
– If not, return to the hit list

• If the site is good, decide how important it
is that the information is perfectly correct:
• If the importance is high, cross-check the
information with another site
• Find corroborating information!
Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Authoritative Information
• Don’t Believe Everything You Read
– No one in charge of the WWW, so no one
checks to see that the information is correct
– Some information is inaccurate
– Most information on the Web should be
considered suspect
– Anyone can post a Web page and make
random statements or claims…true, partly
true, and completely false
Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Wikipedia
• Wikipedia is an open source document
created by knowledgeable and
community-minded Internet users
• Anyone can contribute to Wikipedia
• Wikipedia covers an enormous number of
topics
• The information contained might not be
included in a printed encyclopedia
Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Wikipedia
• Its coverage and timeliness make it a
valuable resource
• The fact that many people contribute to
Wikipedia is both a weakness and a
strength
– Anyone can add anything to or edit an entry
without control
– Use Wikipedia as a starting point for research

Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Using the Web for Research
• Applying good research practices will
ensure highly reliable information:
– Question the information
• Does it make sense? Is it believable? Is it
consistent? Does the information fit with everything
you already know?
– Never rely on a single source; always use multiple
sources.
– Assess the site’s authoritativeness
– Vary the kinds of resources you use, including off-line
resources.
Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
What Is Authoritative?
• Authoritative means that we are looking for
what experts say
– We assume that experts are well informed on
the topic
– We assume that what experts say is true

• There is no way to verify that what ―they‖
say is true, so we accept that it’s the best
available information

Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Respected Sources
• Authoritative information can be gotten
from respected organizations
• Thousands of professional organizations
host trustworthy Web sites
• Individuals are also good sources of
information as long as we have some
reason to believe them

Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Primary Sources
• A primary source is a person who has direct
knowledge of the information
• People who interview primary sources are
secondary sources
– They are not as reliable as primary sources

• People who watch journalists on TV or read
newspaper reports are tertiary sources
– They are three levels removed from direct
knowledge of the event
Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Primary Sources
• A secondary source does not mean that
the information is wrong, only that the
possibility exists
• There is no guarantee that even primary
sources tell the truth

Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Authoritative Sources
• The easiest way to get authoritative
information is to go to a site that you know
to be authoritative
• Many agencies and organizations publish
information you can depend on
• By going directly to an authoritative
source, you are ensured that the
information is reliable
Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Site Analysis… Good? Bad?
• Possible issues for bad sites:
–
–
–
–
–
–

Broken links
Failure to give contact information
Failure to have a non-Web identity
Simplistic design
No recent updates or blog entries
Spelling mistakes

• Legitimate sites may have these issues, too!
• To decide if a site is legitimate, do a little direct
research on the information found there
Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Summary
• In this chapter we have learned the following:
– We need software and our own intelligence to search
the Internet effectively.
– Search engines are composed of a crawler and a
query processor.
– We create queries using the logical operators
AND, OR, and NOT, and specific terms to pinpoint the
information we seek.
– Filtering and ―subtracting‖ search terms removes
extraneous hits.

Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Summary
• In this chapter we have learned the following:
– Once we’ve found information, we must judge
whether it is correct by investigating the organization
that publishes the page, and examining the facts
claimed on the page.
– We must cross-check the information with other
sources, especially when the information is important.

Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley

Mais conteúdo relacionado

Mais procurados

Search Engine Optimization for the Research Librarian, or, How Librarians Can...
Search Engine Optimization for the Research Librarian, or, How Librarians Can...Search Engine Optimization for the Research Librarian, or, How Librarians Can...
Search Engine Optimization for the Research Librarian, or, How Librarians Can...melissagasparotto
 
How to search the Internet, a guide to save time and effort
How to search the Internet, a guide to save time and effortHow to search the Internet, a guide to save time and effort
How to search the Internet, a guide to save time and effortPete S
 
International SEO - Search Love London 2012
International SEO - Search Love London 2012International SEO - Search Love London 2012
International SEO - Search Love London 2012Lisa Myers
 
Seo Training For Beginners By Sandeep Dubey
Seo Training For Beginners By Sandeep DubeySeo Training For Beginners By Sandeep Dubey
Seo Training For Beginners By Sandeep DubeySandeep Dubey
 
Spanning The Globe With Search
Spanning The Globe With SearchSpanning The Globe With Search
Spanning The Globe With SearchTim Aldiss
 
Search Engine Optimization Primer
Search Engine Optimization PrimerSearch Engine Optimization Primer
Search Engine Optimization PrimerSimobo
 
On-Page SEO Checklist
On-Page SEO ChecklistOn-Page SEO Checklist
On-Page SEO ChecklistWebFX
 
SEO services in pune,SEO expert in pune,Digital marketing in pune
  SEO services in pune,SEO expert in pune,Digital marketing in pune  SEO services in pune,SEO expert in pune,Digital marketing in pune
SEO services in pune,SEO expert in pune,Digital marketing in puneSwami Solutions
 
Seo top amazing ppt
Seo  top amazing pptSeo  top amazing ppt
Seo top amazing pptMamthaz M
 
Search Engine Optimization Tutorial
Search Engine Optimization TutorialSearch Engine Optimization Tutorial
Search Engine Optimization TutorialTahasin Chowdhury
 
Control What You Can Control in WordPress. On Page SEO FTW!
Control What You Can Control in WordPress. On Page SEO FTW!Control What You Can Control in WordPress. On Page SEO FTW!
Control What You Can Control in WordPress. On Page SEO FTW!Mike Zielonka
 
SEO: Optimizing Sites for People (and search engines)
SEO: Optimizing Sites for People (and search engines)SEO: Optimizing Sites for People (and search engines)
SEO: Optimizing Sites for People (and search engines)kdmcBerkeley at UC Berkeley
 
TechFuse 2013 - Break down the walls SharePoint 2013
TechFuse 2013 - Break down the walls SharePoint 2013TechFuse 2013 - Break down the walls SharePoint 2013
TechFuse 2013 - Break down the walls SharePoint 2013Avtex
 

Mais procurados (20)

Search Engine Optimization
Search Engine OptimizationSearch Engine Optimization
Search Engine Optimization
 
Webmaster guide-en
Webmaster guide-enWebmaster guide-en
Webmaster guide-en
 
Lvr ppt
Lvr pptLvr ppt
Lvr ppt
 
Search Engine Optimization for the Research Librarian, or, How Librarians Can...
Search Engine Optimization for the Research Librarian, or, How Librarians Can...Search Engine Optimization for the Research Librarian, or, How Librarians Can...
Search Engine Optimization for the Research Librarian, or, How Librarians Can...
 
Seo onpage & offpage, Search Engine Optimization, SEO
Seo onpage & offpage, Search Engine Optimization, SEOSeo onpage & offpage, Search Engine Optimization, SEO
Seo onpage & offpage, Search Engine Optimization, SEO
 
How to search the Internet, a guide to save time and effort
How to search the Internet, a guide to save time and effortHow to search the Internet, a guide to save time and effort
How to search the Internet, a guide to save time and effort
 
International SEO - Search Love London 2012
International SEO - Search Love London 2012International SEO - Search Love London 2012
International SEO - Search Love London 2012
 
Seo Training For Beginners By Sandeep Dubey
Seo Training For Beginners By Sandeep DubeySeo Training For Beginners By Sandeep Dubey
Seo Training For Beginners By Sandeep Dubey
 
Spanning The Globe With Search
Spanning The Globe With SearchSpanning The Globe With Search
Spanning The Globe With Search
 
Search Engine Optimization Primer
Search Engine Optimization PrimerSearch Engine Optimization Primer
Search Engine Optimization Primer
 
Search engine optimization
Search engine optimizationSearch engine optimization
Search engine optimization
 
On-Page SEO Checklist
On-Page SEO ChecklistOn-Page SEO Checklist
On-Page SEO Checklist
 
SEO services in pune,SEO expert in pune,Digital marketing in pune
  SEO services in pune,SEO expert in pune,Digital marketing in pune  SEO services in pune,SEO expert in pune,Digital marketing in pune
SEO services in pune,SEO expert in pune,Digital marketing in pune
 
Seo top amazing ppt
Seo  top amazing pptSeo  top amazing ppt
Seo top amazing ppt
 
Search Engine Optimization Tutorial
Search Engine Optimization TutorialSearch Engine Optimization Tutorial
Search Engine Optimization Tutorial
 
Prasenjit's 13 seo techniques
Prasenjit's 13 seo techniquesPrasenjit's 13 seo techniques
Prasenjit's 13 seo techniques
 
Control What You Can Control in WordPress. On Page SEO FTW!
Control What You Can Control in WordPress. On Page SEO FTW!Control What You Can Control in WordPress. On Page SEO FTW!
Control What You Can Control in WordPress. On Page SEO FTW!
 
SEO: Optimizing Sites for People (and search engines)
SEO: Optimizing Sites for People (and search engines)SEO: Optimizing Sites for People (and search engines)
SEO: Optimizing Sites for People (and search engines)
 
TechFuse 2013 - Break down the walls SharePoint 2013
TechFuse 2013 - Break down the walls SharePoint 2013TechFuse 2013 - Break down the walls SharePoint 2013
TechFuse 2013 - Break down the walls SharePoint 2013
 
People Search
People SearchPeople Search
People Search
 

Destaque

CThompson Management Resume (1)
CThompson Management Resume (1)CThompson Management Resume (1)
CThompson Management Resume (1)Chris Thompson
 
Saurav S. Biswas - CV
Saurav S. Biswas - CVSaurav S. Biswas - CV
Saurav S. Biswas - CVSaurav Biswas
 
My understanding of microworlds
My understanding of microworldsMy understanding of microworlds
My understanding of microworldsYr05
 
D link magazine-reduce
D link magazine-reduceD link magazine-reduce
D link magazine-reduceteamlewis
 
Використання інтерактивних технологій на уроках світової літератури
Використання інтерактивних технологій на уроках світової літературиВикористання інтерактивних технологій на уроках світової літератури
Використання інтерактивних технологій на уроках світової літературиОльга Ксьонз
 
EACD Lisbon Debate 2015 Luxury Communications Nuno Duarte Lopes The Luxury Ne...
EACD Lisbon Debate 2015 Luxury Communications Nuno Duarte Lopes The Luxury Ne...EACD Lisbon Debate 2015 Luxury Communications Nuno Duarte Lopes The Luxury Ne...
EACD Lisbon Debate 2015 Luxury Communications Nuno Duarte Lopes The Luxury Ne...Rui Martins PR & Marketing Strategy
 
Latest Trend: Be Your Boss
Latest Trend: Be Your BossLatest Trend: Be Your Boss
Latest Trend: Be Your Bossswatikhanna09
 
Communiqué de presse - Oct 2015
Communiqué de presse - Oct  2015Communiqué de presse - Oct  2015
Communiqué de presse - Oct 2015Matthieu Boye
 
Effects of Globaliztion on Transgenders
Effects of Globaliztion on TransgendersEffects of Globaliztion on Transgenders
Effects of Globaliztion on TransgendersReshma Thomas
 
Efficient Signage system for IIT Guwahati
Efficient Signage system for IIT GuwahatiEfficient Signage system for IIT Guwahati
Efficient Signage system for IIT GuwahatiAbhinav Krishna
 

Destaque (17)

CThompson Management Resume (1)
CThompson Management Resume (1)CThompson Management Resume (1)
CThompson Management Resume (1)
 
Carnaval ayacuchano.
Carnaval ayacuchano. Carnaval ayacuchano.
Carnaval ayacuchano.
 
Saurav S. Biswas - CV
Saurav S. Biswas - CVSaurav S. Biswas - CV
Saurav S. Biswas - CV
 
My understanding of microworlds
My understanding of microworldsMy understanding of microworlds
My understanding of microworlds
 
Tips Twitter by Romain Simonin
Tips Twitter by Romain SimoninTips Twitter by Romain Simonin
Tips Twitter by Romain Simonin
 
Feliz navidad 2013
Feliz navidad 2013Feliz navidad 2013
Feliz navidad 2013
 
D link magazine-reduce
D link magazine-reduceD link magazine-reduce
D link magazine-reduce
 
Використання інтерактивних технологій на уроках світової літератури
Використання інтерактивних технологій на уроках світової літературиВикористання інтерактивних технологій на уроках світової літератури
Використання інтерактивних технологій на уроках світової літератури
 
EACD Lisbon Debate 2015 Luxury Communications Nuno Duarte Lopes The Luxury Ne...
EACD Lisbon Debate 2015 Luxury Communications Nuno Duarte Lopes The Luxury Ne...EACD Lisbon Debate 2015 Luxury Communications Nuno Duarte Lopes The Luxury Ne...
EACD Lisbon Debate 2015 Luxury Communications Nuno Duarte Lopes The Luxury Ne...
 
Latest Trend: Be Your Boss
Latest Trend: Be Your BossLatest Trend: Be Your Boss
Latest Trend: Be Your Boss
 
Communiqué de presse - Oct 2015
Communiqué de presse - Oct  2015Communiqué de presse - Oct  2015
Communiqué de presse - Oct 2015
 
COMPUTER NETWORKS
COMPUTER  NETWORKSCOMPUTER  NETWORKS
COMPUTER NETWORKS
 
Chapter1
Chapter1Chapter1
Chapter1
 
Effects of Globaliztion on Transgenders
Effects of Globaliztion on TransgendersEffects of Globaliztion on Transgenders
Effects of Globaliztion on Transgenders
 
Ppt luh murniasih
Ppt luh murniasihPpt luh murniasih
Ppt luh murniasih
 
Efficient Signage system for IIT Guwahati
Efficient Signage system for IIT GuwahatiEfficient Signage system for IIT Guwahati
Efficient Signage system for IIT Guwahati
 
Bye bye jenkins welcome bots
Bye bye jenkins welcome botsBye bye jenkins welcome bots
Bye bye jenkins welcome bots
 

Semelhante a Chapter05

Lost in the Net: Navigating Search Engines
Lost in the Net:  Navigating Search EnginesLost in the Net:  Navigating Search Engines
Lost in the Net: Navigating Search EnginesJohan Koren
 
Search Engine Optimization
Search Engine OptimizationSearch Engine Optimization
Search Engine OptimizationSubbu Lakshmanan
 
How to write killer blog article - Freelance Web Designer Malaysia
How to write killer blog article - Freelance Web Designer MalaysiaHow to write killer blog article - Freelance Web Designer Malaysia
How to write killer blog article - Freelance Web Designer MalaysiaManju Hariharan
 
Search Engine Optimization
Search Engine OptimizationSearch Engine Optimization
Search Engine Optimizationchintanchheda
 
Getting Started with SEO
Getting Started with SEOGetting Started with SEO
Getting Started with SEOBiznet IIS
 
Search Engine Optimization
Search Engine OptimizationSearch Engine Optimization
Search Engine OptimizationTechLekh Nepal
 
SEARCH ENGINE OPTIMIZATION
SEARCH ENGINE OPTIMIZATIONSEARCH ENGINE OPTIMIZATION
SEARCH ENGINE OPTIMIZATIONShubham Sharma
 
Best marketing agency.pptx
Best marketing agency.pptxBest marketing agency.pptx
Best marketing agency.pptxMatchstickmedia
 
Copywriting for SEO
Copywriting for SEO Copywriting for SEO
Copywriting for SEO Brian Pereira
 
Webpowerpoint
WebpowerpointWebpowerpoint
Webpowerpointtonideegs
 
FIT5 Ch. 5, CIS 110 13F
FIT5 Ch. 5, CIS 110 13FFIT5 Ch. 5, CIS 110 13F
FIT5 Ch. 5, CIS 110 13Fmh-108
 
Google Algorithm Update: Pandas, Penguins, and Recovering from Penalties
Google Algorithm Update: Pandas, Penguins, and Recovering from Penalties Google Algorithm Update: Pandas, Penguins, and Recovering from Penalties
Google Algorithm Update: Pandas, Penguins, and Recovering from Penalties Bill Hartzer
 
Copywriting for seo
Copywriting for seoCopywriting for seo
Copywriting for seobrian9p
 
Academic Search Techniques.pptx
Academic Search Techniques.pptxAcademic Search Techniques.pptx
Academic Search Techniques.pptxRADHIKAB20
 

Semelhante a Chapter05 (20)

Lost in the Net: Navigating Search Engines
Lost in the Net:  Navigating Search EnginesLost in the Net:  Navigating Search Engines
Lost in the Net: Navigating Search Engines
 
The Power of SEO
The Power of SEOThe Power of SEO
The Power of SEO
 
Search Engine Optimization
Search Engine OptimizationSearch Engine Optimization
Search Engine Optimization
 
How to write killer blog article - Freelance Web Designer Malaysia
How to write killer blog article - Freelance Web Designer MalaysiaHow to write killer blog article - Freelance Web Designer Malaysia
How to write killer blog article - Freelance Web Designer Malaysia
 
Search Engine Optimization
Search Engine OptimizationSearch Engine Optimization
Search Engine Optimization
 
Getting Started with SEO
Getting Started with SEOGetting Started with SEO
Getting Started with SEO
 
Search Engine Optimization
Search Engine OptimizationSearch Engine Optimization
Search Engine Optimization
 
SEARCH ENGINE OPTIMIZATION
SEARCH ENGINE OPTIMIZATIONSEARCH ENGINE OPTIMIZATION
SEARCH ENGINE OPTIMIZATION
 
Best marketing agency.pptx
Best marketing agency.pptxBest marketing agency.pptx
Best marketing agency.pptx
 
Copywriting for SEO
Copywriting for SEO Copywriting for SEO
Copywriting for SEO
 
Copywriting for SEO
Copywriting for SEOCopywriting for SEO
Copywriting for SEO
 
Seo basic
Seo basicSeo basic
Seo basic
 
Webpowerpoint
WebpowerpointWebpowerpoint
Webpowerpoint
 
SEO
SEOSEO
SEO
 
Seo
SeoSeo
Seo
 
FIT5 Ch. 5, CIS 110 13F
FIT5 Ch. 5, CIS 110 13FFIT5 Ch. 5, CIS 110 13F
FIT5 Ch. 5, CIS 110 13F
 
Google Algorithm Update: Pandas, Penguins, and Recovering from Penalties
Google Algorithm Update: Pandas, Penguins, and Recovering from Penalties Google Algorithm Update: Pandas, Penguins, and Recovering from Penalties
Google Algorithm Update: Pandas, Penguins, and Recovering from Penalties
 
Search engines
Search enginesSearch engines
Search engines
 
Copywriting for seo
Copywriting for seoCopywriting for seo
Copywriting for seo
 
Academic Search Techniques.pptx
Academic Search Techniques.pptxAcademic Search Techniques.pptx
Academic Search Techniques.pptx
 

Último

Optical Fibre and It's Applications.pptx
Optical Fibre and It's Applications.pptxOptical Fibre and It's Applications.pptx
Optical Fibre and It's Applications.pptxPurva Nikam
 
SOLIDE WASTE in Cameroon,,,,,,,,,,,,,,,,,,,,,,,,,,,.pptx
SOLIDE WASTE in Cameroon,,,,,,,,,,,,,,,,,,,,,,,,,,,.pptxSOLIDE WASTE in Cameroon,,,,,,,,,,,,,,,,,,,,,,,,,,,.pptx
SOLIDE WASTE in Cameroon,,,,,,,,,,,,,,,,,,,,,,,,,,,.pptxSyedNadeemGillANi
 
Slides CapTechTalks Webinar March 2024 Joshua Sinai.pptx
Slides CapTechTalks Webinar March 2024 Joshua Sinai.pptxSlides CapTechTalks Webinar March 2024 Joshua Sinai.pptx
Slides CapTechTalks Webinar March 2024 Joshua Sinai.pptxCapitolTechU
 
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxAUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxiammrhaywood
 
Riddhi Kevadiya. WILLIAM SHAKESPEARE....
Riddhi Kevadiya. WILLIAM SHAKESPEARE....Riddhi Kevadiya. WILLIAM SHAKESPEARE....
Riddhi Kevadiya. WILLIAM SHAKESPEARE....Riddhi Kevadiya
 
How to Manage Cross-Selling in Odoo 17 Sales
How to Manage Cross-Selling in Odoo 17 SalesHow to Manage Cross-Selling in Odoo 17 Sales
How to Manage Cross-Selling in Odoo 17 SalesCeline George
 
What is the Future of QuickBooks DeskTop?
What is the Future of QuickBooks DeskTop?What is the Future of QuickBooks DeskTop?
What is the Future of QuickBooks DeskTop?TechSoup
 
KARNAADA.pptx made by - saransh dwivedi ( SD ) - SHALAKYA TANTRA - ENT - 4...
KARNAADA.pptx  made by -  saransh dwivedi ( SD ) -  SHALAKYA TANTRA - ENT - 4...KARNAADA.pptx  made by -  saransh dwivedi ( SD ) -  SHALAKYA TANTRA - ENT - 4...
KARNAADA.pptx made by - saransh dwivedi ( SD ) - SHALAKYA TANTRA - ENT - 4...M56BOOKSTORE PRODUCT/SERVICE
 
CapTechU Doctoral Presentation -March 2024 slides.pptx
CapTechU Doctoral Presentation -March 2024 slides.pptxCapTechU Doctoral Presentation -March 2024 slides.pptx
CapTechU Doctoral Presentation -March 2024 slides.pptxCapitolTechU
 
ARTICULAR DISC OF TEMPOROMANDIBULAR JOINT
ARTICULAR DISC OF TEMPOROMANDIBULAR JOINTARTICULAR DISC OF TEMPOROMANDIBULAR JOINT
ARTICULAR DISC OF TEMPOROMANDIBULAR JOINTDR. SNEHA NAIR
 
How to Send Emails From Odoo 17 Using Code
How to Send Emails From Odoo 17 Using CodeHow to Send Emails From Odoo 17 Using Code
How to Send Emails From Odoo 17 Using CodeCeline George
 
How to Make a Field read-only in Odoo 17
How to Make a Field read-only in Odoo 17How to Make a Field read-only in Odoo 17
How to Make a Field read-only in Odoo 17Celine George
 
Ultra structure and life cycle of Plasmodium.pptx
Ultra structure and life cycle of Plasmodium.pptxUltra structure and life cycle of Plasmodium.pptx
Ultra structure and life cycle of Plasmodium.pptxDr. Asif Anas
 
Over the counter (OTC)- Sale, rational use.pptx
Over the counter (OTC)- Sale, rational use.pptxOver the counter (OTC)- Sale, rational use.pptx
Over the counter (OTC)- Sale, rational use.pptxraviapr7
 
EBUS5423 Data Analytics and Reporting Bl
EBUS5423 Data Analytics and Reporting BlEBUS5423 Data Analytics and Reporting Bl
EBUS5423 Data Analytics and Reporting BlDr. Bruce A. Johnson
 
Work Experience for psp3 portfolio sasha
Work Experience for psp3 portfolio sashaWork Experience for psp3 portfolio sasha
Work Experience for psp3 portfolio sashasashalaycock03
 
Easter in the USA presentation by Chloe.
Easter in the USA presentation by Chloe.Easter in the USA presentation by Chloe.
Easter in the USA presentation by Chloe.EnglishCEIPdeSigeiro
 
How to Solve Singleton Error in the Odoo 17
How to Solve Singleton Error in the  Odoo 17How to Solve Singleton Error in the  Odoo 17
How to Solve Singleton Error in the Odoo 17Celine George
 
The Stolen Bacillus by Herbert George Wells
The Stolen Bacillus by Herbert George WellsThe Stolen Bacillus by Herbert George Wells
The Stolen Bacillus by Herbert George WellsEugene Lysak
 

Último (20)

Optical Fibre and It's Applications.pptx
Optical Fibre and It's Applications.pptxOptical Fibre and It's Applications.pptx
Optical Fibre and It's Applications.pptx
 
SOLIDE WASTE in Cameroon,,,,,,,,,,,,,,,,,,,,,,,,,,,.pptx
SOLIDE WASTE in Cameroon,,,,,,,,,,,,,,,,,,,,,,,,,,,.pptxSOLIDE WASTE in Cameroon,,,,,,,,,,,,,,,,,,,,,,,,,,,.pptx
SOLIDE WASTE in Cameroon,,,,,,,,,,,,,,,,,,,,,,,,,,,.pptx
 
Slides CapTechTalks Webinar March 2024 Joshua Sinai.pptx
Slides CapTechTalks Webinar March 2024 Joshua Sinai.pptxSlides CapTechTalks Webinar March 2024 Joshua Sinai.pptx
Slides CapTechTalks Webinar March 2024 Joshua Sinai.pptx
 
March 2024 Directors Meeting, Division of Student Affairs and Academic Support
March 2024 Directors Meeting, Division of Student Affairs and Academic SupportMarch 2024 Directors Meeting, Division of Student Affairs and Academic Support
March 2024 Directors Meeting, Division of Student Affairs and Academic Support
 
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxAUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
 
Riddhi Kevadiya. WILLIAM SHAKESPEARE....
Riddhi Kevadiya. WILLIAM SHAKESPEARE....Riddhi Kevadiya. WILLIAM SHAKESPEARE....
Riddhi Kevadiya. WILLIAM SHAKESPEARE....
 
How to Manage Cross-Selling in Odoo 17 Sales
How to Manage Cross-Selling in Odoo 17 SalesHow to Manage Cross-Selling in Odoo 17 Sales
How to Manage Cross-Selling in Odoo 17 Sales
 
What is the Future of QuickBooks DeskTop?
What is the Future of QuickBooks DeskTop?What is the Future of QuickBooks DeskTop?
What is the Future of QuickBooks DeskTop?
 
KARNAADA.pptx made by - saransh dwivedi ( SD ) - SHALAKYA TANTRA - ENT - 4...
KARNAADA.pptx  made by -  saransh dwivedi ( SD ) -  SHALAKYA TANTRA - ENT - 4...KARNAADA.pptx  made by -  saransh dwivedi ( SD ) -  SHALAKYA TANTRA - ENT - 4...
KARNAADA.pptx made by - saransh dwivedi ( SD ) - SHALAKYA TANTRA - ENT - 4...
 
CapTechU Doctoral Presentation -March 2024 slides.pptx
CapTechU Doctoral Presentation -March 2024 slides.pptxCapTechU Doctoral Presentation -March 2024 slides.pptx
CapTechU Doctoral Presentation -March 2024 slides.pptx
 
ARTICULAR DISC OF TEMPOROMANDIBULAR JOINT
ARTICULAR DISC OF TEMPOROMANDIBULAR JOINTARTICULAR DISC OF TEMPOROMANDIBULAR JOINT
ARTICULAR DISC OF TEMPOROMANDIBULAR JOINT
 
How to Send Emails From Odoo 17 Using Code
How to Send Emails From Odoo 17 Using CodeHow to Send Emails From Odoo 17 Using Code
How to Send Emails From Odoo 17 Using Code
 
How to Make a Field read-only in Odoo 17
How to Make a Field read-only in Odoo 17How to Make a Field read-only in Odoo 17
How to Make a Field read-only in Odoo 17
 
Ultra structure and life cycle of Plasmodium.pptx
Ultra structure and life cycle of Plasmodium.pptxUltra structure and life cycle of Plasmodium.pptx
Ultra structure and life cycle of Plasmodium.pptx
 
Over the counter (OTC)- Sale, rational use.pptx
Over the counter (OTC)- Sale, rational use.pptxOver the counter (OTC)- Sale, rational use.pptx
Over the counter (OTC)- Sale, rational use.pptx
 
EBUS5423 Data Analytics and Reporting Bl
EBUS5423 Data Analytics and Reporting BlEBUS5423 Data Analytics and Reporting Bl
EBUS5423 Data Analytics and Reporting Bl
 
Work Experience for psp3 portfolio sasha
Work Experience for psp3 portfolio sashaWork Experience for psp3 portfolio sasha
Work Experience for psp3 portfolio sasha
 
Easter in the USA presentation by Chloe.
Easter in the USA presentation by Chloe.Easter in the USA presentation by Chloe.
Easter in the USA presentation by Chloe.
 
How to Solve Singleton Error in the Odoo 17
How to Solve Singleton Error in the  Odoo 17How to Solve Singleton Error in the  Odoo 17
How to Solve Singleton Error in the Odoo 17
 
The Stolen Bacillus by Herbert George Wells
The Stolen Bacillus by Herbert George WellsThe Stolen Bacillus by Herbert George Wells
The Stolen Bacillus by Herbert George Wells
 

Chapter05

  • 2. Learning Objectives • Explain how a Web search engine works • Find information by using a search engine • Use logical operators, filtering to express complex queries • Recognize and find authoritative information sources • Decide whether Web information is truth or fiction Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 3. Web Search Fundamentals • A search engine is a collection of computer programs that help us find information on the Web • No one organizes the information posted on the Web • Search Engines must look around to find out what’s out there and then organize what is found Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 4. How a Search Engine Works • The first step, crawling, visits every Web page that it can find • How are the pages found? – The crawler has a todo list that is loaded with a set of pages to start – When a URL is found while crawling a page, it adds that URL to the todo list • The main work of the crawler is to build an index Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 5. Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 6. How a Search Engine Works • The index is a list of tokens (or words) that are associated with the page • The token might be part of the page’s title • There are other ways for a token to be associated with a page • For each token, the crawler creates a list of the URLs associated with that token Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 7. How a Search Engine Works • The second step is query processing • The user presents tokens (aka search terms) to the query processor • The search engine then looks up the word in the index and returns to you a hit list • By creating the index ahead of time, search engines are able to answer user queries very quickly Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 8. Multiword Searches • With a multiple-word query, the pages returned should be appropriate for all of the queried words • AND-query – Each page returned should be associated with all the words – There is no index entry corresponding to a set – There is only a list for the individual words Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 9. Intersecting Queries • For multiple words, the query processor fetches the index lists for each of the terms • URLs that are in all of the lists are looked at and compared • The query processor intersects the lists • The URL lists are alphabetized to speed up the processing … it is easier to notice when the same URL is on multiple lists Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 10. Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 11. Rules for Intersecting Alphabetized Lists • To intersect several alphabetized lists: 1. Put a marker (arrow) at the start of each token’s index list. 2. If all markers point to the same URL, save it, because all tokens are associated with the page. 3. Move the marker(s) to the next position for whichever URL is earliest in the alphabet. 4. Repeat Steps 2–3 until some marker reaches the end of the list. Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 12. Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 13. Power of an Indexed Search • The computer: – takes the time to crawl the data (Web pages) – build an index first – find the index entries for each word – intersect the lists to find the information for an AND-query • Search engines can look at billions of Web pages and return an answer in less than a fifth of a second Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 14. Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 15. Descriptive Terms • ―Hits‖ on a page means the search term is ―associated‖ with the page • This does not mean the word is ―on‖ the page • Web page structure helps a lot to identify descriptive text Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 16. Descriptive Terms • Descriptive text: – Title—The <title> encloses a short phrase describing the whole page – Anchor text—The highlighted link text, inside <a . . . > tags, describes the page it links to – Meta—A <meta . . . > tag in the head section can hold a several sentence description of the page – Alt attributes—The <img . . . > tag has an alt attribute that gives a textual description Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 17. Page Rank • Why, when the hit list is returned, the page you’re looking for is often first on the hit list or in the top 10? • The order in which hits are returned to a query is determined by a number called the PageRank • The higher the PageRank, the closer to the top of the list Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 18. Links to Other Pages • Google pioneered page ranking as a way to determine which pages are likely to be most important • PageRanking works like a voting system: – If page A links to page B, A’s link adds to B’s importance • Pages that are linked-to by many pages have a higher page ranking and are assumed to be more important Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 19. Links to Other Pages • Links from pages with a high page ranking are also viewed as more • PageRank is computed by the crawler: – The crawler looks at page A – It notices the links to page B – It scores one for B • Counting the number of links to a page is not sufficient Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 20. Links to Other Pages • After the crawling is completed, the PageRank computation is completed • The query processor is puts together the hit list • The URLs are sorted by their page ranking, highest to lowest, and returned in that order Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 21. Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 22. Advanced Searches human AND powered AND flight • The Logical Operator AND – Basic queries are AND-queries – All words given must be associated with the page for a hit – The word AND is a logical operator – Logical operators specify a logical relationship between the words it connects Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 23. Advanced Searches human AND powered AND flight • The Logical Operator AND – Search engines treat search words as three independent words – The words can appear anywhere on the page in any order – Use of quotes mean the exact phrase must appear as given…this is more than an AND query Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 24. Complex Queries marshmallow OR strawberry OR chocolate • Another logical operator is OR – OR-queries hit on pages that are associated with at least of the words Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 25. Complex Queries tigers AND NOT baseball • Another logical operator is NOT – NOT queries specify words that are not to be associated with the page – AND is included because we want both requirements to be true Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 26. Combining Logical Operators (marshmallow OR strawberry) AND sundae • The logical operators work like arithmetic • They can be combined and grouped using parentheses • Google also uses a minus (–) as an abbreviation for NOT Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 27. Restricting Global Search • Many sites offer the opportunity to perform a site search • A site search means looking only on the current site • The site search is usually offered on the homepage with a search window and a Go button Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 28. Filtered Searches • Constraints can be used to help pinpoint specific pages • Using these areas on the advanced search, can limit our search hits Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 29. Filtered Searches • Site searches don’t necessarily use PageRanking to order the hits • Use of the advanced search filters can be used to get PageRanked hits Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 30. Web Searching • Research on the web requires serious consideration and strategies: 1. Selecting Search Terms choosing good words to include in a query 2. The Anatomy of a Hit how to use the information returned 3. Using the Hit List skimming to find what you want quickly 4. Once You Find a Likely Page locating the desired data on the page Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 31. 1. Selecting Search Terms • How do you find the best search terms? • It is a sequence of finding ever-more precise terms: I. II. III. IV. V. VI. Advanced Search General Topic Descriptive Terms Refining (Adding Words) Avoiding Over Constraining Removing Words Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 32. 1. Selecting Search Terms I. Use Advanced Search – Google’s Advanced Search gives control over the results returned – You can do complex queries, but Advanced Search provides control Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 33. 1. Selecting Search Terms II. Begin with the General Topic – Words have multiple meanings – Giving the topic can eliminate most of those conflicting words – Many hits can be eliminated – Start with the topic word(s) Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 34. 1. Selecting Search Terms III. Choose Descriptive Words – Be picky about the words we select – Select precise terms – Take care using terms that have many other meanings—they become less useful Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 35. 1. Selecting Search Terms IV. Refine by Adding Words – Begin with a ―first guess‖ search – Check the results – Check the initial hit list to see what you’ve found – The initial list often suggests additional terms to search – This may require several rounds of adding a word to the query each time Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 36. Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 37. 1. Selecting Search Terms V. Avoid Over Constraining – Adding more words one at a time works best because interesting pages might be accidentally overlooked – Care is required in this process: Add a word only if you are sure the pages you want will have it Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 38. 1. Selecting Search Terms VI. Remove Specific Words (Minus or NOT) – It is useful to consider eliminating pages with certain words – It is the opposite of adding words that may/will appear on a page – The minus sign is a good way to eliminate wrong interpretations of words Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 39. 2. Anatomy of a Hit • What is displayed with each hit? – Title • This is the text between the page’s <title> and </title> tags. – Snippet • This information is a preview of what might be on the page • Usually a short phrase from the page containing one or more of the searched words Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 40. 2. Anatomy of a Hit • What is displayed with each hit? – URL • The URL that is linked from the title line – Site Links • These are useful links from the site, which are basically shortcuts • These links are found algorithmically • They are not ―sponsored‖ links – no one is paying for them Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 41. 3. Using the Hit List • Checking the hits is a process of filtering • Skim the top level of information, and look deeper if it looks promising • If not, continuing skimming Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 42. 3. Using the Hit List • ―Looking deeper‖ ranks the information: – Title • It’s the first source of information • When the title checks out, look at the . . . – Snippet • Search terms are shown in bold • The context around the occurrence is given • If the snippet checks out, look at the . . . Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 43. 3. Using the Hit List • ―Looking deeper‖ ranks the information: – URL • The domain of the site hosting the page is given • The site name is the first check of how authoritative the information is • If the URL checks out, look at the . . . – Pageitself • At this point, there is some likelihood that the page includes information you need Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 44. 4. Once You Find a Likely Page • A page has been found! – A suggestive title – A promising snippet – A reliable domain • How close are we to what is wanted? – First, ―roll through the page‖ checking out its main features – Next, look for a date to see how current it is – Finally, find the location of one of the keywords Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 45. 4. Once You Find a Likely Page • The goal is still to determine whether you want to stay on the page. – If you’ve found what you want, you’re done – If not, return to the hit list • If the site is good, decide how important it is that the information is perfectly correct: • If the importance is high, cross-check the information with another site • Find corroborating information! Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 46. Authoritative Information • Don’t Believe Everything You Read – No one in charge of the WWW, so no one checks to see that the information is correct – Some information is inaccurate – Most information on the Web should be considered suspect – Anyone can post a Web page and make random statements or claims…true, partly true, and completely false Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 47. Wikipedia • Wikipedia is an open source document created by knowledgeable and community-minded Internet users • Anyone can contribute to Wikipedia • Wikipedia covers an enormous number of topics • The information contained might not be included in a printed encyclopedia Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 48. Wikipedia • Its coverage and timeliness make it a valuable resource • The fact that many people contribute to Wikipedia is both a weakness and a strength – Anyone can add anything to or edit an entry without control – Use Wikipedia as a starting point for research Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 49. Using the Web for Research • Applying good research practices will ensure highly reliable information: – Question the information • Does it make sense? Is it believable? Is it consistent? Does the information fit with everything you already know? – Never rely on a single source; always use multiple sources. – Assess the site’s authoritativeness – Vary the kinds of resources you use, including off-line resources. Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 50. What Is Authoritative? • Authoritative means that we are looking for what experts say – We assume that experts are well informed on the topic – We assume that what experts say is true • There is no way to verify that what ―they‖ say is true, so we accept that it’s the best available information Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 51. Respected Sources • Authoritative information can be gotten from respected organizations • Thousands of professional organizations host trustworthy Web sites • Individuals are also good sources of information as long as we have some reason to believe them Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 52. Primary Sources • A primary source is a person who has direct knowledge of the information • People who interview primary sources are secondary sources – They are not as reliable as primary sources • People who watch journalists on TV or read newspaper reports are tertiary sources – They are three levels removed from direct knowledge of the event Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 53. Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 54. Primary Sources • A secondary source does not mean that the information is wrong, only that the possibility exists • There is no guarantee that even primary sources tell the truth Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 55. Authoritative Sources • The easiest way to get authoritative information is to go to a site that you know to be authoritative • Many agencies and organizations publish information you can depend on • By going directly to an authoritative source, you are ensured that the information is reliable Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 56. Site Analysis… Good? Bad? • Possible issues for bad sites: – – – – – – Broken links Failure to give contact information Failure to have a non-Web identity Simplistic design No recent updates or blog entries Spelling mistakes • Legitimate sites may have these issues, too! • To decide if a site is legitimate, do a little direct research on the information found there Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 57. Summary • In this chapter we have learned the following: – We need software and our own intelligence to search the Internet effectively. – Search engines are composed of a crawler and a query processor. – We create queries using the logical operators AND, OR, and NOT, and specific terms to pinpoint the information we seek. – Filtering and ―subtracting‖ search terms removes extraneous hits. Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
  • 58. Summary • In this chapter we have learned the following: – Once we’ve found information, we must judge whether it is correct by investigating the organization that publishes the page, and examining the facts claimed on the page. – We must cross-check the information with other sources, especially when the information is important. Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley