6. • Google.com indexes 40 billion public web pages.
• 100+ billion static web pages are publicly-available. These pages can
easily be found by Google and other search engines
• 11+ billion static pages are hidden from the public. As private intranet
content, these are the corporate pages that are only open to
employees of specific companies.
• 450+ billion database-driven pages are completely invisible to
Google.
7. Beyond Google
• Google Scholar
• Metasearch engines:
o Yippy – n.b. Shakespeare
o Dogpile
• Invisible (or Deep) Web:
o Databases:
• Jstor
• Expanded Academic
• Project Muse
• The Literary Encyclopedia
• Austlit n.b. Indigenous Literature
o E-books:
• Cambridge companions:
• Invisible web search engines
8. • search for scholarly literature.
• articles, theses, books, abstracts
• from academic publishers, professional societies,
online repositories, universities and other web
sites.
13. Why Yippy?
•
•
•
•
•
Metasearch: Searches several top search engines
Combines results based on comparative ranking
Best results go to the top
‘Clouds’, group similar results
Yippy does not track or sell your personal information
14.
15. Domain names
.ac = academic
.au = Australia
.com = commercial
.edu = educational institution
.edu.au
.gov = government
.org = organisation eg Association of Professional Engineers, Scientists and
Managers, Australia
.uk = UK
16.
17. What is the invisible web?
• Search engines index less than 10%
• The rest is called the Invisible Web or the Deep Web
• Massive content hidden from search engines
18. Invisible (Deep) Web
• Library databases:
o
o
o
o
Austlit
JSTOR
Expanded Academic ASAP
Project Muse
• E-books:
o Cambridge Companions Online
• Invisible Web search engines:
o Complete Planet
o ScienceResearch.com
19.
20.
21. •
•
•
•
Scholarly journals, news magazines & newspapers
All academic disciplines
25,152,618 articles as at 24/10/13
All published between 1980 - 2013
29. “One Search. Superior Science.”
• High quality results from the Deep Web
• Slower than Google because
• It searches several search engines, + collates and
ranks results
• For science and technology
According to Eric Schmidt of Google, every two days now the human race creates as much information as we did from the dawn of civilisation until 2003. That's about five exobytes of data a day, for those of you keeping score. The challenge becomes, not finding that scarce plant growing in the desert, but finding a specific plant growing in a jungle. We are going to need help navigating that information to find the thing we actually need.