When it comes to
many questions (products, science, etc.) , I refer people to http://electronics.howstuffworks.com/
This fantastic site now has a new search engine.
When it comes to encyclopedia-type questions my next favorite referral is http://en.wikipedia.org/wiki/Main_Page
If you don’t like something in a Wiki module, you can change it yourself
from your browser.If you don’t
find a module, you can perform a service for the world by writing a module.
Sometimes wandering through the wilds
of Wikipedia can result in confusion. For Dennis Lorson, his wandering
led him to create this handy application. With Pathway 1.0.3 visitors
can retrace their own steps through Wikipedia by creating a graphical
network representation of article pages. It’s worth a try, and it will
work with all computers running Mac OS X 10.4.
CatsCradle 3.5 ---
http://www.stormdance.net/software/catscradle/overview.htm
Many websurfers enjoy going to sites that might be
based in other countries, and as such, they might very well encounter a
different language. With CatsCradle 3.5, these persons need worry no more,
as this application can be used to translate entire websites in such
languages as Thai, Chinese, Japanese, and Russian. This version is
compatible with all computers running Windows XP or 2000. (Scout
Report, September 1, 2006)
Congoo, a search engine launched this month and
partnered with Google, gives registered users free online access to a
selection of publications that normally required a subscription or a
pay-per-view fee to read. After downloading the Congoo plug-in and
registering, users can get access to "between four and 15 articles per
month per publisher." Publications available include the Encyclopaedia
Britannica Online, Financial Times, BusinessWire, Editor & Publisher,
The New Republic, The Boston Globe, The Chicago Tribune, The Denver
Post, The Philadelphia Inquirer and other major U.S. newspapers. Congoo
is available at http://www.congoo.com/.
Critics of Congoo note that many public
libraries, such as the San Francisco Public Library
(
http://www.sfpl.org/sfplonline/dbcategories.htm ),
also offer free access to subscription databases.
And your own college and university library may also have online
subscriptions that you can access at no additional fee.
No A Grades to 83.33% of search engine users. They say they trust their favorite search engines, but
there’s a distressing lack of understanding of how engines rank and present
pages -- only 38 percent of users are aware of the distinction between paid or
“sponsored“ results and unpaid results.“ And only one in six say they can
always tell which results are paid or sponsored and which are not.“ The
funny part about this last bit is, nearly half of users say they would stop
using search engines if they thought the engines were being unclear about how
they presented paid results.
David Appell, "Search Engines," MIT's Technology Review,
February 11, 2005 --- http://www.technologyreview.com/blog/blog.asp?blogID=1732&trk=nl
Borrowing a page from the popular video-sharing
site YouTube, a new online service lets people upload and share their papers
or entire books via a social-network interface. But will a format that works
for videos translate to documents?
It’s called
iPaper,
and it uses a Flash-based document reader that can be
embedded into a Web page. The experience of reading neatly formatted text
inside a fixed box feels a bit like using an old microfilm reader, except
that you can search the documents or e-mail them to friends.
The company behind the technology, Scribd, also
offers a
library of iPaper documents and invites users to
set up an account to post their own written works. And, just like on
YouTube, users can comment about each document, give it a rating, and view
related works.
Also like on YouTube, some of the most popular
items in the collection are on the lighter side. One document that is in the
top 10 “most viewed” is called
“It seems this essay was written while the guy was high, hilarious!”
It is a seven-page paper that appears to have been
written for a college course but is full of salty language. The document
includes the written comments of the professor who graded it, and it ends
with a handwritten note: “please see after class to discuss your paper.”
LocateTV will search over 3 million TV listings
across all channels in your area
Type in the name of a TV show, movie, or actor
Locate TV will find channels and times in your locale http://www.locatetv.com/
Songza
Search for a song or band and play the selection ---
http://songza.com/
I tried it for Arturo Toscanini, Stan Kenton, and Jim Reeves.
The results were absolutely amazing!
SpiralFrog.com, an ad-supported Web site with a terrible
name that allows visitors to download music and videos free of charge, commenced
on September 17, 2007 in the U.S. and Canada after months of "beta"
testing. At launch, the service was offering more than 800,000 tracks and 3,500
music videos for download ---
http://www.spiralfrog.com/
Zaba Search free database of names, addresses, birth dates, and phone numbers.
Social security numbers and background checks are also available for a fee ---
http://www.zabasearch.com/
Google is a great search engine, but it's also more
than that. Google has tons of hidden features, some of which are quite fun
and most of which are extremely useful— if you know about them. How do you
discover all these hidden features within the Google site?
See
http://www.informit.com/articles/article.asp?p=675528&rl=1
Amid the flurry of news over Microsoft's bid for Yahoo and Google's
rebuttal, a research announcement by Google went largely unnoticed. Last week, the search giant began a public
experiment in which users can make their search results look a little
different from the rest of the world's. Those who sign up are able to switch
between different views, so instead of simply getting a list of links (and
sometimes pictures and YouTube videos, a relatively recent addition to the
Google results), they can choose to see their results mapped, put on a
timeline, or narrowed down by informational filters. Dan Crow, product
manager at Google, says that the results of the experiment could eventually
help the company improve everyone's search experience.
Kate Greene, MIT's Technology Review, February 6, 2008 ---
http://www.technologyreview.com/Infotech/20162/?nlid=857
Jensen Comment
You can read more about this experiment at
http://www.google.com/experimental/index.html
Google added historic map overlays to its free interactive online globe of
the world to provide views of how places have changed with time.
"Google Earth maps history," PhysOrg, November 14, 2006 ---
http://physorg.com/news82706337.html
"Finding Yourself without GPS: Google's new
technology could enable location-finding services on cell phones that lack GPS,"
by Kate Greene, MIT's Technology Review, December 4, 2007 ---
http://www.technologyreview.com/Infotech/19809/?nlid=716&a=f
As more mobile phones tap into the Internet, people
increasingly turn to them for location-centric services like getting
directions and finding nearby restaurants. While Global Positioning System
(GPS) technology provides excellent accuracy, only a fraction of phones have
this capability. What's more, GPS coverage is spotty in dense urban
environments, and in-phone receivers can be slow and drain a phone's
battery.
To sidestep this problem, last week Google added a
new feature, called My Location, to its Web-based mapping service. My
Location collects information from the nearest cell-phone tower to estimate
a person's location within a distance of about 1,000 meters. This resolution
is obviously not sufficient for driving directions, but it can be fine for
searching for a restaurant or a store. "A common use of Google Maps is to
search nearby," says Steve Lee, product manager for Google Maps, who likened
the approach to searching for something within an urban zip code, but
without knowing that code. "In a new city, you might not know the zip code,
or even if you know it, it takes time to enter it and then to zoom in and
pan around the map."
Many phones support software that is able to read
the unique identification of a cell-phone tower and the coverage area that
surrounds it is usually split into three regions. Lee explains that My
Location uses such software to learn which tower is serving the phone--and
which coverage area the cell phone is operating in. Google also uses data
from cell phones in the area that do have GPS to help estimate the locations
of the devices without it. In this way, Google adds geographic information
to the cell-phone tower's identifiers that the company stores in a database.
People are flocking to online social networks.
Facebook,
for example, claims an average of 250,000 new
registrations per day. But companies are still hunting for ways to make
these networks more useful--and profitable. In the past year, Facebook has
introduced new services aimed at taking advantage of users' online contacts
(see "Building
onto Facebook's Platform"), and Yahoo announced
plans for an
e-mail service that shares data with
social-networking sites. (See "Yahoo's
Plan for a Smarter In-Box.") Now a company called
Delver,
which presented at
Demo
earlier this week, is working on a search engine that
uses social-network data to return personalized results from the larger Web.
Liad Agmon, CEO of Delver, says that the site
connects information about a user's social network with Web search results,
"so you are searching the Web through the prism of your social graph." He
explains that a person begins a search at Delver by typing in her name.
Delver then crawls social-networking websites for widely available data
about the user--such as a public
LinkedIn profile--and builds a network of
associated institutions and individuals based on that information. When the
user enters a search query, results related to, produced by, or tagged by
members of her social network are given priority. Lower down are results
from people implicitly connected to the user, such as those relating to
friends of friends, or people who attended the same college as the user.
Finally, there may be some general results from the Web at the bottom. The
consequence, says Agmon, is that each user gets a different set of results
from a given query, and a set quite different from those delivered by
Google.
"We have no intention of competing with the Googles
of the world, because Google is doing a very good job of indexing the Web
and bringing you the
Wikipedia
page of every search query you're looking for," says
Agmon. He says that Delver will free general search queries such as "New
York" or "screensaver" from the heavy search-engine optimization that tends
to make those kinds of queries return generic, ad-heavy results on Google.
"[As a user], you're always thinking, how can I trick Google into bringing
me the real results rather than the commercial results?" Agmon says. "With
this engine, we don't need to trick it at all. You can go back to these very
naive and simple queries because the results come from your network. Your
network is not trying to optimize results; they just publish or bookmark
pages which they find interesting." As a consequence, the results lean
toward user-generated content and items tagged through sites such as
del.icio.us.
March 18, 2008 (PC World) If
you dig around the Web long enough, you're bound to find
things somebody might not want you to know. (Maybe, like
me, you hang your laundry out in the backyard.) This
week I have a bunch of sites to help you dig up the dirt
and do some serious research.
Find the Dirt on Your Neighbor
With two free Web services, I found the address of a
neighbor, his first and last name, his phone number and
how much his home is worth. If
Zillow
would only update its images, I could even tell you if
he hangs his laundry out in the backyard.
met a
neighbor while walking the dogs, and we chatted a while.
When I got home, I decided to pop something in the mail.
(It was some census tract stuff if you must know.) He
lives about two blocks down the road, but for the life
of me, I couldn't remember the guy's name or his street
address. Okay, sure, I could've just dropped by his
house. But what would I have to write about today, eh?
I popped
open Zillow and searched on my neighborhood until I
found the image of his house, then clicked on it. Zillow
told me lots of stuff about the value of his home. What
I needed--and got--was his street address.
Now that I had his street address, I went to the Reverse
Lookup tab at
411Locate, entered info in the
Reverse Address Lookup section, and got lucky. In a
second, I had Jess's name. You might not be so
fortunate--411Locate doesn't always come up with the
right name.
Dig This: Tempted to buy a set of those newfangled
color-pencil input devices? Be sure to
read the review first--it
details advanced features, usability, and, no surprise,
bugs.
Trulia's Hindsight: Watch Cities Grow
If
you enjoyed Zillow, you might also like
Trulia.
But there's more to this
real-estate site than you might expect. I was poking
around the other day and discovered
Trulia Hindsight, which shows
annual population growth in most parts of the U.S.
Once
you're on Trulia Hindsight, click on Plano, Texas.
You'll see a city map paint on the screen and a timeline
at the bottom of the page will begin to advance. The map
begins to populate, showing how the area developed over
time.
Use the
contrast slider on the bottom right to adjust how much
of the background you want to see and the slider on the
bottom left to zoom in or out of the map.
Once you get your bearings, grab the timeline slider,
move it to the left, then slowly move it to the right.
Type a city and state into the search field at the top
to find your hometown. Unfortunately, the site doesn't
have data for every area. If your town isn't on Trulia's
radar, try
downtown Los Angeles.
Dig This: You've gotta watch
The Front Fell Off. My editor
started kvetching that while hilarious, it also looks
quite plausible. And she complained that the actors
aren't getting credit even though there are lots of
clips floating around the Internet. Okay, so here it
goes: The guys are Australian comedy team
Bruce and Dawe.
Top 5 Little-Known Research Web Sites
OWL, the Online Writing Lab,
lets you look up the whys and wherefores of grammar. The
Phrase Finder is a handy
thesaurus for phrases. Need a fact checker?
Refdesk.com has all the
facts--or links to them--you'll ever need. Visiting the
LibrarySpotis like walking into the local library and walking
into the reference room. The site's part of the
StartSpot Network, which includes HomeworkSpot and
MuseumSpot.
Dig This: Whenever I
go to CES in Las Vegas, my first stop is the craps table
for some fast action--and maybe a chance to make a
couple of bucks. Yet after watching these
videos of Texas Hold'em--the
game that "takes five minutes to learn and a lifetime to
master"--I may have to find a low-stakes game.
Dig This, Too: Need a change of pace? Try
Reel Fishing. You'll need
patience and a steady hand.
Digg is perhaps one of the web’s best known sites,
and it contains various content submitted by users from all over the world.
Dugg 1.1.5 is a tiny widget that can help Digg devotees (and Digg neophytes)
search and find content on Digg quickly. Visitors can view stories for
specific topics or users and also check out what friends might be “digging”.
This version of Dugg is compatible with computers running Mac OS X 10.3.
Question
What does Walt Mossberg think about the Ask3D search engine?
But Ask's new system, called "Ask3D," is a much
bolder and better advance in unifying different kinds of results and presenting
them in a more effective manner. It shows, once again, that Ask places a higher
priority than its competitors do on making search results easy to navigate and
use. Both new systems are now the defaults on the search sites. You don't have
to do anything special to use them. Indeed, Google's change is so subtle you may
not even notice it for some searches.
Walter S. Mossberg, "Ask.com Takes Lead In Designing Display Of Search Results,"
The Wall Street Journal, June 28, 2007; Page B1 ---
http://online.wsj.com/article/SB118298543501150751.html
Apple Macintosh - Search for all things Mac
BSD Unix - Search web pages about the BSD operating
system
Linux - Search all penguin-friendly pages
Microsoft - Search Microsoft-related pages
U.S. Government - Search all U.S. federal, state and
local government sites
Universities - Search a specific school's website
In addition
to providing easy access to billions of web pages, Google has many
special features to help you to find exactly what you're looking
for. Click the title of a specific feature to learn more about it.
• Book
Search
Use
Google to search the full text of books.
•
Cached Links
View a
snapshot of each page as it looked when we indexed it.
•
Calculator
Use
Google to evaluate mathematical expressions.
•
Currency Conversion
Easily
perform any currency conversion.
•
Definitions
Use
Google to get glossary definitions gathered from various
online sources.
• File
Types
Search
for non-HTML file formats including PDF documents and
others.
•
Froogle
To
find a product for sale online, use Froogle - Google's
product search service.
•
Groups
See
relevant postings from Google Groups in your regular web
search results.
• I'm
Feeling Lucky
Bypass
our results and go to the first web page returned for your
query.
•
Images
See
relevant images in your regular web search results.
•
Local Search
Search
for local businesses and services in the U.S., the U.K., and
Canada.
•
Movies
Use
Google to find reviews and showtimes for movies playing near
you.
•
Music Search
Use
Google to get quick access to a wide range of music
information.
• News
Headlines
Enhances your search results with the latest related news
stories.
•
PhoneBook
Look
up U.S. street address and phone number information.
• Q&A
Use
Google to get quick answers to straightforward questions.
•
Refine Your Search -
New!
Add
instant info and topic-specific links to your search in
order to focus and improve your results.
•
Results Prefetching
Makes
searching in Firefox faster.
•
Search By Number
Use
Google to access package tracking information, US patents,
and a variety of online databases.
•
Similar Pages
Display pages that are related to a particular result.
•
Site Search
Restrict your search to a specific site.
•
Spell Checker
Offers
alternative spelling for queries.
•
Stock and Fund Quotes
Use
Google to get up-to-date stock and mutual fund quotes and
information.
•
Street Maps
Use
Google to find U.S. street maps.
•
Travel Information
Check
the status of an airline flight in the U.S. or view airport
delays and weather conditions.
•
Weather
Check
the current weather conditions and forecast for any location
in the U.S.
• Web Page Translation
Provides you access to web pages in other languages.
The yet-to-be-developed technology detailed in the
patent application carries serious implications for the future of search
technology, particularly in regard to the Google Book Search project.
What could that mean for the future of academic
research and the role of libraries? In an interview, Wendy P. Lougee,
University of Minnesota librarian, frames the would-be technology in the
context of “discoverability” — the ease with which an item can be found
through a search.
“With respect to images, the challenges have been
in the metadata,” or the data that contextualizes items in a database, she
says, and the potential technology “could significantly enhance” librarians’
ability to catalogue and retrieve information.
A new application lets Facebook users start their
library research in the popular social-networking system. The
plug-in
provides an interface in Facebook for searching the
popular Worldcat database, operated
by the nonprofit OCLC. The group’s Web site says
the index includes more than a billion items in more than 10,000 libraries.
So far the application does not seem to be listed
in Facebook’s official directory. But a quick search of Facebook’s other
applications shows that more than a dozen other academic libraries have
created their own search tools for the social-networking platform. The
University of Notre Dame
has one, for instance, as does
Elmhurst College,
Pace University, and
Ryerson University.JSTOR,
the popular, nonprofit digital archive of scholarly publications, also
offers
a Facebook application.
One thing I discovered when
I invited Wired Campus readers to join my Facebook friend group
is that librarians are some of the most enthusiastic
nonstudent users of social networks. But can Facebook, known as a place for
socializing, become part of the research process as well?
The University of Illinois at Urbana-Champaign announces
the availability of a newly-digitized collection of Abraham Lincoln books
accessible through the Open Content Alliance and displayed on the University
Library's own web site, as the first step of a digitization project of
Lincoln books from its collection. View the first set of books digitized at:
http://varuna.grainger.uiuc.edu/oca/lincoln/
The University of California's eScholarship Repository has recently
exceededfive million full-text downloads,according to the university The eScholarship Repository, a service of the
California Digital Library, allows scholars in the University of California
system to submit their work to a central location where any users may easily
access it free of charge. The idea is to ease communication between
researchers. Catherine Mitchell, acting director of the CDL publishing
group, says the number shows that both content seekers and creators have
embraced the service, allaying concerns among researchers that others
wouldn't contribute to the repository.
Hurley Goodall, Chronicle of Higher Education, January 16, 2008 ---
http://chronicle.com/wiredcampus/index.php?id=2667&utm_source=wc&utm_medium=en
How It Works ---
http://snurl.com/BookSearch
A significant extension of our groundbreaking Look Inside the Book
feature, Search Inside the Book allows you to search millions of pages
to find exactly the book you want to buy. Now instead of just displaying
books whose title, author, or publisher-provided keywords that match
your search terms, your search results will surface titles based on
every word inside the book. Using Search Inside the Book is as simple as
running an Amazon.com search.
A new search engine from TigerLogic Corporation, of
Irvine Calif., is being pushed to scholars and researchers, among others.
Called,
ChunkIt, the search engine refines results from
other search engines and databases, and displays chunks of text surrounding the
key words. In one of the company's
promotional videos, shown below, a stressed-out
college student uses ChunkIt to narrow a search on the Russian Revolution via
the Lexis/Nexis database. The student sports an Oberlin College sweatshirt and
gripes about meeting a deadline for a research paper in two hours. Steven J.
Bell, a research librarian at Temple University,
picks apartthe video on a blog from the
Association of College and Research Libraries, noting that it gives short shrift
to the skills of librarians. He questions why the student would need ChunkIt to
refine his search when Lexis/Nexis already has tools available to narrow search
results. His conclusion? ChunkIt is appropriate for use with other search
engines like Google, but not with library databases. Andrea L. Foster, Chronicle of Higher Education, July 15, 2008 ---
http://chronicle.com/wiredcampus/index.php?id=3166&utm_source=wc&utm_medium=en
Experts vs. Amateurs Searching the Web The credibility war rages on in the world of Web 2.0.
Those who say information provided by Internet
research tools needs to be vetted have
made their casein several ways.
Knol,for example, appears to be Google's answer to
Wikipedia. And for now, while the project is under development, authors can
contribute content by invitation only. The plan is to let users rank the wheat
among the chaff; the highest-ranking articles would pop up first in a Google
search. A clear example is
Mahalo.It's essentially a search engine run by
staff members, who hand-pick links for popular search terms. That's a familiar
concept for
academic libraries. There
is resistance to the idea that experts have lost their place in the
indiscriminate, user-generated Web 2.0. John Connell, an education-business
manager at Cisco Systems, writes in his
blogthat experts and laymen can coexist on the
Web: "We are not dealing with a zero-sum game of any kind -- the rise of one
source of information does not (necessarily) cause the dissipation of another.
Why then do those who espouse the ‘cult of the expert,’ for want of a better
term, feel it necessary not just to have access to the authoritative information
(in their terms) that they seek, but to deny those who want access to the ...
trivial information they want? "It is elitism, pure and simple." The question
is, do users need someone else to filter information for them? We know from past
reports that the
"Google Generation"has a hard time sorting the
relevant from the trivial. But isn't it better to teach them how?
Hurley Goodall, Chronicle of Higher Education, March 14, 2008 ---
http://chronicle.com/wiredcampus/index.php?id=2818&utm_source=wc&utm_medium=en
Is banning of Wikipedia/Google
for coursework both stupid and wasted effort?
Some professors
ban their students from citing Wikipedia
in papers. Tara Brabazon of the University of Brighton, bars her students from
using not only Wikipedia, but Google as well,The Times
of London reported. Google is “white bread for the mind,”
Brabazon said. “Google offers easy answers to difficult questions. But students
do not know how to tell if they come from serious, refereed work or are merely
composed of shallow ideas, superficial surfing and fleeting commitments,” she
said. “Google is filling, but it does not necessarily offer nutritional
content.” Inside Higher
Education, January 14, 2008 ---
http://www.insidehighered.com/news/2008/01/14/qt
"The University of Google," by Andrea L.
Foster, Chronicle of Higher Education, January 17, 2008 ---
Click Here
Tara Brabazon, professor of media studies at
Britain’s University of Brighton, was expected Wednesday to criticize Google
and what she sees as students’ over-reliance on the search engine and
Wikipedia in an inaugural lecture at the university. She calls the trend
“The University of Google,” according to an article Monday in The Times, and
labels the search engine “white bread for the mind.” The professor bans her
own students from using Wikipedia and Google in their first year of study.
A columnist for the paper responded in a piece that
accuses Ms. Brabazon of snobbery. “Curiosity, it seems, can only be
stimulated by trawling library shelves or by shelling out substantial
amounts of money,” he writes, sarcastically.
January 17, 2008 reply from Derek
Very interesting. I understand Brabazon’s point
about students’ over-reliance on Google and Wikipedia, but I don’t know if
banning those web sites helps to improve students’ information literacy. I
think students need to know how to use these kinds of web sites wisely.
If I can make a plug here, our teaching center just
started a new podcast series featuring interviews with faculty about issues
of teaching and learning. The first episode, available
here, features an interview with a
(Vanderbilt) history professor who uses Wikipedia to
teach the undergraduate history majors in his class how to think like
historians. He’s a great teacher and interviewee, and I think he offers an
effective way to use Wikipedia to help him accomplish his course goals.
Jensen Question
How will Professor Brabazon deal with the new and authoritative
Google Knol?
Jensen Comment
So how might a student find refereed journal or scholarly book references using
Wikipedia?
Most scholarly Wikipedia modules have footnotes and
references that can be traced back such that there is no evidence of having
ever gone to Wikipedia.
For example, note the many scholarly references and links at
http://en.wikipedia.org/wiki/Jung
Don't overlook the Discussion tab in Wikipedia. Here's
where some information is turned into knowledge by scholars.
If there is not a footnote or a reference, look for a
unique phrase in Wikipedia and then insert that phrase in Google Scholar or
one of the other sites below:
The University of Illinois at Urbana-Champaign announces
the availability of a newly-digitized collection of Abraham Lincoln books
accessible through the Open Content Alliance and displayed on the University
Library's own web site, as the first step of a digitization project of
Lincoln books from its collection. View the first set of books digitized at:
http://varuna.grainger.uiuc.edu/oca/lincoln/
How It Works ---
http://snurl.com/BookSearch
A significant extension of our groundbreaking Look Inside the Book
feature, Search Inside the Book allows you to search millions of pages
to find exactly the book you want to buy. Now instead of just displaying
books whose title, author, or publisher-provided keywords that match
your search terms, your search results will surface titles based on
every word inside the book. Using Search Inside the Book is as simple as
running an Amazon.com search.
For example,
Wikipedia describes how Jung proposed
spiritual guidance as treatment for chronic alcoholism ---
http://en.wikipedia.org/wiki/Jung#Spirituality_as_a_cure_for_alcoholism
Professor Brabazon might give a student an F grade for citing the above link.
Instead the student is advised to enter the phrase [ \"Jung\" AND \"Alcoholism\"
AND \"Spiritual Guidance\" ] into the exact phrase search box at
http://scholar.google.com/advanced_scholar_search?hl=en&lr=
Hundreds of scholarly references will emerge that Professor Brabazon will accept
as authoritative. But never mention to Professor
Brabazon that you got the idea for spiritual guidance
as a treatment of alcoholism from Wikipedia.
Also there's a question of how Professor Brabazon
will deal with the new Google Knol
"Google's Answer to Wikipedia: Google's Knol project aims to
make online information easier to find and more authoritative," MIT's Technology
Review, January 15, 2008 ---
http://www.technologyreview.com/Biztech/20065/?nlid=806
Google recently announced Knol, a new experimental
website that puts information online in a way that encourages authorial
attribution. Unlike articles for the popular online encyclopedia Wikipedia,
which anyone is free to revise, Knol articles will have individual authors,
whose pictures and credentials will be prominently displayed alongside their
work. Currently, participation in the project is by invitation only, but
Google will eventually open up Knol to the public. At that point, a given
topic may end up with multiple articles by different authors. Readers will
be able to
rate the articles, and the better an article's
rating, the higher it will rank in Google's search results.
Google coined the term "knol" to denote a unit of
knowledge but also uses it to refer to an authoritative Web-based article on
a particular subject. At present, Google will not describe the project in
detail, but Udi Manber, one of the company's vice presidents of engineering,
provided a cursory sketch on the company's blog site.
"A knol on a particular topic is meant to be the first
thing someone who searches for this topic for the first time will want to
read," Manber writes. And in a departure from Wikipedia's model of community
authorship, he adds that "the key idea behind the Knol project is to
highlight authors."
Noah Kagan,
founder of the premier conference about online communities,
Community Next,
sees an increase in authorial attribution as a change
for the better. He notes the success of the review site
Yelp,
which has risen to popularity in the relatively short span of three years.
"Yelp's success is based on people getting attribution for the reviews that
they are posting," Kagan says. "Because users have their reputation on the
line, they are more likely to leave legitimate answers." Knol also has
features intended to establish an article's credibility, such as references
to its sources and a listing of the title, job history, and institutional
affiliation of the author. Knol may thus attract experts who are turned off
by group editing and prefer the style of attribution common in journalistic
and academic publications.
Manber writes that "for many topics, there will
likely be competing knols on the same subject. Competition of ideas is a
good thing." But
Mark
Pellegrini, administrator and featured-article
director at Wikipedia and a member of its press committee, sees two problems
with this plan. "I think what will happen is that you'll end up with five or
ten articles," he says, "none of which is as comprehensive as if the people
who wrote them had worked together on a single article." These articles may
be redundant or even contradictory, he says. Knol authors may also have less
incentive to link keywords to competitors' articles, creating "walled
gardens." Pellegrini describes the effect thus: "Knol authors will tend to
link from their articles to other articles they've written, but not to
articles written by others."
Google,Inc. recently announced two new services as
part of its Google Research University program.
Google Search "is designed to give university
faculty and their research teams high-volume programmatic access to Google
Search, whose huge repository of data constitutes a valuable resource for
understanding the structure and contents of the web." For more information
and to register for the service, go to
http://research.google.com/university/search/
Google Translate "offers tools to help researchers
in the field of automatic machine translation compare and contrast with, and
build on top of, Google's statistical machine translation system." For more
information and to register for the service, go to http://research.google.com/university/translate/.
Flickr has
unveiled a new project, dubbed
The Commons,
which will
give Flickr members an
opportunity to browse and tag
photos from Library of Congress
archives. The goal is to create
what
Flickr
likes to call
an "organic information system,"
in other words, a searchable
database of tags that makes it
easier for researchers to find
images.
The pilot
project features a
small sampling
of the
Library of Congress’ some 14
million images. For now you’ll
find two collections. The first
is called “American Memory:
Color photographs from the Great
Depression” and features color
photographs of the Farm Security
Administration-Office of War
Information Collection including
“scenes of rural and small-town
life, migrant labor, and the
effects of the Great
Depression.”
The second collection is the The
George Grantham Bain Collection
which features “photos produced
and gathered by George Grantham
Bain for his news photo service,
including portraits and
worldwide news events, but with
special emphasis on life in New
York City.” The Bain collection
images date from around
1900-1920.
In effect
the Library of Congress has
become a Flickr user,
complete with its own stream
and while
it’s great to see these image
available to a much wider
audience, we’re not so sure how
much it’s going to help
researchers.
If you’re
looking for historical
photographs do you want to
search through comments from
self-appointed experts
criticizing the composition
skills of photography pioneers
or adding
the
ever insightful “wow?”
Then
there’s the inevitable comments
soliciting photos to be added to
whatever banal and increasingly
inane groups and pools that
Flickr members have come up
with.
The tagging aspect will no doubt
produce something of value, but
pardon our cynicism, this may
well turn out to be a good test
of whether the positive aspects
of the Flickr community outweigh
the negative.
Google,Inc. recently announced two new services as
part of its Google Research University program.
Google Search "is designed to give university
faculty and their research teams high-volume programmatic access to Google
Search, whose huge repository of data constitutes a valuable resource for
understanding the structure and contents of the web." For more information
and to register for the service, go to
http://research.google.com/university/search/
Google Translate "offers tools to help researchers
in the field of automatic machine translation compare and contrast with, and
build on top of, Google's statistical machine translation system." For more
information and to register for the service, go to http://research.google.com/university/translate/.
Google, Yahoo, Wikipedia, Open
Encyclopedia, and YouTube as
Knowledge Bases
A professor wrote to me drawing a fine line between
information and knowledge. Information is just organized data that can be
right or wrong or unknown in terms of been fact versus fiction. Knowledge
generally is information that is more widely accepted as being "true"
although academics generally hate the word "true" because it is either too
demanding or too misleading in terms of being set in stone. Generally
accepted "knowledge" can be proven wrong at later points in time just like
Galileo purportedly proved that heavy balls fall at the same rate of speed
as their lighter counterparts, thereby proving, that what was generally
accepted knowledge until then was false. "Galileo
Galilei is said to have dropped two
cannon balls of different masses from the tower to demonstrate that
their descending
speed was independent of their
mass. This is
considered an apocryphal tale, and the only source for it comes from
Galileo's secretary." Quoted from
http://en.wikipedia.org/wiki/Leaning_Tower_of_Pisa#History
In my opinion there is a spectrum along the lines of data to
information to knowledge. Researchers attempt to add something new and
creative at any point along the spectrum. Scholars learn from most any point
on the spectrum and usually attempt to share their scholarship in papers,
books, Websites, blogs, and online or onsite classrooms.
That professor then mentioned above then asserted that
Wikipedia
and YouTube were
information databases but not knowledge bases. He then mentioned the problem
of students knowing facts but not organizing these facts in a scholarly
manner. He conjectured that this was perhaps do to increased virtual
learning in their development. My December 5, 2007 reply to him was as
follows (off-the-cuff so to speak).
Although I see your point about information versus knowledge, the
addition of the “Discussion tab” in Wikipedia changed the name of
the game. As “information” gets discussed and debated and critiqued
it’s beginning to look a whole lot more like knowledge in Wikipedia.
For example, note the Discussion tab at
http://en.wikipedia.org/wiki/Intelligent_Design
And
when UC Berkeley puts 177 science courses on YouTube (some of them
in biology), it’s beginning to look a lot more like YouTube
knowledge --- ---
http://www.jimmyr.com/free_education.php
With
respect to virtual learning, my best example is Stanford’s million+
dollar virtual surgery cadaver that can do more than a real cadaver.
For one thing it can have blood pressure such that a nicked artery
can hemorrhage. Learning throughout time is based on models and
simulations of sorts. Our models and simulations keep getting better
and better to a point where the line between virtual and real world
become very blurred much like pilots in virtual reality begin to
think they are in reality.
Much
depends on the purpose and goals of virtual learning. Sometimes
edutainment is important to both motivate and make learners more
attentive (like wake them up). But this also has drawbacks when it
makes learning too easy. I’m a strong believer in blood, sweat, and
tears learning ---
http://www.trinity.edu/rjensen/265wp.htm
When I put it into practice it was not popular with students of this
generation who want it to be easy.
You
note that: “These
students have prepared but it is poorly arranged, planned, and
articulated.” One thing
we’ve noted in Student Managed Funds (like in Phil Cooley’s course
where students actually control the investments of a million dollars
or more of a Trinity University's endowment) where students must
make presentations before the Board of Trustees greatly improves
students “planning and articulation.”You can read more about
this at the University of XXXXX (December 4) at
http://financialrounds.blogspot.com/
Note that the portfolios in these courses are not virtual
portfolios. They’re the real thing with real dollars! Students adapt
to higher levels of performance when the hurdles require higher
ordered performance.
Much
of the focus in metacognitive learning is how to examine/discover
what students have learned on their own and how to control cheating
when assessing discovery and concept learning ---
http://www.trinity.edu/rjensen/assess.htm
We
studied whether instructional material that connects accounting
concept discussions with sample case applications through hypertext
links would enable students to better understand how concepts are to
be applied to practical case situations.
Results from a laboratory experiment indicated that students who
learned from such hypertext-enriched instructional material were
better able to apply concepts to new accounting cases than those who
learned from instructional material that contained identical content
but lacked the concept-case application hyperlinks.
Results also indicated that the learning benefits of concept-case
application hyperlinks in instructional material were greater when
the hyperlinks were self-generated by the students rather than
inherited from instructors, but only when students had generated
appropriate links.
I look forward to your
writings on this subject when you get things sorted out. You’re a
good writer. Scientist's aren't meant to be such good writers.
Anna Patterson's last Internet search engine
was so impressive that industry leader Google Inc. bought the technology
in 2004 to upgrade its own system.
She believes her latest invention is even more
valuable - only this time it's not for sale.
Patterson instead intends to upstage Google,
which she quit in 2006 to develop a more comprehensive and efficient way
to scour the Internet.
The end result is Cuil, pronounced "cool."
Backed by $33 million in venture capital, the search engine plans to
begin processing requests for the first time Monday.
Cuil had kept a low profile while Patterson,
her husband, Tom Costello, and two other former Google engineers -
Russell Power and Louis Monier - searched for better ways to search.
Now, it's boasting time.
Web index: For starters, Cuil's search index
spans 120 billion Web pages.
Patterson believes that's at least three times
the size of Google's index, although there is no way to know for
certain. Google stopped publicly quantifying its index's breadth nearly
three years ago when the catalog spanned 8.2 billion Web pages.
Ex-Googlers: Where are they now? Cuil won't
divulge the formula it has developed to cover a wider swath of the Web
with far fewer computers than Google. And Google isn't ceding the point:
Spokeswoman Katie Watson said her company still believes its index is
the largest.
After getting inquiries about Cuil, Google
asserted on its blog Friday that it regularly scans through 1 trillion
unique Web links. But Google said it doesn't index them all because they
either point to similar content or would diminish the quality of its
search results in some other way. The posting didn't quantify the size
of Google's index.
A search index's scope is important because
information, pictures and content can't be found unless they're stored
in a database. But Cuil believes it will outshine Google in several
other ways, including its method for identifying and displaying
pertinent results.
Content analysis: Rather than trying to mimic
Google's method of ranking the quantity and quality of links to Web
sites, Patterson says Cuil's technology drills into the actual content
of a page. And Cuil's results will be presented in a more magazine-like
format instead of just a vertical stack of Web links. Cuil's results are
displayed with more photos spread horizontally across the page and
include sidebars that can be clicked on to learn more about topics
related to the original search request.
Finally, Cuil is hoping to attract traffic by
promising not to retain information about its users' search histories or
surfing patterns - something that Google does, much to the consternation
of privacy watchdogs.
Cuil is just the latest in a long line of
Google challengers.
Other contenders: The list includes swaggering
startups like Teoma (whose technology became the backbone of Ask.com),
Vivisimo, Snap, Mahalo and, most recently, Powerset, which was acquired
by Microsoft Corp. (MSFT, Fortune 500) this month.
Even after investing hundreds of millions of
dollars on search, both Microsoft and Yahoo Inc. (YHOO, Fortune 500)
have been losing ground to Google (GOOG, Fortune 500). Through May,
Google held a 62% share of the U.S. search market followed by Yahoo at
21% and Microsoft at 8.5%, according to comScore Inc.
Google has become so synonymous with Internet
search that it may no longer matter how good Cuil or any other
challenger is, said Gartner Inc. analyst Allen Weiner.
"Search has become as much about branding as
anything else," Weiner said. "I doubt [Cuil] will be keeping anyone at
Google awake at night."
Google welcomed Cuil to the fray with its usual
mantra about its rivals. "Having great competitors is a huge benefit to
us and everyone in the search space," Watson said. "It makes us all work
harder, and at the end of the day our users benefit from that."
But this will be the first time that Google has
battled a general-purpose search engine created by its own alumni. It
probably won't be the last time, given that Google now has nearly 20,000
employees.
Patterson joined Google in 2004 after she built
and sold Recall, a search index that probed old Web sites for the
Internet Archive. She and Power worked on the same team at Google.
Although he also worked for Google for a short
time, Monier is best known as the former chief technology officer of
AltaVista, which was considered the best search engine before Google
came along in 1998. Monier also helped build the search engine on eBay's
(EBAY, Fortune 500) online auction site.
The trio of former Googlers are teaming up with
Patterson's husband, Costello, who built a once-promising search engine
called Xift in the late 1990s. He later joined IBM Corp. (IBM, Fortune
500), where he worked on an "analytic engine" called WebFountain.
Costello's Irish heritage inspired Cuil's odd
name. It was derived from a character named Finn McCuill in Celtic
folklore.
Patterson enjoyed her time at Google, but
became disenchanted with the company's approach to search. "Google has
looked pretty much the same for 10 years now," she said, "and I can
guarantee it will look the same a year from now."
Jensen Comment on July 28, 2008
Thus far the hype seems to be more hyped than the performance on this first
day of trials. For example I typed in the following in both Cuil and Google:
"Basis Adjustment" AND "FAS 133"
Google gave me hundreds of hits and many of them were quite relevant to
my research.
Cuil gave me four hits and most of them were irrelevant to my research. Cuil
said it had 1,116,835,248 hits, but I could only find a way to list four of
these hits.
Go figure! Thus far the "World's Largest Search Engine" has a ways to go.
Another limitation is that Google has many cached documents where the
original link is no longer active. Cuil does not mention a caching service.
First turn your speakers on and read in "Excel Magic Trick #73" in Cuil.
Results: Nothing!
Next read in ""Excel Magic Trick #73" in Google.
Google's cached version takes you to an interesting video on the
significant-digits bound in Excel.
Please let me know when and where Cuil is better than Google.
Also is Cuil like Yahoo in that early listing priority of hits goes to
advertisers' sites?
If that's the case, Cuil will be a bummer. It does have Preferences
button, but thus far that seems to be inactive.
I do a great deal of google searching almost
everyday and so this is of great interest. To run a quick test, I went
to cuil.com (which is supposed to stand for "cool") and entered "audit
simulation." I received nine rather large blocks of information relating
to web sites that I found to be mostly irrelevant. I then tried
"auditing simulation" and got pretty much the same thing. I also noticed
that it was looking for "audit" and "simulation" separately and that
there was no option for an advanced search, which on google allows you
to combine words into phrases and sentences. I then tried "audit
simulation" again, but this time with the quotes. This improved the
results slightly, but most of the hits were still not very relevant. The
links did have more information attached to them, but the information
seemed to take up too much space. When I type "audit simulation" or "auditng
simulation" into the basic google search page or toolbar, I get http://realaudit.com
as most relevant. This makes more sense to me and when this link does
not come up in cuil.com at all, it leaves me thinking that cuil still
has a long way to go. Thanks again, for the tip,
We've been testing the engine for the last
hour. Based on our test queries Cuil is an excellent search engine,
particularly since it is all of an hour old. But it doesn't appear to
have the depth of results that Google has, despite their claims. And the
results are not nearly as relevant.
. . .
It seems pretty clear that Google's index of
web pages is significantly larger than Cuil's unless we're randomly
choosing the wrong queries. Based on the queries above, Google is
averaging nearly 10x the number of results of Cuil.
And Cuil's ranking isn't as good as Google's
based on the pure results returned from both queries. Where Cuil excels
is with the related categories, which return results that are extremely
relevant. With Google, we've all gotten used to trying a slightly
different search to get the refined results we need. Cuil does a good
job of guessing what we'll want next and presents that in the top right
widget. That means Cuil saves time for more research based queries.
And I want to reemphasize that Cuil is only an
hour old at this point, Google has had a decade to perfect their search
engine.
Question
How does Google's new Wikipedia-like online Encyclopedia differ from the
real Wikipedia?
Hint
Colleges may one day give scholarly performance credit for authoring a
module in Knol. In a sense it's like exposing your scholarship and research
in such a way that the entire world may become "referees" of you
contribution. Of course most of the modules fall into the realm of
scholarship (mastery of existing knowledge) rather than research
(contribution to new knowledge). The catch of course, is that the author
must approve the reviewer's call. Darn! The rejected reviews may be, in most
instances, be published in Wikipedia. In that sense Wikipedia is more
academic.
"Google Presents
Wikipedia Competitor," by Andrea L. Foster, Chronicle of Higher Education,
July 23, 2008 ---
Click Here
Google today
launched Knol, an online encyclopedia that, in
many ways, mimics Wikipedia, the popular encyclopedia that anyone can
edit. As in Wikipedia, anyone can create a page in Knol. But changes to
the page become active only after they are approved by the page’s author
or authors. And unlike Wikipedia, the author’s name is featured
prominently on Knol articles.
Among the featured articles on the
Knol
site today are “How to Backpack,” “Lung Cancer,”
and “Toilet Clogs.”
Daniel Colman, director and associate dean of
Stanford University’s continuing-studies program and author of the blog
OpenCulture,
predicted in December that Knol would have a
hard time attracting experts to write articles.
I get free online access to Encyclopaedia Britannica': Is this my
just reward?
Encyclopaedia Britannica, which apparently
fears being nudged into irrelevance by the proliferation of free online
reference sources, has started giving bloggers free access to its
articles, TechCrunch reports.
Reference sites such as Wikipedia, which are
often criticized for their amateur (if zealous) authorship sources, have
made the expensive, expert-vetted, hard-bound book set a less popular
purchase. (Comscore analysis, also reported on TechCrunch, found that
“[f]or every page viewed on Brittanica.com, 184 pages are viewed on
Wikipedia,” or 3.8 billion v. 21 million page views per month).
Under a new program entitled Britannica
WebShare, the encyclopedia publisher is allowing “people who publish
with some regularity on the Internet, be they bloggers, webmasters, or
writers,” to read and link to the encyclopedia’s online articles. The
company seems to hope that by offering its services free to Web
publishers, links to Britannica articles will proliferate across the
Internet and will persuade regular Web surfers to cough up $1,400 for
the encyclopedia’s 32-volume set, or perhaps $70 for an annual online
subscription.
Posted Comments as of April 21, 2008
“What’s that laugher?” Sir Colin wondered aloud to no one in
particular. The entire room sat in nervous silence.
“I say, what is that laughter?”
— S. Britchky Apr 21, 12:50 PM #
The Encyclopedia Britannica print edition is worth every penny of the
$1400 I paid for it. Other readers should note that the print edition of
the set is marked down each year, to below $1000, near the end of its
run, as the next year’s edition approaches publication. I don’t work for
Britannica, but in my opinion, every home library should have a set. I’d
be lost without it., even though I have full access to the Internet.
— Richard Apr 21, 08:49 PM
Jensen Comment
Woe is me! Should I continue to be one of the billions or join the millions?
This is the classic issue of open source versus refereed publishing.
Refereed articles, including Encyclopaedia Britannica, assign a few highly
qualified referees to pass judgment on the accuracy and relevance of each
module once and some modules are not reviewed again for many years.
Wikipedia freely allows the entire online world to edit each module in real
time. Do you have more faith in one-time decisions of experts or real-time
decisions of possibly millions of people with expertise ranging from dunder
heads to the best experts in the world on a given topic.
What Encyclopaedia Britannica has going for it is that it prevents dunder
heads from messing up the module. What Wikipedia has going for it is that
experts generally override the dunder heads of most topics, although errors
may remain indefinitely in modules that nobody online is particularly
interested in to a point of searching for the module on Wikipedia.
There also is the "problem" in Wikipedia that organizations and
individuals such as the CIA, FBI, IRS, Israel, Russia, Barack Obama, Hillary
Clinton, John McCain, and the Fortune 500 largest corporations are
"maintaining" certain modules about themselves and sensitive terms. This is
both good and bad. It prevents kooks from spreading lies about these
organizations/individuals, but it also affords these
organizations/individuals to present their own biased accounts of
themselves. Fortunately Wikipedia added a Discussion Tab to each module
where even the kooks are allowed to express opinions on the modules. Readers
can then choose whether to read the discussions or not.
Now what about scholarly journals. Should the refereeing be done by two
or three experts (sometimes cronies) selected by the Editor or should the
working papers be exposed open source to online people of the world who can
then publish feedback regarding the strengths and weaknesses of the research
paper or other scholarly work? Me, I'm an open source kinda guy!
A researcher at Trinity College Dublin has
software that lets users map the links between Wikipedia pages. His Web
site is called “Six Degrees of Wikipedia,” modeled after the trivia game
“Six Degrees of Kevin Bacon.” Instead of the
degrees being measured by presence in the same film, degrees are
determined by articles that link to each other.
For example, how many clicks through Wikipedia
does it take to get from “Gatorade” to “Genghis Khan”? Three: Start at
“Gatorade,” then click to “Connecticut,” then “June 1,” then “Genghis
Khan.”
Stephen Dolan, the researcher who created the
software, has also used the code to determine which Wikipedia article is
the “center” of Wikipedia—that is, which article is the hub that most
other articles must go through in the “Six Degrees” game. Not including
the articles that are just lists (e.g., years), the article closest to
the center is “United Kingdom,” at an average of 3.67 clicks to any
other article. “Billie Jean King” and “United States” follow, with an
average of 3.68 clicks and 3.69 clicks, respectively.
More detailed information can be found on Mr.
Dolan’s
Web site
When the online, anyone-can-edit
Wikipedia appeared in 2001, teachers, especially college professors,
were appalled. The Internet was already an apparently limitless source
of nonsense for their students to eagerly consume — now there was a Web
site with the appearance of legitimacy and a dead-easy interface that
would complete the seduction until all sense of fact, fiction, myth and
propaganda blended into a popular culture of pseudointelligence masking
the basest ignorance. An Inside Higher Ed article just last year on
Wikipedia use in the academy drew a huge and passionate response, much
of it negative.
Now the English version
of Wikipedia has over 2 million articles, and it has been translated
into over 250 languages. It has become so massive that you can type
virtually any noun into a search engine and the first link will be to a
Wikipedia page. After seven years and this exponential growth, Wikipedia
can still be edited by anyone at any time. A generation of students was
warned away from this information siren, but we know as professors that
it is the first place they go to start a research project, look up an
unfamiliar term from lecture, or find something disturbing to ask about
during the next lecture. In fact, we learned too that Wikipedia is
indeed the most convenient repository of information ever invented, and
we go there often — if a bit covertly — to get a few questions answered.
Its accuracy, at least for science articles, is actually as high as the
revered Encyclopedia Britannica, as shown by a test published in the
journal Nature.
It is time for the
academic world to recognize Wikipedia for what it has become: a global
library open to anyone with an Internet connection and a pressing
curiosity. The vision of its founders, Jimmy Wales and Larry Sanger, has
become reality, and the librarians were right: the world has not been
the same since. If the Web is the greatest information delivery device
ever, and Wikipedia is the largest coherent store of information and
ideas, then we as teachers and scholars should have been on this train
years ago for the benefit of our students, our professions, and that
mystical pool of human knowledge.
What Wikipedia too often
lacks is academic authority, or at least the perception of it. Most of
its thousands of editors are anonymous, sometimes known only by an IP
address or a cryptic username. Every article has a “talk” page for
discussions of content, bias, and organization. “Revert” wars can rage
out of control as one faction battles another over a few words in an
article. Sometimes administrators have to step in and lock a page down
until tempers cool and the main protagonists lose interest. The very
anonymity of the editors is often the source of the problem: how do we
know who has an authoritative grasp of the topic?
That is what academics
do best. We can quickly sort out scholarly authority into complex
hierarchies with a quick glance at a vita and a sniff at a publication
list. We make many mistakes doing this, of course, but at least our
debates are supported with citations and a modicum of civility because
we are identifiable and we have our reputations to maintain and friends
to keep. Maybe this academic culture can be added to the Wild West of
Wikipedia to make it more useful for everyone?
I propose that all
academics with research specialties, no matter how arcane (and nothing
is too obscure for Wikipedia), enroll as identifiable editors of
Wikipedia. We then watch over a few wikipages of our choosing, adding to
them when appropriate, stepping in to resolve disputes when we know
something useful. We can add new articles on topics which should be
covered, and argue that others should be removed or combined. This is
not to displace anonymous editors, many of whom possess vast amounts of
valuable information and innovative ideas, but to add our authority and
hard-won knowledge to this growing universal library.
The advantages should be
obvious. First, it is another outlet for our scholarship, one that may
be more likely to be read than many of our journals. Second, we are
directly serving our students by improving the source they go to first
for information. Third, by identifying ourselves, we can connect with
other scholars and interested parties who stumble across our edits and
new articles. Everyone wins.
I have been an
open Wikipedia editor now for several months. I have enjoyed it
immensely. In my teaching I use a “living syllabus” for each course,
which is a kind of academic blog. (For example, see my History of Life
course
online
syllabus.) I connect students through links to
outside sources of information. Quite often I refer students to
Wikipedia articles that are well-sourced and well written. Wikipages
that are not so good are easily fixed with a judicious edit or two, and
many pages become more useful with the addition of an image from my
collection (all donated to the public domain). Since I am open in my
editorial identity, I often get questions from around the world about
the topics I find most fascinating. I’ve even made important new
connections through my edits to new collaborators and reporters who want
more background for a story.
For example, this year I
met online a biology professor from Centre College who is interested in
the ecology of fish on Great Inagua Island in the Bahamas. He saw my
additions and images on that Wikipedia page and had several questions
about the island. He invited me to speak at Centre next year about
evolution-creation controversies, which is unrelated to the original
contact but flowed from our academic conversations. I in turn have been
learning much about the island’s living ecology I did not know. I’ve
also learned much about the kind of prose that is most effective for a
general audience, and I’ve in turn taught some people how to properly
reference ideas and information. In short, I’ve expanded my teaching.
Wikipedia as we know it
will undoubtedly change in the coming years as all technologies do. By
involving ourselves directly and in large numbers now, we can help
direct that change into ever more useful ways for our students and the
public. This is, after all, our sacred charge as teacher-scholars: to
educate when and where we can to the greatest effect.
How helpful is Wikipedia to scholarship? Jimmy Wales, co-founder of Wikipedia, told
educators last year that students shouldn't cite his sprawling Web site:
"For God's sake, you’re in college," he said. "Don’t cite the encyclopedia.”
It's a safe bet that most professors agreed with that assessment. But
according to BBC News, Mr. Wales has now modified his message. He told
attendees at a London IT conference this week that he doesn't object to
Wikipedia citations, although he admitted that scholars would "probably be
better off doing their own research." From the BBC report, it's hard to tell
how gung-ho Mr. Wales is about Wikipedia's academic value. But the online
encyclopedia's efforts to improve the quality of its articles might be
starting to pay dividends: A German magazine recently compared 50 Wikipedia
articles with similar pieces in Brockhaus, a commercial encyclopedia.
According to the study, the Wikipedia articles were generally more
informative.
Brock Read, Chronicle of Higher Education, December 7, 2007 ---
http://chronicle.com/wiredcampus/index.php?id=2598&utm_source=wc&utm_medium=en
See, I'm not the only one!
University of Texas Professor Praises Wikipedia
Scholars often take swipes at Wikipedia, claiming that
it dumbs down education and encourages intellectual laziness.
Some professors have even banned their students
from using it for research. But in an
articlethis week in Science Progress, a
scholar at the University of Texas at Dallas argues that such bans are
irresponsible. David Parry, an assistant professor of emerging media and
communications at the university, writes that students need to become
familiar with new and non-static forms of communication. He encourages his
students to read Wikipedia’s “history” and “discussion” pages, saying they
explain how articles were produced. And he says the online encyclopedia’s
entry on global warming does a good job of explaining both the controversy
and the science surrounding the issue.“Like it or not, the networked digital
archive changes our basis of knowledge,” Mr. Parry writes “and training
people for the future is about training them for this shift." Andrea L. Foster, Chronicle of Higher Education, February 14, 2008 ---
Click Here
It goes without saying that Wikipedia modules are always
suspect, but it is easy to make corrections for the world. I
think this particular model requires registration to
discourage anonymous edits.
What is often better about Wikipedia is to read the
discussion and criticisms of any module. For example, some
facts in dispute in this particular module are mentioned in
the “Discussion” or “talk” section about the module ---
http://en.wikipedia.org/wiki/Talk:Mahmoud_Ahmadinejad
Perhaps some of the disputed facts have already been pointed
out in the “Discussion” section. Of course pointing out
differences of opinion about “facts” does not, in and of
itself, resolve these differences. I did read the
“Discussion” section on this module before suggesting the
module as a supplementary link. I assumed others would also
check the “Talk” section before assuming what is in dispute.
Since Wikipedia is so widely used by so many students and
others like me it’s important to try to correct the record
whenever possible. This can be done quite simply from your
Web browser and does not require any special software. It
requires registration for politically sensitive modules.
Wikipedia modules are often “corrected” by the FBI, CIA,
corporations, foreign governments, professors of all
persuasions, butchers, bakers, and candlestick makers. This
makes them fun and suspect at the same time. It’s like
having a paper refereed by the world instead of a few, often
biased or casual, journal referees. What I like best is that
“referee comments” are made public in Wikipedia’s
“Discussion” sections. You don’t often find this in
scholarly research journals where referee comments are
supposed to remain confidential.
Reasons for flawed journal peer reviews were recently
brought to light at
http://www.trinity.edu/rjensen/HigherEdControversies.htm#PeerReviewFlaws
The biggest danger in Wikipedia in generally for modules
that are rarely sought out. For example, Bill Smith might
right a deceitful module about John Doe. If nobody’s
interested in John Doe, it may take forever and a day for
corrections to appear. Generally modules that are of great
interest to many people, however, generate a lot of “talk”
in the “Discussion” sections. For example, the Discussion
section for George W. Bush is at
http://en.wikipedia.org/wiki/Talk:George_W._Bush
You already know about Wikipedia -- or
think you do. It's the online encyclopedia that anyone can edit,
the one that by dint of its 1.9 million English-language entries
has become the Internet's main information source and the 17th
busiest U.S. Web site.
But that's just the half of it.
Most people are familiar with
Wikipedia's collection of articles. Less well-known,
unfortunately, are the discussions about these articles. You can
find these at the top of a Wikipedia page under a separate tab
for "Discussion."
Reading these discussion pages is a
vastly rewarding, slightly addictive, experience -- so much so
that it has become my habit to first check out the discussion
before going to the article proper.
At Wikipedia, anyone can be an editor
and all but 600 or so articles can be freely altered. The
discussion pages exist so the people working on an article can
talk about what they're doing to it. Part of the discussion
pages, the least interesting part, involves simple housekeeping;
-- editors noting how they moved around the sections of an
article or eliminated duplications. And sometimes readers seek
answers to homework-style questions, though that practice is
discouraged.
But discussion pages are also where
Wikipedians discuss and debate what an article should or
shouldn't say.
This is where the fun begins. You'd be
astonished at the sorts of things editors argue about, and the
prolix vehemence they bring to stating their cases. The
9,500-word article "Ireland," for example, spawned a 10,000-word
discussion about whether "Republic of Ireland" would be a better
name for the piece. "I know full well that many Unionist editors
would object completely to my stance on this subject," wrote one
person.
A ferocious back and forth ensued over
whether Antonio Meucci or Alexander Graham Bell invented the
telephone. One person from the Meucci camp taunted the Bell side
by saying, "'Nationalistic pride' stop you and people like you
to accept the truth. Bell was a liar and thief. He invented
nothing."
As for the age-old philosophical
question, "What is truth," it's an issue Wikipedia editors have
spent 242,000 words trying to settle, an impressive feat
considering how Plato needed only 118,000 words to write "The
Republic."
These debates extend to topics most
people wouldn't consider remotely controversial. The article on
calculus, for instance, was host to some sparring over whether
the concept of "limit," central to calculus, should be better
explained as an "average."
Wikipedia editors are always on the
prowl for passages in articles that violate Wikipedia policy,
such as its ban on bias. Editors use the discussion pages to
report these sightings, and reading the back and forth makes it
clear that editors take this task very seriously.
On one discussion page is the comment:
"I am not sure that it does not present an entirely Eurocentric
view, nor can I see that it is sourced sufficiently well so as
to be reliable."
Does it address a polarizing topic from
politics or religion? Hardly. The article was about kittens. The
editor was objecting to the statement that most people think
kittens are cute.
These debates are not the only
treasures in the discussion pages. You can learn a lot of stray
facts, facts that an editor didn't think were important enough
for the main article. For example, in the discussion
accompanying the article about diets, it's noted that potatoes,
eaten raw, can be poisonous. The National Potato Council didn't
believe this when asked about it last week, but later called
back to say that it was true, on account of the solanine in
potatoes. Of course, you'd have to eat many sackfuls of raw
potatoes to be done in by them.
The discussion about "biography"
included random facts from sundry biographies, including that
Marshall McLuhan believed his ideas about mass media and the
rest to have been inspired by the Virgin Mary. This is true,
said McLuhan biographer Philip Marchand. (Mr. Marchand also said
McLuhan believed that a global conspiracy of Freemasons was
seeking to hinder his career.)
Remember, though, this is Wikipedia,
and while it tends to get things right in the long run, it can
goof up along the way. A "tomato" article contained a lyrical
description of the Carolina breed, said to be "first noted by
Italian monk Giacomo Tiramisunelli" and "considered a rare
delicacy amongst tomato-connoisseurs."
That's all a complete fabrication, said
Roger Chetelat, tomato expert at the University of California,
Davis. While now gone from Wikipedia, the passage was there long
enough for "Giacomo Tiramisunelli" to turn up now in search
engines as a key figure in tomato history.
Wikipedia is very self-aware. It has a
Wikipedia article about Wikipedia. But this meta-analysis
doesn't extend to "Wikipedia discussions." No article on the
topic exists. Search for "discussion," and you are sent to
"debate."
But, naturally, that's controversial.
The discussion page about debate includes a debate over whether
"discussion" and "debate" are synonymous. Emotions run high; the
inability to distinguish the two, said one participant, is "one
of the problems with Western Society."
Maybe I have been reading too many
Wikipedia discussion pages, but I can see the point.
Jensen Comment
This may be more educational than what we teach in class. Try it by
clicking on the Discussion tab for the following"
"CIA, FBI Computers Used for Wikipedia Edits," by
Randall Mikkelsen, The Washington Post, August 16, 2007 ---
Click Here
"CIA and Vatican Edit Wikipedia Entries," TheAge.com, August 18,
2007 ---
Click Here
Jensen Comment
Wikipedia installed software to trace the source of edits and new
modules.
On Wikipedia, you never really know who wrote
the article you're reading. Some are written by experts, but others are
written by people with time on their hands who may or may not know what
they're talking about. Actually, most Wikipedia articles are written by
a combination of the two. But Google's new Web encyclopedia,
announced last tweek,
will put the authors of articles front and center, so you'll always know
who is talking and what their qualifications are. The question is, which
model will produce a better quick-reference guide? Daniel Colman,
director and associate dean of Stanford University's continuing-studies
program and author of the blog OpenCulture, picks Wikipedia to win this
face off. He thinks that Google's planned encyclopedia
will have a hard time attracting experts to write articles,
whereas Wikipedia works by letting everyone write
articles that are then often corrected by experts. "Take my word for
it," writes Mr. Colman. "I’ve spent the past five years trying to get
scholars from elite universities, including Stanford, to bring their
ideas to the outside world, and it’s often not their first priority.
They just have too many other things competing for their time." Others
have pointed out that Google's project, called knol, is similar to other
efforts to create authoritative topic pages, like
Squidoo. There is
at least one key factor in Google's favor though. Knol authors stand to
make money for their efforts. "At the discretion of the author, a knol
may include ads," Google's Udi Manber, said in a statement announcing
the service. "If an author chooses to include ads, Google will provide
the author with substantial revenue share from the proceeds of those
ads." Those ad dollars would be more than professors make for writing
journal articles, which are usually written for no compensation at all.
. . .
There is at least one key factor in Google’s
favor though. Knol authors stand to make money for their efforts.
“At the discretion of the author, a knol may
include ads,” Google’s Udi Manber, said in a statement announcing the
service. “If an author chooses to include ads, Google will provide the
author with substantial revenue share from the proceeds of those ads.”
Those ad dollars would be more than professors
make for writing journal articles, which are usually written for no
compensation at all.
CiteBase
Citebase is a trial service that allows researchers
to search across free, full-text research literature
ePrint archives, with results ranked according to
criteria such as citation impact.
Gateway to ePrints
A listing of ePrint servers and open access
repository search tools.
Google Scholar
A search tool for scholarly citations and abstracts,
many of which link to full text articles, book
chapters, working papers and other forms of
scholarly publishing. It includes content from many
open access journals and repositories.
OAIster
A search tool for cross-archive searching of more
than 540 separate digital collections and archives,
including arXiv, CiteBase, ANU ePrints, ePrintsUQ,
and others.
Scirus
A search tool for online journals and Web sites in
the sciences.
Borrowing a page from the popular video-sharing
site YouTube, a new online service lets people upload and share their
papers or entire books via a social-network interface. But will a format
that works for videos translate to documents?
It’s called
iPaper,
and it uses a Flash-based document reader that can
be embedded into a Web page. The experience of reading neatly formatted
text inside a fixed box feels a bit like using an old microfilm reader,
except that you can search the documents or e-mail them to friends.
The company behind the technology, Scribd, also
offers a
library of
iPaper documents and invites users to set up
an account to post their own written works. And, just like on YouTube,
users can comment about each document, give it a rating, and view
related works.
Also like on YouTube, some of the most popular
items in the collection are on the lighter side. One document that is in
the top 10 “most viewed” is called
“It seems this essay was written while the guy was high, hilarious!”
It is a seven-page paper that appears to have been
written for a college course but is full of salty language. The document
includes the written comments of the professor who graded it, and it
ends with a handwritten note: “please see after class to discuss your
paper.”
Social scientists and business scholars often use SSRN (not free) ---
http://www.ssrn.com/
If you have access to a college library, most colleges generally have
paid subscriptions to enormous scholarly literature databases that are not
available freely online. Serious scholars obtain access to these vast
literature databases.
Zotero is a
free,
open source extension
for the
Firefox browser, that
enables users to collect, manage, and cite
research from all types of sources from the
browser. It is partly a piece of
reference management software,
used to manage
bibliographies and
references
when writing essays and articles. On many major
research websites such as digital libraries,
Google Scholar, or
even
Amazon.com, Zotero
detects when a book, article, or other resource
is being viewed and with a mouse click finds and
saves the full reference information to a local
file. If the source is an online article or web
page, Zotero can optionally store a local copy
of the source. Users can then add notes, tags,
and their own
metadata through the
in-browser interface. Selections of the local
reference library data can later be exported as
formatted bibliographies.
The program is produced by
the
Center for History and New Media
of
George Mason University
and is currently available
in public beta. It is open and extensible,
allowing other users to contribute citation
styles and site translators, and more generally
for others who are building digital tools for
researchers to expand the platform. The
name is from
Albanian language "to
master".
It is aimed at replacing
the more cumbersome traditional
reference management software,
originally designed to
meet the demands of offline research
Zotero is a tool for
storing, retrieving, organizing, and annotating
digital documents. It has been available for not
quite a year. I started using it about six weeks
ago, and am still learning some of the fine
points, but feel sufficient enthusiasm about
Zotero
to recommend it to anyone doing research online.
If very much of your work involves material from
JSTOR, for example – or if you find it necessary
to collect bibliographical references, or to
locate Web-based publications that you expect to
cite in your own work — then Zotero is worth
knowing how to use. (You can install it on your
computer for free; more on that in due course.)
Now, my highest qualification for
testing a digital tool is, perhaps, that I have no
qualifications for testing a digital tool. That is not as
paradoxical as it sounds. The limits of my technological
competence are very quickly reached. My command of the laptop
computer consists primarily of the ability to (1) turn it on and
(2) type stuff. This condition entails certain disadvantages
(the mockery of nieces and nephews, for example) but it makes
for a pretty good guinea pig.
And in that respect, I can report that
the folks at George Mason University’s Center for History and
New Media have done an exemplary job in designing Zotero. A
relatively clueless person can learn to use it without
exhaustive effort.
Still, it seems as if institutions that
do not currently do so might want to offer tutorials on Zotero
for faculty and students who may lack whatever gene makes for an
intuitive grasp of software. Academic librarians are probably
the best people to offer instruction. Aside from being digitally
savvy, they may be the people at a university in the best
position to appreciate the range of uses to which Zotero can be
put.
For the absolute newbie, however, let
me explain what Zotero is — or rather, what it allows you to do.
I’ll also mention a couple of problems or limitations. Zotero is
still under development and will doubtless become more powerful
(that is, more useful) in later releases. But the version now
available has numerous valuable features that far outweigh any
glitches.
Suppose you go online to gather
material on some aspect of a book you are writing. In the course
of a few hours, you might find several promising titles in the
library catalog, a few more with Amazon, a dozen useful papers
via JSTOR, and three blog entries by scholars who are thinking
aloud about some matter tangential to your project.
Continued in article
Using Speech Recognition in a Search Engine Boston-based startup EveryZing
has launched a search engine that it hopes will change the
way that people search for audio and video online. Formerly known as PodZinger,
a podcast search engine, EveryZing is leveraging speech systems developed by
technology companyBBN
that can convert spoken words into searchable text with about 80 percent
accuracy. This bests other commercially available systems, says EveryZing CEO
Tom Wilde.
Kate Greene, "More-Accurate Video Search: Speech-recognition software
could improve video search," MIT's Technology Review, June 12, 2007 ---
http://www.technologyreview.com/Infotech/18847/
The University Channel makes videos of
academic lectures and events from all over the world available to the
public. It is a place where academics can air their ideas and present
research in a full-length, uncut format. Contributors with greater video
production capabilities can submit original productions.
The University Channel presents ideas in a
way commercial news or public affairs programming cannot. Because it is
neither constrained by time nor dependent upon commercial feedback, the
University Channel's video content can be broad and flexible enough to cover
the full gamut of academic investigation.
While it has unlimited potential, the
University Channel begins with a focus on public and international affairs,
because this is an area which lends itself most naturally to a many-sided
discussion. Perhaps of greatest advantage to universities who seek to expand
their dialog with overseas institutions and international affairs, the
University Channel can "go global" and become a truly international forum.
The University Channel aims to become,
literally, a "channel" for important thought, to be heard in its entirety.
Television has become so much a part of the fabric of our world that it
should be more than an academic interest. It should be an academic tool.
The University Channel project is an
initiative of Princeton University's Woodrow Wilson School of Public and
International Affairs, which is leading the effort to build university
membership and distribution partners. Technical support, advice and services
are provided through the generosity of Princeton University's Office of
Information Technology. Digital video solutions courtesy of Princeton Server
Group.
For those users who are finding their current RSS
feed software a bit unruly, they may wish to check out this latest version
of the Advanced RSS Mixer. The application can be used to combine different
RSS feeds into one aggregate feed, and it also contains a built-in RSS
keyword filter. The basic interface is quite easy to use, and for keeping
track of RSS feeds, this application is most handy. This version is
compatible with computers running Windows 95 and newer.
CatsCradle 3.5 ---
http://www.stormdance.net/software/catscradle/overview.htm
Many websurfers enjoy going to sites that might be based
in other countries, and as such, they might very well encounter a different
language. With CatsCradle 3.5, these persons need worry no more, as this
application can be used to translate entire websites in such languages as Thai,
Chinese, Japanese, and Russian. This version is compatible with all computers
running Windows XP or 2000. (Scout Report, September 1, 2006)
KidStuff
Movies
Credit Bureaus NEW!
About the
Home
Inspiration
Electronic
Directories
Home
Journals
Time and
Weather
Electronic Greetings
Travel and Tourism Numbers &
Measurements
Books
Travel
Coupons
Information,
Please
Hoax Sites
Vehicles
Free
Stuff
Dead Links Archive
Question
What new online people finders are making it easier to find the whereabouts of
people in your past? Hint: One of the sites has very large and pointed ears.
Zaba Search free database of names, addresses, birth dates,
and phone numbers. Social security numbers and background checks are also
available for a fee ---
http://www.zabasearch.com/
"Searching for Humans: Various websites are trying to make it easier to
find friends and colleagues online," by Erica Naone, MIT's Technology Review,
August 20, 2007 ---
http://www.technologyreview.com/Infotech/19270/
Jaideep Singh,
cofounder of the new people-search
engine
Spock,
says he wants to build a profile for
every person in the world. To do
this, he plans to combine the power
of search algorithms with online
social networks.
Singh says he got the idea for Spock
while looking for people with
specific areas of expertise among
his contacts in Microsoft Outlook.
Although he has two or three
thousand people listed, he could
only find people he was already
thinking about.
Spock is designed to solve that
problem by allowing users to search
for tags--such as "saxophonist" or
"venture capitalist"--and then view
a list of people associated with
those tags. Singh could have
manually entered tags for each of
his contacts into Microsoft Outlook,
but capturing every interest of each
particular individual would be
time-consuming. Spock uses a
combination of human and machine
intelligence to automatically come
up with the tags: search algorithms
identify possible tags, and users
can vote on their relevance or add
new tags. Registered users can add
private tags to another person's
profile to organize their contacts
based on information that they don't
want to share. For example, a
contentious associate might be
privately labeled as such.
The
social-network component of the
website introduces an element of
crowd commentary into the search
process.
George W. Bush
is tagged
"miserable failure," with a vote of
87 to 31 in favor of the tag's
relevance as of this writing. Users
aren't allowed to vote anonymously,
and the tag links to the profiles of
people who voted.
Singh hopes
social networks will also help with
one of the main problems in people
search: teaching the system to
recognize that two separate entries
refer to a single person--a problem
called entity resolution. For
example, a single person might have
a
MySpace
page, a
Linked In
profile, and a write-up on a company
website.
Steven Whang,
an
entity-resolution researcher at
Stanford University, says that there
are several aspects to the problem:
getting the system to compare two
entries and decide whether they are
related, merging related entries
without repetition, and comparing
information from a myriad of
possible sources online. Finally,
Whang says, there is a risk of
merging two entries that should not
be merged, as in the case of a name
like Robin, which is used by both
men and women.
Many of the
people-search engines try to get
around these problems by encouraging
people to claim and manage their own
profiles, although Whang notes that
this is a labor-intensive approach.
Although there are many sites where
people could claim their profiles,
Singh says he thinks one engine will
eventually dominate, and people will
make the effort to claim profiles
there. Bryan Burdick, chief
operating officer of the
business-search site
Zoominfo,
says that 10,000 people a week claim
their profiles on Zoom, in spite of
having to provide their credit-card
numbers to do so.
Singh has also
introduced the
Spock Challenge, a
competition to design a better
entity-resolution algorithm. He says
that 1,400 researchers have already
downloaded the data set, and they
will compete for a $50,000 prize,
which will be awarded in November.
The Accoona Super Target search engine is at
http://www.accoona.com/ That being said, Accoona looks, at first glance, not
much different than other search engines — including Google itself. Its
bare-bones initial interface follows the same design: A central search field
with buttons that let you search the entire Web or confine your search to news
or business sources. Searching On Scott I started with a general Web search on
"Scott Joplin" on Accoona and Google, and found quite a bit of disparity in the
results (112,393 for Accoona and 4,130,000 for Google). When I did a search on
the phrase "mp3 players," I got similar results: Accoona came up with 6,031,343
results, while Google boasted 187,000,000. Quite frankly, while I appreciated
Google's higher numbers, that alone wouldn't have made Google my preferred
search engine — how many people go past the fifth page of results, anyway? There
was also some variation in which sites came up in what order, but again, there
were no really important differences.
Barbara Krasnoff, "Accoona: A New Google Alternative? The latest search engine
to hit the Web, Accoona offers additional business info and a nice filtering
ability. But is that enough? InternetWeek, March 20, 2006 ---
http://internetweek.cmp.com/handson/183700172
Academics should remember that Google Scholar greatly narrows down the search
hits ---
http://scholar.google.com/
Fee Based Google
Specialized Services (including an enterprise-level
search appliance) Google Inc. added two beefier Minis to
its line of business search appliances. The Mountain View,
Calif.company said Minis
are now available with
capacities of 200,000 documents and 300,000 documents
for $5,995 and $8,995, respectively. The new versions
were in addition to the current 100,000-document
appliance that sells for $2,995. Google also sells an
enterprise-level appliance that can search up to 15
million documents. The device starts at $30,000 for
searching up to 500,000 documents.
Antone Gonsalves, "Google Unveils Two Search
Appliances," InternetWeek, January 12, 2006 ---
http://www.internetweek.cmp.com/showArticle.jhtml?sssdmh=dm4.163237&articleId=175804113
Question
What is Boxxet (box set) and why might it be the next big thing when
searching on the Web in your discipline?
At the O'Reilly
Emerging Technology Conference in San
Diego this week, a new software
application was introduced, called
Boxxet
(pronounced "box set"), which allows
online interest groups to form by
aggregating content from users, instead
of the more traditional way of
networking around a person or event. The
software is meant to build communities
by allowing users to gather and rate
search information. It operates on the
assumption that in a group of 100
people, at least three will rate items
for relevance. Boxxet won't be
available to the public for another
couple of months, but free invitations
to try it out are available on their
website. The software is meant to build
communities by allowing users to gather
and rate search information. It operates
on the assumption that in a group of 100
people, at least three will rate items
for relevance. Boxxet won't be
available to the public for another
couple of months, but free invitations
to try it out are available on their
website.Conference organizer Tim
O'Reilly, who cited Boxxet in his
keynote address, says he's big on the
company because it solves a fundamental
issue with social software. "The problem
with social networks is they're
artificial -- they aren't 'your'
network," he says. "Boxxet is an
infrastructure to let you develop your
own social network."
Michael Fitzgerald, "Beyond Google:
Collective Searching A new kind of
search engine could make the act of Web
searching more sociable," MIT's
Technology Review, March 9, 2006 ---
http://www.technologyreview.com/InfoTech/wtr_16526,258,p1.html
Beyond Google with Specialized Search Engines Instead of trawling through
billions of Web pages to find results, the way the big
engines do, vertical engines limit their searches to
industry-specific sites. And they usually serve up lists
of actual things -- such as houses for sale or open jobs
-- instead of links to pages where you might find them.
So you spend less time skimming through irrelevant links
to find what you want. On top of that, the sites let you
filter the results by factors such as salary, price or
location. "Often, a specialized database can take you
directly" to the most useful information and save you
time, says Gary Price, news editor of the Search Engine
Watch site. "Every useful result can't be in the first
few results from a major Web engine, and that's where
most people look."
Kevin J. Delaney, "Beyond Google: Yes, there are
other search engines. And some may even work better for
you," The Wall Street Journal, December 19, 2005;
Page R1 ---
http://online.wsj.com/article/SB113459260842822579.html?mod=todays_us_the_journal_report
Here's
a look at some common search tasks -- and a
sampling of specialized search engines that
will get you what you're looking for.
If you go to a big search engine
looking for background on a certain
topic, you'll usually get a series
of links to other pages -- which
means more surfing to get what you
want. Answers.com, formerly known as
GuruNet, cuts out the middleman by
collecting all the information and
organizing it into a Web page.
Type "Internet" into the site, for
example, and it displays a
comprehensive history and
explanation of the Internet, with
entries culled from the Computer
Desktop Encyclopedia, Columbia
University Press Encyclopedia,
Wikipedia and other sources. The top
results from Google on a recent day,
by contrast, included the sites of
Microsoft's Internet Explorer
software and an online movie
database.
"We see
ourselves as complementary to search
engines," says Bob Rosenschein,
chairman and chief executive of
Answers
Corp. in
Jerusalem, which offers the service.
Indeed, Google's results page for
some queries includes a "definition"
link that takes users to the
Answers.com results for the same
query.
Conduct U.S. Government Searches
(including sites for buying goods and services from the Feds) ---
http://www.firstgov.gov/
"Federal Web Search Upgraded:
Contractor-Run Service Boasts Answers in a Click
or Two." by Caroline E. Mayer, The Washington
Post, February 18, 2006 ---
Click Here
Need to know how
many calories are in that margarita you
drank last night?
The temperature at
the beach you hope to go to this weekend?
How many minutes
your flight will be delayed because of high
winds in Newark?
Or the winning
number in the Pennsylvania lottery?
The answers are at
a comprehensive one-stop federal Web site,
FirstGov.gov, the official gateway to
federal, state and local government Web
sites and documents.
The nearly
six-year-old Web site, which has won
innovation awards for being
consumer-friendly, has just been updated to
make it easier for consumers, businesses and
federal employees to find a mind-boggling
array of information from A (airline
complaints) to Z (Zip codes). With a click
or two of the mouse, users can download tax
forms, collect all sorts of economic trivia
or play educational online games to learn
about consumer scams and how to avoid them,
of course.
FirstGov launched a
powerful new search engine last month,
expanding the number of accessible documents
from 8 million to 40 million, including more
state and local Web sites. Perhaps equally
significant for time-constrained browsers,
the new search engine uses improved
algorithms to provide more relevant results.
With the old search
engine, for example, a search for "baseball"
brought up the Web site Afterschool.gov
because it features a picture of a boy
holding a baseball bat. With the new search
engine, that same search steers you to a
list of World Series winners. (Who knew the
government even had such information?)
Using the old
search engine, a person who typed "Social
Security" in the search box would get a link
to the Social Security Administration and
related Web sites, including the President's
Commission to Strengthen Social Security.
The same search
today turns up a list of frequently asked
questions, such as "What are the Social
Security and Medicare increases the
government has in store for 2006?" or "How
do I contact Social Security's nationwide
Toll-Free Hotline?" There is also a special
section where a browser can further refine
the field of research by choosing retirement
or disability, as well as a tab to easily
download federal forms.
Consumers in the
market for a new car can just enter a make
and model to get gas mileage and crash test
results on a single page. In the past, it
would have taken visits to two different
government Web sites (one by the
Environmental Protection Agency, the other
by the National Highway Traffic Safety
Administration) to get the data.
"All this
information is out in the government, but it
does you no good if you can't find it," said
M.J. Pizzella, associate administrator of
the General Services Administration's Office
of Citizen Services and Communications,
which oversees FirstGov.gov.
The government had
been running its own search operations -- at
an estimated annual cost of $3.2 million.
The new search engine is being operated for
$1.8 million a year under a contract with
two private companies: Microsoft Corp. and
Vivisimo Inc.
FirstGov.gov also
offers podcasts, as well as Espanol.gov for
Spanish-speaking consumers.
By presenting
frequently asked questions and special
sections to allow consumers to refine their
initial search, FirstGov is more than a
Google for government, said Larry Freed,
president of ForeSee Results, a Michigan
firm that measures customer satisfaction of
Internet sites. "It's sort of a Google-plus"
because you do not have to rely on scrolling
through pages and pages of search results to
find what you want, Freed said.
Launched in the
last days of the Clinton administration,
FirstGov was revamped in 2002 to make it
easier to use. At that time, the goal was
"three clicks to service." But under the
latest redesign, just one or two clicks may
be all that is needed.
When the site began, "customer satisfaction
was fairly low," Freed said. But the
government "has made great strides," he
said. "Is the government really building Web
sites for me? They are, and it's a win-win
for the government and consumers. Consumers
are getting more information, and the
government is lowering its cost by making it
easier to get information off the Web,"
helping reduce calls to call centers.
IBM Corp. and Yahoo
Inc. are teaming up to offer a free
data-search tool for businesses, a quirky
move challenging Google Inc. and other
corporate-search specialists in a blossoming
market.
IBM already sells a
business-focused search product, OmniFind,
that lets organizations comb through
internal documents. This free new edition of
OmniFind will be limited in the number of
documents it can query, but it will combine
the results with Web searches powered by
Yahoo.
IBM hopes the
service, being announced Wednesday, bolsters
its overall efforts to improve its dealings
with small companies.
More broadly,
though, Yahoo and IBM expect their
partnership to shake up the field of
''enterprise search,'' in which leading
providers such as Google, Autonomy Corp. and
Norway-based FAST are seeing forays from
business software giants such as Microsoft
Corp., Oracle Corp. and SAP AG.
Google has been
dominant at the lower end of the market
selling ''search appliances'' that begin at
$2,000 and range up to $30,000. The
top-of-the-line version can comb through
500,000 documents. Not coincidentally, that
is the same limit that IBM and Yahoo have
set for their free software -- although
Google's product includes hardware that
operates the search service.
''They're going to
create a real headache for Google at that
tier,'' said Forrester Research analyst
Matthew Brown.
Of course, whatever
pain Google feels ought to be put in context
-- it gets 99 percent of its revenue from
advertising, not from selling search
appliances.
While Yahoo and IBM
may eventually expand their partnership,
Yahoo will focus on the Web-search aspect of
the equation and not venture into enterprise
search, said Eckart Walther, Yahoo's vice
president of product management for search.
That would be in keeping with Yahoo's recent
pledge to stay focused on its consumer
audience and advertising network -- a step
aimed at resolving internal strife over a
muddled strategy.
Indeed, Forrester's
Brown said it appears that Yahoo is most
interested in using the IBM deal to
strengthen its brand in corporate
environments and get people using Yahoo Web
search at work more often.
Touch User
Interface Links Podcasts To Printed Text Somatic Digital LLC said Friday
it has developed technology that lets publishers
integrate podcasts into their paper and ink content. The
tool is offered through the BookDesigner software suite.
The software tool allows publishers tie a
podcast
to a paper-based text, supplement or magazine, the
company said. The reader touches the page in a printed
book and a podcast is directed to the reader’s computer
or download to an MP3 player through Bluetooth
technology. The podcast can serve as a supplement to the
paper-based product bringing new revenue opportunities
to publishers and authors, the company said.
Laurie Sullivazn, "Touch User Interface Links Podcasts
To Printed Text," Information Week, December 16,
2005 ---
http://www.internetweek.cmp.com/showArticle.jhtml?sssdmh=dm4.161133&articleId=175004719
Everyone knows a lot about
something, whether it's quasars, quilting,
or crayons. But the converse is also true: there
are a lot of things that most people know
nothing about. And unfortunately, that doesn't
seem to stop them from sharing their opinions.
That's one lesson I took away from my recent
survey of the growing collection of social
question-and-answer websites, where members can
post questions, answer other members' questions,
and rate other members' answers to their
questions--all for free. The Wikipedia-like,
quintessentially Web 2.0
premise of these ventures--which include
Yahoo Answers,
Microsoft's
Live QnA,
AnswerBag,
Yedda,
Wondir, and Amazon's new
Askville--is
that the average citizen is an untapped well of
wisdom.
But
it takes a lot of sifting to get truly useful
information from these sites. Each boasts a core
of devoted members who leave thorough and
well-documented answers to the questions they
deem worthy. And most of the sites have systems
for rating the performance or experience of
answerers, which makes it easier to assess their
reliability, while also inspiring members to
compete with one another to give the best
answers. But not all of the Q&A sites do this
equally well; after all, the companies that run
these sites are selling advertising space, not
information.
In
an attempt to flush out the best of the bunch,
I've spent the past few days trying to identify
what unique advantages each one offers. I also
devised a diabolically difficult, two-part test.
First, I searched each site's archive for
existing answers to the question "Is there any
truth to the five-second rule?" (I meant the
rule about not eating food after it's been on
the floor for more than five seconds, not the
basketball rule about holding.)
Second, I posted the same two original questions
at each site: "Why did the Mormons settle in
Utah?" and "What is the best way to make a
grilled cheese sandwich?" The first question
called for factual, historical answers, while
the second simply invited people to share their
favorite sandwich-making methods and recipes. I
awarded each site up to three points for the
richness and originality of its features, and up
to three points for the quality of the answers
to my three questions, for a total of 12
possible points.
Features:
Launched in 2003,
AnswerBag is one of
the oldest Q&A
sites. Members get
points for asking
and answering
questions as well as
for rating other
members' questions
and answers. After
earning a certain
number of points,
members "level up"
from Beginner to
Novice, Contributor,
Wiz, Authority,
Expert, and
ultimately
Professor. Bloggers
or webmasters can
embed customized
AnswerBag "widgets"
in their own pages,
so that visitors to
a site about
restoring antiques,
for example, can ask
AnswerBag members
questions about
restoration.
Points: 1
Is
there any truth to
the five-second
rule?
All of AnswerBag's
answers about the
five-second rule
pertained to
basketball.
Points: 0
Why
did the Mormons
settle in Utah?
By press time--two
and a half days
after I posted the
question--I had
received only one
answer at AnswerBag.
Here it is, edited
for brevity (like
all the answers
quoted here): "The
church believes that
God directed Brigham
Young, Joseph
Smith's successor as
President of the
Church, to call for
the Mormons to
organize and migrate
west, beyond the
western frontier of
the United States to
start their own
community away from
traditional American
society." That's
more or less in line
with the best
answers to this
question at other
sites.
Points: 1
What
is the best way to
make a grilled
cheese sandwich?
I rated the answers
to this question
purely according to
their
mouthwateringness.
The best AnswerBag
answer, out of six:
"Grate cheddar
cheese or similiar
[sic] and then add
about a quarter of
the same amount of
Lancashire, cheshire
or similiar [sic]
crumbly white
cheese. Mix them
together with a
couple of spoonfuls
of milk until the
consistency goes
like thick cottage
cheese. Add lots of
black pepper. Spread
on lightly toasted
buttered bread and
put back under the
grill until the
cheese melts and is
golden brown.
Delish."
Points: 2
Continued in article
Jensen Comment
None of these free services is very good for accounting questions. For me,
Wondir did better with accounting questions than the other alternatives, but
none of these sites would be very helpful in answering questions about
accounting and tax rules.
Magellan is a
perl, CGI-based meta search engine, aimed at being highly evolutive. It provides
an extended query language that enables it to perform complex requests and check
the results before showing them.
Current state of scholarly cyberinfrastructure in the humanities and social
sciences
From the University of Illinois Issues in Scholarly Communication Blog
"Our Cultural Commonwealth"
The American Council of Learned Societies has just
issued a report, "Our Cultural Commonwealth," assessing the current state of
scholarly cyberinfrastructure in the humanities and social sciences and
making a series of recommendations on how it can be strengthened, enlarged
and maintained in the future.
John Unsworth, Dean and Professor, Graduate School
of Library and Information Science here at Illinois, chaired the Commission
that authored the report.
Free pass to the "most comprehensive online research storehouse" It's a lofty ambition -- the Internet equivalent of
nonprofit public television: a user-supported resource that pays top academics
to create authoritative maps, articles, and links to third-party content related
to virtually any scholarly topic. But the vast scope of the project hasn't
stopped former high-flying Silicon Valley entrepreneur Joe Firmage from building
Digital Universe, a commercial-free storehouse of information four years in the
making.
"A Free Online Encyclopedia: Digital Universe, a nonprofit website, aims
to be the most comprehensive online research storehouse," MIT's Technology
Review, March 6, 2006 ---
http://www.technologyreview.com/TR/wtr_16512,323,p1.html
Chinese-language version of Wikipedia China's biggest Internet search site, Baidu.com, has
launched a Chinese-language encyclopedia inspired by the cooperative reference
site Wikipedia, which the communist government bars China's Web surfers from
seeing. The Chinese service, which debuted in April, carries entries written by
users, but warns that it will delete content about sex, terrorism and attacks on
the communist government. Government censors blocked access last year to
Wikipedia, whose registered users have posted more than 1.1 million entries,
apparently due to concern about its references to Tibet, Taiwan and other
topics. The emergence of Baidu's encyclopedia reflects efforts by Chinese
entrepreneurs to take advantage of conditions created by the government's
efforts to simultaneously promote and control Internet use.
"Baidu, the most popular search engine in China, has launched a Chinese-language
version of Wikipedia," MIT's Technology Review, May 18, 2006 ---
http://www.technologyreview.com/read_article.aspx?id=16896
"Co-Founder of Wikipedia Starts Spinoff With Academic Editors,"
University of Illinois Issues in Scholarly Communications blog, October
18, 2006 ---
http://www.library.uiuc.edu/blog/scholcomm/
Can scholars build a better version of Wikipedia?
Larry Sanger, a co-founder who has since become a critic of the open-source
encyclopedia, intends to find out.
This week Mr. Sanger announced the creation of the
Citizendium, an online, interactive encyclopedia that will be open to public
contributors but guided by academic editors. The site aims to give academics
more authorial control -- and a less combative environment -- than they find
on Wikipedia, which affords all users the same editing privileges, whether
they have any proven expertise or not.
The Citizendium, whose name is derived from
"citizen's compendium," will soon start a six-week pilot project to
determine many of its basic rules and operating procedures.
Mr. Sanger left Wikipedia at the end of 2002
because he felt it was too easy on vandals and too hard on scholars. There
is a lot to like about Wikipedia, he said, starting with the site's
open-source ethics and its commitment to "radical collaboration."
But in operation, he said, Wikipedia has flaws --
like its openness to anonymous contributors and its rough-and-tumble editing
process -- that have driven scholars away. With his new venture, Mr. Sanger
hopes to bring those professors back into the fold.
He plans to create for the site a "representative
democracy," in which self-appointed experts will oversee the editing and
shaping of articles. Any Web surfer, regardless of his or her credentials,
will be able to contribute to the Citizendium. But scholars with "the
qualifications typically needed for a tenure-track academic position" will
act as editors, he said, authorizing changes in articles and approving
entries they deem to be trustworthy.
A team of "constables" -- administrators who must
be more than 25 years old and hold at least a bachelor's degree, according
to the project's Web site -- will enforce the editors' dictates. "If an
editor says the article on Descartes should put his biography before his
philosophy, and someone changes that order, a constable comes in and changes
it back," said Mr. Sanger.
Of course the Wikipedia link to an unbelievable (nearly 1.5 million articles
to date) database in information (and some misinformation) is at
http://en.wikipedia.org/wiki/Main_Page
"The Dangerous Side of Search Engines: Popular search engines
may lead you to rogue sites. Here's what you need to know to avoid dangerous
downloads, bogus sites, and spam," by Tom Spring, PC World via
The Washington Post, May 27, 2006 ---
Click Here
Who knew an innocent search for "screensavers"
could be so dangerous? It may actually be the riskiest word to type into
Google's search engine. Odds are, more than half of the links that Google
returns take you to Web sites loaded with either spyware or adware. You
might also face getting bombarded with spam if you register at one of those
sites with your e-mail address.
A recently released study, coauthored by McAfee and
anti-spyware activist Ben
Edelman , found that sponsored results from top
search engines AOL, Ask.com, Google, MSN, and Yahoo can often lead to Web
sites that contain spyware and scams, and are operated by people who love to
send out spam.
The
study concluded that an average of 9 percent of
sponsored results and 3 of organic search results link to questionable Web
sites. The study was based on analysis of the first five pages of search
results for each keyword tested.
According to the results of the study, the top four
most dangerous searches on Google are:
The study defined dangerous sites as those that
have one or a combination of the following characteristics: its downloads
contain spyware and/or adware; its pages contain embedded code that performs
browser exploits; the content is meant to deceive visitors in some way; it
sends out inordinate amounts of spam to e-mail accounts registered at the
site.
These results are a sobering wake-up call to Web
surfers, and they illustrate the changing nature of Internet threats today.
It used to be that most viruses and scams made their way to our PCs
via our inboxes . But thanks to security software
that's getting better at filtering out viruses, spam, and phishing attacks
from our e-mail, rogue elements are
having a difficult time booby-trapping our PCs.
"Scammers and spammers have clearly turned to
search engines to practice their trade," says Shane Keats, market strategist
for McAfee.
McAfee says that of the 1394 popular keywords it
typed into Google and AOL alone, 5 percent of the results returned links to
dangerous Web sites. Overall, MSN search results had the lowest percentage
of dangerous sites (3.9 percent) while Ask search results had the highest
percentage (6.1 percent).
Given the study's findings, it shouldn't come as a
big surprise that the company has a free tool, called McAfee SiteAdvisor,
for tackling the problems. In my tests I found it does a great job of
protecting you from the Web's dark side.
Since March McAfee has been offering a
browser plug-in that works with Mozilla Firefox
and
Microsoft Internet Explorer. SiteAdvisor puts a
little rectangular button in the bottom corner of the browser. If a site
you're visiting is safe, the SiteAdvisor button stays green. When you visit
a questionable Web site the button turns red or yellow (depending on the
risk level) and a little balloon expands with details on why SiteAdvisor has
rated the site as such.
SiteAdvisor ratings are based on threats that
include software downloads loaded with adware or spyware, malicious code
embedded in Web pages, phishing attempts and scams, and the amount of spam
that a registered user gets.
SiteAdvisor takes it a step further with Google,
MSN, and Yahoo. With these search engines, it puts a rating icon next to
individual results. This is a great safety feature and time saver, steering
you clear of dangerous sites before you make the mistake of clicking on a
link.
"Kid-Friendly Search Engines Filter Content," by Akeya Dickson, The
Washington Post, May 8, 2006 ---
Click Here
It's not unheard of these days for a child doing
online research for a school project to accidentally stumble into a porn
site or someplace else that's too dicey for a parent's comfort level.
Between e-mail filters, parental controls and
special software, there are plenty of tools meant to help parents keep their
children safe. The next target for fed-up parents: Internet search engines
such as Google and Yahoo.
The upside of the modern-day search engine -- an
index of Web sites on the Internet -- is also the downside. And when kids
research a report by tapping search words in Google or Yahoo, chances are
good that they may run across something they shouldn't see.
Christine Willig, president of Cincinnati-based
Thinkronize, said that one in four children across the country is exposed to
pornography by age 11 -- often over the Internet.
Her company's flagship product, NetTrekker, a
child-safe search engine featuring 180,000 sites that are regularly reviewed
by 400 volunteer teachers, has been in schools since 2000, including many in
Virginia, Maryland and the District.
Now, the product is being made available for home
users for $9.95 ( http://www.netrekker.com/ ).
Willig, the mother of seven, said children's
potential exposure to questionable Internet content was the primary reason
she left her job as a textbook publisher and joined the start-up Thinkronize.
"My decision to leave was driven by my own
experiences with my own children and stories I've heard from other parents
and teachers," she said.
Since then, the product has been used in 12,000
schools across the United States -- reaching an estimated 7 million
students. School administrators and parents in other countries -- including
Hong Kong, Turkey and Nigeria -- also have expressed an interest in the
product, she said.
In Pennsylvania, the search engine was adopted in
school districts across the state.
Exposure to inappropriate sites "was definitely a
huge concern with teachers," said Mary Schwander, a library media specialist
at New Hope-Solebury High School in New Hope, Pa. "Some kids did a
comparison between Google and NetTrekker and found that NetTrekker was more
favorable to use and quicker."
Willig acknowledges that offensive and
inappropriate sites have been found -- but usually by teachers and specialty
software that constantly scan the sites, not the students.
"With our tools in place, we have found porn sites,
and we have found them before users," Willig said. "There's a Martin Luther
King site that's now a hate site, really a KKK thing in disguise. There are
those things that we have to look out for with a combination of technology
and human review."
That is the main challenge constantly facing John
Stewart and Ryan Krupnik, the guys behind the family-safe search engine
RedZee. The site filters out pornographic results and delivers targeted
searches.
"Ryan and I have put a great deal of time and money
to make sure things are blocked, but we're really coming to a point where we
need the general public to help us," said Stewart. "We can't possibly catch
all of it. I would love to say we're 200 percent, but we're not."
"Please Do Not Use These Programs for Illegal Purposes:
Powerful new tools let you search for free software and music, zoom in on
landmarks and buildings, and add comments to news stories," by Steve Bass, PC
World via The Washington Post, August 21, 2007 ---
Click Here
I don't know what Google was thinking
when it allowed Google Hacks to be posted on the Google Code site. But it's
a sure bet most people won't abide by the "Please do not use this program
for illegal uses" disclaimer you'll find on thedownload site.
Google Hacks is a front-end GUI you can use as a
stand-alone app or as a browser toolbar. It performs searches you can
already do--if you know the syntax. For instance, if I wanted to search for
Dave Brubeck, I could pop the following into Google's search field:
But it's obviously a heck of a lot easier to type
into Google Hacks and choose the music category.
Google Hacks lets you search in any one of 12
categories--music, applications, video, books, lyrics, and others. But
there's a catch. The searches are indexes--Web site directories that haven't
been protected. Translation: You have to sort through lists of files and
some, if not most, could be unrelated to what you're searching for.
At the same time, you might hit the jackpot--loads
of files with just the content you're looking for. The showstopper is that
the content belongs to someone else who doesn't know how to hide it from
prying eyes. (And yes, I know, that person may have downloaded the music
illegally as well.)
BTW, credit for this masterpiece goes to Jason
Stallings, the author of Google Hacks. Jason doesn't work for Google, but
his program was released using Google'sfree code hosting service. You can
find more of Jason's code onhis Web site.
Dig This:Microsoft's entryinto the mobile phone
arena is sure to give Apple a run for the money--and promises to take the
nerd world by storm.
Microsoft's Photosynth is awesome--and addictive.
You can travel to Rome, zoom in on St. Peter's Basilica, and see
details--and I mean close, close up--that I guarantee will amaze you. (The
hardware requirements are stringent--more in a sec.) Don't believe me? Watch
this7-minute demonstration.
But wait a minute: Unless you have a heavy-duty
PC--you need Windows XP and the hardware needs to be Vista ready--save your
time. You just won't be able to use Photosynth. (My wife's out of luck;
she's been playing with Photosynth on my machine.) If you have the system
requirements, you'll also need to download a small ActiveX plug-in available
at the Photosynth site.
Photosynthis now up and running. (My friend Bill
Webb has a goodwrite-up about it.)
Continued in article
Google is a great search engine, but it's also more
than that. Google has tons of hidden features, some of which are quite fun
and most of which are extremely useful— if you know about them. How do you
discover all these hidden features within the Google site?
See
http://www.informit.com/articles/article.asp?p=675528&rl=1
Maybe
my mind is drifting—or maybe 2 plus 2 does equal 4.
Terminator
3 has been playing recently on cable. [Don’t read further if you don’t want
to know the ending!]
At
the end of Terminator 3, we learn that Skynet (which takes over the world in the
future and tries to kill all humans) is not controlled by just one major
computer as we thought in Terminators 1 and 2, but instead, Skynet is all the
computers on earth connected together—acting as one giant computer brain.
Tonight
I was watching 60 Minutes on TV and they dedicated 30 minutes to Google. Google
is able to search all computers connected to the Internet. Recently Google
released software that will search all the computers on LANS. Now you can Google
on your cell phone, search libraries, etc. etc. etc. Now they are working on a
universal translator (Start Trek anyone?) that will automatically search and
translate any document in any language.
Is
Google Skynet? Think about it.
Glen
L. Gray, PhD, CPA Dept.
of Accounting & Information Systems College
of Business & Economics California
State University, Northridge Northridge,
CA 91330 http://www.csun.edu/~vcact00f
January 3, 2005 reply
from Bob Jensen
Hi
Glen,
I
also watched the excellent 60 Minute module. Google
is amazing in almost every aspect, including how it is managed.I think that all business policy and organization behavior students
should watch this module.It will be
interesting to see how long the company can continue to grow at an exponential
pace and maintain its long-standing motto to “Do No Evil.” These
guys really believe in that motto. Google is probably the most cautious
firm in the world about who gets hired and promoted.
There
has never been anything quite like Google in terms of management, except SAS
probably comes a little bit close.
Yes I think Google could become Skynet if it were not for the
serious policy of Google to not be a monopolist (except by default) which is the
antithesis of Microsoft Corporation.Also
there is the black cloud of Microsoft hanging over Google to pull down
Google’s Skynet even if it takes a trillion dollars.
There
were some very fascinating things that I learned from the 60 Minutes module.For one thing, Google is getting closer to scanning the documents in
alternate languages around the world and then translating each hit into a
language of choice (probably English to begin with). Secondly,
I knew that Google bought Keyhole, but I had not played in recent years with the
amazing keyhole (not Google Views) --- http://www.keyhole.com/
I
might also add that this module was followed by another module on The World’s
Most Beautiful Woman --- http://www.cbsnews.com/stories/2004/12/29/60minutes/main663862.shtml
She’s very articulate and a pure delight in this world of sinking morality
even though her movie roles to date have been Bombay frivolous.
Bob
Jensen
CatsCradle 3.5 ---
http://www.stormdance.net/software/catscradle/overview.htm
Many websurfers enjoy going to sites that might be based
in other countries, and as such, they might very well encounter a different
language. With CatsCradle 3.5, these persons need worry no more, as this
application can be used to translate entire websites in such languages as Thai,
Chinese, Japanese, and Russian. This version is compatible with all computers
running Windows XP or 2000. (Scout Report, September 1, 2006)
"Google's Cloud Looms Large: How might expanding Google's
cloud-computing service alter the digital world?," by Kate Greene, MIT's
Technology Review, December 3, 2007 ---
http://www.technologyreview.com/Biztech/19785/?nlid=701
To know how you'll be using computers and the
Internet in the coming years, it's instructive to consider the Google
employee: most of his software and data--from pictures and videos, to
presentations and e-mails--reside on the Web. This makes the digital stuff
that's valuable to him equally accessible from his home computer, a public
Internet café, or a Web-enabled phone. It also makes damage to a hard drive
less important. Recently, Sam Schillace, the engineering director in charge
of collaborate Web applications at Google, needed to reformat a defunct hard
drive from a computer that he used for at least six hours a day.
Reformatting, which completely erases all the data from a hard drive, would
cause most people to panic, but it didn't bother Schillace. "There was
nothing on it I cared about" that he couldn't find stored on the Web, he
says.
Schillace's digital life, for the most part, exists
on the Internet; he practices what is considered by many technology experts
to be cloud computing. Google already lets people port some of their
personal data to the Internet and use its Web-based software. Google
Calendar organizes events, Picasa stores pictures, YouTube holds videos,
Gmail stores e-mails, and Google Docs houses documents, spreadsheets, and
presentations. But according to a Wall Street Journal story, the company is
expected to do more than offer scattered puffs of cloud computing: it will
launch a service next year that will let people store the contents of entire
hard drives online. Google doesn't acknowledge the existence of such a
service. In an official statement, the company says, "Storage is an
important component of making Web apps fit easily into consumers' and
business users' lives ... We're always listening to our users and looking
for ways to update and improve our Web applications, including storage
options, but we don't have anything to announce right now." Even so, many
people in the industry believe that Google will pull together its disparate
cloud-computing offerings under a larger umbrella service, and people are
eager to understand the consequences of such a project.
To be sure, Google isn't the only company invested
in online storage and cloud computing. There are other services today that
offer a significant amount of space and software in the cloud. Amazon's
Simple Storage Service, for instance, offers unlimited and inexpensive
online storage ($0.15 per gigabyte per month). AOL provides a service called
Xdrive with a capacity of 50 gigabytes for $9.95 per month (the first five
gigabytes are free). And Microsoft offers Windows Live SkyDrive, currently
with a one-gigabyte free storage limit.
But Google is better positioned than most to push
cloud computing into the mainstream, says Thomas Vander Wal, founder of
Infocloud Solutions, a cloud-computing consultancy. First, millions of
people already use Google's online services and store data on its servers
through its software. Second, Vander Wal says that the culture at Google
enables his team to more easily tie together the pieces of cloud computing
that today might seem a little scattered. He notes that Yahoo, Microsoft,
and Apple are also sitting atop huge stacks of people's personal information
and a number of online applications, but there are barriers within each
organization that could slow down the process of integrating these pieces.
"It could be," says Vander Wal, "that Google pushes the edges again where
everybody else has been stuck for a while."
Continued in article
How to search for academic videos
Answer First go to YouTube and search for professors or courses if you have the
names.
One Web site that opened this week,
Big Think,
hopes to be "a YouTube for ideas." The site offers
interviews with academics, authors, politicians, and other thinkers. Most of
the subjects are filmed in front of a plain white background, and the
interviews are chopped into bite-sized pieces of just a few minutes each.
The short clips could have been served up as text quotes, but Victoria R. M.
Brown, co-founder of Big Think, says video is more engaging. "People like to
learn and be informed of things by looking and watching and learning," she
says.
YouTube itself wants to be a venue for academe. In
the past few months, several colleges have signed agreements with the site
to set up official "channels." The University of California at Berkeley was
the first, and the University of Southern California, the University of New
South Wales, in Australia, and Vanderbilt University soon followed.
It remains an open question just how large the
audience for talking eggheads is, though. After all, in the early days of
television, many academics hoped to use the medium to beam courses to living
rooms, with series like CBS's Sunrise Semester. which began in 1957.
Those efforts are now a distant memory.
Things may be different now, though, since the
Internet offers a chance to connect people with the professors and topics
that most interest them.
Even YouTube was surprised by how popular the
colleges' content has been, according to Adam Hochman, a product manager at
Berkeley's Learning Systems Group. Lectures are long, after all, while most
popular YouTube videos run just a few minutes. (Lonelygirl, the diary of a
teenage girl, had episodes that finished in well under a minute. Many other
popular shorts involve cute animals or juvenile stunts). Yet some lectures
on Berkeley's channel scored 100,000 viewers each, and people were sitting
through the whole talks. "Professors in a sense are rock stars," Mr. Hochman
concludes. "We're getting as many hits as you would find with some of the
big media players."
YouTube officials insist that they weren't
surprised by the buzz, and they say that more colleges are coming forward.
"We expect that education will be a vibrant category on YouTube," said
Obadiah Greenberg, strategic partner manager at YouTube, in an e-mail
interview. "Everybody loves to learn."
To set up an official channel on YouTube, colleges
must sign an agreement with the company, though no money changes hands. That
allows the colleges to brand their section of the site, by including a logo
or school colors, and to upload longer videos than typical users are
allowed.
The company hasn't exactly made it easy to find the
academic offerings, though. Clicking on the education category shows a mix
of videos, including ones with babes posing in lingerie and others on the
lectures of Socrates. But that could change if the company begins to sign up
more colleges and pay more attention to whether videos are appearing in the
correct subject areas, says Dan Colman, director and associate dean of
Stanford University's continuing-studies program, who runs a
blog
tracking podcasts and videos made by colleges and
professors.
In many cases, the colleges were already offering
the videos they are putting on YouTube on their own Web sites, or on Apple's
iTunes U, an educational section of the iTunes Store. But college officials
say that teaming up with YouTube is greatly expanding their audiences
because so many people are poking around the service already.
'YouTube for Intellectuals' Goes Live Amy Gutmann,
president of the University of Pennsylvania, talks about the importance of
racial, socioeconomic, and religious diversity at colleges in a
video on bigthink,
a new Web site that is meant to be a YouTube for intellectuals. In addition
to featuring academics, the site includes one- to two-minute videos from
politicians, artists, and business people.
According to an
article in Monday’s New York Times, the site was
started by Peter Hopkins, a 2004 graduate of Harvard University. He said he
hopes bigthink becomes popular among college students. David Frankel, a
venture capitalist, put up most of the money for the enterprise. Lawrence H.
Summers, a former president of Harvard, has invested tens of thousands of
dollars as well.
How many videos are on YouTube at this moment?
How many new videos are added (uploaded) on average each day?
The content on both
YouTube.com and
YouTube.ca will be the same,
but the Canadian site will highlight homegrown material, said international
product manager Luis Garcia. The site becomes the 15th country-specific site,
Garcia said. ''The only thing that's different is that this is just a Canadian
lens into that content, so if a user wants to get the Canada point of view into
that global body of content, then they're able to do that,'' Garcia told
reporters at the YouTube.ca launch event Tuesday in Toronto. That means that
content uploaded by users in Canada will show up as ''top favorites'' and
''recommended content'' on the site. . . .
YouTube, which was founded in February 2005, hosts more
than 100 million video views every day with 65,000 new videos uploaded daily.
Within a year after its launch, YouTube made headlines when Google Inc. acquired
the company for US$1.65 billion worth of stock.
"Popular video-sharing site YouTube launches Canadian version," MIT's
Technology Review, November 6, 2007
http://www.technologyreview.com/Wire/19682/?nlid=653
Recall that UC Berkeley has over 300 lectures (mostly in science) on YouTube ---
http://www.youtube.com/ucberkeley
Other Open Courseware videos ---
http://www.trinity.edu/rjensen/000aaa/updateee.htm#OKI
Jensen Comment
With 15 or more nations having their own YouTube videos, it will make it more
difficult to search for given topics since the videos will not be maintained in
a single archive. Hopefully, YouTube will one day have a search engine for
searching all of its archives at the same time. Of course this will not overcome
language barriers.
SpiralFrog.com, an ad-supported Web site with a terrible name that
allows visitors to download music and videos free of charge, commenced on
September 17, 2007 in the U.S. and Canada after months of "beta" testing.
At launch, the service was offering more than 800,000 tracks and 3,500 music
videos for download ---
http://www.spiralfrog.com/
This week, I tested four
video-search engines, including revamped entrant Truveo.com, a smartly
designed site that combs through Web video from all sorts of sources ranging
from YouTube to broadcasting companies. Truveo, a subsidiary of AOL, is
stepping out on its own again after spending three years in the background,
powering video search for the likes of Microsoft, Brightcove and AOL itself.
It unveiled its new site last week, though I've been playing with it for a
few weeks now.
This Web site,
www.truveo.com, operates under the idea
that users don't merely search for video by entering specific words
or phrases, like they would when starting a regular Web search.
Instead, Truveo thinks that people don't often know what they're
looking for in online video searches, and browsing through content
helps to retrieve unexpected and perhaps unintended (but welcome)
results. I found that, compared with other sites, Truveo provided
the most useful interface, which showed five times as many results
per page as the others and encouraged me to browse other clips.
In effect, Truveo combines
the browsing experience of a YouTube with the best Web-wide
video-search engine I've seen.
The other video-search sites
I tested included Google's (www.google.com/video)
and Yahoo's (www.video.yahoo.com),
as well as Blinkx.com (www.blinkx.com).
None of these three sites do much to encourage
browsing; by default they display as many as 10 results per search
on one page and display the clips in a vertical list, forcing you to
scroll down to see them all. The majority of clips watched on Truveo,
Yahoo and Blinkx direct you to an external link to play the video on
its original content provider's site -- which takes an extra step
and often involves watching an advertisement.
Searching on Google video
almost always displays only content from Google and its famously
acquired site, YouTube. The giant search company is working on
improving its search results to show a better variety of content
providers. Still, the upside here is that clips play right away in
the search window rather than through a link to the site where the
video originated. YouTube works this way because its clips are
user-generated -- either made by users and posted to the site or
copied from original host sites and posted to YouTube, saving a trip
to the original content provider's site.
Yahoo's video-searching page
looks clean and uncluttered, with a large box for entering terms or
phrases with which to conduct searches. Two options -- labeled "From
Yahoo! Video" and "From Other Sites" -- help you sort results in one
step. But the clips that I found on Yahoo video seemed less
relevant, overall, and included more repeated clips. One search for
the Discovery Channel's "Man Versus Wild" show returned seven clips,
four of which were identical.
Blinkx, a three-year-old
site, distinguishes itself with its "wall" feature -- a visually
stimulating grid of moving video thumbnails. It is like Truveo in
that it also works behind the scenes for bigger companies, including
Ask.com. Blinkx says it uses speech recognition and analysis to
understand what the video is about, while the others stick to
text-based searching. And this seemed to hold true: I rarely got
results that were completely off-base using Blinkx.
But Truveo's focus on
browsing and searching worked well. It repeatedly displayed spot-on
results when I was looking for a video about a specific subject, or
provided a variety of other videos that were similar, requiring less
overall effort on my part. Its most useful feature is the way it
shows results: by sorting clips into neatly organized buckets, or
categories, such as Featured Channels, Featured Tags and Featured
Categories. These buckets spread out on the page in a gridlike
manner, giving your eye more to see in a quick glance.
. . .
With so many videos
added to the Web each day, the search for online clips can be
fruitless and tiresome. Truveo starts users out with enough relevant
clips right away so that they can more easily find what they're
looking for. And its organizational buckets encourage browsing and,
therefore, entertainment -- one of the reasons for Web video's
popularity.
Truveo takes a
refreshing look at video search, and as long as you have the
patience to travel to sites where content originated, you'll find it
useful. It stands apart from other search engines in looks and
functionality.
Welcome to CogPrints,
an electronic archive for
self-archive papers in any area of
Psychology,
neuroscience, and
Linguistics, and many areas of
Computer Science (e.g.,
artificial intelligence,
robotics,
vison,
learning,
speech,
neural networks),
Philosophy (e.g., mind,
language,
knowledge,
science,
logic),
Biology (e.g., ethology,
behavioral ecology,
sociobiology,
behaviour genetics,
evolutionary theory),
Medicine (e.g.,
Psychiatry,
Neurology,
human genetics,
Imaging),
Anthropology (e.g.,
primatology,
cognitive ethnology,
archeology,
paleontology), as well as
any other portions of the
physical, social
and mathematical
sciences that are pertinent to the study of cognition.
". . . the crisis in the scholarly communication
system not only threatens the well being of libraries, but also it threatens
our academic faculty's ability to do world-class research. With current
technologies, we now have, for the first time in history, the tools
necessary to effect change ourselves. We must do everything in our power to
change the current scholarly communication system and promote open access to
scholarly articles."
Paul G. Haschak's webliography provides resources
to help effect this change. "Reshaping the World of Scholarly Communication
-- Open Access and the Free Online Scholarship Movement: Open Access
Statements, Proposals, Declarations, Principles, Strategies, Organizations,
Projects, Campaigns, Initiatives, and Related Items -- A Webliography" (E-JASL,
vol. 7, no. 1, spring 2006) is available online at
http://southernlibrarianship.icaap.org/content/v07n01/haschak_p01.htm
E-JASL: The Electronic Journal of Academic and
Special Librarianship [ISSN 1704-8532] is an independent, professional,
refereed electronic journal dedicated to advancing knowledge and research in
the areas of academic and special librarianship. E-JASL is published by the
Consortium for the Advancement of Academic Publication (ICAAP), Athabasca,
Canada. For more information, contact: Paul Haschak, Executive Editor, Board
President, and Founder, Linus A. Sims Memorial Library, Southeastern
Louisiana University, Hammond, LA USA;
email: phaschak@selu.edu
Web:
http://southernlibrarianship.icaap.org/
The October/November 2006 issue (vol. 3, issue 1)
of INNOVATE is devoted to open source and the "potential of open source
software and related trends to transform educational practice." Papers
include:
"Getting Open Source Software into Schools:
Strategies and Challenges" by Gary Hepburn and Jan Buley
"Looking Toward the Future: A Case Study of Open
Source Software in the Humanities" by Harvey Quamen
"Harnessing Open Technologies to Promote Open
Educational Knowledge Sharing" by Toru Iiyoshi, Cheryl Richardson, and Owen
McGrath
Innovate [ISSN 1552-3233] is a bimonthly,
peer-reviewed online periodical published by the Fischler School of
Education and Human Services at Nova Southeastern University. The journal
focuses on the creative use of information technology (IT) to enhance
educational processes in academic, commercial, and government settings.
Readers can comment on articles, share material with colleagues and friends,
and participate in open forums. For more information, contact: James L.
Morrison, Editor-in-Chief, Innovate; email:
innovate@nova.edu ; Web:
http://www.innovateonline.info/ .
Is the increasing availability of documents
diminishing our reliance on colleagues for resource information? In 2004,
Pertti Vakkari and Sanna Talja surveyed 900 faculty members and PhD students
in Finnish universities to answer the question, "How are academic status and
discipline associated with the patterning of search methods used by
university scholars for finding materials for teaching, research, and
keeping up to date in their field?" They report their findings in "Searching
for Electronic Journal Articles to Support Academic Tasks. A Case Study of
the Use of the Finnish National Electronic Library (FinELib)" (INFORMATION
RESEARCH, vol. 12 no. 1, October 2006). One interesting discovery was that,
in contradiction to earlier studies, colleagues were considered "unimportant
sources for discovering needed [electronic] materials." However, the authors
believe that, while this role for colleagues is diminishing, their role as
"discussion partners concerning matters of research is considerably more
important than their role as providers of information about literature."
Information Research [ISSN 1368-1613] is a freely
available, international, scholarly journal, dedicated to making accessible
the results of research across a wide range of information-related
disciplines. It is privately published by Professor T.D. Wilson, Professor
Emeritus of the University of Sheffield, with in-kind support from the
University and its Department of Information Studies. For more information,
contact: Tom Wilson, Department of Information Studies, University of
Sheffield, Sheffield S10 2TN, UK; tel: +44 (0)114-222-2642; fax: +44
(0)114-278-0300;
email: t.d.wilson@shef.ac.uk ;
Web:
http://informationr.net/ir/ .
Search for Terms on Book Pages
The Absolutely Fantastic New Search Tool From Amazon
Google now has a new service (Google Print) for reading parts and searching
among pages of new books that is both similar to and different from the
groundbreaking Amazon free service.
DISPLAYING FIRST 50 OF 790 WORDS - Google Print, the
new search engine that allows consumers to search the content of books online,
could help touch off an important shift in the balance of power between
companies that produce books and those that sell them, publishing executives
said here on ... Google announced the introduction of the...
Google's mission is to organize the world's
information and make it universally accessible and useful. Since a lot of the
world's information isn't yet online, we're helping to get it there. Google
Print puts the content of books where you can find it most easily; right in
Google search results.
To use Google Print, just search on Google as you
normally would. For example, do a search on a subject such as "Books
about Ecuador Trekking," or search on a title like "Romeo and
Juliet." Whenever a book contains content that matches your search terms,
we'll show links to that book in your search results. Click on the book title
and you'll see the page that contains your search terms, as well as other
information about the book. You can also search for other topics within the
book. Click "Buy this Book" and you'll go straight to a bookstore
selling the book online.
Frequently Asked Questions
How does it work? What types of books are available? Can I read an entire book
online? Where does the book content come from? What can I do with a book that
I find using Google Print? Does Google keep track of the pages I'm viewing?
I'm searching for a specific book – why can't I find it? Does Google profit
when I buy a book from a Google Print page? I think I found a bug – who can
consign it to oblivion?
Google provides examples here!
You can read more about the competing Amazon book search and sample page
reading service below.
I find the Google service a bit easier to use, but I found that Amazon gave
me greater coverage of new books. Google will probably get better and
better over time. Neither service covers books that publishers have not
allowed surfers to search inside. In many instances this is a mistake on
the part of the publishing firms since finding a book by searching for a phrase
may greatly improve sales of the book.
Amazon’s ability to search through millions of
book pages to unearth any tidbit is part of a search revolution that will change
us all. Steven Levy, MANBC, November 10, 2003 --- http://www.msnbc.com/news/987697.asp?0dm=s118k
Hints from Bob Jensen
Be sure you note
the Previous Page and the Next Page options when you bring up a page of
text.
Note the option
at the top to "See all references" to your search term within a
given book (this is a wonderful search utility).
When you hit the
end of the allowed pages of reading, you might be able to find a phrase on
that last page that you can enter as a search term. I've done this and
have been able to bring up another five pages, etc. This is a
cumbersome way that one might read large portions of the book.
However, soon Amazon puts up a message that you have reached a limit of your
searches on the book and will deny you further searches. This software
is amazingly sophisticated.
The pages are
scanned pages and will sometimes show images as well as text in the original
colors. For example, search for "gnp graph" and note the
second hit to The Third World Atlas by Alan Thomas.
How It Works ---
http://snurl.com/BookSearch
A significant extension of our groundbreaking Look Inside the Book
feature, Search Inside the Book allows you to search millions of pages
to find exactly the book you want to buy. Now instead of just displaying
books whose title, author, or publisher-provided keywords that match
your search terms, your search results will surface titles based on
every word inside the book. Using Search Inside the Book is as simple as
running an Amazon.com search.
Amazon.com Inc. said
a new program that allows customers to search the contents of some books has
boosted sales growth by 9% for titles in the program above other titles that
can't be searched.
The news from the
Seattle-based Internet retailer suggests that concerns among some book
publishers that the search service might hurt sales haven't materialized.
Amazon last Thursday introduced the service, called Search Inside the Book,
which gave its customers a way to scour complete copies of 120,000 books from
190 publishers, a major advance over the searches customers were previously
limited to, such as searches by title and author name.
Some book publishers
have stayed out of the new Amazon search service because of concerns that
users can easily scan Amazon's electronic copies instead of buying the books.
In the days since the service launched though, Amazon monitored sales of
120,000 book titles that can be searched through its new service and says
growth in sales of those books significantly outpaced the growth of all other
titles on the site. Amazon said 37 additional publishers have contacted the
company since the search service launched asking to have their books included
in the program.
"It's helping
people find things they couldn't otherwise find," Steve Kessel, vice
president of Amazon's North American books, music and video group, said in an
interview. "There are people who love authors and who are finding things,
not just by the author, but about the author."
Although its
customers can search entire books with the new service, Amazon has
restrictions that limit the ability to browse entire books online. Once a user
clicks to a book page containing terms that they've search for -- "Gulf
War," for instance -- Amazon doesn't let them browse more than two pages
forward or back. Users may jump to other pages containing the terms, but the
same restrictions on browsing apply.
Search technology is
becoming an increasingly important focus for Amazon and for online shopping in
general. The company recently established a new division in Silicon Valley,
called A9, which is developing searching technology for finding products to
purchase on the Internet. (See article.) The project is getting underway at a
time when more shoppers are using search engines like Google and comparison
shopping sites like BizRate.com to locate products.
Amazon has a head
start on another big Internet company in the book search department. Google
Inc. is also talking to publishers about allowing searches of the contents of
books, according to people familiar with the matter. A Google spokesman
declined to comment.
Google's
Scholarly Search Engine and Some Publisher Ripoff Reasons Why It Has Big
Problems
"Google to Launch Scholarly
Search," The Wall Street Journal, November 18, 2004, Page A8
---
Google
Inc. today is set to introduce a service allowing computer users to search the
content of scholarly publications. The free service, called Google Scholar,
searches academic literature available on the Web or through Google's
agreements with publishers. Search results will include dissertations,
peer-reviewed papers, articles and books. To rank the results, Google will
consider such factors as where a document was published and how many other
scholarly works cite it, factors that aren't a part of its usual ranking
system for Web pages. In some cases, publishers require consumers to pay a fee
to see the full text of a document. In Google's current test version, the
service doesn't include advertisements.
Online search engine leader Google Inc. is setting
out make better sense of all the scholarly work stored on the Web.
The company's new service, unveiled late Wednesday at
http://scholar.google.com,
draws upon newly developed algorithms to list the academic research that
appears to be most relevant to a search request. Mountain View-based Google
doesn't plan to charge for the service nor use the feature to deliver
text-based ads - the primary source of its profits.
"Google has benefited a lot from scholarly
research, so this is one way we are giving back to the scholarly
community," said Anurag Acharya, a Google engineer who helped develop the
new search tools.
Although Google already had been indexing the reams
of academic research online, the company hadn't been able to separate the
scholarly content from commercial Web sites.
By focusing on the citations contained in academic
papers, Google also engineered its new system to provide a list of potentially
helpful material available at libraries and other offline sources.
The scholarly search effort continues Google's effort
to probe even deeper into content available online and offline. Just last
month, Google expanded a program that invites publishers to scan their books
into the search engine's index, enabling people to peek at the contents online
before deciding whether to buy a copy.
I did a search on
XBRL and found that Google did an excellent job of finding research on this
specialist area. I will be recommending this site to my students in future,
Roger
November 19, 2004 reply from Clifford
Budge
I have just screened
through its offerings in relation a a single topic: Cash Flow. 100 screens
full of references - must be 800 or more. It took over an hour to screen
through all the titles!
Let me give a very
rough impression of what came out of screening through the topic.
For a Google search
approach, there are very few ref's that seem to be totally irrelevant to the
title "Cash Flow".
Most of the articles
are from journals with a wide, business interest.
Many report
possibilities to implement academic studies for practical use, on topics
probably of interest to the financial markets, specific industries.
Some focus on
developing methods of forecasting cash flows - for control, or calculating
investment opportunities etc.
There are at least a
dozen articles of academic research in the area, up to 12 or so years old.
Most of them discuss theory of applying various aspects of CF in
investing/business situations.
Academics looking for
a research area in the field might well locate something with a potential for
closer consideration.
OVERALL, this topic
has probably been well-served by Google.
Clifford Budge
Macquarie University,
Sydney Australia
Email: cbudge@efs.mq.edu.au
As you may have read on AECM, I've aready used this
new Google search to assess it against my own interests in Cash Flow
Statements.
It would be wise for us "wise men" to put
Google to the test:
Could a number of readers, in different aspects of
accounting research, put the system to the test?
My own very quick test on Cash Flow research
presented a huge majority of articles from magazines without a research
focus. - Some of them considered the possible application of research
articles to business situations - which isn't the same thing, is it? - I was
mystified at the low proportion of articles from the "recognised"
research journals: perhaps someone might correlate the "hit rate"
for their topic of interest back to the journals? (I have records of
articles over the period reported that did not reach their site).
The whole job took me less the two hours! What
about some other analyses to spread our knowledge?
One of the real problems of scholarly research is
that scholarly research journals think the only way they can make money and
control copying losses is to restrict publications to hard copy. This
prevents Internet search crawlers like Google from finding key words buried
in text.
Pogo got it right."The enemy is us."In
particular our worst enemies are faculties who still insist on publication
in "elite" journals that shut out easy searches for literature via
the Internet. What is worse is
that scholarly journal publication has become a monopoly of the worst kind
(rip off pricing of libraries) that some universities and virtually all
librarians are fighting as best they can.
Ted Bergstrom, an economist at UCSB expains this
phenomena (where free entry and existence of free or cheaper non-profit
journals does not preclude monopoly profits by academic journal peddlers)
via a parable that illustrates the well known co-ordination game in the
Theory of Games. The equilibrium is a situation where everyone is worse off.
You can see the paper at http://www.econ.ucsb.edu/~tedb/Journals/jeprevised.pdf
. I am giving below just a snippet that explains the concept through a
parable.
Bob, we do not have to go very far to find the
effects of this. Look at AAA and how it extracts monopoly rents by
restricting knowledge, if there is much of it, in its journals.
Jagdish
_______________________________________________
The Anarchists' Annual Meeting: A Parable
This tale is intended to illustrate the workings of
coordination games, and to show that in such games, the presence of
potential competitors does not necessarily prevent monopoly pricing.
A large number of anarchists find it valuable to
attend an annual meeting of like-minded people. The meeting is more valuable
to each of them, the greater the number of other anarchists who attend. A
meeting attended by only a few is of little value to any of them. At some
time in the past, the anarchists started to gather on a particular day of
the year in one hotel in a certain city. Other hotels in this and other
cities would have served equally well for the meeting, but since each
anarchist expects the others to appear at the usual hotel, they return every
year to the same hotel on the day of the meeting.
A few years after the anarchists had established
their routine, the hotel that served as their meeting-place increased its
prices for the day of their annual meeting. Most anarchists valued the
annual meeting so highly that they continued to attend, despite the price
increase. A few decided that at the higher price, they would rather stay
home. The hotel owner observed that although attendance was slightly
reduced, the fall in attendance was less than the proportional to the price
increase and thus his revenue and his profits increased. In subsequent
years, after some experimentation, the hotel owner learned that he could
maximize his annual profit by setting a price on the anarchists' meeting day
that was much higher than that of other hotels. After setting this price,
the hotel owner proclaimed that he was offering a uniquely valuable service
to the anarchists.
The anarchists were annoyed at having to pay
tribute to the hotel owner for services no better than other hotels offered
more cheaply. Moreover, since all of the anarchists prefer larger attendance
to smaller, they were all made worse off by the fact that high prices caused
some of their number to stay home. But what else could they do? Each
anarchist was aware that he or she would be better off if they could all
meet at one of the many other hotels offering equal physical facilities at a
lower price. Given their beliefs and temperaments, the anarchists were
resistant to making and obeying centralized decisions. Lacking central
direction, the anarchists were unable to coordinate a move to another hotel.
No individual, nor even any small group of anarchists, could gain by moving
to another hotel because small meetings, however cheap, are not worth much
to any of them.
Pessimistic anarchists speculated that even if they
were somehow able to re-coordinate at a cheaper hotel, this victory would be
shortlived. The new hotel like its predecessor would raise its prices to
take advantage of the anarchists' disorderly ways. More optimistic
anarchists suggested that the problem of organizing a meeting at a new hotel
is not insurmountable, even for anarchists. Therefore, argued the optimists,
once it is demonstrated that the anarchists will move their meeting if
prices become excessive, the hotel at which they settle will moderate its
prices rather than provoke another mass defection.
How It Works ---
http://snurl.com/BookSearch
A significant extension of our groundbreaking Look Inside the Book
feature, Search Inside the Book allows you to search millions of pages
to find exactly the book you want to buy. Now instead of just displaying
books whose title, author, or publisher-provided keywords that match
your search terms, your search results will surface titles based on
every word inside the book. Using Search Inside the Book is as simple as
running an Amazon.com search.
CiteBase
Citebase is a trial service that allows researchers
to search across free, full-text research literature
ePrint archives, with results ranked according to
criteria such as citation impact.
Gateway to ePrints
A listing of ePrint servers and open access
repository search tools.
Google Scholar
A search tool for scholarly citations and abstracts,
many of which link to full text articles, book
chapters, working papers and other forms of
scholarly publishing. It includes content from many
open access journals and repositories.
OAIster
A search tool for cross-archive searching of more
than 540 separate digital collections and archives,
including arXiv, CiteBase, ANU ePrints, ePrintsUQ,
and others.
Scirus
A search tool for online journals and Web sites in
the sciences.
Social scientists and business scholars often use SSRN (not free) ---
http://www.ssrn.com/
If you have access to a college library, most colleges generally have
paid subscriptions to enormous scholarly literature databases that are not
available freely online. Serious scholars obtain access to these vast
literature databases.
MIT's Video Lecture Search
Engine: Watch the video at ---
http://web.sls.csail.mit.edu/lectures/
Researchers at MIT have released a video and audio search tool that solves one
of the most challenging problems in the field: how to break up a lengthy
academic lecture into manageable chunks, pinpoint the location of keywords, and
direct the user to them. Announced last month, the MIT
Lecture Browser website gives the general public
detailed access to more than 200 lectures publicly available though the
university's
OpenCourseWareinitiative. The search engine
leverages decades' worth of speech-recognition research at MIT and other
institutions to
convert
audio
into text and make it searchable.
Kate Greene, MIT's Technology Review, November 26, 2007 ---
http://www.technologyreview.com/Infotech/19747/?nlid=686&a=f
Once again, the Lecture Browser link (with video) is at
http://web.sls.csail.mit.edu/lectures/
Bob Jensen's search helpers are at
http://www.trinity.edu/rjensen/Searchh.htm
The web is easy to use, but using it well is not
easy. We are inventing new ways to take search one step farther and make it
more effective. We provide a unique set of powerful features to find
information, organize it, and remember it—all in one place. A9.com is a
powerful search engine, using web search and image search results enhanced by
Google, Search Inside the Book™ results from Amazon.com, reference results
from GuruNet, movies results from IMDb, and more.
A9.com remembers your information. You can keep your
own notes about any web page and search them; it is a new way to store and
organize your bookmarks; it even recommends new sites and favorite old sites
specifically for you to visit. With the A9 Toolbar all your web browsing
history will be stored, allowing you (and only you!) to retrieve it at any
time and even search it; it will tell you if you have any new search results,
or the last time you visited a page.
I don't think A9.com will be the search engine of choice for some time to
come. It also has a long ways to go in terms of luring advertising
revenue.
Features of the Amazing Google
Did you know that Google will calculate equations?
In addition to providing easy access to more than 4
billion web pages, Google has many special features to help you to find
exactly what you're looking for. Click the title of a specific feature to
learn more about it.
A man in Southern California is irate over the
results of “Googling” his name. Mark Maughan, certified public accountant of
the Brown & Maughan firm, believes the search results for “Mark Maughan”
contained “alarming, false, misleading and injurious results.”
Maughan discovered that Google’s results
about him and his company made false claims that, according to NBC4News,
“the search results falsely represent that plaintiffs Maughan and/or Brown
& Maughan have been disciplined for gross negligence, for failing to timely
submit a client's claim for refund of overpayment of taxes, and for practicing
as a CPA without a permit.”
Plaintiff attorney John A. Girardi believes that Google’s PageRank system is
what caused this misinformation. In the suit, Giradi states that Google PageRank
“reformats information obtained from accurate sources, resulting in changing
of the context in which information is presented.”
While it’s true that Google results pages alter the context of information,
PageRank does not actually determine search result descriptions.
The attorney stated that a literal reprint would be suitable, but that the
reformat gives misinformation. He is asking that Google discontinue using
PageRank. Girardi is asking for unspecified monetary damages, as well.
Also named in the lawsuit are Yahoo, AOL, and Time Warner.
Google Will Generate a Map to An Address From a Telephone Number
As I see the new Google service (see
below), its main attraction to me is in finding a quick map when I know a
person's home or a business phone number. Often I have a phone number but
do not have an address. Even if I have an address, it takes more time to
bring up a mapping service (like Mapquest) and then type in an address.
Google has implemented an address/map
service. If you type a phone number in the format (210)555-5555 you will
then be given the address and links to a map of where this phone number is
located. Scary! But this type of service has been available from
some other services for years (although not necessarily with the quick map
service).
It works for home phones and most
business phones. It will give you an address and map for some business
phone numbers but not others. It did not work for the main
Trinity
University
phone number (210)999-7701. It also does not work for my office phone or
my cell phone. It also does not work for unlisted numbers.
The phone numbers are not
extremely up to date. When I type in a phone number (210)653-5055 that I
cancelled in June, it still brings up my former address where I no longer live.
My wife and I had got a new phone number in New Hampshire in June. It does
not find our NH address, but other services like Switchboard
are also not up to date in terms of "new" listings.
Note that if you have online documents
with your phone number on them (e.g., a resume), Google will also find those
documents like it does with any other search term.
The empire of Google Inc. is officially going interplanetary. Working with researchers from NASA at Arizona State
University, the search engine has compiled images of Mars on a map
Web site, making it possible to view the dunes, canyons
and craters of the red planet as easily as the cul-de-sacs and cityscapes of
Earth. Infrared images at http://mars.google.comeven pull up things normally invisible to the naked
eye. Having mapped the Earth and the relatively nearby moon, Google said seeking
out farther-flung planetary conquests is a natural progression.
"Need to Find Your Way on Mars? Google It," by Yuki Noguchi, The Washington
Post, March 14, 2006 ---
Click Here
Google added historic map overlays to its free interactive online globe of
the world to provide views of how places have changed with time.
"Google Earth maps history," PhysOrg, November 14, 2006 ---
http://physorg.com/news82706337.html
Google recently published its Web Services interface
at http://www.google.com/apis (tech explanation). We've built an email
interface to Google. Actually, the folks in Marketing built it, which says a
lot about the simplicity of Web services. Just email google@capeclear.com
and put the text of your query in the "Subject" line. You'll receive
your search results via email.
It's not going to take the world by storm, but maybe
it'll kick start some thought processes on the power of Web Services. It might
be useful for PDAs, mobile phones, offline laptop users, and generally people
who have infrequent, low quality access to the Internet. Some people may find
it easier to use email rather than launch a browser, or maybe you could just
use it to remind yourself to do something on the Internet...
There are some interesting queries that you can do on
google, that transfer nicely to CapeMail. One trick is to do the query "
site:www.capeclear.com ceo " to find out Cape Clear's CEO. Send this
query to CapeMail - and find out who our CEO is...
International: Are your French, Dutch, Russian or
outside the general '.com' arena? To see sites in just your region, append the
text "site:.XX" to the end of your subject query, where XX is your
domain of interest. For example to see all occurences of CapeClear in Denmark
do the following query: CapeClear site:.dk, for a similar query on French
Websites try this
Shortcut: More useful is the following idea: Store
this link on your desktop. (How?: hover over this link, right mouse
click->'Copy Shortcut', then on your Windows desktop, right mouse
click->'Paste Shortcut'). A handy shortcut for CapeMail access.
Discuss CapeMail in the CapeScience forum or email ed@capeclear.com
and check out our sister offering CapeSpeller
Tutorials and Books on How to Use Google
Google has become so huge, that
learning about what you can do and/or remembering to use what you once learned
how to do something is as complex as running a Microsoft Office product. How
many of us know and or use all of the features in MS Word? How many of us know
and use all of the features in Excel such as Goal Seek, Solver, Pivoting, and 3D
graphing? How many of us know how to use the new exotic features of PowerPoint?
There are books, videos, and online
tutorials that will illustrate how to use MS Office features.
Although I have not yet found online
video tutorials on Google features, there are now books that you can buy such as
How to Do Everything With Google by Fritz Schneider Nancy Blachman Eric
Fredricksen (McGraw-Hill, 2004) --- http://books.mcgraw-hill.com/cgi-bin/pbg/0072231742.html
A
drawback of books and tutorials for Google vis-a-vis MS Office products is that
Google seems to add new features monthly whereas Microsoft adds new features at
a slower pace.
Barry Rice tells us how to search for PowerPoint and other file types
July 15, 2007 message from Barry Rice
[brice@LOYOLA.EDU]
I just read in PC Magazine that
you can Google by file type by entering in the search box
"filetype: filetype and search term"
e.g., entering the following in the search
box returns 374,000 hits [quotes left out to minimize confusion]:
filetype:ppt accounting
I get 27,800 links to PowerPoint files when
I search for:
filetype:ppt accounting auditing
I get 969 links to PowerPoint files when I
search for:
filetype:ppt accounting derivatives
I get 15 links to PowerPoint files when I
search for the following, a couple of which, amazingly, are not Bob:
When I typed the phrase "filetype:ppt
accounting derivatives" (without quote marks) into the
"Advanced Search" box it would not work properly. The phrase must be typed
in the "All the words" search box to work properly. This makes sense since
in retrospect --- Dahh!
When I typed the phrase "filetype:ppt
accounting derivatives AND Jensen" (without quote marks) into
the "All the words" search box I got some but not all of my PowerPoint files
on derivatives that are listed at
http://www.cs.trinity.edu/~rjensen/Calgary/CD/JensenPowerPoint/
When I typed the phrase "filetype:ppt
"accounting derivatives" AND Jensen" without the outer quote marks it
reduced the number of hits, but it also missed more of my PowerPoint files
on this topic.
When I typed the phrase "filetype:ppt
accounting derivatives AND Jensen" I did find some of my Excel workbooks on
this topic but not all Excel workbooks under the following URL ---
http://www.cs.trinity.edu/~rjensen/Calgary/CD/
My conclusion is that if you want your PowerPoint ppt files or other file
types like xls on some topic like "accounting derivatives" it is best to be
very careful to use that phrase in the title or in a listing of key words
for each PowerPoint file.
When I typed the phrase "filetype:ppt
accounting "FAS 157" AND Jensen" (without the outer quote mark) I find my
most recent PowerPoint file on FAS 157 ---
Click Here
Google
Inc. announced a low-cost hardware and software package
that small- and medium-size organizations can use for searching their own Web
sites and other information.
The Internet search company is selling
the $4,995 Google Mini, which includes a computer server and software,
exclusively through its online store. Organizations can use the Google Mini to
let staff search for shared documents and information on internal Web sites
and permit the public to search their external Web sites.
Google says it has over 800 customers
for the Google Search Appliance, a more powerful but similar product. The
Google Search Appliance, with a minimum price tag of $32,000, represented less
than 2% of Google's $2.2 billion in revenue during the first nine months of
2004.
Question
What can you do to prevent being taken on eBay?
(Word of Caution: Never open an email message that pretends to be from Pay-Pal)
Two brothers have published a book of "true tales of
treachery, lies and fraud" from eBay. "Dawn of the eBay Deadbeats" contains
stories written by eBay buyers and sellers. From stories of disappointing
purchases to out-and-out fraud, the book is a manual of what can go wrong when
buying and selling on auction sites. Brothers Stephen and Edward Klink co-wrote
the book, illustrated by Clay Butler. The idea for the book sprung from a
website Stephen Klink had created. A New Jersey police office, he founded
eBayersThatSuck.com - a site that aims to help people avoid auction scams -
after he himself was ripped off online.
Ina Steiner, "Dawn of the eBay Deadbeats: New Book Uncovers Online Auction
Treachery," AuctionBytes.com, December 28, 2005 ---
http://www.auctionbytes.com/cab/abn/y05/m12/i28/s01
Imagine buying vintage Spiderman
comics for $16,000 and receiving instead, a box of printer paper or
losing a whopping $27,000 in purchasing a big rig that didn't exist in
the first place. These are just many of the online auction fraud horror
stories that brothers Edward and Steve Klink compiled from their eBay
watchdog Web site eBayersThatSuck.com (E.T.S.).
In their book "Dawn of the eBay Deadbeats," some
70 strange-but-true stories were collected and retold with the help of
illustrator Clay Butler.
The December 2005 publishing of the book comes just in time as the
online auction giant has been criticized by consumer groups, most
recently by the U.K. magazine "Computing Which?" for its passive and
sometimes delayed approach in handling fraud reports.
At any given time, the site has 78 million listings, and 6 million new
listings are added each day.
And while, eBay maintains that less than .01 percent of all listings end
in a confirmed case of fraud, that could mean that of the 1.9 billion
listings reported by eBay in 2005, that 190,000 cases were confirmed
frauds in the last year.
Currently there are almost 900 horror stories from eBay fraud victims
are on the E.T.S. site whose motto is "Winning the war on deadbeats."
And already the brothers are working on the next volume of horror
stories, encouraging victims who want to get their tales to be told to
get into contact with them.
United Press International spoke with Edward Klink about the recent
book, their watchdog
Web site, and the current state of eBay.
"We had collected hundreds of stories
on the Web site
and figured it was time to take these stories to a wider audience and
let the victims have their say," Edward Klink said. "Plus with our
combined backgrounds, Steve is a police officer and I'm a
business writer,
we felt we were ideally suited to get the job done."
Fraud on eBay can take on many forms including items paid for that vary
from the description in the sale, unpaid items, and spoof eBay or
Pay-Pal e-mails.
And like the many victims on their site, the brothers too have
encountered the problem of auction fraud.
In 2003, Steve, a New Jersey police officer, won a set of "new"
speakers, only to find that it looked as if they were "gnawed on by a
wild animal."
"The seller said they weren't that way when mailed, and eBay said there
was nothing they could do," Klink said. "Annoyed that he was stuck with
the merchandise and given no recourse, Steve started
www.ebayersthatsuck.com and stories began pouring in from around the
world."
And the site has received a positive response since it's been up and
running.
"People love it," Klink said. "On eBay, their official boards are
closely monitored and talk about problems and scams and eBay's failings
are not generally tolerated. So E.T.S. gives them an outlet. When it
first came out Ebayersthatsuck.com was featured on Courttv.com and
newspapers as far away as South Africa."
According to Klink, while eBay has what could be considered --"the
ultimate
business model" -- of collecting fees and
delegating the marketing, selling, packaging, shipping, and
customer service
to eBay users, it's very easy for these same users to fall victim to
fraud.
"I think consumers let their guard down when they are sitting at home
and surfing the Web with their coffee," he said. "If a stranger offered
them a $1,400 antique vase on the street they'd most likely walk away,
but when that same vase is on
the Internet for some reason the reaction is
more, 'Say, now that looks interesting.'"
And have the brothers seen any improvements in eBay's handling of the
fraud issue?
"eBay says it is a tiny fraction of all auctions," Klink said, "but the
hundreds of people who told us their stories hate being in that tiny
group and never thought they would be. Lots of fraud is underreported,
too. EBay encourages users to settle it among themselves, and if they
can't, then they are directed to pay $20.00 to have SquareTrade, a third
party, mediate the dispute. But it's not often a scammer shows up for
mediation!"
. . .
"We want people on eBay to have a good buying and
selling experience - transparent, well-lit, and safe," the spokesperson
said. "Fraud on all levels is something we take seriously."
The company also has a team dedicated to working
with law enforcement rather it be educating them on fraudulent cases and
working proactively taking information on specific cases to them or
cooperating with investigations.
"We would invite anyone to visit the site and read
more," said the spokesperson, who also emphasized that the no. 1 issue for
online shoppers is to pay safely using Pay-Pal or a credit card than any
other form of payment.
In many cases, consumers are able to get their
money back, Pay-Pal offers up to $1,000 back with buyer protection and
credit card programs usually have a pay back program in cases of fraud. In
many cases, Pay-Pal offers a way for consumers to make purchases without
providing personal information and at the same time protecting money.
"Dawn of the eBay Deadbeats" ($12.95) is
available on Amazon, eBay, and in select bookstores.
Click Fraud Gets Smarter
Internet ad-traffic scams could be ripping off as much as $1 billion annually.
Are Web companies like Google doing enough to foil them?
"Click Fraud Gets Smarter," by Burt Helm, Business Week, February 27,
2006 ---
Click Here
Internet ad-traffic scams could be ripping off
as much as $1 billion annually. Are Web companies like Google doing enough
to foil them?
Web consultant Greg Boser has an ingenious method
for sending loads of traffic to clients' Internet sites. Last month he
began using a software program known as a clickbot to create the impression
that users from around the world were visiting sites by way of ads
strategically placed alongside Google search results. The trouble is, all
the clicks are fake. And because Google charges advertisers on a per-click
basis, the extra traffic could mean sky-high bills for Boser's clients.
But Boser's no fraudster. He cleared the procedure
with clients beforehand and plans to reimburse any resulting charges.
What's he up to? Boser wants to get to the bottom of a blight that's
creating growing concern for online advertisers and threatens to wreak havoc
across the Internet: click fraud.
BILLION-DOLLAR QUESTION. The practice can
wildly skew statistics on the popularity of an ad, drain marketing budgets,
and enrich the scam artists behind it. While click fraud isn't new, the
methods for carrying it out--take Boser's clickbot software--are getting
increasingly sophisticated. And some advertisers, analysts and consultants
question whether Web companies such as Google (GOOG) and Yahoo (YHOO) are
doing enough to nip click fraud in the bud. "No one has any idea how much
of this is actually going on," says Boser. "So we're going to see how well
[the search engines] actually try to protect advertisers."
One of Boser's biggest challenges is putting a
finger on exactly how widespread the practice is. Some search consultants
say click fraud accounts for upwards of 20% of all traffic, and may generate
more than $1 billion in dubious sales a year. Others say those stats vastly
overstate the problem.
Now, one of the biggest players in fraud detection
aims to end the guessing. Fair Isaac (FIC), which analyzes 85% of U.S.
credit card transactions, in partnership with Web search consultancy
Alchemist Media, will unveil plans at this week's Search Engine Strategies
Conference for what it says is the most rigorous study ever of click fraud.
Fair Issac will invite companies to submit traffic data that can be mined
for aberrations that may signify fraud. "We've seen indications that the
overall losses due to click fraud could equal more than $1 billion [a
year]--larger than the total magnitude of credit card fraud in the U.S.,"
says Kandathil Jacob, Fair Issac's director of product marketing. "It's
certainly worth our effort to look at it."
MORE CLICKS, MORE DOLLARS. A rising number
of companies would agree. The percentage of advertisers listing click fraud
as a "serious" problem tripled in 2005, to 16%, according to a survey by the
Search Engine Marketing Professional Organization. Advertisers have filed
at least two class-action suits saying Google, Yahoo, and other search
engines ought to be more up-front about methods for combating the practice.
Google says the suits are meritless. Yahoo declines to comment.
And in January, Standard & Poor's equity analyst
Scott Kessler downgraded Google stock in part because he considers click
fraud a "notable risk" (see BW Online, 1/17/06, "S&P Downgrades Google to
Sell"). Among his concerns: the prospect of false clicks may sour companies
from placing ads on Google. He too says Google needs to be more forthcoming
on the issue. "No one has any idea as to what Google assesses [as] its own
percentage of clicks that are generated by fraud, no idea what that process
consists of, and all the things that are being done to battle it," he says.
Question
What is so special about the new FactSpotter semantics-based search engine from
Xerox?
Xerox Rolls Out Semantics-Based Search Xerox Corp. says its new search engine based on semantics will analyze the
meaning behind questions and documents to help researchers find information more
quickly. Developing the search engine is similar to understanding how brains
process information, said Frederique Segond, manager of parsing and semantics
research at Xerox Research Center Europe in Grenoble, France. "Many words can be
different things at the same time. The context makes the difference," she said.
"The tricky things here are not the words together but how are they linked." For
example, common searches using keywords "Lincoln" and "vice president" likely
won't reveal President Abraham Lincoln's first vice president. A semantic search
should yield the answer: Hannibal Hamlin. Segond, whose background is in math
and linguistics, said Stamford-based Xerox has been working on the project for
four years. FactSpotter was introduced in Grenoble on Wednesday and will launch
next year, initially to help lawyers and corporate litigation departments plow
through thousands of pages of legal documents. Xerox expects the technology to
eventually be used in health care, manufacturing and financial services. Xerox's
technology is part of a growing field in which researchers are trying to adapt
to a computer the complex workings of the brain.
Stephen Singer, PhysOrg, June 21, 2007 ---
http://physorg.com/news101560663.html
The Older AskOnce Search Engine from Xerox
Stuck in a search rut? Online search engines aren't your only option. AskOnce
from Xerox (www.xerox.com) aims to refine searching by allowing access to all
the information available to you via a single query. The program's simple and
advanced searches scour the Internet, your intranets, DocuShare (Xerox's
Web-based storage space), tech magazines and specific databases. The simpie
search resembles a typical search engine but accesses mare information. The
advanced search is less intuitive but more robust-- offering tools such as
scheduled searches. For finding documents buried in your network, AskOnce is
handy, but its online capabilities fall short of a good Web meta-search engine.
The price for 50-user licenses starts at $7,000 (street).
Liane Gouthro, "Search Me - AskOnce from Xerox - search service,"
LookSmart, Sept, 2001 ---
http://findarticles.com/p/articles/mi_m0DTI/is_9_29/ai_79756063
The New LinkedIn Platform Shows Facebook How It's Done A social network showdown is coming. LinkedIn, which
aims to track your business and professional connections, has rolled out a new
developer platform and already the majority of the web press is comparing
LinkedIn's efforts Facebook's platform. It's a fair comparison, but there's one
key difference between the two — LinkedIn's platform is actually useful. Where
Facebook’s platform provides a proprietary programming language for developers
to build applications that run inside the site (so you can send you friends a
fresh pair of virtual diapers or whatever), LinkedIn has created a platform in
the sense of what the word used to mean — a way of mixing, mashing, repurposing
and sharing your data. Think Flickr, not Facebook. The LinkedIn platform, known
as the LinkedIn Intelligent Application Platform, consists of two parts, a way
for developers to build application that run inside your LinkedIn account (via
OpenSocial) and the far more useful and interesting part — ways to pull your
LinkedIn data out and use it elsewhere . . . As an example of the second half of
LinkedIn’s new platform, the company has announced a partnership with
Business Weekwhich will see LinkedIn data pulled
into the Business Week site. For instance, if you land on a Business Week
article about IBM, the site will then look at your LinkedIn profile (assuming
you’ve given it permission to do so) and highlight the people you know at IBM.
Call it six degrees of Business Week, but it does something Facebook has yet to
do — it connects your data with the larger web.With Beacon having recently
blown up in Facebook’s face— something that’s
become a trend for the site,
violate privacy,
weather user backlash, violate privacy,
weather user backlash, violate privacy, weather
user backlash — LinkedIn’s new platform couldn’t come at a better time. Frankly,
it reminds us of the good old days when the data you stored on websites was
actually yours and you could pull it out and do interesting things with it. Scott Gilbertson, Wired News, December 10, 2007 ---
http://blog.wired.com/monkeybites/2007/12/the-new-linkedi.html
A rising tide of companies are tapping Semantic Web technologies to
unearth hard-to-find connections between disparate pieces of online data
"Social Networks: Execs Use Them Too Networking technology gives companies a
new set of tools for recruiting and customer service—but privacy questions
remain," by Rachael King, Business Week, September 11, 2007 ---
Click Here
Encover Chief
Executive Officer Chip Overstreet was on the hunt for a new
vice-president for sales. He had homed in on a promising
candidate and dispensed with the glowing but unsurprising
remarks from references. Now it was time to dig for any
dirt. So he logged on to LinkedIn, an online business
network. "I did 11 back-door checks on this guy and found
people he had worked with at five of his last six
companies," says Overstreet, whose firm sells and manages
service contracts for manufacturers. "It was incredibly
powerful."
So
powerful, in fact, that more than a dozen sites like
LinkedIn have cropped up in recent years. They're responding
to a growing impulse among Web users to build ties,
communities, and networks online, fueling the popularity of
sites like News Corp.'s (NWS)
MySpace (see BusinessWeek.com,
12/12/05
"The MySpace Generation"). As of
April, the 10 biggest social-networking sites, including
MySpace, reached a combined unique audience of 68.8 million
users, drawing in 45% of active Web users, according to
Nielsen/NetRatings.
Of course,
corporations and smaller businesses haven't embraced online
business networks with nearly the same abandon as teens and
college students who have flocked to social sites. Yet
companies are steadily overcoming reservations and using the
sites and related technology to craft potentially powerful
business tools.
PASSIVE SEARCH.
Recruiters at Microsoft (MSFT)
and Starbucks (SBUX),
for instance, troll online networks
such as LinkedIn for potential job candidates. Goldman Sachs
(GS)
and Deloitte run their own online alumni networks for hiring
back former workers and strengthening bonds with
alumni-cum-possible clients. Boston Consulting Group and law
firm Duane Morris deploy enterprise software that tracks
employee communications to uncover useful connections in
other companies. And companies such as Intuit (INTU)
and MINI USA have created customer
networks to build brand loyalty.
Early
adopters notwithstanding, many companies are leery of online
networks. Executives don't have time to field the possible
influx of requests from acquaintances on business networks.
Employees may be dismayed to learn their workplace uses
e-mail monitoring software to help sales associates' target
pitches. Companies considering building online communities
for advertising, branding, or marketing will need to cede
some degree of control over content.
None of
those concerns are holding back Carmen Hudson, manager of
enterprise staffing at Starbucks, who says she swears by
LinkedIn. "It's one of the best things for finding mid-level
executives," she says.
The Holy
Grail in recruiting is finding so-called passive candidates,
people who are happy and productive working for other
companies. LinkedIn, with its 6.7 million members, is a
virtual Rolodex of these types. Hudson says she has hired
three or four people this year as a result of connections
through LinkedIn. "We've started asking our hiring managers
to sign up on LinkedIn and help introduce us to their
contacts," she says. "People have concerns about privacy,
but once we explain how we use it and how careful we would
be with their contacts, they're usually willing to do it."
BOOMERANGS.
Headhunters
and human-resources departments are taking note. "LinkedIn
is a tremendous tool for recruiters," says Bill Vick, the
author of LinkedIn for Recruiting. So are sites
such as Ryze, Spoke, OpenBc, and Ecademy
Continued in article
"Taming the World Wide Web A rising tide of companies are tapping Semantic
Web technologies to unearth hard-to-find connections between disparate pieces of
online data," by Rachael King, Business Week, April 9, 2007 ---
Click Here
When Eli Lilly
scientists try to develop a new drug, they face a Herculean
task. They must sift through vast quantities of information
such as data from lab experiments, results from past
clinical trials, and gene research, much of it stored in
disparate, unconnected databases and software programs. Then
they've got to find relationships among those pieces of
data. The enormity of the challenge helps explain why it
takes an average of 15 years and $1.2 billion to get a new
drug to market.
Eli
Lilly (LLY)
has vowed to bring down those costs.
"We have set the goal of reducing our average cost of R&D
per new drug by fully one-third, about $400 million, over
the next five years," Lilly Chairman and Chief Executive
Officer Sidney Taurel told the American Chamber of Commerce
in Japan last August.
As part of
its cost-cutting campaign, the drugmaker is experimenting
with new technologies designed to make it easier for
scientists to unearth and correlate scattered, unrelated
morsels of online data. Outfitted with this set of tools,
researchers can make smarter decisions earlier in the
research phase—where scientists screen thousands of chemical
compounds to see which ones best treat symptoms of a given
disease. If all goes according to plan, the company will get
new pharmaceuticals to patients sooner, and at less cost.
Found in Space
Those tools
are the stuff of the Semantic Web, a method of tagging
online information so it can be better understood in
relation to other data—even if it's tucked away in some
faraway corporate database or software program. Today's
prominent search tools are adept at quickly identifying and
serving up reams of online information, though not at
showing how it all fits together. "When you get down to it,
you have to know whatever keyword the person used, or you're
never going to find it," says Dave McComb, president of
consulting firm Semantic Arts.
Researchers in a growing number of industries are sampling
Semantic Web knowhow. Citigroup (C)
is evaluating the tools to help
traders, bankers, and analysts better mine the wealth of
financial data available on the Web. Kodak (EK)
is investigating whether the
technologies can help consumers more easily sort digital
photo collections. NASA is testing ways to correlate
scientific data and maps so scientists can more efficiently
carry out planetary exploration simulation activities.
The Semantic
Web is in many ways in its infancy, but its potential to
transform how businesses and individuals correlate
information is huge, analysts say. The market for the
broader family of products and services that encompasses the
Semantic Web could surge to more than $50 billion in 2010
from $2.2 billion in 2006, according to a 2006 report by
Mills Davis at consulting firm Project10X.
Data Worth a Thousand Pictures
While
other analysts say it will take longer for the market to
reach $50 billion, most agree that the impact of the
Semantic Web will be wide-ranging. The Project10X study
found that semantic tools are being developed by more than
190 companies, including Adobe (ADBE),
AT&T (T),
Google (GOOG),
Hewlett-Packard (HPQ),
Oracle (ORCL),
and Sony (SNE).
Among the
enthusiasts is Patrick Cosgrove, director of Kodak's
Photographic Sciences & Technology Center, who is, not
surprisingly, also a photo aficionado. He boasts more than
50,000 digital snapshots in his personal collection. Each
year he creates a calendar for his family that requires him
to wade through the year's photos, looking for the right
image for each month. It's a laborious task, but he and his
colleagues aim to make it easier.
One project
involves taking data captured when a digital photo is taken,
such as date, time, and even GPS coordinates, and using it
to help consumers find specific images—say a photo of mom at
last year's Memorial Day picnic at the beach. Right now,
much of that detail, such as GPS coordinates, is expressed
as raw data. But Semantic Web technologies could help Kodak
translate that information into something more useful, such
as what specific GPS coordinates mean—whether it's
Yellowstone National Park or Grandma's house up the street.
Continued in article
A new natural-language system
is based on 30 years of research at PARC.
"Building a Better Search Engine," by Michael Reisman, MIT's
Technology Review, July 27, 2007 ---
http://www.technologyreview.com/Biztech/19109/?a=f
Powerset, Inc.,
based in San Francisco, is on the verge of offering an innovative
natural-language search engine, based on linguistic research at the
Palo Alto Research Center (PARC). The
engine does more than merely accept queries asked in the form of a question.
The company claims that the engine finds the best answer by considering the
meaning and context of the question and related Web pages.
"Powerset extracts deep concepts and relationships from the texts, and the
users query and match them efficiently to deliver a better search," Powerset
CEO Barney Pell says.
Even though attempts have been made at natural-language search for decades,
Powerset says that its system is different because it has solved some of the
fundamental technological problems that have existed with this kind of
search. It has done so by developing a product that is deep, computationally
advanced, and still economically viable.
Pell says that it's difficult to pinpoint one particular technological
breakthrough, but he believes that Powerset's superiority lies in the three
decades of hard work by
scientists at PARC. (PARC licensed much of its
natural-language search technology to Powerset in February.) There was not
one piece of technology that solved the problem, Pell says, but instead, it
was the unification of many theories and fragments that pulled the project
together.
"After 30 years, it's finally reached a point where it can be brought into
the world," he says.
A key component of the search engine is a deep natural-language processing
system that extracts the relationships between words; the system was
developed from PARC's Xerox Linguistic
Environment (XLE) platform. The framework that
this platform is based on, called Lexical Functional Grammar, enabled the
team to write different grammar engines that help the search engine
understand text. This includes a robust, broad-coverage grammar engine
written by PARC. Pell also claims that the engine is better than others at
dealing with ambiguity and determining the real meaning of a question or a
sentence on a Web page. All these innovations make the system more
adaptable, he says, so that it can extract deep relationships from text.
Continued in Article
Online Networking Site for Scientists Debuts BiomedExperts.com, a
social-networking Web site for health-care and life-science experts, was
unveiled today at the American Library Association’s midwinter meeting, in
Philadelphia. The site includes profiles of more than 1.4 million biomedical
experts in 120 countries. Researchers can gain access to the site for free and
search for colleagues based on their areas of expertise, where they live, or
other variables. The site also allows scientists to share data and analyses, and
view summaries of their colleagues' research papers. The site is a collaboration
between Collexis Holdings Inc., a Dutch software company, and Dell, a computer
manufacturer.
Andrea L. Foster, Chronicle of Higher Education, January 11, 2008 ---
http://chronicle.com/wiredcampus/index.php?id=2656&utm_source=wc&utm_medium=en
New search tool from Google: Putting order into the wild west of the
blogosphere
It's
tough to make money in a chaotic environment, and things don't get more
rough-and-tumble then in today's blogosphere. The universe of blogs has
everything from little Johnny's web diary to serious journalism and
corporate marketing. Nevertheless, there's money to be made, and Google
is taking the first step to finding that pot of gold. The Mountain View,
Calif., company has launched a
blog-search toolthat looks to bring order
to the unruly blogosphere. Experts say some blogs, such as those doing
credible work in journalism and commentary, are beginning to show
commercial potential. The problem, however, is to find and categorize
them, which is something Google does better than anyone.
InternetWeek Newsletter, September 15, 2005
Also see
http://www.internetweek.com/showArticle.jhtml?articleId=170703264
I fit into the category of an original NWAL blogger category meaning that I'm
a Nerd Without A Life blogger. Now of course there are millions of bloggers who
also have a life. I'm still stuck in the NWAL category.
The WSJ blogiversary highlights the impact of some of selected blogs.
Christopher Cox, Chairman of the SEC, recommends searching for blogs at
Google and Blogdigger ---
http://www.blogdigger.com/index.html
He points out that Sun Microsystems CEO Jack Schwartz in his own blog challenged
the SEC to consider blogs as a means of corporate sharing of public information.
Christopher Cox, a strong advocate of
XBRL,
gives a high recommendation to the following XBRL blog:
For fast financial reporting, a recommended blog is Hitachi America, Ltd XBRL
Business Blog ---
http://www.hitachixbrl.com/
One of the great bloggers is one of the all time great CEOs is Jack Bogle
who founded what is probably the most ethical mutual fund businesses in the
world called
Vanguard. He maintains his own blog (without a ghost blogger) called The
Bogle eBlog ---
http://johncbogle.com/wordpress/
Tom Wolfe (popular novelist) grew "weary of narcisstic shrieks and
baseless information."
Xiao Qiang, the founder of Chna Digital Times, recomments the
following blogs:
ZonaEuropa for global news with a focus on China ---
http://www.zonaeuropa.com/weblog.htm
Howard Rheingold's tech commentaries on the social revolution at
http://www.smartmobs.com/
DoNews from Keso (in Chinese) ---
http://blog.donews.com/keso
(Search engines like Google will translate pages into English)
For Newspapers and Magazines I highly recommend
Drudge Links ---
http://www.trinity.edu/rjensen/DrudgeLinks.htm
In particular I track Reason Magazine, The Nation, The New
Yorker, Sydney Morning Herald, Sky, Slate, BBC, Jewish World Review, and
The Economist
For financial news I like The Wall Street
Journal and the Business sub-section of The New York Times
Much more of my news and commentaries comes from online newsletters such as
MIT's Technology Review, AccountingWeb, SmartPros, Opinion Journal, The
Irascible Professor, T.H.E. Journal, and more too numerous too mention.
And I also get a great deal of information from
various listservs and private messages that people just send to me, many of whom
I've never met.
Better, More Accurate Image
Search
By modifying a common type of machine-learning technique, researchers have found
a better way to identify pictures," by Kate Greene, MIT's Technology
Review, April 9, 2007 ---
http://www.technologyreview.com/Infotech/18501/
HooRay for
Google! Down With Yahoo!
Yahoo is expanding a program that
lets advertisers pay to ensure that their sites are included in its search
results.
Is this new Yahoo policy an abuse of
advertising? I don't seem to mind the tiny advertising boxes that appear
on many Google searches, because I know they are advertisements, and they are
not obtrusive. But I can't say that I go along with the following
new policy of Yahoo. It's just one step away of
the highly abusive policy of listing all advertiser sites before listing the
most relevant sites in a search outcome. That is really abusive in what I
call CFO --- Crap First Out.
The
new Yahoo policy is CAO --- Crap Always Out
The
most abusive in what I call CFO --- Crap First Out.
You may or
may not like Google's search results. You may disagree with its search methods.
But with Google, the search results you see are strictly those that its search
methodology yields. By contrast, at major competitors like Yahoo
and Microsoft's
MSN, the first search results you see are there, at least in part, because
companies paid to place them there.
Walter Mossberg (see below)
Microsoft and Ask Jeeves are dropping paid-inclusion links from their search
engines, a move that's winning praise. Yahoo is the last major search engine
that champions paid inclusion, but for how much longer?
"Paid Inclusion Losing Charm?" by Chris Ulbrich, Wired News,
July 5, 2004 --- http://www.wired.com/news/business/0,1367,64092,00.html?tw=newsletter_topstories_html
"Say Cheerio to Jeeves," by Arik Hesseldahl, Business Week,
February 27, 2006 ---
Click Here
AskJeeves' signature butler, borrowed from
novelist P.G. Wodehouse, is being dropped, as the search site switches names
to Ask.com and revamps its format
After nearly a decade as the search engine with a
human face, AskJeeves.com is dumping the cheerful visage of the butler that
has graced its pages. Starting on Feb. 27, the site will become known
simply as Ask.com.
The character had been used under an agreement
reached in 2000 with the estate of the late British novelist, P.G.
Wodehouse, who penned a series of novels involving the adventures of the
butler Jeeves and his master Bertie Wooster. When initially launched,
AskJeeves.com allowed users to phrase their search terms as questions, such
as "What is the capital of Ohio?" or "How many cups are in a gallon?"
Those days are over, says Daniel Read,
vice-president for consumer products at the new Ask.Com, which for nearly a
year has been part of IAC Search & Media, a unit of IAC/Interactive (IACI),
Barry Diller's $5.7 billion (2005 sales) Internet concern: "The old name
hearkened back to what we were five to seven years ago and not what we are
now. And while we found there were some customers who were loyal to the
AskJeeves name, most of our users were ambivalent about it." IAC paid $1.85
billion for the site, which first launched in 1996.
PHASED OUT. The question approach worked
for a few years, and initially the company found a business building
customer-support Web sites that would allow customers to ask questions on
the Web. The business model changed when in 2001, AskJeeves acquired
Teoma.com itself once dubbed a "Google-killer," and built the Teoma search
technology into the AskJeeves site. Starting on Feb. 27, Teoma.com will
redirect users to Ask.com.
Jeeve's "retirement" hasn't been much of a secret.
Diller has been quoted several times over the last year as saying that the
character would be phased out.
"Lycos Europe's Survival Instincts," by Jack Ewing, Business Week,
February 24, 2006 ---
Click Here
The Web outfit is relying on a new, more
focused, search engine and cost cuts to deliver growth. Does it stand a
chance against Google?
By most conventional financial measures, Lycos
Europe should probably not exist. The Internet portal, search engine, and
Web services provider, part owned by German media giant Bertelsmann and
Spanish telco Telefónica (TEF), has had
only two profitable quarters in its six-year history. It's competing in a
business dominated by Google (GOOG) and Yahoo! (YHOO). And it must cope
with the European market, where economies of scale are undercut by the need
to offer content tailored to national tastes and languages.
Yet on Feb. 22, Lycos Europe Chief Executive
Christoph Mohn stood bravely before a handful of reporters and analysts and
explained why he believes the company will finally be profitable in 2006.
"In a couple of years people will see that we're one of the few global
players," said Mohn, while standing before a laptop at a Frankfurt hotel
conference room and paging through a PowerPoint presentation on the
company's 2005 results. Lycos' loss narrowed to $24 million on sales of
$149 million last year, vs. a loss of $54 million in 2004.
Somebody believes Mohn. Shares of Lycos Europe,
which is a separate company from US.-based Lycos, rose more than 60% last
year. True, the recent price of 1.07 euros ($1.27) was still a long way
from the 2000 Internet bubble price of more than 23 euros. And the shares
fell sharply after the 2005 results were announced. But long after most
highfliers from those days have been forgotten, Lycos still employs almost
700 people, primarily in Gütersloh,
Germany, and offers services in eight European countries, plus the former
Soviet republic of Armenia. It's also the largest chat service in Europe,
with 5 million users.
WORLD AT YOUR FINGERTIPS. Mohn is clearly
true believer No. 1. With an enthusiasm that recalls the Internet euphoria
of a few years ago, he describes the new technologies that he argues will
someday allow Lycos to earn a decent return. The newest service, just
introduced in Germany and currently being rolled out across Europe, is a
search engine called Lycos iQ that's supposed to give more focused results
than Google does.
Users can type in a question, which other users
answer. Users rank answers the same way that eBay (EBAY) users rate sellers
of goods. The idea is to build a database of questions and answers, with
the best answers rising to the top of the list. (It works: This writer
asked for advice on the best places to cross-country ski around Frankfurt,
and within a few minutes received an e-mail with a link to a Web site
devoted to the topic.) "Lycos allows you to tap into the knowledge of the
whole population," Mohn said in an interview.
Will that be enough to compete against the huge
resources of Google? "I don't think [Lycos Europe has] a chance, to be
honest," says Hellen K. Omwando, an analyst at Forrester Research in
Amsterdam. "Look at Yahoo and MSN--even they can't manage to siphon away
Google users."
"ONE OF THE SURVIVORS." Omwando praises
Lycos' cost-cutting measures, which included eliminating more than 200 jobs
last year. She also has kudos for some of Lycos' business-to-business
services, such as software that allows small businesses to easily set up
online shops. But Omwando says the individual assets don't add up to
long-term growth. "Lycos is one of those players waiting to be sold," she
says.
Truckloads of ink and gigabytes of
Internet space are being devoted these days to discussing the merits of Google,
the Web's leading search engine. Most of these articles aren't focusing on how
Google functions for its users but on its value as an investment in light of
the company's announcement last week that it is going public.
I don't give stock tips, and I have no
idea whether investing in Google is a good idea. But I want to focus for a few
moments here on why Google's stock offering is a big deal in the first place:
It's because the company has created a service that works brilliantly for
consumers.
Google's initial success was built on
its breakthrough search technology, which produced more useful search results,
much more quickly, than anyone else. Some analysts believe that edge is waning
or is gone. I still think Google is the best, but in any case, there's another
secret to Google's success: honesty.
Of all the major search engines, Google
is the only one that's truly, scrupulously honest. It's the only one that
doesn't rig its search results in some manner to make money.
You may or may not like Google's search
results. You may disagree with its search methods. But with Google, the search
results you see are strictly those that its search methodology yields. By
contrast, at major competitors like Yahoo
and Microsoft's
MSN, the first search results you see are there, at least in part, because
companies paid to place them there.
Google makes money in a traditional way
that users understand. It sells ads. These ads are clearly labeled and easily
distinguished from the real, unbiased search results. They are triggered by
whatever search term a user enters, and they run down the side of the page
and, occasionally, across the top. The ones across the top are shaded in
color, just to make extra sure nobody confuses them with search results.
This separation of advertising and
editorial content is the same one that has been used for a couple of hundred
years in newspapers and magazines. People get the distinction.
Approach Means Surfers Won't Be Able to Tell
Which Sites Made Payments to Be Included
Yahoo
Inc., the nation's second largest search engine, is aggressively expanding a
program that lets advertisers pay to ensure that their sites are included in
search results.
Yahoo executives say the payments won't
improve a site's ranking on the list of results that appear after a search.
But at the same time, Yahoo acknowledged that there will be no distinguishing
marks to alert Web surfers that a company had paid to be included.
Yahoo's new approach is expected to
begin Tuesday. The Sunnyvale, Calif., Internet company has already been using
a similar approach on its shopping-oriented Web pages, but it's now expanding
the program to its entire site.
The move is likely to add fuel to the
growing battle between Yahoo and its main rival, Google Inc., which has
surpassed Yahoo to become the nation's most popular search site.
Google (www.google.com),
of Mountain View, Calif., says it doesn't let advertisers pay to be included
in its traditional search results. Google does allow advertisers to pay for
promotions that appear alongside search results, but these are clearly labeled
as "sponsored links." Google executives say their users favor this
neutral, technology-driven approach. (Yahoo also continues to have a separate
"sponsored" section for advertisers.)
Google co-founder Larry Page said Google separates
and labels advertising, much the way newspapers distinguish between news
stories and advertising. He questioned whether Yahoo would prevent advertisers
from influencing search rankings, as well as results. "It's really tricky
when people start putting things in the search results," he said.
The problem for Yahoo users is that they won't be
able to tell which results are paid for and which aren't. Currently, search
results are divided into two parts: For example, type in "dog
walkers" and hit "return." At the top of the page that then
pops up -- and also in the right-hand column -- are "sponsored"
links, listing dog walkers or related businesses that paid for the premium
position. Below that are what until now have been unsponsored findings listed
under the heading, "Top 20 Web Results."
Under the new system, that second layer of findings
will include both paid and unpaid links. But there is no way to find out if a
specific company that comes up has paid or not. Yahoo will include only a
general disclosure about the new program, on a separate page. (To read it, Web
surfers must click on the phrase "What's this?")
If Web site operators want to be included in the new
program, they must pay an annual subscription fee of $49 to list one Internet
address and $29 each for their next nine addresses. On top of that, companies
must pay Yahoo a fee for each person that clicks on their search listing.
The move comes two weeks after Yahoo dropped search
technology from Google in favor of its own technology. Google is the
top-ranked site that Internet users visit when conducting Web searches. About
35% of all Web searches in the U.S. are conducted on Google's sites, while 28%
of them are done on Yahoo's sites, according to comScore Media Metrix, a unit
of comScore Networks Inc., a market-research firm.
Analysts say Yahoo's move may arouse suspicions among
computer users that the search results, and rankings, are being influenced by
advertisers. It's a "trust issue," said Charlene Li, an analyst at
market-research firm Forrester Research Inc. "Is this really the most
relevant result or not?"
Yahoo says the program helps users by delivering
information that its own or other search technology might miss. "Our goal
is to deliver the highest quality search results," said Tim Cadogan,
Yahoo's vice president of search. "We're going to gain users," he
says, because "we're delivering better results."
Under Yahoo's "content acquisition
program," advertisers pay to have their sites surveyed by Yahoo software
that "crawls" the Web periodically, looking for new or updated Web
pages.
Forrester's Ms. Li said she thinks consumers
ultimately will accept the program, because they will come to understand
Yahoo's policy of including advertisers in searches, but not allowing
advertising to influence search rankings. (Do
you really think this constraint will remain?)
Search Engine Watch 2003 Award Winners, Part 1
ClickZ's sister site, Search Engine Watch, released its annual list of
outstanding Web search services for 2003. Your favorites are among them, but
there were also surprises and controversial predictions for the coming year. http://nl.internet.com/ct.html?rtr=on&s=1,pvi,1,ctxf,667h,3zob,3pvb
They named it after the biggest number they could imagine. But it wasn't
big enough. On the eve of a very public stock offering, here's everything you
ever wanted to know about Google. A Wired Magazine special report --- http://www.wired.com/wired/archive/12.03/google.html
February 25, 2004 reply from Jim
Borden
Bob,
Here is another good article on Google from Fast
Company (April 2003):
Search Engines 101 --- http://www.searchengines.com/
This website provides some broad categories for searching. It also
provides a tutorial on how search engines work and how to improve your
searches.
How do search engines work? Search engines help people find relevant
information on the Internet. Major search engines have huge databases of web
sites that surfers can search by typing in some text. Learn more about search
engines and effective searching here.
Search engines send out spiders or robots, which
follow links from web sites and index all pages they come across. Each
search engine has its own formula for indexing pages; some index the whole
site, while others index only the main page.
Search engines decide the amount of weight that
will be placed on various factors that influence results. Some want link
popularity to be the most important criterion, while others prefer meta
tags. Search engines use a combination of factors to devise their
formulas.
Directories - a whole different ballgame Often confused with search engines, directories are completely
different. Unlike search engines, directories use "human
indexing;" people review and index links. Directories
have rigid guidelines that sites must meet before being added to their
index. Therefore, they have a smaller, but cleaner index.
Yahoo!,
LookSmart, MSN,
Go and others are
directories. Factors that influence search engine rankings are irrelevant to
directory rankings. Since people review sites, more attention is placed on
the quality of a site: its functionality, content and design. Directories
strive to categorize sites accurately and often correct categories suggested
by a site's webmaster.
You can learn
more about directories here.
Hybrid search engines: The new generation Hybrid search engines combine a directory and a search engine to give
their visitors the most relevant and complete results. The Top 10 search
engines/directories today are hybrid. Yahoo!, for example, is a directory,
which uses results from Google (a search engine) for its secondary results.
At the same time, Google uses Open Directory
Project's directory to supplement its own search engine. Other search
engines work the same way. Learn more about search
engine partnerships here.
Your mission, if you choose to accept it
As someone trying to achieve higher rankings, it's your goal to learn more
about influencing
factors and how each engine uses them. After completing your research,
you will have a better understanding of search engines and directories. You
will also have a better understanding of what it takes to achieve a Top 20
ranking on major search engines.
This section offers detailed explanations of
factors used by search engines and directories, as well as tips for their
implementation.
An Overview
Search engines and directories
Search
engines use robots
Directories
use people
Top
search engines have both robots and people
Different
factors are used for search engines than for directories
FindSame is an entirely new kind of search engine that
looks for content, not keywords. You submit an entire document, and FindSame
returns a list of Web pages that contain any fragment of that document longer
than about one line of text. Enter a URL or paste some text in one of the boxes
below, or upload a file. Then click the "search" button and FindSame
will show you where on the Web any piece of the text at that URL appears.
Search engine for education sites --- http://www.searchedu.com/
My gosh, there were 421 hits for "Bob Jensen," 52 hits for
"FAS 133," and 109 hits for "SFAS 133! I am truly
impressed.
Over 20 million university and education pages
indexed and ranked in order of popularity.
Search for finance and investor news.
TheLion.com http://www.thelion.com/
This is a search engine focused on financial and investment news.
Hoover's, Inc. has acquired Powerize, Inc. and is
pleased to welcome you to the benefits of Hoover's Online. Your favorite
Powerize search features are now available to you in this Archived News
section. Questions? View
our FAQ.
Research Links. Language Translators, etc. --- http://sls-partnership.com/Research_Links.htm
(Links to research, dictionary, thesaurus, dictionaries, thesauri, references,
glossaries, online, language translators, researchers, technical, financial,
medical, engineering, multi-lingual, bilingual --- Japanese and English)
My favorite search engine today is Google.com.
Google consistently produces results that are on target with the
context of my Internet searches, which puts the tools heads above the other
search engines. Here's what Iconocast recently had to say about
Google:
"With the entrance of Google, which greatly
improves search effectiveness and which was recently anointed by Yahoo! as the
Net's definitive search technology, the search-engine game once again looks
promising [we particularly like the "I'm feeling lucky" button].
Google claims to have cataloged 1.06 billion Web pages, no mean feat."
Source: 2000 ICONOCAST http://www.iconocast.com
IFACnet, the global, multilingual search engine developed by the International
Federation of Accountants (IFAC), has expanded its resources to address the
needs of small and medium accounting practices (SMPs), in addition to
professional accountants in business. IFACnet enables SMPs to easily locate
information on a wide range of technical, marketing, human resource and other
matters, including such topics as succession planning, managing a small firm,
staff recruitment and retention, and promoting firm services.
IFACnet has also added
three new features to help accountants worldwide stay current on technical,
professional and marketplace issues and to make the search engine more user
friendly. These include a "Latest News" page with links to a variety of
business, management and accounting media and other websites; a search box that
enables users to search IFACnet directly from their Internet browser; and a
"What's News" section to inform visitors of new IFACnet features and content.
"There are many high quality resources
available from within IFAC as well as through collaboration with our members
that can help the global accountancy community carry out their professional
responsibilities," states Ian Ball, IFAC Chief Executive Officer. "IFACnet's
customised search features provide an efficient means to give professional
accountants, including SMPs and professional accountants in business, in
every part of the world, access to these timely and relevant resources."
Launched in October 2006, IFACnet provides one-stop access to free, high
quality guidance, management tools and articles developed by professional
accountancy bodies from around the world. Since its launch, IFACnet has
attracted nearly 42,000 individuals from more than 190 countries worldwide.
Currently, IFAC and twenty-three of its members (see attachment) provide
IFACnet with access to information from their websites. In the coming
months, new content will continue to be added to IFACnet as it expands the
number of participating organisations.
IFACnet can be accessed free-of-charge at
http://www.ifacnet.com and on the websites of
participating organisations.
IFAC is the worldwide organisation for the accountancy profession dedicated
to serving the public interest by strengthening the profession and
contributing to the development of strong international economies. IFAC is
comprised of 155 members and associates in 118 countries, representing more
than 2.5 million accountants in public practice, education, government
service, industry and commerce. Through its independent standard-setting
boards, IFAC sets ethics, auditing and assurance, education, and public
sector accounting standards. It also issues guidance to encourage high
quality performance by professional accountants in business.
Several add-ins are available, and are a necessity
to be able to search the files most of us work with.
I've tried it on a workstation, and unlike the
Google product it will index and search large files - I was able to find a
phrase in page 388 of a 37.6 MB PDF file with it. There is even some control
over which folders are included in the search indexes.
The only recommendation may be that it is free,
however. As you might expect it steers you towards using more Microsoft
products, although you can turn some of those features off.
The X1 search tool has it beat in being useful,
though. The default view when searching lets you specify several
characteristics simultaneously including filename, type, date/time, path and
size. At the same time you can search for words or other information within
the files that are indexed. You can set limits on what folders are indexed,
and the size of the files that are indexed as well.
If your files are organized into folders, no matter
what criteria you use, you can narrow the search to folders at any level in
the directory tree. When searching for common words that helps immensely in
preventing an overwhelming list of results.
The MSN new toolbar's Windows Desktop Search feature is better than
Google's Desktop Search toolbar Windows won't have integrated desktop search until the
fall of 2006, and IE won't have built-in tabbed browsing until this summer. But
Microsoft has just released a free product that adds both features to Windows
computers. These add-on versions of desktop search and tabbed browsing aren't as
good as their built-in counterparts, but they get the basic job done.
Microsoft's new, free utility goes by the ridiculously long name of MSN Search
Toolbar With Windows Desktop Search, and it can be downloaded at http://toolbar.msn.com/. When you download the toolbar, it adds a
new row of icons and drop-down menus to the IE browser. Many of these are aimed
at driving users to other MSN products, like its Hotmail email service. But you
can also use the toolbar to turn on tabbed browsing and to perform desktop
searches . . . The MSN toolbar's Windows Desktop Search feature is better. It
beats the most popular add-in desktop search product for Windows, Google Desktop
Search, but it's slower and more cumbersome than the integrated search in
Apple's new operating system.
Walter Mossberg, " Free Microsoft Stopgap Offers Tabbed Browsing And Desktop
Searching," The Wall Street Journal, June 16, 2005 ---
http://ptech.wsj.com/ptech.html
Search Inside a Given Computer (Google's Web Desktop Search)
The glitch, which could permit an attacker to
secretly search the contents of a personal computer via the Internet, is what
computer scientists call a composition flaw - a security weakness that emerges
when separate components interact. "When you put them together, out jumps a
security flaw," said Dan Wallach, an assistant professor of computer
science at Rice in Houston, who, with two graduate students, Seth Fogarty and
Seth Nielson, discovered the flaw last month. "These are subtle problems,
and it takes a lot of experience to ferret out this kind of flaw,"
Professor Wallach said.
John Markoff, "Rice University Computer Scientists Find a Flaw in Google's
New Desktop Search Program," The New York Times, December 20, 2004
--- http://www.nytimes.com/2004/12/20/technology/20flaw.html
The glitch only applies to the Web Desktop search tool for internal documents. It
does not apply to other Google search tools.
Google's newly released desktop search application
creates profound security and privacy risks for any companies with public
access PCs, security experts have warned.
"In a shared environment people can use this
powerful Google search tool to deeply mine data from public access
terminals," John McIntosh, managing consultant with IT and security
consultancy Heulyn, told vnunet.com.
"Firms need to be aware of ways in which this
type of software is used and what impact it may have. Credit card details can
be easily unearthed, together with other personal data.
"This can easily lead to identity theft and this
is clearly a fast-growing problem. There is no skill needed to do it, and it
makes it very easy to gain access to potentially sensitive data."
Unveiled last week in a beta test version, the free Google
Desktop search application is designed to enable users to search local
email, files, web history and chat details.
In spite of all the concerns, I think I am going to download the beta
version.
Google Desktop Search is how our brains would work if
we had photographic memories. It's a desktop search application that provides
full text search over your email, computer files, chats, and the web pages
you've viewed. By making your computer searchable, Google Desktop Search puts
your information easily within your reach and frees you from having to
manually organize your files, emails, and bookmarks.
After downloading Google Desktop Search, you can
search your personal items as easily as you search the Internet using Google.
Unlike traditional computer search software that updates once a day, Google
Desktop Search updates continually for most file types, so that when you
receive a new email in Outlook, for example, you can search for it within
seconds. The index of searchable information created by Desktop Search is
stored on your own computer.
In addition to basic search, Google Desktop Search
introduces new ways to access relevant and timely information. When you view a
web page in Internet Explorer, Google Desktop Search "caches" or
stores its content so that you can later look at that same version of the
page, even if its live content has changed or you're offline. Google Desktop
Search organizes email search results into conversations, so that all email
messages in the same thread are grouped into a single search result.
We're currently working to fine tune our algorithms
and to add more capabilities to Google Desktop Search, including the ability
to search for more types of information on your computer. Your opinions and
feedback can help us with this process. What types of files or other
information would you like to be able to search? What new features would be
helpful? Please contact us and let us know.
Frequently Asked Questions
What does Google Desktop Search do? Why is this
useful? Where do I go to do a Desktop Search? What are the system requirements
for running Google Desktop Search? How long will the download take? How easy
is it to install? After installing, how soon can I search my computer? Will
Google Desktop Search affect my computer's performance? What about my privacy?
Does Google Desktop Search share my content with anyone? How come Google
Desktop Search doesn't search all my communications and files? Is Google
Desktop Search available in languages other than English? How do I uninstall?
Yahoo's Desktop Search (for searching text in files within a single
computer) information is at http://desktop.yahoo.com/
Toolbar now lets users navigate the Web without using
URLs.
Scarlet Pruitt, IDG News Service Thursday, July 15,
2004 In its constant quest to court Web surfers, Google added a new feature to
its toolbar this week that allows users to navigate the Web by typing in a
name instead of a URL.
With the new Browse By Name
feature, users of the Google's Toolbar can, for example, type "Grand
Canyon" into their Internet Explorer browser window and land on the Grand
Canyon homepage without having to type the somewhat cumbersome www.nps.gov/grca/
URL for the national park.
If users type in name that isn't
specific or well recognized, the toolbar automatically performs a Google
search on the subject, giving users a choice of destinations to choose from,
the company says.
Typing "bicycles" in
the browser window, for instance, brought up a litany of bicycle-related
search results. Google says that the tool is aimed at helping users save time
when browsing the Web.
Using the browser window as a
convenient search bar may not always be the approach when searching for
general terms, however, because the most specific or obvious destinations tend
to appear first. Typing "apple" takes users directly to Apple
Computer's homepage, for instance, and does not bring up results on the fruit.
Automatic Updates
Google's
Toolbar automatically updates to include new features without users having
to install new versions, though as of Thursday, not all users had received the
update.
A spokesperson for the company in
London says that it would take a few days for the update to be delivered to
all toolbar users, and recommends that users hungry for the new feature
uninstall their toolbar and reinstall the updated version for Google's site.
The new Browse By Name feature is
available in 12 languages, including French, Russian, German, Italian,
Chinese, and Japanese, Google says.
The toolbar update is just the
latest tool introduced by the Mountain View, California, search company, which
has been rolling out new products and services at a clipped rate over the last
year, as it prepares for a much
anticipated IPO.
Audio, Video, Movie, and Television Show Dialog Search Services from Google and Yahoo
LocateTV will search over 3 million TV listings across all channels in your
area
Type in the name of a TV show, movie, or actor
Locate TV will find channels and times in your locale http://www.locatetv.com/
Cuil has demonstrated very well, it doesn't help you to look through the
entire haystack
if it gets dumped on your head, and all you can see is a bunch of hay out there
---
http://www.cuil.com/info/
Boasting big plans, startup search engine Cuil
(pronounced "cool") launched on Monday. The company sold itself on having
indexed more pages than Google, ranking based on context rather than on
popularity, and displaying results organized by concept within a beautiful
user interface. There was just one problem: when the search engine launched,
it didn't work very well.
Cuil's site was down intermittently throughout the
day on Monday, and even when the site was up, it sometimes returned no
results for common queries, or failed to produce the most relevant or
up-to-date results. For example, as of Wednesday morning, searching Cuil for
its own name returns nothing on the first results page that is related to
the engine itself, in spite of the buckets of press it got this week.
"I've seen these sorts of things for all sorts of
startups that get launched," says search-engine expert Danny Sullivan, who
runs Search Engine Land. "You have issues with how it's displaying results;
you have spam showing; you have a lot of duplicate results." But Cuil wasn't
supposed to suffer from the common problems that all sorts of startups
encounter. Its founders have impressive credentials: Anna Patterson and
Russell Power both had major roles in building Google's large search index,
and Tom Costello researched search architecture and relevance methods for
Stanford University and IBM. On top of the company's talent, Cuil raised a
reported $33 million in venture capital. "In many ways, Cuil was the
exception," Sullivan says. "They were one of the few people or companies out
there where you would say, 'Well, all right, I'd be dubious about anyone
else, but if anyone's going to have a chance, you should have a chance.' But
they didn't deliver, and I think that makes it even harder now for startups
to come along."
One of Cuil's main selling points is the size of
its index. Claiming to have indexed 120 billion Web pages, which it states
is three times more than any other search engine, the company says, "Size
matters because many people use the Internet to find information that is of
interest to them, even if it's not popular." But Sullivan notes that
relevance may be the most important quality of search. "When you come into
the idea of size, that starts getting into the question of obscure search,"
he says. "The needle-in-the-haystack search sounds so very compelling--the
idea that if you don't have a lot of pages, you can't search through the
entire haystack. But, as Cuil has demonstrated very well, it doesn't help
you to look through the entire haystack if it gets dumped on your head, and
all you can see is a bunch of hay out there."
Investor
Azeem Azhar,
who incubated the startup search engine
True Knowledge,
notes that while it's useful to have a large base of knowledge, sometimes
the sample that's selected matters more. "There are certain things that
people expect to have, and there are certain facts that are more useful than
others," he says. True Knowledge, which aims at the subset of searchers who
are looking for answers to direct questions, is currently working on
building up a database of relevant facts that can be used to answer
questions such as, "Who was president when Barack Obama was a teenager?" The
company hopes that by focusing on facts of broad interest, such as those
relating to famous people and places, it will be useful to people even as it
solicits responses for them by way of rounding out its database. When a user
asks a question that the system can't answer, it returns, "If there are any
answers, I couldn't find any"; invites the user to add to the database; and
points to traditional search results.
Continued in article
Jensen Comment
I'm still upset that Cuil adds its own pictures to hits that have nothing
whatsoever to do with the author or the documents. Jagdish is probably correct
in saying that Cuil scans part of the document and tries to link a photo from
its own archives that might possibly relate to content of the document. In this
respect Cuil is doing a poor job picking relevant photographs. If I were a
fundamentalist Christian or Muslim, I'd really be upset when Cuil added a
bikini-clad porn star or an aardvark to my serious document about my religion.
As for me I have a sense of humor, but I still contend that adding such useless
pictures is a waste of bandwidth.
The theory is probably that, relative to text, a picture is worth a thousand
words. But the wrong picture on a search hit relates to the wrong thousand
words. And when it comes to searching, trying to search through a million
photographs is certainly not as efficient as trying to search through a billion
words for needles called "key words" or "search phrases." One can't search
through a million pictures for such a thing as "FAS 133." It's pretty difficult
to even sort a million faces for those with big noses. In Internet Explorer when
I have a search page outcome listing 20 hits, I can quickly search the text on
the page by hitting Edit, Find and typing in a search word. I cannot search the
attached pictures for FAS 133. I suppose I could try to scan by eyesight for big
noses. But what would this have to do with my search for FAS 133?
The only real answer to searching for needles in haystacks is indexing in a
way that certain words in different terminologies (e.g., "cash" versus "money"
versus "currency") or certain pictures (e.g., pictures with mountains) are given
useful index magnets. More importantly, a good index system allows you to search
for derivative financial instruments without getting millions of unwanted hits
about mathematics derivatives or chemical derivatives.
TR: Which
research has the most people and
funding?
PN: The two biggest
projects are machine translation and
the speech project. Translation and
speech went all the way from one or
two people working on them to, now,
live systems.
TR:
Like the Google Labs project called
GOOG-411
[a free service that lets people
search for local businesses by
voice, over the phone]. Tell me more
about it.
PN: I think it's
the only major [phone-based
business-search] service of its kind
that has no human fallback. It's 100
percent automated, and there seems
to be a good response to it. In
general, it looks like things are
moving more toward the mobile
market, and we thought it was
important to deal with the market
where you might not have access to a
keyboard or might not want to type
in search queries.
TR:
And speech recognition can also be
important for video search, isn't
it?
Blinkx and
Everyzing
are two examples of startups that
are using the technology to search
inside video. Is Google working on
something similar?
PN: Right now,
people aren't searching for video
much. If they are, they have a very
specific thing in mind like "Coke"
and "Mentos." People don't search
for things like "Show me the speech
where so-and-so talks about this
aspect of Middle East history." But
all of that information is there,
and with speech recognition, we can
access it.
We wanted speech technology that
could serve as an interface for
phones and also index audio text.
After looking at the existing
technology, we decided to build our
own. We thought that, having the
data and computational resources
that we do, we could help advance
the field. Currently, we are up to
state-of-the-art with what we built
on our own, and we have the
computational infrastructure to
improve further. As we get more data
from more interaction with users and
from uploaded videos, our systems
will improve because the data trains
the algorithms over time.
"Video Searching by Sight and Script: Researchers have designed
an automated system to identify characters in television shows, paving the way
for better video search," by Brendan Borrell, MIT's Technology Review,
October 11, 2006 ---
http://www.technologyreview.com/read_article.aspx?id=17604&ch=infotech
Google's acquisition this week of YouTube.com has
raised hopes that searching for video is going to improve. More than 65,000
videos are uploaded to YouTube each day, according to the website. With all
that content, finding the right clip can be difficult.
Now researchers have developed a system that uses a
combination of face recognition, close-captioning information, and original
television scripts to automatically name the faces on that appear on screen,
making episodes of the TV show Buffy the Vampire Slayer searchable.
"We basically see this work as one of the first
steps in getting automated descriptions of what's happening in a video,"
says Mark Everingham, a computer scientist now at the University of Leeds
(formerly of the University of Oxford), who presented his research at the
British Machine Vision Conference in September.
Currently, video searches offered by AOL Video,
Google, and YouTube do not search the content of a video itself, but instead
rely primarily on "metadata," or text descriptions, written by users to
develop a searchable index of Web-based media content.
Users frequently (and illegally) upload bits and
pieces of their favorite sitcoms to video-sharing sites such as YouTube. For
instance, a recent search for "Buffy the Vampire Slayer" turned up nearly
2,000 clips on YouTube, many of them viewed thousands of times. Most of
these clips are less than five minutes and the descriptions are vague. One
titled "A new day has come," for instance, is described by a user thusly:
"It mostly contains Buffy and Spike. It shows how Spike was there for Buffy
until he died and she felt alone afterward."
Everingham says previous work in video search has
used data from subtitles to find videos, but he's not aware of anyone using
his method, which combines--in the technical tour de force--subtitles and
script annotation. The script tells you "what is said and who said it" and
subtitles tell you "what time something is said," he explains. Everingham's
software combines those two sources of information with powerful tools
previously developed to track faces and identify speakers without the need
for user input.
What made the Buffy project such a challenge,
Everingham says, is that in film and television, the person speaking is not
always in the shot. The star, Buffy, may be speaking off-screen or facing
away from the camera, for instance, and the camera will be showing you the
listener's reactions. Other times, there may be multiple actors on the
screen or the actor's face is not directly facing the camera. All of these
ambiguities are easy for humans to interpret, but difficult for
computers--at least until now. Everingham says their multimodal system is
accurate up to 80 percent of the time.
A single episode of
Buffy can have up to 20,000 instances of detected faces, but most of
these instances arise from multiple frames of a single character in any
given shot. The software tracks key "landmarks" on actor's faces--nostrils,
pupils, and eyes, for instance--and if one of them overlaps with the next
frame, the two faces are considered part of a single track. If these
landmarks are unclear, though, the software uses a description of clothing
to unite two "broken" face tracks. Finally, the software also watches
actors' lips to identify who's speaking or if the speaker is off screen.
Ultimately, the system produces a detailed, play-by-play annotation of the
video.
"The general idea is
that you want to get more information without having people capture it,"
says Alex Berg at the
Computer Vision Group at University of California,
Berkeley. "If you want to find a particular scene with a character, you have
to first find the scenes that contain that character." He says that
Everingham's research will pave the way for more complex searches of
television programming.
Computer scientist Josef
Sivic at Oxford's
Visual Geometry
Group, who contributed to the Buffy
project, says that in the future it will be possible to search for
high-level concepts like "Buffy and Spike walking toward the camera
hand-in-hand" or all outdoor scenes that contain Buffy.
Timothy Tuttle, vice
president of AOL Video, says, "It seems like over the next five to ten
years, more and more people will choose what to watch on their own schedule
and they will view content on demand." He also notes that the barrier to
adapting technologies like Everingham's may no longer be technical, but
legal.
These legal barriers
have been coming down with print media because companies have reaped the
financial benefits of searchable content--Google's Book Scan and Amazon's
search programs have been shown to
boost book sales over the last two years.
Video on the Web is all the
rage now, the subject of an endless stream of articles and speculation that
it's the next big thing. And there's some evidence to back that up. Apple
Computer Inc. sold 12 million video clips at $1.99 each from its popular
iTunes Music Store in just a few months. Google has made a splash with a
similar video download store. According to AccuStream iMedia Research, about
18 billion video streams were online in 2005 and that number is expected to
grow by more than 30% in 2006.
But how do you find the
video clips you'd like to see, or download? Normal search engines like
Google's can sometimes point you to video clips, but they aren't optimized
for that task.
So, this week, we dived into
the world of online videos, looking for the best ways to find clips. We were
impressed by how much material is out there -- much of it free. We used
about 10 different video searching/hosting sites to find videos related to
TV shows, including "Grey's Anatomy," Hollywood actors, like Matthew
McConaughey, and musicians, like Brad Paisley. We also searched for news
videos, ads and amateur videos. We even looked for a famous "Saturday Night
Live" mock music video, and its imitators.
Our
results: AOL Video Search, Yahoo Video Search, and Blinkx TV earned our
appreciation because each searches the entire Internet for material, and
does a decent job.
Google Video and iTunes also perform video searches, but they search only
among the material they host on their own servers, and which they offer for
sale, or for free downloading. They don't search across the entire Web.
Sites like YouTube.com and GoFish.com have sprung up as central download
sites for all sorts of video clips, some by amateurs and some by pros. But
they, too, search only the material they offer themselves.
The
technology for searching the actual spoken words in a video exists, but is
in its infancy. So, most video searches are done by looking for words in a
video's title text, or in descriptions or other information embedded in a
video file in the form of "metadata" or "tags" -- kind of like the embedded
title, artist and album information in a music file. Some TV shows stored on
the Web also contain closed captioning data, and that can be searched in
some cases.
AOL Video Search (www.aol.com/video)
uses the search engines of two smaller, yet powerful,
companies that it owns:
Truveo.com and
Singingfish.com. As you use AOL Video Search, your
past search topics are saved in a left-hand column and videos can be saved
into a special AOL playlist. An adult content filter is used on AOL's
server, meaning users can't turn the filter on or off.
Using
AOL, we found and watched the "Saturday Night Live" mock music video called
"Lazy Sunday," set in New York, and its West Coast response, "Lazy Monday,"
set in Los Angeles.
Yahoo Video Search (http://video.search.yahoo.com)
can display results in a visually attractive grid of
images from each video clip. Unlike AOL, which displays advertisements on
its search start and results page, Yahoo doesn't show ads on either page --
though ads will display if they're linked to videos from outside sources. A
SafeSearch filter can be used for blocking adult material as you search
videos.
Using
Yahoo's video search, we turned up clips of a forgettable 1998 appearance
Walt made in an East Coast vs. West Coast computer trivia contest held in
Boston. Not only was his East Coast team crushed, but they wore puffy
colonial shirts while being crushed.
Blinkx TV (www.Blinkx.tv)
uses a simple interface and makes searching easy -- an
empty box placed on the left of the screen with a collage of 100 tiny clip
images playing on the right. After results are returned, you can adjust a
horizontal slider between "date" or "relevance," depending on your
preference. Our results weren't always as accurate with Blinkx as they were
with other video-search sites -- one search returned spreadsheets rather
than videos -- but we liked how the results page played animated clips of
each video in the same window. Blinkx offers a prominent filtering button to
hide adult results.
Google Video (http://video.google.com),
which is still in its beta (or prerelease) version,
also offers video searching through free videos -- but allows you to search
only through material that Google hosts, or streams from its servers. This
site eliminates ads -- including Google's word-only ads -- entirely, which
is refreshing.
Google and Yahoo are introducing services that will
let users search through television programs based on words spoken on the air.
The services will look for keywords in the closed captioning information that
is encoded in many programs, mainly as an aid to deaf viewers.
Google's service, scheduled to be introduced January
25, does not actually permit people to watch the video on their computers.
Instead, it presents them with short excerpts of program transcripts with text
matching their search queries and a single image from the program. Google
records TV programs for use in the service.
Google's vice president for product management,
Jonathan Rosenberg, said offering still images was somewhat limited but was a
first step toward a broader service.
"The long-term business model is complicated and
will evolve over time," Mr. Rosenberg said. Eventually, Google may offer
video programming on its site or direct people to video on other Web sites.
But for now, the issues relating to the rights and business interests of
program owners are very complex, he said.
A Google spokesman, Nate Tyler, said the service
would include "most of the major networks," including ABC, PBS, Fox
News and C-Span. Mr. Rosenberg said Google did not think it needed the
permission of network and program owners to include them in the index but
would remove any program or network if the owner requests it. He declined to
discuss any business arrangements between the program owners and Google.
Brian Lamb, the chief executive of C-Span, said he
met with representatives of Google and approved of their service but no money
changed hands between the two organizations.
Yahoo introduced a test version of a different sort
of video search last year, available from a section of its site, that lets
users comb through video clips from various Web sites.
Today, Yahoo will move the video search to its home
page. In the next few weeks, it will introduce the ability to search the
closed-captioning text for programs from some networks, including Bloomberg
and the BBC. Unlike the Google service, Yahoo's offering will let users watch
60-second video clips.
David Ives, the chief executive of TV Eyes, which is
providing that part of Yahoo's service, said some broadcasters were paying to
have their programs included in the search. In other cases, he said, the
broadcaster and TV Eyes will split revenue from advertisements placed next to
the video clips.
Yahoo Inc. plans to release today (February 3)
a service designed to make it easier to conduct Web searches, its latest sally
in the heated battle with Google Inc. and Microsoft Corp. to make search
results more relevant for individual users.
The service, dubbed "Y!Q," uses keywords
automatically extracted from Web pages to conduct Web searches and also to
find related content on Yahoo's own Web sites. The search companies have long
complained about the difficulty of delivering exactly the search results users
want since the average search query a user enters is just a few words long.
Yahoo's new service, which it plans to release on its site for test services (
http://www.next.yahoo.com/
), partly addresses that problem by creating a list of search keywords itself
based on the text a user is looking at.
. . .
If an individual is reading a news article on Yahoo's
site about plans for changing Social Security, for example, clicking on a button
marked "Search Related Info" generates links to several Web sites
discussing the same topic. In that case, the service extracts a string of
keywords including "President Bush" and "Social Security"
from the original article and uses them as the basis for the new search. The
service works on sites other than Yahoo's own and allows users to add or exclude
search terms from those generated automatically.
StumbleUpon is an intelligent browsing tool for
sharing and discovering great websites. As you click
Stumble!,
you'll get high-quality pages matched to your personal
preferences. These pages have been explicitly recommended (rated
I like it) by friends and other SU members with
similar interests. Rating these sites shares them with your friends and
peers – you will automatically 'stumble upon' each others favorites sites.
In effect,
StumbleUpon's members collectively share the best
sites on the web. You can share any site by simply
clicking I like it. This passes the page on to
friends and like-minded people – letting them "stumble upon" all the great
sites you discover.
Selecting Your Interests
After you join you will be asked to select topics which are of interest to
you. Nearly 500 topics are available and you can select as many as you wish
to help determine your preferences in web content. The more interests you
select, the better StumbleUpon will be able to determine which sites you
will like best. This lets StumbleUpon provide you with sites rated highly by
other members with similar interests. You can also add, remove or modify
your interests at any time.
Jensen Comment: I found this site a little confusing to use, but I
think I got the hang of it. Now I find it quite useful for finding good
sites. Many of the hits are commercial sites. It does clutter your
browser window with yet another toolbar, although if you click on the View
option in your browser you can choose to hide this and other browser toolbars.
KartOO is a metasearch engine
with visual display interfaces. When you click on OK, KartOO launches the query
to a set of search engines, gathers the results, compiles them and represents
them in a series of interactive maps through a proprietary algorithm KartOO Searching ---
http://www.kartoo.com/ Jensen Comment: As the name StumbleUpon suggests in the module above,
StumbleUpon more or less randomly brings up "good" sites under a give topic
area. Another search engine called KartOO brings up "good" sites a little
less randomly due to the ability to fine tune with subtopics.
For example, enter "Accounting" and note the many subtopics. This is a
very good search site when you want to drill down to details on a topic.
Try it again with "Accounting Education." However, I find StumbleUpon a
bit more imaginative in terms of interesting and varying sites.
Speegle: Listen to Your Search Outcomes
Human eyes can scan a page is a fraction of the time it takes to hear the
page read aloud. I can't for the life of me see much advantage to having a
search page read aloud except for blind people or for other people who are
focusing on other things such as driving a car. You can choose a male or
female voice without a heavy Scottish accent. See http://www.speegle.co.uk/
It's fun to try this out. I did so using the search term Enron and
found some interesting outcomes that I had not found on other search
engines. Hence I might use Speegle more as a visual search
engine.
A Scottish firm is looking to attract web surfers with
a search engine that reads out results. Called Speegle, it has the look and feel
of a normal search engine, with the added feature of being able to read out the
results.
Scottish speech technology firm CEC Systems launched
the site in November.
But experts have questioned whether talking search
engines are of any real benefit to people with visual impairments.
'A bit robotic'
The Edinburgh-based firm CEC has married speech
technology with ever-popular internet search.
The ability to search is becoming increasingly
crucial to surfers baffled by the huge amount of information available on the
web.
Continued in the article
Cell Phone Search Engines
From The Washington Post on October 13, 2005
Three search engines now allow cell phone users
to text-message queries from their cell phones. Which of the following is not
one of the three?
Answer: Only AltaVista does not allow cell phone queries.
People who visit www.intelius.com
can enter a person's name to get a cell phone number, or do the reverse by
entering a number to get the subscriber's name. Each search costs $15. They can
also download a raft of personal information about the subscriber. This was a
feature on ABC evening news, August 14, 2007.
"Free Cell Phone Number Search - How To Find Free Cell Phone Numbers," ---
Click Here
The freebies are not really very worthwhile relative to the fee-based services.
Jensen Comment
This will be terribly frustrating if telemarketers and crank callers begin to
use up your allotted free minutes of cell phone time each month.
You may enter your cell phone numbers into the "Do Not Call" registry the
same as you probably did for your landline phone ---
https://www.donotcall.gov/default.aspx
However, telemarketers are not supposed to call cell phones with automatic
dialers ---
https://www.donotcall.gov/default.aspx
This is no protection, however, from crank callers or telemarketers who take the
trouble to dial in your cell phone number. Of course, being in the "Do Not Call"
registry does not protect you from telemarketing charitable organizations that
are typically the biggest nuisance these days. Also the "Do Not Call Register"
provides no guarantee that you will not get calls from commercial telemarketers,
especially those who fly by night.
It might just pay to get the cell phone numbers of your state Senators and
local Congressional representative and call them late at night at home on their
supposedly "personal" cell phones. Better yet, call their children and ask them
to tell their parents how you got their phone numbers.
Note that if you've never given a cell phone number out to any organization
other than your phone company, Intelius may not have your cell phone number in
its dastardly database. You should make your children aware of this. Even
emergency calls to 911 may result in Intelius getting your cell phone number
according to the fine print in my Verizon Wireless contract.
To my knowledge there's no unlisted phone service for cell phones like the
one that you can pay for monthly on your landline number
Using Google to "define" versus define: words
Question:
How can you troll the Web for the definition of a word?
Answer:
Go to Google --- http://www.google.com/advanced_search?hl=en
In either the "Type all the words" box or the "With the exact
phrase" box, type the word "define" with the quotation marks,
then a space, and the word or phrase you want defined. At the top of all
the search hits, you will get the definition you were seeking plus a link to
additional definitions.
For example, type "define"
love
Interestingly, Google suggests typing
"define" carcooning
However, Google cannot seem to find a definition of that word (which
appears to mean customizing one's car for travel comfort).
Note
that you get a different result in Google when you use “define” with
quotation marks versus define: with a colon.
It does not matter whether you are in Google’s main page or in Google’s
Advanced Page.
Baffled by
bling-bling? Perplexed by prairie-dogging? Confused by carcooning? Google can
help.
The search engine
powerhouse has introduced a glossary feature to troll the Web for definitions.
The Mountain View, Calif., company says its particularly well-suited for slang
and newer terms such as "search engine," that are likely to appear
online before they do in print.
The technology was
developed by Google Labs, a unit dedicated to new technology, and has been in
testing for 18 months. International versions will be introduced in coming
months.
"(A search
command) emerges from testing when we feel it's ready for prime time," a
Google spokesman told internetnews.com. "Certainly, the quality
and reliability have to be there."
Users type the word
"define," then a space, and the word or phrase they want defined
into the Google.com search pane. If Google has seen a definition on the Web,
it retrieves and display it on a results page. The commands "what
is" and "definition" also work.
Results are
highlighted as "Web Definition" followed by the text of the
Web-generated definition. If Google finds several entries, users are presented
with a link to a complete list.
Google still has a
deal with dictionary.com to provide its content. On the results page, users
can click on the word they entered in the blue results bar and access the
dictionary.com definition.
Of course, rival
search engines routinely include definitions as part of their results. And
there are other sites specializing in slang and new terms, including Urban
Dictionary, which allows users to submit their own words, and Word Spy, which
compiles and defines words and phrases popping up in the media.
Earlier this year,
Word Spy ran
afoul of Google's intellectual property lawyers who wanted to be sure when
people "use 'Google,' they are referring to the services our company
provides and not to Internet searching in general."
Lawyers weren't as
upset with the definition as they were the lack of mention of the corporate
entity. Word Spy's editor modified the entry by inserting trademark
information, which satisfied Google.
GOOGLE expands services for the following:
area codes, product codes,
flight information,
vehicle identification numbers
U.S. Postal Service tracking numbers.
"Google Expands Search
Features," by Mylene Mangalindan, The Wall Street Journal, January
13, 2004 ---
Google Inc. expanded the types of
information that Internet users can search for on its Web site to include such
things as area codes, product codes, flight information, vehicle
identification numbers and U.S. Postal Service tracking numbers.
The closely held Web-search technology
company introduced the new features to its Web site Monday. Google (www.google.com)
sees its mission as connecting Internet users to the world's information,
which it hopes to organize and make more accessible.
The Mountain View, Calif., company,
which is the leading destination for Internet users on the Web for search, has
been in the news lately because it is expected to sell shares to the public
this year, according to people familiar with the situation. Many bigger public
companies such as Yahoo
Inc. and Microsoft
Corp. have also made it clear that they intend to challenge the start-up in
search technology.
Google's announcement Monday introduces
several new innovations. Computer users, for example, can type in an area code
in the search query bar and the top result will show a map of that geographic
area. Users can also plug in a vehicle identification number into the search
query box to get a link for a Web page with more information about the year,
make and model of a specific type of car.
Google goes local Search giant to tap into huge local advertising
market," by Stephanie Olsen, MSNBC News, March 17, 2004 --- http://www.msnbc.msn.com/id/4547867/
Internet darling Google is taking search to the
streets, helping Web surfers find cafes, parks or even Wi-Fi hot spots in
their area.
On Wednesday, the Web search company unveiled Google
Local, which has been tested in the company's research and development lab for
the last 8 months. Type a keyword along with an address or city name into the
search box at Google.com or at its newly designated site, Local.google.com ( http://www.local.google.com/
), to find maps, locally relevant Web sites and listings from businesses in
the area.
"A lot of times when people are looking for
something, they want to do it on a local level...This is a core search
promise," said Marissa Mayer, Google's director of consumer Web products,
who helped build the service with a team of engineers from Google's New York
office.
Prime real estate Mountain View, Calif.-based Google
is giving prominence to local search at a time when it's one of the most hyped
areas of development in the industry. Financial analysts and industry
executives say geographically targeted search listings are prime real estate
for local advertising, an estimated $12 billion annual business in the United
States. In 2004, less than $50 million of that market will go toward ads
related to local Net searches, but over time, the dollars will find their way
to the virtual world, analysts say.
It will be "worth a lot more online. That is,
merchants will pay more," said Safa Rashtchy, Piper Jaffray's Internet
analyst. "Integration of that with search will make it very convenient
for searchers and extremely useful for local merchants."
For now, search engines including Google, Yahoo, Ask
Jeeves, MSN and CitySearch are working to perfect local search for consumers.
Google's chief rival, Yahoo, recently improved
visitors' chances of finding local restaurants, ATMs, shops and bus routes
through its map service. With its new SmartView feature, Yahoo now
incorporates points of interests like restaurants into local maps, allowing
Web surfers to refine what they're looking for (for example, Italian or Indian
food) and see where a particular spot is located in the neighborhood.
Google, which fields about 200 million queries a day,
said its local service improves people's access to relevant information, its
long-time mission. Using the local service, people will find business
addresses, phone numbers and "one-click" driving directions to
places of interest.
To deliver the results, Google draws on business
listings provided by third-party companies. It also uses technology to collect
and analyze data on the physical location of a Web page and then matches that
data to specified queries and their designated addresses.
For now, Google will not display local advertisements
on the service, but it plans to do so in the future. However, the company
currently sells advertisers the ability to target people by region on the main
Web site. Google makes money by letting advertisers bid for placement on
results pages for related search terms. Ads appear adjacent to or atop search
results.
Google Catalogs - catalogs.google.com
Search and browse mail-order catalogs online. More...
Google Groups - groups.google.com
Post and read comments in Usenet discussion forums. More...
Google Image Search - images.google.com
The most comprehensive image search on the web with 425 million images.
More...
"A Research Paper
Introduces Better Google Image-Search Technology," by Hurley
Goodall, Chronicle of Higher Education, April 28, 2008 ---
Click Here
Google
unveiled a prototype algorithm at a conference in Beijing last
week that will add precision to the search engine’s image-search
technology, The New York Times says.
Two
Google researchers presented a paper describing the prototype,
which is called VisualRank. It uses image-recognition technology
to help rank the relevance of images found in a search.
Currently, Google Image Search results are ranked using the text
around the image on the page. The new method will use the visual
characteristics of the image itself, and rank search results by
comparing similarities among them.
Google Labs - labs.google.com
Prototypes and projects in development by Google engineers, including: Google
Viewer - Google WebQuotes - Google Glossary - Google Sets - Voice Search -
Keyboard Shortcuts. More...
Google News - news.google.com Search and browse 4,500 continuously updated
news sources. More...
Google Special Searches - www.google.com/options/specialsearches.html
Narrow your search to a specific topic, such as BSD, Apple, and Microsoft.
Google Browser Buttons - www.google.com/options/buttons.html
Access Google's search technology by adding our buttons to your browser's
personal toolbar.
Google in Your Language - services.google.com/tc/Welcome.html
Volunteer to translate Google's help information and search interface into
your favorite language.
Google Toolbar - toolbar.google.com
Take the power of Google with you by adding the toolbar to your IE browser.
More...
Google Translate Tool - www.google.com/language _tools
Translate text or entire web pages.
Google Web APIs - www.google.com/apis/
A tool for software developers to automatically query Google. More
Google makes a lot of money from its services to advertisers and other
business firms:
Google sorts billions
of bits of information for its users. Here are some little-known bits of
information about Google:
Google's name is a
play on the word googol, which refers to the number 1 followed by
one hundred zeroes. The word was coined by the nine-year-old nephew of
mathematician Edward Kasner.
Google receives more
than 200 million search queries a day, more than half of which come
from outside the United States. Peak traffic hours to google.com are
between 6 a.m. and noon PST, when more than 2,000 search queries are
answered a second.
Google started as
a research project at Stanford University, created by Ph.D.
candidates Larry Page and Sergey Brin when they were 24 years old and 23
years old respectively (a combined 47 years old)..
Google's index of
web pages is the largest in the world, comprising more than 3 billion
web pages, and which if printed, would result in a stack of paper 130
miles high. Google searches this immense collection of web pages often in
less than half a second.
Google receives
daily search requests from all over the world, including places as
far away as Antarctica and Ghana.
Users can restrict
their searches for content in 35 non-English languages, including
Chinese, Greek, Icelandic, Hebrew, Hungarian and Estonian. To date, no
requests have been received from beyond the earth's orbit, but Google is
working on a Klingon interface just in case.
Google has a
world-class staff of more than 1000 employees known as Googlers.
The company headquarters is called the Googleplex.
Google translates
more than 3 billion HTML web pages into a display format for WAP and i-mode
phones and wireless handheld devices, and has made it possible to
enter a search using only one phone pad keystroke per letter, instead of
multiple keystrokes.
Google Groups
comprises more than 800 million Usenet messages, which is the world's
largest collection of messages or the equivalent of more than a terabyte
of human conversation.
The basis of
Google's search technology is called PageRank™, and assigns an
"importance" value to each page on the web and gives it a rank
to determine how useful it is. However, that's not why it's called
PageRank. It's actually named after Google co-founder Larry Page.
Googlers are
multifaceted.
One operations manager, who keeps the Google network in good health is a
former neurosurgeon. One software engineer is a former rocket scientist,
while another's first job title at Google was the Spiderman. And the
company's chef formerly prepared meals for members of The Grateful Dead
and funkmeister George Clinton.
3 billion web
pages translates to approximately 3 trillion words in Google’s
index. If a person averages about 1 page per minute, it would take 6,000
years to read the Google index. If this person reads on an 8-hour daily
schedule, it would take 18,000 years. Want weekends off? Add another 2,000
years.
What a lot of folks do not know about is the commercial Google Search
Appliance --- http://www.google.com/appliance/features.html
Among other things this allows management to track employee searches and track
incoming data when the outside world seeks employee Web pages.
Universities are now using this (by paying $28,000 or more per year) to track
information about searches of university Web servers. See the following
reference:
"Universities Discover a New Use for Google:
Finding Out What People Want," by Dan Carnevale, The Chronicle of Higher
Education, October 24, 2003, Page A37.
New tutorial detailing 20 ways to get more out of
Google.
Find out how to search with a date range (which avoids all those dead dot-com
pages cluttering up the web) and what intext means.
"20 Great Google Tips," by Tara Calishain, PC Magazine, October
28, 2003 --- http://www.pcmag.com/article2/0,4149,1306756,00.asp
Google offers several services that give you a head
start in focusing your search. Google Groups (http://groups.google.com)
indexes literally millions of messages from decades of discussion on Usenet.
Google even helps you with your shopping via two tools: Froogle (http://froogle.google.com),
which indexes products from online stores, and Google Catalogs (http://catalogs.google.com),
which features products from more 6,000 paper catalogs in a searchable
index. And this only scratches the surface. You can get a complete list of
Google's tools and services at www.google.com/options/index.html.
You're probably used to using Google in your
browser. But have you ever thought of using Google outside your browser?
Google Alert (www.googlealert.com)
monitors your search terms and e-mails you information about new additions
to Google's Web index. (Google Alert is not affiliated with Google; it uses
Google's Web services API to perform its searches.) If you're more
interested in news stories than general Web content, check out the beta
version of Google News Alerts (www.google.com/newsalerts).
This service (which is affiliated with Google) will monitor up to 50 news
queries per e-mail address and send you information about news stories that
match your query. (Hint: Use the intitle: and source: syntax elements with
Google News to limit the number of alerts you get.)
Google on the telephone? Yup. This service is
brought to you by the folks at Google Labs (http://labs.google.com),
a place for experimental Google ideas and features (which may come and go,
so what's there at this writing might not be there when you decide to check
it out). With Google Voice Search (http://labs1.google.com/gvs.html),
you dial the Voice Search phone number, speak your keywords, and then click
on the indicated link. Every time you say a new search term, the results
page will refresh with your new query (you must have JavaScript enabled for
this to work). Remember, this service is still in an experimental phase, so
don't expect 100 percent success.
In 2002, Google released the Google API
(application programming interface), a way for programmers to access
Google's search engine results without violating the Google Terms of
Service. A lot of people have created useful (and occasionally not-so-useful
but interesting) applications not available from Google itself, such as
Google Alert. For many applications, you'll need an API key, which is
available free from www.google.com/apis.
See the figures for two more examples, and visit www.pcmag.com/solutions
for more.
Thanks to its many different search properties,
Google goes far beyond a regular search engine. Give the tricks in this
article a try. You'll be amazed at how many different ways Google can
improve your Internet searching.
Daterange: (start date–end date). You can
restrict your searches to pages that were indexed within a certain time
period. Daterange: searches by when Google indexed a page, not when the page
itself was created. This operator can help you ensure that results will have
fresh content (by using recent dates), or you can use it to avoid a topic's
current-news blizzard and concentrate only on older results. Daterange: is
actually more useful if you go elsewhere to take advantage of it, because
daterange: requires Julian dates, not standard Gregorian dates. You can find
converters on the Web (such as http://aa.usno.navy.mil/data/docs/JulianDate.html),
but an easier way is to do a Google daterange: search by filling in a form
at www.researchbuzz.com/toolbox/goofresh.shtml
or www.faganfinder.com/engines/google.shtml.
If one special syntax element is good, two must be better, right? Sometimes.
Though some operators can't be mixed (you can't use the link: operator with
anything else) many can be, quickly narrowing your results to a less
overwhelming number.
More Google API Applications
Staggernation.com offers three tools based on the
Google API. The Google API Web Search by Host (GAWSH) lists the Web hosts of
the results for a given query (www.staggernation.com/gawsh/).
When you click on the triangle next to each host, you get a list of results
for that host. The Google API Relation Browsing Outliner (GARBO) is a little
more complicated: You enter a URL and choose whether you want pages that
related to the URL or linked to the URL (www.staggernation.com/garbo/).
Click on the triangle next to an URL to get a list of pages linked or
related to that particular URL. CapeMail
is an e-mail search application that allows you to send an e-mail to google@capeclear.com
with the text of your query in the subject line and get the first ten
results for that query back. Maybe it's not something you'd do every day,
but if your cell phone does e-mail and doesn't do Web browsing, this is a
very handy address to know.
April 7, 2003 from Wired News
Rolling out a souped-up search engine Monday, Yahoo makes a bid to supplant its
business partner, Google, as the most popular place to find things on the
Internet. The company says its search engine will be more useful and simpler to
use than Google --- http://www.wired.com/news/business/0,1367,58368,00.html
According to the industry newsletter, Google handles
an average of 112 million searches a day and Yahoo handles about 42 million.
Most of Yahoo's results are generated by Google's software.
With its success, Google has introduced other
services, such as news and shopping pages, that traverse Yahoo's turf.
To lessen its dependence on Google, Yahoo last month
bought search engine specialist Inktomi for $279.5 million. Yahoo plans to
incorporate Inktomi's tools in to its search engine by year's end.
Success also has thrust privately held Google into
the cross-hairs of Microsoft, which last week said it would improve its online
search prowess.
Ask Eric Schmidt,
chief executive of Google, when Silicon Valley and the technology industry
will return to robust growth. All he knows is it won't be in the immediate
future, and he makes a persuasive case.
That's the big
picture. But Schmidt's smaller picture, Google itself, is one of those grand
exceptions that proves the valley's longstanding rule -- that technological
innovation continues no matter what the larger economy is doing.
I caught up with him
at the annual Agenda conference, a gathering that has been a staple of the
tech elite's autumn schedule. This year's gathering, a drastically downsized
affair, reflected overall industry trends.
A grim, confused mood
prevailed here, and the momentary pleasure of Tuesday's market surge was
doused by Intel's disappointing earnings report after the market's close. Few
people here seemed willing even to speculate on when technology spending would
rebound.
Schmidt isn't
predicting any immediate boosts.
Is the current gloom
overdone? That depends, Schmidt says, on ``whether you think we are at a
bottom.'' Are we? ``We're nearing one.''
Not Google, which has
become one of the Internet's essential services. A couple of years ago, when I
spoke publicly, I started asking people in the audience who was not using
Google as their primary search engine. It has been a long time since more than
several people in any crowd raised their hands.
One smart idea has
followed another at the Mountain View company. Recently, Google created a
news-oriented search, culling and ranking news stories from a variety of
sources. Google works in teams of three people, and one of those teams created
Google News (http://news.google.com),
which is rapidly becoming one of my online addictions.
Innovation happens no
matter what markets do, Schmidt says, a common refrain. ``Innovation comes
from universities,'' he says, ``and it's producing enormous step-ups in
wireless, chip design, Linux and information mining,'' among other areas. But
most of the innovations he sees tend to be interesting technologies without a
persuasive business case.
Google gets its share
-- its pick, really -- of smart university graduates, Schmidt says. The
company is doing cool projects. It's probably at the beginning, not the end,
of its serious growth.
Google doesn't give
out the precise numbers, but Schmidt says it has been profitable since March
2001. Its principal business is what he calls a ``positive surprise'' -- the
effectiveness of the little advertisements that appear on the pages showing
the result of users' searches.
A fascinating wrinkle
is how the company sells the ads it places on the right side of the Web page.
These are auctioned off, unlike the ads that show up above the search results.
It would not surprise me to learn that Google is making more money on the Web
than any auction site except eBay. Google sells fewer items, but it keeps all
the money from these auctions.
Many have wondered
how long it will take for Google to do what so many other valley companies
have done in recent years -- sell shares to the public. Schmidt has bad news
for those who want it to be soon: ``We have no plans to go public,'' he says.
Is Google even talking with investment banks? ``No.''
This is a lousy time
for an initial public offering. The economy is stagnant, and a couple of weeks
of stock-market surges should meet more suspicion than joy. The murky
financial landscape gets worse when you consider problems that transcend the
economy.
War is one, Schmidt
notes. ``The papers are writing about war as opposed to the economy,'' he
says. ``As long as that goes on, there is a sense of not being focused on the
problems at hand.''
Only when America
gets beyond its showdown with Iraq will government and businesses refocus on
economic matters. And only then will businesses start buying technology again
in a serious way, he says.
But don't imagine
that they'll restore the industry to the joy ride of the late 1990s. The tech
sector is maturing, Schmidt says, and maturing businesses cannot sustain rates
of growth that younger growth businesses expect. Large sectors can't
continually grow faster than the overall economy, because -- as is happening
with tech -- they effectively become the economy. Regulatory and political
influences, for good and bad reasons, become a larger part of the action.
Search engine Google
is virtually revered by the Internet community and is often profiled as a pure
technology company that does not take commercial interests to heart. But those
days are over. In the past two years, Google has inked revenue-generating
deals with almost every major player on the Internet, stepped up efforts to
secure the lion's share of Internet advertising dollars, and tested the waters
in the news and e-commerce sectors.
Where are these ventures taking Google, and where is
Google taking the Internet? It is more than an academic question: Google
processes more than 150 million Web searches per day. By some accounts, 75
percent of the outside traffic to any given Web site originates on Google.
Where Google goes, so goes the Web.
Google's primary emphasis in the past year has been
on developing its offerings and reach in Internet advertising. The company's
text-based AdWords program has been a big success since its inception a year
ago. And it is easy to see why: For text-based ads related to searches,
click-through rates tend to be four to five times higher than for traditional
banner advertising.
To place a text ad, advertisers choose which keywords
they want to target. On Google, keywords are auctioned off: The higher the
bid, the higher the ad will appear on the search results page. Click-through
performance also factors in. The higher the click-through rate, the less
costly the text link. "Irrelevant ads are dropped down the page, and
advertisers who are more relevant will save money," Andrew Goodman,
principal at search engine optimization firm Page
Zero Media, told the E-Commerce Times.
But for revenue, nothing beats Google's premium
sponsorship program, in which advertisers purchase prime real estate at the
top of a search results page. "Google's probably looking to get 40
percent to 50 percent of [its] revenue [by] targeting big companies with those
premium spots," Goodman said.
In addition, the company has expanded its advertising
offerings by placing cost-per-click advertising on content-targeted Web pages.
So far, the program has been piloted on such sites as HowStuffWorks and
Knight-Ridder properties the San Jose Mercury News and The Philadelphia
Inquirer. Amazon.com (Nasdaq: AMZN)
is the latest partner to sign up, giving Google advertisers a prime
high-traffic site on which to attract customers. But the jury is still out on
how effective this marketing program will be.
"I think the content targeting will be less
lucrative because click-through rates are much lower," Goodman said.
Google also has moved into syndicating ads to ad
networks such as FastClick and Burst Media, which serve smaller clients. With
all this activity, Google and its competitors actually are running out of
spots to place ads, according to Goodman. "There's a finite pie they're
all fighting over."
I doubt that anything soon will replace Google as the main search engine for
the world. However, on August 12, Time Magazine featured some
interesting alternatives.
"Searching for Perfection: Google's still great, but newer search
engines make finding things on the Web easier — and more fun," by Anita
Hamilton, Time, August 12, 2002, Page 66 She gives her highest grade
to altheweb.com
Not only does ALLTHEWEB index more pages (2.1 billion!) than
any other site, but it may have the smartest approach to turning up relevant
results.
Whereas Google runs a virtual popularity contest that pushes
to the top of its list pages that are most frequently and prominently linked to
by other sites, AlltheWeb, based in Oslo, Norway, further tries to decipher the
intent of the query by analyzing its language patterns and identifying common
phrases.
If there were a beauty contest for search engines, Kartoo--developed
in France by cousins Laurent and Nicolas Baleydier--would win the crown.
Rather than display its result in the usual dreary list format, Kartoo scatters
them across a pretty blue background like stars dotting the evening sky.
(Color-coded links suggest how the results interconnect.) Each dot
represents a relevant Web page; when you rest your mouse over a site, it
displays a brief description of the contents.
Wouldn't it be great if a search engine could read your mind
and understand that when you type the word Madonna, you mean the pop star, not
the mother of Jesus? Teoma, which means "expert" in Gaelic,
attempts to do just that by breaking queries into categories grouped by
theme. A search on seals, for example, returned results clustered in 15
categories, from Easter Seals to elephant seals.
Top dog among search
engines in the Net's early years, AltaVista wants that distinction back. A new
look and a promise to provide more relevant results could be steps in the right
direction --- http://www.wired.com/news/business/0,1367,56335,00.html
November
13, 2002
In a bid to recapture its former status as the Web's top-ranked search engine,
the Palo Alto, California, company rolled out a dramatic overhaul of its site
and indexing methodology this week.
Executives said the
revamped site, which includes a pared-down front page and more frequent
updates of indexed links, is part of a broader effort to restructure the
company.
"The company
tried to become a portal too late in the game, and lost focus," said Jim
Barnett, AltaVista's CEO. "What we've done over the past year is focus
the company back on our core strength and our roots, which is search."
The redesign comes
amid a difficult period for AltaVista,
a company that SearchDay
editor Chris Sherman said "was once considered the king of search
engines."
While the company
enjoyed a brief spell of Internet stardom in the late 1990s, its fortunes
abruptly reversed when the dot-com bubble burst. Over the past two years,
AltaVista has weathered multiple rounds of layoffs and withdrew a stock
offering once expected to net $300 million.
Meanwhile, the
company's popularity among search engine users is also slipping. Although
AltaVista still has a large following, with an estimated 33 million visitors a
month worldwide, it trails behind rivals Google and Yahoo.
In the November
ranking of most-visited U.S.-run Internet sites, tabulated by NetRatings,
AltaVista did not make the top 25.
Still, search engine
experts say it's not too late for AltaVista to make a comeback.
"They've had a
history of making changes and hyping the changes and not really living up to
the hype," Sherman said. "But this time it feels different. I get
the impression they really are serious about getting back to being a serious
player in the search industry."
Sherman said that
while he's only done a few searches on the new AltaVista, he's getting better
results than he used to. AltaVista now does a better job separating paid
listings from genuinely relevant search results, he added.
AltaVista's Barnett
believes the revamped site will help bring back many of the search engine's
former fans.
In addition to a
feature that refreshes more than half of its search results daily, the company
is offering an established advanced search tool, Prisma, in four additional
languages: French, German, Italian and Spanish.
AltaVista also claims
to have vastly improved its ability to weed out duplicate pages, spam and dead
links.
But Shari Thurow,
marketing director at Grantastic Designs and author of the upcoming book Search
Engine Visibility, isn't so sure.
"The look and
feel is a million times better," Thurow said. "But I'm hoping their
search results are more relevant, too, because the look and feel doesn't
change that."
Like many early Web
junkies, Thurow was an avid user of AltaVista. She claims that back in 1997 it
was her favorite site.
Nowadays, Thurow says
she usually prefers to conduct searches on Google,
Fast and AskJeeves.
She still uses AltaVista, but largely for software-related research and to
find images, for which the site has a dedicated search function.
NEW
INTERFACE is more compact (the ad banner space has been completely removed)
allowing users to view more results and matches Microsoft's new WindowsXP
interface.
SUGGESTED KEYWORDS based on the original query provides users with follow-up
search suggestions. COLUMNS are adjustable / selectable including showing
abstracts with results.
VALIDATION and RANKING allows users to verify the page exists and/or query is on
page before visiting it. [PRO only feature]
DEEP SEARCH has been added to allow users searching for example 'truck' with an
initial search to then re-search (1-all) of the results for 'red' independent of
the search engines. The second follow up search goes directly to each page
selected and checks the page for 'red'. [PRO only feature]
RIGHT CLICKING on mouse provides quick access to options and advanced features.
MANY MORE tweaks and features have been added. Go to view -> options on
the menu for advanced configuration. Reading the help file provides full
explanations of features and provides tips for efficient searching.
Blinkx finds links of possible interest to you based upon what you are
reading.
Whenever you browse a website, read a news story,
check your e-mail or write a document, blinkx automatically delivers
suggestions from the Web, news or your local files; which you can view by
simply clicking the links or rolling over to get a summary of the information
found. If you want to search, blinkx understands your question and presents
you with links as you search.
In every case, blinkx provides an answer that is
appropriate, faster than using a search engine and personalized just for you.
Current state of scholarly cyberinfrastructure in the humanities and social
sciences
From the University of Illinois Issues in Scholarly Communication Blog
"Our Cultural Commonwealth"
The American Council of Learned Societies has just
issued a report, "Our Cultural Commonwealth," assessing the current state of
scholarly cyberinfrastructure in the humanities and social sciences and
making a series of recommendations on how it can be strengthened, enlarged
and maintained in the future.
John Unsworth, Dean and Professor, Graduate School
of Library and Information Science here at Illinois, chaired the Commission
that authored the report.
The search company, which is expected to go public
this year, is flexing its power with its Internet fans by constantly offering
new services, including comparison shopping and news search. Orkut could be
the clearest signal that Google's aspirations don't end with search.
"Orkut is an online trusted community Web site
designed for friends. The main goal of our service is to make the social life
of yourself and your friends more active and stimulating," according to
the Web site, which states that the service is "in affiliation with
Google."
A Google representative said that the site is the
independent project of one of its engineers, Orkut Buyukkokten, who works on
user interface design for Google. Buyukkokten, a computer science doctoral
candidate at Stanford University before joining Google, created Orkut.com in
the past several months by working on it about one day a week--an amount that
Google asks all of its engineers to devote to personal projects. Buyukkokten,
with the help of a few other engineers, developed Orkut out of his passion for
social networking services.
Google spokeswoman Eileen Rodriquez said that despite
Orkut's affiliation, the service is not part of Google's product portfolio at
this time. "We're always looking at opportunities to expand our search
products, but we currently have no plans in the social networking
market."
Still, Google owns the technology developed by its
employees, Rodriquez said.
Orkut is a "trusted" social network,
meaning that you must be invited to join. The service sent out thousands of
invitations Thursday to welcome individuals, according to Google.
Google regularly throws out new products and services
to see if they stick. Google News, for example, began as the personal project
of Google engineer Krishna Bharat in 2002. While Google still runs news search
in "beta" form, it is gaining a wide audience on the Internet and is
prominently promoted on Google's home page.
Some scary statements have been made about the
privacy of search requests. You may have heard Google was nominated for a Big
Brother Award award. You may also have read Google knows everything you
ever searched for. Should you be afraid? Is it time to boycott
Google, as blogger Gavin Sheridan called for?
Relax. Yes, there are privacy issues when you do a
search at Google. These are concerns at other search engines, too. Fear that
you, personally, will be tracked isn't realistic for the vast majority of
users.
What exactly does Google know about you when you come
to search? You needn't be worried -- for the moment. Next week, we'll continue
the privacy discussion with a look at Yahoo! and search engine privacy
policies.
Fact or Fiction?
No wonder people worry about search privacy after
reading statements like these:
Google builds up a detailed profile of your search
terms over many years. Google probably knew when you last thought you were
pregnant, what diseases your children have had, and who your divorce lawyer
is.--BBC technology commentator Bill Thompson, February
21, 2003
I don't like that its cookies expire 35 years from now, and that it
records all my searches, including the embarrassing ones.--Technology
writer and blogger Chris Gulker, March
7, 2003
Reality: Google doesn't know who you are as an
individual. Its use of cookies, hardly unique, doesn't give it a magical
ability to see your face and know your name through your monitor.
All Google knows is specific browser software, on a
particular computer, made a request. A cookie gives it the ability to
potentially see all requests made by that browser over time. Google doesn't
know who was at the browser when the request made.
When I search at Google, this is how it identifies
me: 740674ce2123e969.
No name, no address, no phone number. If someone else
is at my computer, Google can't tell someone new is searching.
What Does Google Record?
Here's how that unique cookie number is given to you
and why it tells Google nothing about who you are.
Assume you've never been to Google before. You visit
the site and search for "cars." What's recorded?
As stated in its privacy
policy, Google records the time you visited, your Internet address, and
your browser type in a log
file. It's standard practice for Web servers to keep track of this
information.
Here's a simplified example of how a search for
"cars" might appear in Google's logs:
inktomi1-lng.server.ntl.com - 25/Mar/2003 10:15:32 -
http://www.google.com/search?q=cars - MSIE 6.0; Windows NT 5.1 -
740674ce2123e969
When broken down:
inktomi1-lng.server.ntl.com -- my Internet
address, resolved to a domain name
MSIE 6.0; Windows NT 5.1 -- the browser and
operating system I used, MS Internet Explorer 6 on Windows XP
740674ce2123e969 -- my unique cookie ID, assigned
to my browser the first time I visited
My Internet Address
If Google wants to know who I am, the most important
element is my IP
address. That address says nothing about me as Danny Sullivan. NTL is a large
UK Internet access provider. The IP address represents the NTL computer
serving my requests. (Inktomi is mentioned probably as a remnant from when it
provided Internet caching services to ISPs.)
RSS is defined as Rich Site Summary or RDF Site Summary where RDF in this
context is a XML markup that allows you to find topics in documents that do
not necessarily use your search terminology and exclude documents that use
your terminology in a different context.. Unfortunately, the same term in
English may have vastly different meanings which leads to getting thousands or
millions of unwanted "hits" in traditional HTML text searches.
A RSS site allows user to add content to the
site. In this sense it is like Wiki,
but it us much more efficient and popular than a Wiki for news feeds (although
Wikipedia has just started a news feed feature.). But Wiki's do not have
the same deep RDF metadata features. Wikipedia defines RSS as
follows at http://www.webopedia.com/TERM/R/RSS.html
Short for RDFSite Summary or Rich Site Summary,
an XML
format for syndicating
Web content. A Web site that wants to allow other sites to publish some of
its content creates an RSS document and registers the document with an RSS
publisher. A user that can read RSS-distributed content can use the content
on a different site. Syndicated content includes such data as news feeds,
events listings, news stories, headlines, project updates, excerpts from
discussion forums or even corporate information.
RSS/RDF feeds are commonly
available ways of distributing or syndicating the latest news about a given
web site. Weblog (blog) sites in particular are prolific generators of RSS
feeds. Free software that integrates well with Internet Explorer and is
very simple to install is Pluck from http://www.pluck.com/
The following are RSS search
advantages described by Pluck:
For Hunters and Gatherers, a New Way to
Compare
"With one click, users of Pluck can save Web bookmarks into an online
folder or email them to others."
Blurring the Line Between Affiliate and
Developer
"Pluck not only integrates eBay searching into the browser, but it
improves on features built into eBay.com..."
Question
Is RSS really the next big thing on the Internet?
Answer
Actually RDF is a long-run huge thing for meta searches, and RSS is
probably the next big thing as an early part of RDF. Major Internet
players such as Yahoo, Amazon, and eBay are already providing RSS feeds
distributing or syndicating the latest news about their sites. Weblog (blog)
sites in particular prolific sources of RSS feeds.
You should probably download Pluck
and begin to play around with RSS feeds and searches. There are,
however, drawbacks.
If you feed too much too often,
there is high risk of information overload. It is something like email
from Bob Jensen magnified 1,000 times. Also be aware that any summarization or
abstract of a complete article must by definition omit many things. What
you are most interested in may have been left out unless you go to the main
source document.
Another limitation is that our
libraries are just beginning to learn about RDF and it's helper RSS
sites. This technology is is on the cutting edge and you can still get
lost without the help of your friendly librarian. This is still more
into the XML techie domain and is not as user friendly to date as most of us
amateurs would prefer.
I will be interested in reader
comments, because I still feel very ignorant in this domain.
Bob Jensen
THE FUTURE OF SEARCH (or so says IBM)
This may have very serious implications for Internet searching, XBRL, and
academe!
I.B.M. says that its tools will make possible a
further search approach, that of "discovery systems" that will extract
the underlying meaning from stored material no matter how it is structured
(databases, e-mail files, audio recordings, pictures or video files) or even
what language it is in. The specific means for doing so involve steps that will
raise suspicions among many computer veterans. These include "natural
language processing," computerized translation of foreign languages and
other efforts that have broken the hearts of artificial-intelligence researchers
through the years. But the combination of ever-faster computers and
ever-evolving programming allowed the systems I saw to succeed at tasks that
have beaten their predecessors.
James Fallow (See below)
SUDDENLY, the computer world is interesting again.
The last three months of 2004 brought more innovation, faster, than users have
seen in years. The recent flow of products and services differs from those of
previous hotly competitive eras in two ways. The most attractive offerings are
free, and they are concentrated in the newly sexy field of "search."
Google, current heavyweight among systems for
searching the Internet, has not let up from its pattern of introducing
features and products every few weeks. Apart from its celebrated plan to index
the contents of several university libraries, Google has recently released
"beta" (trial) versions of Google Scholar, which returns abstracts
of academic papers and shows how often they are cited by other scholars, and
Google Suggest, a weirdly intriguing feature that tries to guess the object of
your search after you have typed only a letter or two. Give it "po"
and it will show shortcuts to poetry, Pokémon, post office, and other popular
searches. (If you stop after "p" it will suggest "Paris
Hilton.") In practice, this is more useful than it sounds.
Microsoft, heavyweight of the rest of computerdom,
has scrambled to catch up with search innovations from Google and others. On
Dec. 10, a company official made a shocking disclosure. For years Microsoft
had emphasized the importance of "WinFS," a fundamentally new file
system that would make it much easier for users to search and manage
information on their own computers. Last summer, the company said that WinFS
would not be ready in time for inclusion with its next version of Windows,
called Longhorn. The latest news was that WinFS would not be ready even for
the release after that, which pushed its likely delivery at least five years
into the future. This seemed to put Microsoft entirely out of the running in
desktop search. But within three days, it had released a beta version of its
new desktop search utility, which it had previously said would not be
available for months.
Meanwhile, a flurry of mergers, announcements and
deals from smaller players produced a dazzling variety of new search
possibilities. Early this month Yahoo said it would use the excellent indexing
program X1 as the basis for its own desktop search system, which it would
distribute free to its users. The search company Autonomy, which has
specialized in indexing corporate data, also got into the new competition, as
did Ask Jeeves, EarthLink, and smaller companies like dTSearch, Copernic,
Accoona and many others.
I have most of these systems running all at once on
my computer, and if they don't melt it down or blow it up I will report later
on how each works. But today's subject is the virtually unpublicized search
strategy of another industry heavyweight: I.B.M.
Last week I visited the Thomas J. Watson Research
Center in Hawthorne, 20 miles north of New York, to hear six I.B.M.
researchers describe their company's concept of "the future of
search." Concepts and demos are different from products being shipped and
sold, so it is unfair to compare what I.B.M. is promising with what others are
doing now. Still, the promise seems great.
Two weeks before our meeting, I.B.M. released
OmniFind, the first program to take advantage of its new strategy for solving
search problems. This approach, which it calls unstructured information
management architecture, or UIMA, will, according to I.B.M., lead to a third
generation in the ability to retrieve computerized data. The first generation,
according to this scheme, is simple keyword match - finding all documents that
contain a certain name or address. This is all most desktop search systems can
do - or need to do, because you're mainly looking for an e-mail message or
memorandum you already know is there. The next generation is the Web-based
search now best performed by Google, which uses keywords and many other
indicators to match a query to a list of sites.
I.B.M. says that its tools will make possible a
further search approach, that of "discovery systems" that will
extract the underlying meaning from stored material no matter how it is
structured (databases, e-mail files, audio recordings, pictures or video
files) or even what language it is in. The specific means for doing so involve
steps that will raise suspicions among many computer veterans. These include
"natural language processing," computerized translation of foreign
languages and other efforts that have broken the hearts of
artificial-intelligence researchers through the years. But the combination of
ever-faster computers and ever-evolving programming allowed the systems I saw
to succeed at tasks that have beaten their predecessors.
Current state of scholarly cyberinfrastructure in the humanities and social
sciences
From the University of Illinois Issues in Scholarly Communication Blog
"Our Cultural Commonwealth"
The American Council of Learned Societies has just
issued a report, "Our Cultural Commonwealth," assessing the current state of
scholarly cyberinfrastructure in the humanities and social sciences and
making a series of recommendations on how it can be strengthened, enlarged
and maintained in the future.
John Unsworth, Dean and Professor, Graduate School
of Library and Information Science here at Illinois, chaired the Commission
that authored the report.
Journals Associations,
Councils and Organizations
Education
General Issues
Permission
Intellectual Property
Government Law
Publishing Concerns
Libraries and Copyright
Mega Sites Music
Dead Link Archive --- http://ejw.i8.com/copy.htm#dead
DEAD LINK ARCHIVE
For Dead Links, use Internet Archive to find a version
of these sites. Highlight and copy the URL, then go to the Way Back Machine
at http://www.archive.org/index.html
and then paste the URL into the web address box. Often icons are not
available and the most recent listed version may not bring up the page. Go
to an earlier date on the archive list for that site. Also, if you do not
find it archived, try the Google Search Engine at http://www.google.com
and check their archive. Songwriter and Music Copyright Resources, http://www.npsai.com/resources.htm
Since you mention greatdomains.com on your Search Helpers, I was wondering
if your visitors would also like our site www.Register-DomainNames.com
. We have some awesome domain finder and information tools. Check it out to
see if your visitors would like it. You can add your site to our directory if
you are interested in exchanging links, at http://www.register-domainnames.com/cgi-bin/links/add.cgi
.
Amazon Elbows Into Online Yellow Pages
Hiking the stakes in this hot field, the new service from its A9 unit features
photo-rich listings that let you wander around near a destination
January 28, 2005 message from BusinessWeek Online's Insider [BW_Insider@newsletters.businessweek.com] The A9.com home page is at http://a9.com/?c=1&src=a9
People Search Engine Amazon.com Inc.'s A9.com search engine has incorporated
Zoom Information Inc.'s index of businesspeople as the default source for people
information, the companies said Tuesday. The ZoomInfo service can be accessed by
selecting the "people" box on the A9.com homepage. ZoomInfo provides summaries
describing the person's work history, education and accomplishments. ZoomInfo
also allows registered users to monitor and manage their own summaries. "ZoomInfo
has collected and organized hundreds of millions of random bits of information
about people from across the Web," Florian Brody, director of marketing for
A9.com, said in a statement. "This information is useful in lots of different
ways, which we are excited to make it available for our users on A9.com." Zoom's
search technology scans millions of Web sites, press releases, electronic news
services, Securities and Exchange Commission filings and other online sources;
and then summarizes the information, the Waltham, Mass., company said.
Antone Gonsalves, "Through a new deal with Zoom Information, Amazon.com's A9
search engine provides free summaries describing a person's work history,
education, and accomplishments," Information Week, January 17, 2006 ---
http://www.informationweek.com/news/showArticle.jhtml?articleID=177101278
Search for Online Communities on
Over 700,000 Topics
Are there topics in life that you would like to discuss with or read about
amidst over 700,000 online communities. An index of these communities is
provided at ezboard at http://www.ezboard.com/
[ezboard is the] leading online community service on
the Net, consisting of over 700,000 communities and over 5 MILLION registered
users!
A Weblog (which is sometimes written as "web
log" or "weblog") is a Web site of personal or non-commercial
origin that uses a dated log format that is updated on a daily or very
frequent basis with new information about a particular subject or range of
subjects. The information can be written by the site owner, gleaned from other
Web sites or other sources, or contributed by users. A
Web log often has the quality of being a kind of
"log of our times" from a particular point-of-view. Generally,
Weblogs are devoted to one or several subjects or themes, usually of topical
interest, and, in general, can be thought of as developing commentaries,
individual or collective on their particular themes. A Weblog may consist of
the recorded ideas of an individual (a sort of diary) or be a complex
collaboration open to anyone. Most of the latter are moderated discussions.
Weblog software use grows daily -- but bloggers abandon sites and launch new
ones as frequently as J.Lo goes through boyfriends. Which makes taking an
accurate blog count tricky --- http://www.wired.com/news/culture/0,1284,54740,00.html
In 1998 there were just a handful of sites of the
type that are now identified as weblogs (so named by Jorn
Barger in December 1997). Jesse James Garrett, editor of Infosift,
began compiling a list of "other sites like his" as he found them in
his travels around the web. In November of that year, he sent that list to
Cameron Barrett. Cameron published the list on Camworld,
and others maintaining similar sites began sending their URLs to him for
inclusion on the list. Jesse's 'page
of only weblogs' lists the 23 known to be in existence at the beginning of
1999.
Suddenly a community sprang up. It was easy to read
all of the weblogs on Cameron's list, and most interested people did. Peter
Merholz announced in early 1999 that he was going to pronounce it 'wee-blog'
and inevitably this was shortened to 'blog' with the weblog editor referred to
as a 'blogger.'
At this point, the bandwagon jumping began. More and
more people began publishing their own weblogs. I began mine in April of 1999.
Suddenly it became difficult to read every weblog every day, or even to keep
track of all the new ones that were appearing. Cameron's list grew so large
that he began including only weblogs he actually followed himself. Other
webloggers did the same. In early 1999 Brigitte
Eaton compiled a list of every weblog she knew about and created the Eatonweb
Portal. Brig evaluated all submissions by a simple criterion: that the
site consist of dated entries. Webloggers debated what was and what was not a
weblog, but since the Eatonweb Portal was the most complete listing of weblogs
available, Brig's inclusive definition prevailed.
This rapid growth continued steadily until July 1999
when Pitas, the first free build-your-own-weblog
tool launched, and suddenly there were hundreds. In August, Pyra
released Blogger, and Groksoup
launched, and with the ease that these web-based tools provided, the
bandwagon-jumping turned into an explosion. Late in 1999 software developer
Dave Winer introduced Edit This Page,
and Jeff A. Campbell launched Velocinews. All of these services are free, and
all of them are designed to enable individuals to publish their own weblogs
quickly and easily.
The original weblogs were link-driven sites. Each was
a mixture in unique proportions of links, commentary, and personal thoughts
and essays. Weblogs could only be created by people who already knew how to
make a website. A weblog editor had either taught herself to code HTML for
fun, or, after working all day creating commercial websites, spent several
off-work hours every day surfing the web and posting to her site. These were
web enthusiasts.
Many current weblogs follow this original style.
Their editors present links both to little-known corners of the web and to
current news articles they feel are worthy of note. Such links are nearly
always accompanied by the editor's commentary. An editor with some expertise
in a field might demonstrate the accuracy or inaccuracy of a highlighted
article or certain facts therein; provide additional facts he feels are
pertinent to the issue at hand; or simply add an opinion or differing
viewpoint from the one in the piece he has linked. Typically this commentary
is characterized by an irreverent, sometimes sarcastic tone. More skillful
editors manage to convey all of these things in the sentence or two with which
they introduce the link (making them, as Halcyon
pointed out to me, pioneers in the art and craft of microcontent).
Indeed, the format of the typical weblog, providing only a very short space in
which to write an entry, encourages pithiness on the part of the writer;
longer commentary is often given its own space as a separate essay.
These weblogs provide a valuable filtering function
for their readers. The web has been, in effect, pre-surfed for them. Out of
the myriad web pages slung through cyberspace, weblog editors pick out the
most mind-boggling, the most stupid, the most compelling.
But this type of weblog is important for another
reason, I think. In Douglas Rushkoff's Media Virus,
Greg Ruggerio of the Immediast
Underground is quoted as saying, "Media is a corporate
possession...You cannot participate in the media. Bringing that into the
foreground is the first step. The second step is to define the difference
between public and audience. An audience is passive; a public is
participatory. We need a definition of media that is public in its
orientation."
By highlighting articles that may easily be passed
over by the typical web user too busy to do more than scan corporate news
sites, by searching out articles from lesser-known sources, and by providing
additional facts, alternative views, and thoughtful commentary, weblog editors
participate in the dissemination and interpretation of the news that is fed to
us every day. Their sarcasm and fearless commentary reminds us to question the
vested interests of our sources of information and the expertise of individual
reporters as they file news stories about subjects they may not fully
understand.
Weblog editors sometimes contextualize an article by
juxtaposing it with an article on a related subject; each article, considered
in the light of the other, may take on additional meaning, or even draw the
reader to conclusions contrary to the implicit aim of each. It would be too
much to call this type of weblog "independent media," but clearly
their editors, engaged in seeking out and evaluating the "facts"
that are presented to us each day, resemble the public that Ruggerio speaks
of. By writing a few lines each day, weblog editors begin to redefine media as
a public, participatory endeavor
But then personal sites went from being static
collections of bad poetry and award banners to constantly updated snippets of
commentary, photography, sounds, bad poetry, and links. The popularity of this
format grew (for a good primer on where weblogs came from and how they
evolved, try Rebecca Blood's Weblogs:
A History and Perspective), and people started building applications to
simplify the process of maintaining a content-heavy personal site.
These applications have grown in number and
sophistication over the years, and with some major upgrades appearing over the
past few months (Blogger Pro, Movable Type 2.0, Radio UserLand 8.0), I thought
the time was nigh to talk about what they do, why you might care, which one
would best suit your needs, and how they can keep you company on those long,
lonely nights, so empty since you were abandoned for someone who could write
Perl scripts.
Weblogs continue to
grow in popularity, no doubt in part to their immediacy. Denizens of the
Internet enjoy the opportunity to drop by and catch an up-to-the-minute
account on their favorite blog. However, nothing is more frustrating than
encountering a cobwebbed blog that hasn't been updated in weeks. To remedy
such situations, this site offers a minute-by-minute account of over 50,000
weblogs. It doesn't get fresher than this! For utility's sake, the site offers
a tiny java applet that sits on your desktop and continually refreshes,
keeping the weblogs whirring. You can also stop by the most popular blogs to
see what kind of content is piquing the interest of others. Whether you're a
neophyte or veteran blogger, you're sure to find an intriguing site or two to
scour.
Some time ago, Glenn
Reynolds hardly qualified as plankton on the punditry food chain. The
41-year-old law professor at the University of Tennessee would pen the
occasional op-ed for the L.A. Times, but his name was unfamiliar to even the
most fanatical news junkie. All that began to change on Aug. 5 of last
year, when Reynolds acquired the software to create a "Weblog," or
"blog." A blog is an easily updated Web site that works as an
online daybook, consisting of links to interesting items on the Web,
spur-of-the-moment observations and real-time reports on whatever captures the
blogger's attention. Reynold's original goal was to post witty
observations on news events, but after September 11, he began providing links to
fascinating articles and accounts of the crisis, and soon his site, called
InstaPundit, drew thousands of readers--and kept growing. He now gets more
than 70,000 page views a day (he figures this means 23,000 real people).
Working at his two-year-old $400 computer, he posts dozens of items and links a
day, and answers hundreds of e-mails. PR flacks call him to cadge
coverage. And he's living a pundit's dream by being frequently cited--not
just by fellow bloggers, but by media bigfeet. He's blogged his way into
the game.
Some say the game itself
has changed. InstaPundit is a pivotal site in what is known as the
Blogosphere, a burgeoning samizdat of self-starters who attempt to
provide in the aggregate an alternate media universe. The putative
advantage is that this one is run not by editors paid by corporate giants, but
unbespoken outsiders--impassioned lefties and righties, fine-print-reading
wonks, indignant cranks and salt-'o-the-earth eyewitnesses to the
"real" life that the self-absorbed media often miss. Hard-core
bloggers, with a giddy fever not heard of since the Internet bubble popped, are
even predicting that the Blogosphere is on a trajectory to eclipse the
death-star-like dome of Big Media. One blog avatar, Dave Winer (who
probably would be saying this even if he didn't run a company that sold blogging
software), has formally wagered that by 2007, more readers will get news from
blogs than from The New York Times. Taking him up on the bet is Martin
Nisenholtz, head of the Time's digital operations.
My guess is that
Nisenholtz wins. Blogs are a terrific addition to the media universe.
But they pose no threat to the established order.
How does your site rate in terms of popularity among
large numbers of users?
Google Monitor is a simple application that allows you to find
and track the ranking of your Web site or any given URL in Google search
results. It offers two modes of operation: you can enter a URL and a keyword to
find the top results and where your site ranks among them, or select a URL and
find its ranking for several keywords at once. You may store statistics for all
URLs and keywords, and keep notes to further track search trends and the
performance of your Web site. --- http://download.com.com/3000-2181-1015
My colleagues in the language department tell me that language translation
software leaves a lot to be desired. Having said this, however, I provide
a few links below:
As the need for global communication increases,
online translation services are in greater demand. Users are attracted to
the breakneck speed at which online translation is done and the price. Those
that aren't free are still fairly inexpensive.
New languages have been added to the traditional
lists and Arabic, in particular, has been in demand recently. I spent the
past few weeks tinkering with four free online services, translating various
texts from English to Arabic and vice versa to test their speed and
accuracy. I tested Google's Language Tools and services from Applied
Language Solutions, WorldLingo Translations and Systran.
Customers who have been waiting for such services
to be perfected will find improvements are slow in coming. Overall, I found
the Arabic-English translations rife with syntactic and semantic errors --
from the merely too-literal to the laughably bad.
For the purposes of my test, I selected different
texts: conversation, news stories, and legal and scientific documents.
First, I picked an Associated Press story that started with the sentence: "A
wintry storm caked the center of the nation with a thick layer of ice
Monday..."
I got a variety of imprecise translations into
Arabic (which I'm interpreting below).
Applied Language and WorldLingo offered identical
translations, which were slightly better than the other two: "A storm
covered the center's storm from the nation with a thick layer snow Monday."
Systran: "A stormy storm covered the center for the
mother with a thick layer snow Monday."
Language Tools: "The storm grilled bloc in the
middle of the nation with a thick layer of snow Monday."
The translations would have been nearly impossible
to understand were I not fluent in both languages. It's worse in Arabic than
it seems above. Arabic has masculine and feminine nouns, verbs and
adjectives that have to agree in a sentence; otherwise, the sentence makes a
native speaker wince.
Next, I processed some longer news stories. Only
Language Tools didn't set text limits. WorldLingo and Applied Language each
had a 150-word limit. Systran didn't specify a limit, but it rendered only a
short part of the text.
Language Tools came out ahead this time. It was the
only one to translate the word "Taliban" from Arabic to English contextually
correct, as a movement. The other services translated it literally from the
Arabic as "two students."
The services were better at translating everyday
phrases, but even these sometimes came out missing a word, or were
scrambled.
In this category, I again found translations by
Google's Language Tools closest to the original texts. Still, there is much
room for improvement. Google, for example, translated from Arabic to English
the simple question, "Do you speak English?" as "Do they speak English?"
Other services got the pronoun right but botched
other parts of the sentence. With the exception of Google, all three
services, oddly, attempted to write the Arabic word for "English" in the
Roman alphabet (aalaanklyzyh) in the middle of an Arabic sentence.
All the services did a terrible job with metaphors
and other figurative uses of the language, whether Arabic or English.
The weakest performance by all the services was the
translation of legal and scientific texts. Only Language Tools correctly
translated the word "noncompliance" in a legal text, for example. Instead of
using the proper word in Arabic, the other services transliterated it
phonetically into a meaningless word.
All four services have an interface that is easy to
use, with a pull-down menu listing several languages. Each has two text
boxes, one for the original language and the other for the desired
translation. They also translate entire Web sites, but the translation again
tended to be awkwardly verbatim.
Google also has a feature that lets you translate
search results free. (It also offers users an option to send in a better
translation.) The others require you to become a paid subscriber. English
and Arabic results appeared side-by-side.
I also liked WorldLingo and Applied Language's
email-translation feature. After clicking the email button, a window with
two text boxes pops up. You enter your name and email address, and the
recipient's name and address. When you send the message with WorldLingo,
both recipient and sender see the message in both languages. Neither Google
nor Systran has this feature.
Systran has a convenient swap button that lets
users easily flip the source and target languages. This saves time when
going back-and-forth between two languages. The other services have you use
pull-down menus. Systran's interface also allows prompt translation of a
text as soon as it's pasted in a text box, without the need to click a
"translate" button.
Free online translation tools help travelers or
those curious about languages, but I found them unreliable for important
documents. Use with caution.
What happens when an English phrase is
translated (by computer) back and forth between 5 different languages? The
authors of the Systran translation software probably never intended this
application of their program. As of April 2002, translation software is almost
good enough to turn grammatically correct, slang-free text from one language
into grammatically incorrect, barely readable approximations in another. But the
software is not equipped for 10 consecutive translations of the same piece of
text. The resulting half-English, half-foreign, and totally non sequitur
response bears almost no resemblance to the original. Remember the old game of
"Telephone"? Something is lost, and sometimes something is gained. Try
it for yourself!
Set the Google homepage, messages, and buttons to display in your
selected language via our Preferences
page.
Google currently offers the following interface languages:
Lina Douglas
Smart Link Corp.
www.smartlinkcorp.com
www.paralink.com
800-256-4814
fax 949-552-1699
18401 Von Karman Ave. Ste 450
Irvine, California 92612
infoScope is a handheld device equipped with a
digital camera that can take snapshots of text in English, French, German,
Spanish, Italian and Chinese and translate the image to another language in a
matter of seconds. The device displays characteristics of augmented reality,
by presenting the real world in the form of a captured image, such as a
restaurant sign, and merging it with virtual data, by providing a translation
of the image as an overlay to the PDA's screen. infoScope is not intended for
lengthy translations, but more for speedy hits of three or four lines of text.
Abstract.
We describe an information augmentation system (infoS- cope) and applications
which integrates handheld devices, such asper- sonal digital assistants, with
color camera to provide enhance information perception services to users.
InfoScope uses a color camera as an input device by capturing scene images
from the real world and using computer vision techniques to extract
information from real world, convert them into digital world as text
information and augment them back to original location so that user can see
both the real world and information together in display of their handheld
devices. We implemented two applications: First one is an automatic sign/text
translation for foreign travelers where user can use infoScope whenever they
want to see texts or signs in their own language where they are originally
written in foreign language in the scene and extracted from scene images
automatically by using computer vision techniques, second one is Information
Augmentation in the City where user can see information, which is associated
with building, places or attraction, overlaid onto real scene images on their
PDA's display. We would like to presents our system with 5-minutes
presentation with a possible demonstration.
Maps, Travel Information, and Local Area Searches for Businesses and Places
of Interest
Google Inc. on Thursday launched in beta a
trip-planning service for people who prefer to take pubic transit rather
than drive. The Google Transit Trip Planner, which is initially available
only for the Portland, Ore, metro area, provides directions for public
transportation from a starting location to a destination. Besides showing a
road map of the route, the service provides transportation schedules and
other information to help plot a step-by-step itinerary. In addition, the
service compares the cost of the trip with the cost of driving.
Use of the service is similar to Google Local, a
mapping and local search service that lets users find businesses and other
locations in a city, and get driving directions. Locations and directions
are shown over a roadmap or an aerial view of the area. The new transit
service offers the same views.
Google Transit, which is currently a Google Labs
product, has not been integrated with Google Local, because the company said
it needed time to develop the product further. Google, based in Mountain
View, Calif., did not have any definite plans for which cities would be
added or when.
Engineers in San Francisco, New York, and Zurich
who use public transportation often started the project, the company said.
Innovative applications of Google maps Tracking sexual predators in Florida. Guiding travelers
to the cheapest gas nationwide. Pinpointing $1,500 studio apartments for rent in
Manhattan. Geeks, tinkerers and innovators are crashing the Google party, having
discovered how to tinker with the search engine's mapping service to graphically
illustrate vital information that might otherwise be ignored, overlooked or not
perceived as clearly. "It's such a beautiful way to look at what could be a
dense amount of information," said Tara Calishain, editor of Research Buzz and
co-author of "Google Hacks," a book that offers tips on how to get the most out
of the Web's most popular search engine. Yahoo and other sites also offer maps,
but Google's four-month-old mapping service is more easily accessible and
manipulated by outsiders, the tinkerers say. As it turns out, Google charts each
point on its maps by latitude and longitude - that's how Google can produce.
"Google Maps Make Demographics Come Alive," Forbes, June 8, 2005 ---
http://www.forbes.com/technology/ebusiness/feeds/ap/2005/06/08/ap2083551.html
Also see
http://www.technologyreview.com/articles/05/06/ap/ap_060905.asp?trk=nl
Google added historic map overlays to its free interactive online globe of
the world to provide views of how places have changed with time.
"Google Earth maps history," PhysOrg, November 14, 2006 ---
http://physorg.com/news82706337.html
Yahoo was last to the game with its Yahoo! Desktop
Search, putting it squarely behind the eight ball. While Microsoft and Google
have emerged as the current leaders (see What's
Next for Google?), Yahoo has turned its attention to indexing the
world by building out its mobile search technologies.
Yahoo!'s SmartView -- a competitor to America
Online's MapQuest -- launched last March, and enables users find business
locations, phone numbers, and directions.
Squaring off with AOL won't be an easy task. MapQuest
has done a good job outflanking Microsoft's MapPoint, which recently made
headlines for its odd search queries results.
But Yahoo hopes to hone SmartView, and integrate that
into its Web and desktop search capabilities. Recently, a "Real
Time" traffic report was added, where people can locate traffic jams,
construction sites, speed zones, and accident reports. The service also
includes Yahoo! Maps enhancements which include faster panning and zooming,
larger views, and turn-by-turn maps with driving directions.
. . .
In the competitive field of search services, Yahoo is
gaining ground on its top competitors, according to a survey by Market
Researcher Keynote. While Google, Yahoo and MSN were the top choices of the
2,000 consumers surveyed, Yahoo maintained the
highest user loyalty, primarily because of its localized search.
All three companies are targeting local search moving
into the future, but Google and Microsoft haven't yet found their groove.
Yahoo's localized search for businesses and other sites of interest in a
localized area is called Yahoo Local --- http://local.yahoo.com/ Keep in mind that Yahoo gives output priority to
companies that pay to be listed near the top of a search outcome.
Check out this site.
I've had alot of fun with it. Enter an address on the top in the
right hand search and then see an aerial photo of that address. The
slider on the left lets you zoom in quite a ways. I just wonder how
updated these pictures are.
Better, More Accurate Image Search
By modifying a common type of machine-learning technique, researchers have found
a better way to identify pictures," by Kate Greene, MIT's Technology
Review, April 9, 2007 ---
http://www.technologyreview.com/Infotech/18501/
See more than rooftops:
Free satellite photos at 45-degree angles (Bird's Eye Images) ---
http://local.live.com/
Use the slider to zoom and the arrows to relocate In battling Google in local search,
Microsoft is falling back on its familiar strategy: copy and then go
one better. The software giant has released in beta a new online
service that's similar to Google Local, but has some impressive
innovations.
Windows Live Localcombines Microsoft's
local search engine and Virtual Earth aerial-imaging service. In
providing the new tool, Microsoft is going a step further than
Google by providing 45-degree aerial views of locations. This
so-called "bird's-eye view" is only available for places in New
York, Los Angeles, San Francisco, Boston, Seattle and Las Vegas, but
more cities are expected to be added over time . . . Besides its
bird's-eye views, Microsoft is offering step-by-step driving
directions using either the angular views or straight-down satellite
views, identification of construction areas along a specific route
and several print options, such as the ability to only print
directions or to include thumbnail pictures of each turn in the
route. User also can print directions that include their personal
notes.
Antone Gonsalves, InformationWeek Newsletter, December 8,
2005 ---
http://www.techweb.com/article/showArticle.jhtml?sssdmh=dm4.160172&articleId=174904438&pgno=2
Question
What is the Mechanical Turk from Amazon.com?
Aviation adventurer Steve Fossett went missing while
flying over Nevada a week ago Monday. The cops can't find him. TheAir Force
can't find him. (They did
spot 6 other previously unknown wrecks, though.)
But maybe, just maybe, a geek sitting at his computer succeeded where the
government failed. Using an Amazon.com service called
Mechanical Turk, web users have been
scouring massive amounts of satellite imagery in an
effort to assist rescue workers. And one of them may have spotted Fossett's
plane, according to
AVweb (registration required).
David Axe, "Geeks Spot Fossett?" Wired News, September 12, 2007 ---
http://blog.wired.com/defense/
Flickr has
unveiled a new project, dubbed
The Commons,
which will
give Flickr members an
opportunity to browse and tag
photos from Library of Congress
archives. The goal is to create
what
Flickr
likes to call
an "organic information system,"
in other words, a searchable
database of tags that makes it
easier for researchers to find
images.
The pilot
project features a
small sampling
of the
Library of Congress’ some 14
million images. For now you’ll
find two collections. The first
is called “American Memory:
Color photographs from the Great
Depression” and features color
photographs of the Farm Security
Administration-Office of War
Information Collection including
“scenes of rural and small-town
life, migrant labor, and the
effects of the Great
Depression.”
The second collection is the The
George Grantham Bain Collection
which features “photos produced
and gathered by George Grantham
Bain for his news photo service,
including portraits and
worldwide news events, but with
special emphasis on life in New
York City.” The Bain collection
images date from around
1900-1920.
In effect
the Library of Congress has
become a Flickr user,
complete with its own stream
and while
it’s great to see these image
available to a much wider
audience, we’re not so sure how
much it’s going to help
researchers.
If you’re
looking for historical
photographs do you want to
search through comments from
self-appointed experts
criticizing the composition
skills of photography pioneers
or adding
the
ever insightful “wow?”
Then
there’s the inevitable comments
soliciting photos to be added to
whatever banal and increasingly
inane groups and pools that
Flickr members have come up
with.
The tagging aspect will no doubt
produce something of value, but
pardon our cynicism, this may
well turn out to be a good test
of whether the positive aspects
of the Flickr community outweigh
the negative.
Innovative applications of Google maps Tracking sexual predators in Florida. Guiding travelers
to the cheapest gas nationwide. Pinpointing $1,500 studio apartments for rent in
Manhattan. Geeks, tinkerers and innovators are crashing the Google party, having
discovered how to tinker with the search engine's mapping service to graphically
illustrate vital information that might otherwise be ignored, overlooked or not
perceived as clearly. "It's such a beautiful way to look at what could be a
dense amount of information," said Tara Calishain, editor of Research Buzz and
co-author of "Google Hacks," a book that offers tips on how to get the most out
of the Web's most popular search engine. Yahoo and other sites also offer maps,
but Google's four-month-old mapping service is more easily accessible and
manipulated by outsiders, the tinkerers say. As it turns out, Google charts each
point on its maps by latitude and longitude - that's how Google can produce.
"Google Maps Make Demographics Come Alive," Forbes, June 8, 2005 ---
http://www.forbes.com/technology/ebusiness/feeds/ap/2005/06/08/ap2083551.html
Google Deskbar (Beta) makes searching much faster and easier than anything
I’ve ever seen before!
Google Deskbar: Search Without Having to Use Internet Explorer or
Any Other Internet Browser
A great free download from Google ---
http://toolbar.google.com/deskbar/
Google Deskbar enables you to search with Google from
any application without lifting your fingers from the keyboard. Installs
easily in your Windows taskbar.
Key Features: Search using Google, even when your
browser isn't running
Preview search results in a small inset window
that closes automatically
Access Google from any application by typing
Ctrl+Alt+G
Use keyboard shortcuts for multiple Google
searches, i.e., Google News (Ctrl+N),Google Images (Ctrl+I), or Froogle (Ctrl+F)
Jensen Comment The free Beta download is fast and easy.
But there is a confusing part. After it is downloaded, it will say "Wait,
you are not done yet!". Now you have to jump in and do exactly what the
looped instructions tell you to do. You right click on the clock at the
bottom of your screen (hopefully you have a clock showing). Then you have
to click on Toolbars and then Google Search like the looped instructions explain
to you. Then you are done. It's neat!
When you are done, you will get a permanent Google
search box at the bottom of your screen.
You type in or paste in the URL and click the button to the right. After you type in the first few letters of a site that you have visited
before, you will see a pop-up option that allows you to click on the URL without
having to type the rest of it in the search box.
SAN JOSE, California -- Internet search engine Google
has unveiled free software that lets people search the Web quickly --
without launching a Web browser.
Google Deskbar (
http://toolbar.google.com/deskbar/
), released Thursday, appears as a search box in the Windows toolbar. After
the search words are entered, a resizable miniviewer pops up with the
results. Users can jump to the site within the miniviewer or launch their
browser.
Unless a program is filling the screen or the user
has set the taskbar to automatically hide, the search box is always visible.
With a keyboard shortcut, the cursor can be moved to it without moving the
mouse.
Though the software is free, Google does get some
exposure on the desktop: The company's logo appears faintly in the search
box when words aren't being typed into it.
Beyond Google's main search, the box can be set to
search Google non-U.S. sites, Google News, Google Images and others. There
are options to find stock quotes, movie reviews, word definitions and
synonyms. Users can add custom sites to search, too.
The software, which is about 400 kilobytes, requires
a personal computer with Windows XP or Windows 2000 software, at least
Internet Explorer 5.5 and an Internet connection. Windows 95, 98 and ME
aren't supported. Google Deskbar doesn't run on Macintosh or Linux
computers.
Written by a Patrick Crispen, it starts with a basic
explanation of how Google works and then goes into some of the advanced
features such as boolean searches and ends with explanations of a lot of the
special operators. I've quickly looked through it and it seems pretty
comprehensive and suitable as supplementary material for students. I'm not
familiar with Patrick Crispen, but appartently he has done a lot of writing
on internet topics, and he does seem to have a sense of humor, which is a
plus when dealing with this type of topic.
There are also other free (that magic word again)
downloads on this website.
Charlie Betts
Delaware Technical & Community College
Terry Campus 100 Campus Drive Dover DE 19904 cbetts@college.dtcc.edu
Questions
What is the literal definition of Googol? (the source of the trade name
Google)
Who were the two Stanford University graduate students who invented Google?
How does Google make its profits by providing a free search engine to the world?
Hint: It's not the advertising revenue.
Answers:
“Googol” is the mathematical term for the number one followed by a hundred
zeros.
The geeks who invented Google
were the following 22-year old graduate students at Stanford University:
Larry Page was an
all-American type (geek variety) whose dad taught computer science in
Lansing, Mich.
Sergey Brin, with the
dark brooding looks of a chess prodigy, emigrated from Russia at the age of
6: his father was a math professor.
The main source of revenue is from
licensing fees to huge companies like Yahoo and AOL who in turn use Google's
licensed corporate services.
Its performance is the envy of executives and
engineers around the world ... For techno-evangelists, Google is a marvel of Web
brilliance ... For Wall Street, it may be the IPO that changes everything (
again ) ... But Google is also a case study in savvy management -- a company
filled with cutting-edge ideas, rigorous accountability, and relentless
attention to detail ... Here's a search for the growth secrets of one of the
world's most exciting young companies -- a company from which every company can
learn. Keith H. Hammonds ---
http://www.fastcompany.com/magazine/69/google.html
THE DESKTOP ORACLE
OF DELPHI
Internet-search engines have been around for the better
part of a decade, but with the emergence of Google, something profound has
happened. Because of its seemingly uncanny ability to provide curious minds with
the exact information they seek, a dot-com survivor has supercharged the entire
category of search, transforming the masses into data-miners and becoming a
cultural phenomenon in the process. By a winning combination of smart
algorithms, hyperactive Web crawlers and 10,000 silicon-churning computer
servers, Google has become a high-tech version of the Oracle of Delphi,
positioning everyone a mouseclick away from the answers to the most arcane
questions—and delivering simple answers so efficiently that the process becomes
addictive. Google cofounder Sergey Brin puts it succinctly: “I’d like to get to
a state where people think that if you’ve Googled something, you’ve researched
it, and otherwise you haven’t and that’s it.” We’re almost there now. With
virtually no marketing, Google is now the fourth most popular Web site in the
world—and the Nos. 1 and 3 sites (AOL, Yahoo) both license Google technology for
their Web searches. About half of all Web searches in the world are
performed with —Google, which has been translated into 86 languages. The big
reason for the success? It works. Not only does Google dramatically speed
the process of finding things in the vast storehouse of the Web, but its power
encourages people to make searches they previously wouldn’t have bothered with.
Getting the skinny from Google is so common that the company name has become a
verb. The usage has even been anointed by an instantly renowned New Yorker
cartoon, where a barfly admits to a friend that “I can’t explain it—it’s just a
funny feeling I’m being Googled.”
. . .
THE GOOGLE
MYSTIQUE When
Judge Richard Posner wrote a book recently to identify the world’s leading
intellectuals, he used Google hits as a key criterion. When the Chinese
government decided that the Web offered its citizenry an overly intimate view of
the world outside its borders, what better way to pull down the shades than to
block Google? (Within a week the Chinese changed direction; Google was too
useful to withhold.) Companies that do business online have become justifiably
obsessed with Google’s power. “If you drop down on Google, your business can
come to a screeching halt,” says Greg Boser of WebGuerilla, an Internet
consultancy. And if two clashing egos want to see whose Google is bigger, they
need only venture to a Web site like GoogleFight to compare results.
Google was the brainchild of two
Stanford graduate students who refused to accept the conventional wisdom that
Internet searching was either a solved problem or not very interesting. Larry
Page was an all-American type (geek variety) whose dad taught computer science
in Lansing, Mich. Sergey Brin, with the dark brooding looks of a chess prodigy,
emigrated from Russia at the age of 6: his father was a math professor. Brin and
Page, who met as 22-year-old doctoral candidates in computer science in 1995,
began with an academic research project that morphed into an experiment on Web
searching.
Their big idea was something they
called PageRank (named after Larry), which took into account not just the title
or text on a Web site but the other sites linked to it. “Our intention of doing
the ranking properly was that you should get the site you meant to get,” says
Page. Basically, the system exploited the dizzyingly complex linking network of
the Web itself—and the collective intelligence of the millions who surfed the
Web—so that when you searched, you could follow in the pathways of others who
were interested in that same information.
. . .
TOO MUCH OF A GOOD
THING?
For researchers, of course, Google is a dream
tool. “I can’t imagine writing a nonfiction book without it,” says author
Steven Johnson. Some even wonder if Google might be too much of a good
thing. “I use it myself, every day,” says Joe Janes, assistant professor in
the information school of the University of Washington. “But I worry about
how over reliance on it might affect the skill-set of librarians.”
New uses emerge almost as quickly
as the typical 0.3 seconds it takes to get Google results. People find
long-lost relatives, recall old song lyrics and locate parts for old MGs.
College instructors sniffing for plagiarism type in suspiciously
accomplished phrases from the papers of otherwise inarticulate students.
Computer programmers type in error-code numbers to find out which Windows
function crashed their program. Google can even save your life. When Terry
Chilton, of Plattsburgh, N.Y., felt a pressure in his chest one morning, he
Googled heart attacks, and quickly was directed to a detailed list of
symptoms on the American Heart Association site. “I better get my butt to
the hospital,” he told himself, and within hours he was in life-saving
surgery.
Eleven years ago computer scientist
David Gelernter wrote of the emergence of “mirror worlds,” computer-based
reflections of physical reality that can increase our understanding and
mastery of the real world. Google is the ultimate mirror world, reflecting
the aggregate brilliance of the World Wide Web, on which is stored
everything: cookie-bake results, Weblogs, weather reports and the
Constitution. And because Google is now the default means of accessing such
information, the contents of Google’s world matter very much in the real
world.
Best place to start a search --- Google
at
http://www.google.com/advanced_search
Scroll down the entire page to see new features that Google keeps adding,
wonderful features!
Note that Google will search databases as well as Webpages.
Google's WebSearch service searches the Internet to
return the best search results for a specific query. Google's SiteSearch
service searches only the specific university domain(s) that you listed. You
can give your users the option to toggle between these different search
options.
Google Offers Free Search Services to Educational Institutions
Google offers free SiteSearch (enables users to
search your university website) and optional WebSearch (enables users to
search the Internet) to universities and educational organizations
worldwide.
Note: This services is intended for educational
organizations only. Google reserves the right to terminate accounts used for
other purposes.
Business.com: The leading business search engine and directory
designed to help its users find the companies, products, services, and
information they need to make the right business decision --- http://www.business.com/
Browse
our comprehensive business directory . . . .
AllTheWeb
A
very fast search engine which aims to index the entire Web. Includes sites
that are very up-to-date and is easy to use. Good for distinctive phrases
or terms. Also has pictures, audio and video. AltaVista
One
of the largest, most comprehensive engines. Includes audio, video and
pictures. Amazing
Picture Machine
An
engine to use when pictures are needed. Ask
Jeeves
A human powered directory which aims to direct you to the exact page that
answers your question. If it can't find an answer within its own database,
it will search other search engines. Queries can be entered in common
language question form.
CompletePlanet
A
directory which searches the "Deep Web" where most other search
tools do not and rates the results according to relevance and popularity.
Dog
Pile
An
excellent metasearch engine which searches several search engines at once. Excite
A
popular portal which searches for web sites containing search terms
entered as well as for sites with terms closely related to the search
terms entered. Also has pictures. One of the best places on the Web to
find news articles. Find
Articles
A
site which has news articles from more than 300 magazines dating back to
1998. Google
Considered by many to be the best search tool which has comprehensive
searching across the Web and returns high quality, relevant results.
Search both singular and plural forms of a search term to get the best
results. All words entered are searched automatically. Look in advanced
search for pictures. Lor
An
excellent search engine which is quick, efficient and very user friendly.
A good
place to search for free online magazine articles which were printed
within the last 4 years.
Search
Excellent metasearch engine which works by collecting the results of other
search engines. Includes 800 search engines which are searched according
to subject area. A good tool to locate news articles. Search
Online
A meta
search engine which searches eight search engines and has several search
options. Free registration gives full access to all features.
Excellent
web resources which are already sorted by subject and grade level.
Teoma
A new
search engine which groups sites by topics. Vivisimo
An
excellent metasearch tool which searches 20 search engines and organizes
the results in clusters. Wisenut
A new
search engine which is large and updates web pages often. Yahoo!
Most
popular search tool even though it is quite small. A directory whose web
sites are selected by people who evaluate each site as to relevancy. An
excellent place to find news articles. Does not support Boolean searching.
Pictures can be found by typing "Picture Gallery" in the search
box and then clicking on "Browse Thousands of Pictures on Yahoo!
Picture Gallery".
From InternetWeek Newsbreak on June 19, 2002
TODAY'S INTERNET INSIGHT:
Who doesn't love Google? Most people I know use it incessantly. The Google
toolbar has been a godsend. The company's recent geek experiment with a public
SOAP interface was the talk of the Web for a few (slow) days. In today's
Leading Off, we talk with several enterprise users who have been using the new
rack-mounted Google Search Appliance. They have interesting stories to tell.
They like its relative low cost and ease of use. More intriguing is the fact
that they are finding -- or contemplating -- ways to go under the covers of
their Google boxes and work all sorts of magic on its XML data feeds. The
earlier excitement about the public SOAP interface into Google was tempered by
the fact that no instant killer app emerged (outside of so-called Google boxes
that popped up on now-ubiquitous Weblogs). Look for enterprises to find even
better ways to interact with their Google search results internally, blending
Web services and knowledge management in intriguing new ways.
Google Takes Aim At The Enterprise By focusing on
what it does best -- search -- Google is winning IT converts with its new
search appliance. The flexibility of open standards and XML data feeds is just
icing on the cake. -- Richard Karpinski http://update.internetweek.com/cgi-bin4/flo?y=eHkA0Bdl6n0V30BeQZ0Ai
Beyond HTML: Security concerns with Google Now that Google is indexing a wide
range of document types beyond HTML and plain text formats, potential security
concerns are cropping up, both for searchers and webmasters. http://www.newmedia.com/default.asp?articleID=3297
Since Bob, myself,
and a few others on the list appear to be fans of Google searches, I thought
some on the list might appreciate knowing the technology behind Google
searches and rankings.
If you're curious and
have a few idle minutes, check it out at:
Perhaps this
technology might find some application in auditing circles? ... possibly
smoothing the ruffled feathers of congressional investigators and SEC members?
;-)
David Fordham
James Madison University
LinksNet --- http://www.linxnet.com/
What I like about LinksNet are the categories (Directories)
Google, LookSmart, Metacrawler,
Northern Light, Alta Vista, Go.com, and various other search engines are
compared in "So Much Information, So Little Time: Evaluating Web
Resources With Search Engines," by K.A. Killmer and N.B. Koppel, T.H.E.
Journal, August 2000, pp. 21-29 --- http://www.thejournal.com/magazine/vault/A4101.cfm
The above article reveals the search engine preferences of students at
Montclair State University. Interestingly, Google is not mentioned in
the comparison table on Page 28 (Figure 2). Without Google in the
running, Alta Vista
was
the most popular.
Live Person Search Help and Other
Fee Consulting from Google Brokers
Google is operating as a broker between the questioner and researcher, taking
commissions in return for providing an information service much like eBay offers
for items and collectibles. In order to get a question answered, users
must pay a price that they deem sufficient (between $4 and $50), with 75 percent
going to the researcher and 25 percent going to Google. Google also provides
online facilities for researchers to get paid answering questions.
A colleague of mine recently showed me a great new
search engine called Vivisimo ( http://www.vivisimo.com
). I use Google most of the time . . ., but I now use Vivisimo when the first
few Google hits are not fruitful. I find that the classification structure
(similar to file explorer) really helps me to "zero-in" on relevant
links.
Rick -------------------------
Richard Newmark Assistant Professor of Accounting University of Northern
Colorado Kenneth W. Monfort College of Business Campus Box 128 Greeley, CO
80639 (970) 351-1213 Office (801) 858-9335 Fax (free e-mail fax at efax.com) richard.newmark@PhDuh.comhttp://PhDuh.com
The Art of Selecting Good Keywords Keywords are important in determining
which users a search engine sends to your Web site, so care should be taken
with their selection. http://www.newmedia.com/default.asp?articleID=3363
You
can now search for digital resources in OAIster!
The novelty of our
service is multi-fold:
Our service will reveal
digital resources previously "hidden" from users behind web
scripts (how
are they hidden?). The OAI
harvesting protocol we're using makes this possible.
There won't be
any dead ends. Users will not be retrieving merely information
(metadata) about resources -- they will have access to the real things.
For instance, instead of just the catalog records of a slide collection of
Van Gogh's works, users will be able to view images of the actual works.
The service will
provide one-stop "shopping" for users interested in
useful digital resources.
Digital resources
will be easily findable and viewable through our service. The middleware
we use to index these resources makes this possible.
If you're interested in making your
collection available for harvesting, please contact Kat
Hagedorn.
Search engines and directories for education sites
College Is Possible (CIP) is the American Council
on Education's K–16 youth development program that motivates middle and high
school students from underserved communities to seek a college education. As
the umbrella organization for higher education and a presidential
association, the American Council on Education (ACE) is uniquely positioned
to build a bridge between colleges and universities and their local K-12
community with commitment at the executive level. Resources
Paying for College
A Brief Look at Student Financial Aid Programs
Basic Facts About College Prices and Student
Aid
Financial Aid Glossary
Myths and Realities About Paying for College
Recommended Web sites, Books, and Brochures
Preparing for College
A Guide for Parents: Ten Steps to Prepare Your
Child for College
Courses Students Should Take in Middle,
Junior, and High School to Prepare for College
Probably the best way to search for a college, university, or any other
school is to enter the name into the Advanced Search box at http://www.google.com/advanced_search?hl=en
AltaVista Education Search http://www.altavista.com/sites/search/edu
This new page from AltaVista allows users to search for
terms from within the university and college sites in AltaVista. Searches from
this page will cover the more-than-20 million university and college sites
held here. Users can also browse the three categories, Education, Colleges
& Universities, and K-12 Education, though admittedly the links here,
while annotated, are not extensive.
Two months ago, Gregory Waldorf and his mother, Toby,
began a start-up whose time, they hoped, had come.
The company, Destination-U.com,
helps high school students identify the colleges that might be best for them.
For $39.95, students fill out a 10-minute questionnaire that focuses on
matching their personalities with colleges and also considers grades and
extracurricular activities. In seconds, the students receive a list of about
15 four-year colleges they might want to consider.
In devising the questionnaire, the Waldorfs relied on
three things - the expertise of Ms. Waldorf, a longtime independent college
counselor for high school students; interviews with some 18,000 college
juniors and seniors; and the insights of eight college counselors.
Mr. Waldorf, a venture capitalist with an interest in
businesses that provide Web-based matching services, says the company, based
in Menlo Park, Calif., can take advantage of overlapping social trends: the
fluency of teenagers in using the Web and the competitive frenzy among
students to get into the best colleges
Continued in the article
For a Fee: The Odds of Admission
Among 80 Elite Universities
"Thick Envelope," The New York Times --- http://thickenvelope.com/
Cool Sites (Timelines) as Forwarded by Chris Nolan
-----Original Message-----
From: Richard Wiggins [mailto:rich@richardwiggins.com]
Sent: Friday, December 21, 2001 11:23 AM
To: Multiple recipients of list
Subject: [WEB4LIB] Google timeline and zeitgeist, 2001
Visit Amazon Light at www.kokogiak.com/amazon4,
and you’ll see a plain search box that allows you to locate any product in
Amazon.com’s database. Click on an item, and you’ll be taken to a page with
the usual product image, price information, and customer reviews, and, of
course, the familiar “Buy This” button. Amazon Light’s pages are
deliberately less cluttered than those at Amazon itself, but the family
relationship is obvious.
Look closer, however, and you’ll spot some
distinctly non-Amazonian features. If the item you’re viewing is a DVD, for
example, there will be a button that lets you see in a single click whether
the same disc is for rent at Netflix. If it’s a CD, you can check whether
Apple’s iTunes music store has a downloadable version. And if it’s a book,
Amazon Light will even tell you whether it’s on the shelf at your local
public library.
What’s going on here? Surely, executives at
Seattle-based Amazon would never condone an online service that encourages
people to buy things from sites other than Amazon?
Actually, they would. Amazon Light, created by former
Amazon programmer Alan Taylor and hosted on his personal website, kokogiak.com,
is one of thousands of independent sites incorporating the product data and
programming tools that Amazon has been sharing freely since July 16, 2002.
That’s the day Amazon celebrated its seventh anniversary—and unveiled a
startling new project, called Amazon Web Services, that promises to change,
once again, the way retailers of all stripes think about reaching their
customers.
While companies such as Google and Microsoft are also
experimenting with the idea of letting outsiders tap into their databases and
use their content in unpredictable ways (see “What’s
Next for Google?”), none is proceeding more aggressively than Amazon.
The company has, in essence, outsourced much of its R&D, and a growing
portion of its actual sales, to an army of thousands of software developers,
who apparently enjoy nothing more than finding creative new ways to give Web
surfers access to Amazon merchandise—and earning a few bucks in the process.
The result: a syndicate of mini-Amazons operating at very little cost to
Amazon itself and capturing customers who might otherwise have gone elsewhere.
It’s as if Starbucks were to recruit 50,000 of its most loyal caffeine
addicts to strap urns of coffee to their backs each morning and, for a small
commission, spend the day dispensing the elixir to their officemates.
“Amazon is pouring so many resources into their Web
services that it’s almost frightening,” says Paul Bausch, one of the
inventors of the well-known weblogging tool Blogger and, more recently, the
author of O’Reilly Media’s Amazon Hacks, a collection of tips for tapping
into Amazon’s rich database. “They are extremely aggressive, and that
separates them from Google and from other people who are still just
experimenting with the technology. They really believe that this is where
their business is heading.”
Continued for three more pages in the article
Marketing and Purchasing Searches
ClickZ's Search Engine Watch released its annual list of outstanding Web search
services for 2003. Your favorites are among them, but there were also surprises
and controversial predictions for the coming year --- http://www.clickz.com/experts/search/opt/article.php/3319991
Note that Yahoo is Number One for marketing searches.
As if there
wasn't enough to find on Google, the search engine offers a service that allows
users to peruse the goods from more than 600 catalogs --- http://catalogs.google.com/
Anyone who's ever
marked a page in a catalog to remember to buy that perfect sweater -- only to
lose that sweater among dozens of catalogs with dog-eared pages strewn on the
living room floor -- will appreciate this:
Google's catalog
search combs the pages of more than 600 current catalogs -- 1,500
including back issues –- to help both consumers and corporations find
everything from apple
butter to zipper
doodles.
"Google's
mission is to organize the world's information and make it universally
accessible and useful," said David Krane, a spokesman
Pages were fed into a
bulk scanner -- similar to a copy machine -- then run through an optical
character recognition process, which extracts the text from each page. Google
then crawls and indexes the text from the pages.
Both consumer and
corporate catalogs are accessible through the search. Catalogs were initially
collected by Google staff members, but vendors and users may also suggest a
catalog.
"The sheer
comprehensiveness of the service makes it incredibly useful," Krane said.
A beta version of the
site launched last Thursday. There's no link to the catalog search from the
Google homepage, but it's accessible through Google's advanced search page or
catalogs.google.com.
The catalog company's
URL, phone number and catalog code are listed at the top of each page, as well
the catalog page number. People may also search within a particular catalog
using the Google search.
Several catalog
companies support the new feature.
"Google is just
a great search engine, and we're happy to be there," said Bill Ihle, a
spokesman for gourmet food purveyor Harry
and David. "Any exposure that we can get is good exposure. It's an
additional way to reach the customers."
"I would think
that a few more people will find us," said Michael Beard, managing
partner of Raven Maps. "We do one
big mailing per year, so in that regard that will extend the life of the
catalog."
But will holiday
shoppers find a use for it at this late date?
"There was no
strategic effort to time the beta release of this service for the holiday
shopping season," Krane said. "It's more the inverse of that: The
tail end of the shopping season provides us a good opportunity to secure
feedback."
Daypop (a search site for links to
daily newscasts) --- http://www.daypop.com/
Search 5800 News Sites and Weblogs for Current Events and Breaking News
Many of the top search engines now accept payment for improved listings or
fast appraisal of your site for inclusion in their directories. One expert
believes the payments are commercially worthwhile. Here he looks at the options
for six top search engines. http://www.newmedia.com/default.asp?articleID=2527
Google.com is one of the largest and
most used engines. It is fast, accurate, and one of the most visited Web sites
today. Literally thousands of searches are conducted on Google every day for
your keywords related to your site. You can now advertise on Google very
affordably using their AdWords program. Your AdWords text ads appear on search
result pages for the keywords you buy, and can be targeted by language and
country. So, to reach collectors of tin toys you might buy the keywords
"toy collector," "tin toys," etc.
Pricing for AdWords is based on the
position in which they're shown. Google positions your ad based on how many
users click on it over time. Current rates are $15, $12, $10 (per thousand ads
shown) for positions 1, 2, and 3 respectively, and $8 per thousand for
positions 4 through 8. Accounts are opened with a credit card and no minimum
deposit is required.
The trick is to choose very targeted
keywords that will trigger your ad (yes, you can do that). Which means that
only very targeted buyers will ever see your ad and your conversion ration
will be incredibly high. You can set how much you wish to spend. Google takes
the money from your credit card after you owe $50, by which time your ad will
have been displayed roughly 5,000 times. If your keywords are highly targeted,
many of the people who see your ad will become buyers and you will get your
return before you even pay Google!
Because there are thousands of searches
a day, Google alone can be one of your biggest sales drivers with its great
AdWords program. For more details, see the Google AdWords pages - they have
plenty of tips. Your listing shows up in about an hour. Remember, it is more
effective to target about 20 keywords, specific and related to your site, than
it is to use just one.
Goto.com is another powerful
pay-per-listing search engine. The trick is to pay for the top position, or at
least a top three position. Depending on your product, how much gross profit
you make out of it and what your conversion ratio is, you may be able to
profit from the top spot.
Why is the top spot so important?
Goto.com is actually a relatively small search engine compared to the others.
Its power does not come from people searching the Goto.com site itself. It
comes from the Goto.com partners. Their top search results reach 75% of all
Internet users through their affiliate partner network, which includes America
Online, Microsoft Internet Explorer, EarthLink, Lycos, and AltaVista. But
these partner sites only show Goto.com's top one to three results for any
search. So look for the top spot if your gross margin can allow it.
The good thing about Goto.com is that
they only charge you for a click-through, so you only pay when someone clicks
on your link. Top spots can cost you anything from 0.01 cents to over $4
depending on the keyword. Your listing shows up in about 3 days. Again, it is
more effective to have about 20 keywords than just one.
Although strictly speaking it is a
directory rather than an engine, DMOZ.org, or the open directory project,
powers the search results of several of the top search engines. It is free to
get listed and takes about 3 weeks to get indexed once you submit, then a
couple of months for your listing to start showing up on the engines that use
DMOZ. The most important thing to try to do is to have a domain name that is
high on the alphabetic order (starts with a number or an 'a') and also include
your primary keyword phrase - the one most people use to find your site - in
your Web site name (title) and its description.
Yahoo has immense reach. Without doubt,
you must be in Yahoo. It can bring you up to 50% of your traffic or more!
Fortunately, you can now get listed in Yahoo in seven days for a cost of just
$199 - often worthwhile. You should get back your investment in a matter of
days. The most important thing is to have a domain name that is high on the
alphabetic order (starts with a number or an 'a') and also to include your
primary keyword phrase - the one most people use to find your site - in your
Web site name (title) and its description.
Once you get listed, you should also
sign up to have your site become a sponsored site within Yahoo. It costs $25
to $300 or more a month at the time of writing, depending on the category.
Sponsored sites appear in a separate, clearly demarcated listing box, located
on appropriate category pages in the Directory at the top - which means more
traffic.
LookSmart may not be used much
directly, but its listings reach over 83% of the Internet through its partner
network. Its listings actually reach a much wider audience than Yahoo!.
LookSmart currently provides its search solutions to leading Internet portals,
370 ISPs and 600,000 Web sites including the Microsoft Network, AltaVista,
Excite@Home, iWon, Time Warner, Sony, British Telecom, US West, AltaVista,
Netscape Netcenter and NetZero. Now that is power! Again, without doubt, you
must be in LookSmart, and being in it can bring up to 50% of your traffic or
more.
Fortunately, you can now be listed in
LookSmart in 2 days for a cost of $199 - very worthwhile for what LookSmart
will give you. Its partner sites will pick your listing up shortly after you
are listed, usually within a few days or weeks. Again it's best to have a
domain name that is high on the alphabetic order (starts with a number or an
'a') and also to include your primary keyword phrase in your Web site name and
its description.
DirectHit/AskJeeves also has a paid
text ads system similar to Google's. Your link appears alongside their search
results for every search topic you sponsor, right where Ask Jeeves users are
looking for the best link to follow. Your link also appears alongside search
results on Web sites that participate in the Jeeves Text Sponsorship Network,
including MSN, Searchalot, Bomis.com, SuperCyberSearch, and Direct Hit. Your
ads appear in a few days and you just need a minimum deposit of $25 to start.
Paid search engines are faring well during our soft online ad market because
they are among the few that have successfully aligned the needs of both
consumers and marketers, according to new research. http://www.newmedia.com/default.asp?articleID=2893
The Taxonomy Warehouse is a
fantastic search engine in terms of helpful categories (including companies) ---
http://www.taxonomywarehouse.com/
•Image Search
•Web Page Translation
•PhoneBook
• PDF Files
• Stock Quotes
• Cached Links
• Similar Pages
• Who links to you?
• Site Search
• I'm Feeling Lucky
• Dictionary Definitions
For example, the search term "accounting" led me (using http://images.google.com/
) to exciting images
of accountants, spreadsheets, etc.
The phrase "Bob Jensen" led me to a picture of me (along with 25
images that were not about me). One of the images not about me was a page
from the Cedar Falls 1959 High School Yearbook. This suggests how far
Google has gone to put images in the image database.
Try the term "Rembrandt" and see what pops up.
The term "porn" leads to the Seal of the President of the United
States --- which in context has a link to pledge to protect children from
pornography. There are also some soft porn images even when Google's
Mature Content Filter is turned on (the default setting is to turn on this
filter). Google warns that the filter will not necessarily block all
pornographic images.
What I am impressed with is how much faster it is to search hits in image
form rather than text hits. A picture is worth 1,000 words.
Ditto's visual search
is easy, fun and fast. Through images, Ditto has always provided you an
excellent way to visually navigate the richness of the Web. Now we would like
to announce a fabulous way to visually find products and services.
It's quite simple. At
the top of selected search results pages you will see Featured Products and
Services listings from an extraordinary variety of merchants. We think this is
a great new way to shop the web. Go ahead, search and give it a try. On Ditto
you see what you want to buy!
Our Company -- The
Ditto site is a TLS Technologies property. We are the world’s leading visual
search engine. Ditto enables people to navigate the Web through pictures. It
is far easier for people to find what they are looking for when they are
looking at it! Pictures are fun, fast and intuitive.
The premise behind
our company is twofold; deliver highly relevant thumbnail images AND the
highly relevant web sites underlying these images. In accomplishing this task
we have compiled the largest searchable index of visual content on the
internet via proprietary processes. These are accessible on our standalone web
site or via the web sites of our Visual Search Partners.
Our Market -- The
internet is increasingly becoming a visual medium as millions of images are
being posted by businesses, schools, organizations and individuals. Visuals
provide a unique and well-understood method of accessing information around
the world. In addition to providing more relevant information for many
searches, visual searching is often more intuitive, interesting and enjoyable
to work with than traditional based text search engines. Therefore, as more
information is displayed visually, there is a greater need to enhance
traditional text-based searches with visually enhanced search. That is where
we come in. Market research reveals that experienced Internet users, when
exposed to visual search, have an extremely positive response and would refer
the site to their friends at a rate substantially above average.
Our Business -- Our
web site is a hugely popular destination for visual searching among all age
groups, offering advertisers the opportunity to reach their audiences with
targeted keyword or category placement. Additionally, our search technology is
offered commercially to other sites through licensing partnerships in what we
term our "Visual Search Partnerships".
Our Technology -- We
believe relevancy is the key difference in our technology, and the following 3
elements are equally important in the overall makeup of that technology:
Indexing: We identify
websites containing media via our automated crawler. We then select, rank,
weight, filter and rate pictures, illustrations, clipart, photographs,
drawings and other image-related material. Next, we index the images from
these websites. Finally, these results are ranked and displayed in order of
relevance.
Relevance: In order
to achieve an exceptional level of relevance to a user’s search, we have
developed a proprietary filtering process that combines sophisticated
automated filtering with human editors.
Verification:
Highly relevant results require ongoing maintenance. We continually review
current images and links to ensure that our database contains accurate and
up-to-date results.
"Upstream: Video Searching," By David Voss, Technology Review,
July/August 2001
With text documents, you can type in a query, and a
piece of software finds the matching text strings. Searching video is much
tougher. Unless someone has gone back and somehow marked the video data, it's
now nearly impossible to find a specific image. A content provider like CNN
has more than a hundred thousand hours of tape in its video archive—far too
much for any human to view and annotate manually. Now a small but growing
number of labs are searching for novel ways to better navigate the video glut.
These are still early days for video indexing and
retrieval. A few existing Web search engines like AltaVista can find some
video clips, but they only return those that are on Web pages with text that
can be searched by keywords. Likewise, San Mateo, CA-based Virage has
developed a search engine for ABCNews.com that allows the transcript of a
broadcast to be searched; the search, however, is also by keywords, and the
video is played from the point at which the specified word occurred. None of
these systems provides direct image searches—in other words, a video answer
to the command "give me all the clips of an astronaut outside the space
station Mir."
Video-search and database tools that directly find
images can be far more powerful than keyword searches. At Columbia University,
a team led by Shih-Fu Chang is developing software that can search a video for
particular features in the images—such as shape, color and motion. For
example, you could select a static image from a catalogue and have the
software find close matches in the video frames. Or you could make a simple
sketch of a blob, with a few arrows to show how it moves, and the system finds
video segments that match these features. For instance, you could roughly
sketch the shape of the Mir space station and a human figure moving outside
it.
This kind of direct image query could be especially
useful for large databases of video records. Chang's group has been
researching ways to extract information from medical-exam videos. Every year
at Columbia-Presbyterian Medical Center, "ten thousand echocardiograms
[ultrasound movies of the heart] are performed," he explains. "Each
is about a half-hour long, and they get put into a tape library." A
cardiologist then has to look up the ultrasound to make a diagnosis, wasting a
lot of time fast-forwarding and rewinding through the tape. Much better would
be an automatic way of detecting signs of heart ailment in the video stream.
Chang's software first parses the ultrasound video into segments by looking
for sharp changes in image content—when the view on the ultrasound display
is switched to another angle, for instance. Each segment is then processed by
a "view recognizer" that matches the images to known images of
abnormal events and flags any suspected heart conditions.
At Carnegie Mellon University, researchers are
creating a digital library that combines natural-language processing, speech
recognition and image analysis. "The integration of these different
technologies is the key," says Howard D. Wactlar, director of the
Informedia Digital Video Library Project at Carnegie Mellon. A prototype
captures news broadcasts from around the world and stores them, along with
summaries or storyboards. Someone can then type in a question, or just utter
the question aloud: "Tell me about oxygen problems on the Russian space
station Mir." All the relevant news clips are displayed as frame icons
you can click on. The system is also incorporating face recognition to make it
possible to call up all the clips of a particular person (see graphic above).
It will be some time before direct video searches
become routine. But if today's research pays off, finding a video needle in
the immense multimedia haystack will be no more difficult than typing in a few
words—or maybe sketching out a simple image.
Starpond was founded by Terry Bowers and Tom Folkes,
a brother-sister team who pooled their talents to bring a functional product
to market in under a year. The company's flagship Collaborative Use Research
Engine (CURE) is the brainchild of Folkes, an independent computer consultant
who had been knocking the idea around in his head for nine years.
CURE operates within a preestablished field of data,
professional journals in the student's case, that is customized to fit the
needs of a group. When a topic is searched, the system produces a list of
results and ranks them by how frequently each is used by others in the group.
So, the student would see which journal articles are used most often by other
students and professors researching immigration cases.
CURE costs $400 to $5,000 a month to lease, depending
on the number of users, range of data sources and amount of customer support
the client requires.
Folkes and Bowers believe the system surpasses most
research and analysis tools now in use and that it eventually will be used in
settings ranging from government think tanks to junior high science classes.
"This so accelerates research and knowledge
sharing," said Bowers, the president and chief executive. "We've
already seen germs of its ability to affect public policy and foster
growth."
"Starpond and WebSurveyor Team to Deliver Next Generation Research
Capabilities to WebSurveyor Customers," Web Surveyor, May 8, 2001
--- http://www.websurveyor.com/about_news.asp#SP
Chevy Chase, MD and Herndon, VA. Starpond, Inc, a
leading developer of next generation search technologies and applications, and
WebSurveyor, the leading provider of online customer research software and
services, today announced a strategic partnership to provide WebSurveyor
customers with access to Starpond's newest research tools, including the soon
to be released Collaborative Use Research Engine (CURE)™.
WebSurveyor's online services provide business
consumers innovative and effective survey technology to acquire actionable
knowledge from customers, prospects, employees, partners, students,
constituents, and web site visitors. Through its partnership with Starpond,
WebSurveyor will add the search capabilities of the CURE™, giving business
consumers even more versatility in conducting customer research.
"The CURE™ will help WebSurveyor clients in
two significant ways," said WebSurveyor CEO Bruce Mancinelli.
"First, it will speed up analyzing responses from open-ended questions,
which, depending on the amount of data, may be very time-consuming. Second, by
using the technology to discover relationships within and between text
responses, WebSurveyor clients will find open-ended responses an even more
valuable source of business intelligence. Besides creating added-value to our
survey applications, there are endless opportunities for StarPond and
WebSurveyor to work together in developing additional unique solutions and
services."
Starpond's first product, the CURE combines an
advanced search application with high-level knowledge management
functionality. This combination provides the ability to create real time
parabolic knowledge available from tacit and explicit sources as well as
aggregate data that have been mined from the web.
To enable organizations to gain competitive advantage
one needs to have access to relevant comprehensive information and analysis
tools to make better decisions, faster.
Provide the ability to extract, cleanse, and
aggregate data from multiple operational systems into a separate data mart
or warehouse.
Store data repository to enable rapid delivery of
summarized information and drill-down to detail.
Deliver personalized, relevant informational
views, querying, reporting and analysis capabilities that go beyond the
standard reporting capabilities of transaction based systems-a requirement
for gaining better business understanding and making better decisions,
faster.
BoardReader.com was developed to address the
shortcomings of current search engine technology to accurately find and display
information contained on the Web’s forums and message boards. Founded in May
2000 by engineers and students from the University of Michigan, Boardreader.com
uses proprietary software that allows users to search the forums and message
boards in a particular topic area, thus allowing users to share information in a
truly global sense...
From
UC Berkeley
Free Digital Library of Books, Audio, and Films (Thank you Richard Campbell for
leading me to this great site)
WayBack Machine --- http://www.archive.org/
The Internet Archive,
working with Alexa Internet, has created
the Wayback Machine. The Wayback Machine
makes it possible to surf more than 10 billion pages stored in the Internet
Archive's web archive. The Wayback Machine was unveiled on October 24th, 2001
at U.C. Berkeley's Bancroft
Library. Visit the Wayback Machine by entering an URL above or clicking on
specific collections below
The Internet Archive is
collaborating with various collectors, community members, and film-makers to
provide easy access to a rich and fascinating core collection of archival
films.
www.legal-definitions.com CPAs who need help deciphering “lawyerspeak” can find
concise definitions of legal terminology at this e-stop as well as the meaning
of general business terms such as bankruptcy.
www.commerce-database.com Need to know the difference between an act of God and
an act of nature? The legal terms section of this online business
dictionary defines them as one and the same. The Commerce Database categorizes
words into separate business and legal dictionaries: The business one offers
categories such as accounting.
www.computer-acronyms.com This Web site offers visitors short definitions for technical
terminology such as, for example, cable modem. Also users can find
brief explanations of acronyms for high-speed Internet concepts such as
DSL—digital subscriber line.
www.legal-database.com CPAs interested in legal topics such as bankruptcy, civil
rights, employment, labor and tax laws can find various terms explained in the
articles section for each category at this Web stop. In addition visitors can
register for free monthly newsletters on bankruptcy, employment, family and
tax law.
Internet Public Library
--- http://www.ipl.org
or IPL, was one of the first public libraries "of and for the Internet
community." Some available collections are general reference,
associations, literary criticism, newspapers, youth, and teens.
Library of Congress Online
Catalog --- http://catalog.loc.gov
records represent the holdings of the library, including books, computer
files, manuscripts, cartographic materials, music, sound recordings, and
visual materials; it also includes searching aids for users.
Search for Terms on Book Pages
The Absolutely Fantastic New Search Tool From Amazon
Amazon’s ability to search through millions of
book pages to unearth any tidbit is part of a search revolution that will change
us all. Steven Levy, MANBC, November 10, 2003 --- http://www.msnbc.com/news/987697.asp?0dm=s118k
Hints from Bob Jensen
Be sure you note
the Previous Page and the Next Page options when you bring up a page of
text.
Note the option
at the top to "See all references" to your search term within a
given book (this is a wonderful search utility).
When you hit the
end of the allowed pages of reading, you might be able to find a phrase on
that last page that you can enter as a search term. I've done this and
have been able to bring up another five pages, etc. This is a
cumbersome way that one might read large portions of the book.
However, soon Amazon puts up a message that you have reached a limit of your
searches on the book and will deny you further searches. This software
is amazingly sophisticated.
The pages are
scanned pages and will sometimes show images as well as text in the original
colors. For example, search for "gnp graph" and note the
second hit to The Third World Atlas by Alan Thomas.
How It Works ---
http://snurl.com/BookSearch
A significant extension of our groundbreaking Look Inside the Book
feature, Search Inside the Book allows you to search millions of pages
to find exactly the book you want to buy. Now instead of just displaying
books whose title, author, or publisher-provided keywords that match
your search terms, your search results will surface titles based on
every word inside the book. Using Search Inside the Book is as simple as
running an Amazon.com search.
Note from Jensen
Be sure you note the Previous Page and the Next Page options when you bring up a
page of text.
Amazon.com Inc. said
a new program that allows customers to search the contents of some books has
boosted sales growth by 9% for titles in the program above other titles that
can't be searched.
The news from the
Seattle-based Internet retailer suggests that concerns among some book
publishers that the search service might hurt sales haven't materialized.
Amazon last Thursday introduced the service, called Search Inside the Book,
which gave its customers a way to scour complete copies of 120,000 books from
190 publishers, a major advance over the searches customers were previously
limited to, such as searches by title and author name.
Some book publishers
have stayed out of the new Amazon search service because of concerns that
users can easily scan Amazon's electronic copies instead of buying the books.
In the days since the service launched though, Amazon monitored sales of
120,000 book titles that can be searched through its new service and says
growth in sales of those books significantly outpaced the growth of all other
titles on the site. Amazon said 37 additional publishers have contacted the
company since the search service launched asking to have their books included
in the program.
"It's helping
people find things they couldn't otherwise find," Steve Kessel, vice
president of Amazon's North American books, music and video group, said in an
interview. "There are people who love authors and who are finding things,
not just by the author, but about the author."
Although its
customers can search entire books with the new service, Amazon has
restrictions that limit the ability to browse entire books online. Once a user
clicks to a book page containing terms that they've search for -- "Gulf
War," for instance -- Amazon doesn't let them browse more than two pages
forward or back. Users may jump to other pages containing the terms, but the
same restrictions on browsing apply.
Search technology is
becoming an increasingly important focus for Amazon and for online shopping in
general. The company recently established a new division in Silicon Valley,
called A9, which is developing searching technology for finding products to
purchase on the Internet. (See article.) The project is getting underway at a
time when more shoppers are using search engines like Google and comparison
shopping sites like BizRate.com to locate products.
Amazon has a head
start on another big Internet company in the book search department. Google
Inc. is also talking to publishers about allowing searches of the contents of
books, according to people familiar with the matter. A Google spokesman
declined to comment.
Open Content Sites Allow You to Add and Edit Content
and Share
Aaron Konstam mentioned the following open source (search, encyclopedia,
history, culture, science) link --- http://www.wikipedia.org/
Wikipedia
is a multilingual
project to create a complete and accurate open
content encyclopedia. We started on 15 January 2001 and are already working
on 99391 articles
in the English version. Visit the help
page and experiment in the sandbox
to learn how you can edit any article right
now
eContent
Distribution – We give publishers the ability to tap the
Internet to increase sales and distribution.
ebrarian™
– Our ebrarian solution helps online community aggregators retain
customers, create eCommerce opportunities and build brands.
ebrarian
Pro – Fast and accurate, ebrarian Pro helps libraries and
information professionals make the business of performing research easy
and cost effective.
ebrarian
A+ – For eLearning properties, ebrarian A+ makes word-level
content interaction a reality, generating new comprehension and commerce
opportunities.
If you haven't been to LibrarySpot lately, you should
stop by for a visit. We've added many new resources. Also, you might be
interested in several of our other "spots" - GovSpot.com ( http://www.govspot.com
) simplifies the search for government resources online and our new
"spot" HeadlineSpot.com ( http://www.headlinespot.com
) is a one stop shop for thousands of the best U.S. and international news
sources online.
ebrary and Learning Network recently launched the
first public beta of ebrarian, ebrary's new e-content solution for partners
and customers. The co-branded site is available now at http://learningnetwork.ebrary.com
. Using a model similar to photocopying, Learning Network visitors pay only to
print or copy the information they need. Searching and browsing are free, and
there are no membership or subscription fees. Learning Net- work has also
customized ebrary InfoTools to link users to relevant information on the
Infoplease site. With InfoTools, any word or phrase a user selects can link to
research mater- ials such as articles, definitions, biographical information,
and statistics.
Questia 2.0 Nearly
Doubles the Size of Its Collection
Recently,
Questia—provider of an online library complete with search and writing
tools--launches its version 2.0. Version 2.0 includes a collection of more
than 60,000 full-text titles— nearly double the size of its version 1.0
collection launched January 2001. Version 2.0 also improves Questia's tools,
which enable users to personalize books by electronically highlighting and
making notes in them and to write better papers by automatically creating
footnotes and bibliographies in various for- mats. New features include new
tools for subscribers, including an automatic view of the most recently used
books, a personal bookshelf for storing and retrieving favorite books, and a
customizable home page; re-organization of tools and functions around the
three main areas of search, read, and work to improve the site's usability;
and faster search and navigation between books and within books. The Questia
service is also useful, both as a source for teaching materials and as an
effective anti-plagiarism tool. Using the search function to look for a
phrase, professors can check a student's paper for material copied but not
cited. For more information, visit http://www.questia.com
.
As of November 2002, Questia
claims to be "The World's Largest Online Library."
Thank you Paula. The links at refdesk.com look great. It even has
an "Ask Bob" form to provide expert help in finding reference
material. This is a great alternative to LibrarySpot.
Bob,
You may already know about this, but I couldn't find
it on your website of search helpers.
In addition to the website, you can subscribe to
"Today's Reference Pick of the Day" which is always interesting (see
example below).
Paula
Paula Kelley Ward Development Information Systems
Manager Trinity University, San Antonio Texas Phone 210.999.7432 FAX
210.999.7433 Benefactor 5.0, Colleague 16.0 PWard@Trinity.edu
I would appreciate it
if you would consider ABYZ News Links for inclusion as a link on your website
on the page http://www.trinity.edu/rjensen/bookbob3.htm
under the heading "Newspapers/Magazines".
ABYZ News Links at http://www.abyznewslinks.com/
contains links to more than 17,000 newspapers and other news sources from
around the world.
Some of the notable
features of ABYZ News Links follow: -It is large, well organized, detailed,
and accurate. -The emphasis is on strong content. -It is particularly strong
in non-English news media. -Simple design aids quick downloading of the site.
-Links are checked regularly with a link checking program resulting in a much
lower than average link failure rate. -The site is worked upon and updated
daily.
I believe ABYZ News
Links would be a very useful resource for your website visitors. Thank you for
considering it for inclusion on your site.
I was intrigued by your letter to Professor Robert
Jensen (published in http://www.trinity.edu/rjensen/book01q3.htm
concerning ABYZ News Links. I thought it might be a link I could use in my
work. Before I checked the links, however, I tried the search engine.
Unfortunately, it had less than impressive results. When I asked for
"George Bush," the following were representative results. Although I
will bookmark you site for the news links, I was disappointed that the search
engine was less than useful.
Researchers at MIT
have released a video and audio search tool that solves one of the most
challenging problems in the field: how to break up a lengthy academic
lecture into manageable chunks, pinpoint the location of keywords, and
direct the user to them. Announced last month, the MIT
Lecture Browser website gives the general public
detailed access to more than 200 lectures publicly available though the
university's
OpenCourseWare initiative. The search engine
leverages decades' worth of speech-recognition research at MIT and other
institutions to
convert
audio
into text and make it searchable.
The Lecture Browser arrives at a time when more and
more universities, including Carnegie Mellon University and the University
of California, Berkeley, are posting videos and podcasts of lectures online.
While this content is useful, locating specific information within lectures
can be difficult, frustrating students who are accustomed to finding what
they need in less than a second with Google.
"This is a growing issue for universities around
the country as it becomes easier to record classroom lectures," says Jim
Glass, research scientist at MIT. "It's a real challenge to know how to
disseminate them and make it easier for students to get access to parts of
the lecture they might be interested in. It's like finding a needle in a
haystack."
The fundamental elements of the Lecture Browser
have been kicking around research labs at MIT and places such as BBN
Technologies in Boston, Carnegie Mellon, SRI International in Palo Alto, CA,
and the University of Southern California for more than 30 years. Their
efforts have produced software that's finally good enough to find its way to
the average person, says Premkumar Natarajan, scientist at BBN. "There's
about three decades of work where many fundamental problems were addressed,"
he says. "The technology is mature enough now that there's a growing sense
in the community that it's time [to test applications in the real world].
We've done all we can in the lab."
A handful of companies, such as online audio and
video search engines Blinkx and EveryZing (which has licensed technology
from BBN) are making use of software that converts audio speech into
searchable text. (See "Surfing TV on the Internet" and "More-Accurate Video
Search".) But the MIT researchers faced particular challenges with academic
lectures. For one, many lecturers are not native English speakers, which
makes automatic transcription tricky for systems trained on American English
accents. Second, the words favored in science lectures can be rather
obscure. Finally, says Regina Barzilay, professor of computer Science at
MIT, lectures have very little discernable structure, making them difficult
to break up and organize for easy searching. "Topical transitions are very
subtle," she says. "Lectures aren't organized like normal text."
To tackle these problems, the researchers first
configured the software that converts the audio to text. They trained the
software to understand particular accents using accurate transcriptions of
short snippets of recorded speech. To help the software identify uncommon
words--anything from "drosophila" to "closed-loop integrals"--the
researchers provided it with additional data, such as text from books and
lecture notes, which assists the software in accurately transcribing as many
as four out of five words. If the system is used with a nonnative English
speaker whose accent and vocabulary it hasn't been trained to recognize, the
accuracy can drop to 50 percent. (Such a low accuracy would not be useful
for direct transcription but can still be useful for keyword searches.)
MIT's Video Lecture Search Engine: Watch the
video at ---
http://web.sls.csail.mit.edu/lectures/
Researchers at MIT have released a video and audio search tool that solves one
of the most challenging problems in the field: how to break up a lengthy
academic lecture into manageable chunks, pinpoint the location of keywords, and
direct the user to them. Announced last month, the MIT
Lecture Browser website gives the general public
detailed access to more than 200 lectures publicly available though the
university's
OpenCourseWareinitiative. The search engine
leverages decades' worth of speech-recognition research at MIT and other
institutions to
convert
audio
into text and make it searchable.
Kate Greene, MIT's Technology Review, November 26, 2007 ---
http://www.technologyreview.com/Infotech/19747/?nlid=686&a=f
Once again, the Lecture Browser link (with video) is at
http://web.sls.csail.mit.edu/lectures/
Bob Jensen's search helpers are at
http://www.trinity.edu/rjensen/Searchh.htm
Listen to the classics: Download audio books from the NY Public
Library The New York Public Library announced Monday that it is
making 700 books _ from classics to current best sellers _ available to members
in digital audio form for downloading onto PCs, CD players and portable
listening devices.
"N.Y. Public Library Starts Digital Library," The Washington Post, June
13, 2005 ---
http://www.washingtonpost.com/wp-dyn/content/article/2005/06/13/AR2005061301093.html?referrer=email
Audio Speeches from Past Leaders and Famous People (the entire set of
recordings is too large to make available online) The voices include every U.S. president since Herbert
Hoover, foreign leaders from Charles de Gaulle to Corazon Aquino, scientists
like Jacques Cousteau and Carl Sagan, entertainers like Joan Baez and Cecil B.
DeMille. They are all part of a collection of speeches spanning nearly 100 years
recently acquired from the Commonwealth Club by the Hoover Institution "Commonwealth Club archives feature key 20th-century
voices," Stanford Magazine --- http://www.stanfordalumni.org/news/magazine/2004/marapr/show/archives.html
Some audio clips are available at http://sfgate.com/cgi-bin/article.cgi?f=/c/a/2003/06/17/MN157028.DTL
Other links about these recordings are available at http://www-hoover.stanford.edu/hila/cwclub.htm
I found an excellent search engine for audio and
video clips called singingfish - http://www.singingfish.com
I did a search for "Worldcom accounting" and found an interview by an
NPR commentator and Denny Beresford. It would be easy to supplement an
accounting lecture with appropriate sound bites from a variety of sources.
Richard Campbell, e-Mail Message on March 25, 2004
Bob Jensen's audio search helpers are at
Listen to past arguments before the U.S. Supreme Court
Years ago I made the Wow Site of the Week the Oyez site at Northwestern
University. Under funding from the U.S. Government, the Oyez site enabled
anyone in the world to download the audio of actual oral arguments of lawyers
standing before the U.S. Supreme Court --- http://www.oyez.org/oyez/frontpage
Audible's on-demand
audio files include top national newspapers and magazines, and both classic
and best-selling novels. They offer more than 32,000 hours of audio programs
and 165 content partners.
Audible hopes the
campaign, appropriately called Spread the Word, will increase its customer
base by 60,000 to 90,000 users.
To achieve this goal,
Audible has sent marketing kits to about 30,000 of its most dedicated
customers. In return for their customers' free marketing efforts, Audible will
give away free audio files and $5,000 worth of tech prizes.
Spread the Words
builds on the customer-referral volume the company has experienced informally.
"Our current
customers have already played an essential role in our rapid growth, which has
almost tripled our customer base within a year," said Donald Katz, CEO of
Audible, Inc.
Customers who spread
the word about Audible deserve to be rewarded, Katz said. In fact the kernel
of the Spread the Words idea came from a customer and shareholder.
TextArc is a visual represention of a text—the
entire text (twice!) on a single page. Some funny combination of an index,
concordance, and summary, it uses the viewer's eye to help uncover meaning. A
more detailed overview is available.
Rarely do value, aesthetics and innovation come
together so effectively. It shows what can happen when designers go beyond
thinking about displays as just electronic paper. —Bill Buxton Chief
Scientist, Alias|Wavefront & SGI [TextArc] frees you to see the text in a
nonlinear way, and to make connections that you would not have otherwise made.
It makes a text richer and more interpretable. —Bruce Ferguson Dean,
Columbia University School of the Arts
TextArc evolves from an academic tool into a
full-fledged work of digital art. ... This is the reading process made
visible. As the eye arrives at each word, it glows in the mind while
generating a skein of other associations, similar to what happens when one
reads a book. Then it lingers a bit before receding from consciousness. And
all the while, the greater whole of the story is present in the imagination
and beautifully vivid on the screen. (Here's the whole article.) —Matthew
Mirapaul Arts Columnist, The New York Times
Really enjoyed hearing your ideas and seeing your
work. Your multivalent approach is rare and welcome. I look forward to seeing
more anytime. —Larry Rinder Curator, The Whitney Biennial & Bitstreams
TextArc is just taking its first steps with the site
launch on Monday, April 15. There are major new releases just around the
corner, including the ability to search for word associations in any text,
some real-world applications, and fine art prints that make it possible for
you to live with what's been called a "mindprint" of your favorite
work. Please drop a note to email-list@textarc.org and we'll keep you
informed. (This list will never be sold or used for purposes unrelated to
TextArc; and the e-mail update is free, of course.)
Project Gutenberg ---
http://www.promo.net/pg/
Project Gutenberg is the brainchild of Michael Hart, who in 1971 decided that it
would be a really good idea if lots of famous and important texts were freely
available to everyone in the world. Since then, he has been joined by hundreds
of volunteers who share his vision. Now, more than thirty years later, Project
Gutenberg has the following figures (as of November 8th 2002): 203 New eBooks
released during October 2002, 1975 New eBooks produced in 2002 (they were 1240
in 2001) for a total of 6267 Total Project Gutenberg eBooks. 119 eBooks have
been posted so far by Project Gutenberg of Australia.
All Free Magazines (links to free magazines) ---
http://www.all-freemagazines.com/mag.html
These are classified by subject matter.
Many are offer free trial subscriptions for one year.
BookMooch allows you to trade books on your
shelf for other books --- http://bookmooch.com/
"Only minutes after creating a list of books I am
willing to give away on Bookmooch, I already had enough points to request
free books from others. Tomorrow, I am mailing two complete strangers some
old books. And four strangers have promised to send me books I was planning
to buy on Amazon. An excellent trade! Bookmooch works!"
- Solana Larsen (a BookMooch member)
See Joanne Kaufman, "Clear the Bookshelf and Fill It Up Again, All
Online," The New York Times, October 15, 2007 ---
Click Here
. . . I would like to introduce
you to our service and web site Hitflip that might be an interesting
addition to your links for books and education. Hitflip is a community
to swap used books and other original media. It is therefore an easy and
cheap alternative to the existing online book stores. You can find
hitflip at
http://www.hitflip.de .
The just recently launched English version can be found at
http://www.hitflip.co.uk
. . . I would like to introduce you to
our service and web site Hitflip that might be an interesting addition to
your links for books and education. Hitflip is a community to swap used
books and other original media. It is therefore an easy and cheap
alternative to the existing online book stores. You can find hitflip at
http://www.hitflip.de .
The just recently launched English version can be found at
http://www.hitflip.co.uk .
Charles W. Bailey, Jr., compiler of SCHOLARLY
ELECTRONIC PUBLISHING BIBLIOGRAPHY (now in its 70th edition), has recently
published "Institutional Repositories, Tout de Suite", a work "designed to
give the reader a very quick introduction to key aspects of institutional
repositories and to foster further exploration of this topic though liberal
use of relevant references to online documents and links to pertinent
websites." The document covers definitions of institutional repositories,
why institutions should have them, and the issues authors face when
contributing to repositories.
"Institutional Repositories, Tout de Suite" is
available at
http://www.digital-scholarship.org/ts/irtoutsuite.pdf.
The work is licensed under a Creative Commons
Attribution-Noncommercial 3.0 United States License, and it can be freely
used for any noncommercial purpose in accordance with the license.
There are a number of places to get books online, but
this recent addition to that cadre of websites is definitely worth a look. The
staff members at Manybooks.net have adapted the e-texts created by the Project
Gutenberg DVD and placed them online in a host of formats, including pdf,
eReader, and as Palm document files. Visitors can begin by browsing by author,
title, category, or language. Some of the languages covered in the database
include Dutch, Esperanto, Swedish, Tagalog, and Welsh. Satisfied visitors can
also submit a list of five of their favorite books so that other users may
take advantage of their favorite reads. Some of the recently recommended
titles include Jude the Obscure, Silas Marner, Ecce Homo, and New Grub Street.
Persons attracted to this site should also take a look at the ebook cover
page, where they can peruse the covers of some of the many books contained
within the archive. Some of the more compelling covers include those for As a
Man Thinketh authored by James Allen and a rather lovely cover for Under the
Lilacs by Louisa May Alcott.
My name is Lucy. I would like to inform you
regarding a new web site that I hope you will find interesting for you and
for the 'Bob Jensen's Links to Electronic Literature' page, and to ask you
to add our link to the other book related links on the 'Online Book and
Table of Contents Finders' section.
http://www.booksprice.com is a free
innovative service of finding the best price on a purchase of several books
together. This service is more useful than the standard services which
perform one book comparison at a time: acquiring several books together may
reduce the total price of shipping the books, as usually the shipping rate
for the second book is lower than the cost for the first book.
I will really appreciate adding a link to our site.
You can use this html to create the link to our site:
I hope that you find the service interesting. If
you have any queries or you'd like more information, kindly contact me.
My name is Ivy Carla, and I work for ECNext, Inc. After reviewing your
website, specifically the Helpers for Searching the Web section,
http://www.trinity.edu/rjensen/searchh.htm, I
wanted to propose you consider adding a new online textbooks site,
iChapters.com.
iChapters.com offers brand new textbooks, in electronic & print formats.
Electronic versions of college textbooks, including individual chapters, are
available for immediate download at affordable prices. Only at iChapters.com
can you choose to buy just what you need at the price you want to pay.
Students who frequent your website, especially those with a tight budget,
will surely benefit from iChapters. I am hoping that you can help them find
us by including iChapters (http://www.iChapters.com)
on your Helpers for Searching the Web section.
Please don’t hesitate to contact me (ivy@ecnext.com) if you have any
questions.
BookBrowse.com --- http://www.bookbrowse.com/
This site is very efficient for finding the latest and greatest books on a wide
range of topics.
Before seeking the best deals on a
book, look up its ISBN and Barnes Nobel price at www.bn.com
Then use the ISBN to compare new and used prices at the following sites:
www.amazon.com
(This is a great site because you can download sample pages of text and
pictures)
Helper Site if You Are Looking for a Book to Read
Whichbook --- http://www.whichbook.net/index.jsp
Categories include the following (Note that you click
on a category and then slide a pointer):
Happy - Sad
Funny - Serious
Safe - Disturbing
Expected - Unpredictable
Larger than Life - Down to Earth
Beautiful - Disgusting
Gentle - Disturbing
No Sex - Sex
Conventional - Unusual
Optimistic - Bleak
Short - Long
Eye on Books (included photographs and audio) --- http://www.eyeonbooks.com/
Read and listen to reviews of top books --- separate the wheat from the chaff.
Barnes & Noble Textbook Home Page The price all of our
books below suggested retail price. Look for books that have our Guaranteed Buy
Back stamp and save even more! http://www.gis.net/~catb/textbooks.html
My name is Nathan letourneau and as a U of M
student I created a no cost textbook price comparison service for students
(It's like Expedia, but for textbooks). It allows students to find the
cheapest prices on college books by comparing prices across online
retailers. I see you have Amazon.com listed on your website (
http://www.trinity.edu/rjensen/ElectronicLiterature.htm ).
Amazon is one of the many sites I search to find the
cheapest prices. Would you be willing to add a link to my site on your
website or tell your class about it.
Also you can add your own small price comparison
right on your website by adding a small snippet of code. You can see what it
looks like by visiting
http://www.campusbooks4less.com/professors.html
If you would like me to send you a reminder next
semester, please reply back with the email you would like me to send the
reminder to.
Thanks for your consideration and time!
Nathan Letourneau
Nathan@CampusBooks4Less.comleto0023@tc.umn.edu I
did not buy your email or contact information. I found it myself on the web
by looking through current syllabi and emailed you on my own. I am not going
to distribute or sell your email address in any way. I am providing the
following information to comply with current regulations.
To unsubscribe (even though you haven't been signed
up for anything), please send an email to
campusbooks4less@campusbooks4less.com. Please provide your email address in
the message along with the word "unsubscribe" and I will make sure to never
email you again.
Hungry Minds --- http://www.hungryminds.com/
Over 17,000 training and education courses (Mostly from top universities) and
links to books
Our online campus, hungrymindsuniversity.com , offers
up to 17,000 courses from top universities like UC-Berkeley, UCLA, NYU, as
well as leading training companies and subject experts.
Our famous brands including For Dummies, CliffsNotes,
and Frommerís are all more than books. When you want to know, or know-how,
you can get immediate answers by visiting cliffsnotes.com, dummies.com, and
frommers.com. You can even subscribe to free e-newsletters filled with tips
delivered direct to your desktop. Our 42 Dummies Daily newsletters make over
14.5 million deliveries per month.
And for all our books -- including the award-winning
series that have made us the best-selling computer books publisher -- check
out our online bookstore.
Hungry Minds is here to feed your appetite for
knowledge with a full range of trusted, timely content. Whether it's to find a
restaurant on a wireless Palm, to study Shakespeare at 2 a.m. with a
downloaded Cliffs Note, to solve a computing problem with Dummies Answer
Network, to fulfill an ambition via the UC Berkeley certificate program --
with Hungry Minds it's all possible!
Great electronic "books" from the University of Texas and Princeton
University Dante Inferno, Purgatory, and Paradise (a
multimedia learning experience) ---
http://danteworlds.laits.utexas.edu/
Also see Princeton University's contribution (in Italian or English) ---
http://etcweb.princeton.edu/dante/pdp/
Princeton's versions has both lectures and multimedia!
An electronic library that teaches children how to read better The $40 annual subscription provides families with
unlimited access to the site and to several dozen books for children ages 2 to
9. The company plans to unveil the complete 108-book library next year. "They're
beautifully illustrated with interesting stories that hold a child's attention,"
Teitelbaum says. "The original illustrations with text and 3-D figures reinforce
that this is a book, not a video game or TV. We want kids to feel inspired to go
from reading the screen to reading the hard copy." While not designed as a
reading instruction program, One More Story does have features for emerging
readers, such as the "I can read it" function, in which the words will be read
aloud only when the child clicks the mouse there. By highlighting narrated
words, the site can help children make the link between written and spoken
language, Roth says.
Chelsea Waugaman, "Read the story again? Sure. Computers don't get tired,"
The Christian Science Monitor, July 11, 2005 ---
http://www.csmonitor.com/2005/0711/p12s01-stin.html
One More Story is an interactive online library for children
that was founded in 2000 ---
http://www.onemorestory.com/
TheFreeDictionary.com has about 2,000,000 articles
and definitions from leading dictionaries and encyclopedias. Please take a
look at our site and help your visitors find out about us.
Thank you in advance for taking a look at our
website.
Sincerely,
Valerie Schaeffer
P.S.
Also, if you are interested, we recently created a
new "dictionary search" box and “Word of the Day” feature that can be used
on your web page. The instructions can be found at
http://www.thefreedictionary.com/lookup.htm
Charles W. Bailey, Jr., compiler of SCHOLARLY
ELECTRONIC PUBLISHING BIBLIOGRAPHY (now in its 57th edition), has a new
publication. DigitalKoans is a weblog that provides commentary on scholarly
electronic publishing and digital culture issues. It is available at
http://www.escholarlypub.com/digitalkoans/
.
Since 2001, Bailey has also published another
weblog, The Scholarly Electronic Weblog, an exhaustive compilation of
citations to articles dealing with all aspects of scholarly communication.
The weblog is online at
http://info.lib.uh.edu/sepb/sepw.htm .
Scholarly Electronic Publishing Bibliography is a
searchable resource that cites selected articles, books, electronic
documents, and other sources that are useful in understanding scholarly
electronic publishing efforts on the Internet and other networks. The latest
version is available at
http://info.lib.uh.edu/sepb/sepb.html .
Bailey is the Assistant Dean for Digital Library
Planning and Development at the University of Houston Libraries. In 1989,
Bailey established PACS-L, a mailing list about public-access computers in
libraries, and The Public-Access Computer Systems Review, one of the first
scholarly electronic journals published on the Internet. For more
information, contact Charles W. Bailey, Jr., University of Houston, Library
Administration, 114 University Libraries, Houston, TX 77204-2000 USA; tel:
713-743-9804; fax: 713-743-9811; email: cbailey@uh.edu; Web:
http://info.lib.uh.edu/cwb/bailey.htm
.
A great index of electronic journals (although admittedly not
comprehensive)--- http://ejw.i8.com/
There are a number of places to get books online, but
this recent addition to that cadre of websites is definitely worth a look. The
staff members at Manybooks.net have adapted the e-texts created by the Project
Gutenberg DVD and placed them online in a host of formats, including pdf,
eReader, and as Palm document files. Visitors can begin by browsing by author,
title, category, or language. Some of the languages covered in the database
include Dutch, Esperanto, Swedish, Tagalog, and Welsh. Satisfied visitors can
also submit a list of five of their favorite books so that other users may
take advantage of their favorite reads. Some of the recently recommended
titles include Jude the Obscure, Silas Marner, Ecce Homo, and New Grub Street.
Persons attracted to this site should also take a look at the ebook cover
page, where they can peruse the covers of some of the many books contained
within the archive. Some of the more compelling covers include those for As a
Man Thinketh authored by James Allen and a rather lovely cover for Under the
Lilacs by Louisa May Alcott.
Book publisher HarperCollins and OverDrive have
created HarperCollins Private Reserve, a digital warehouse for HarperCollins
e-books worldwide. Using OverDrive servers and technology, HarperCollins
Private Reserve allows the publishing company's divisions in the United
States, Canada, the United Kingdom, Australia and New Zealand to manage and
distribute e-book titles and marketing information directly.
The warehouse supplies online retailers with e-book
catalog information, and fulfills e-book purchases to their customers in
Microsoft Reader and Adobe Acrobat eBook Reader formats. In addition,
OverDrive's technology allows HarperCollins to use its growing e-book library
to promote the sale of both print and electronic titles. For example,
HarperCollins can now offer electronic review copies or e-books bundled with
print titles. The initiative includes HarperCollins' e-book imprint,
PerfectBound, and e-books from its Christian publishing group, Zondervan.
HarperCollins Publishers, New York, NY, www.harpercollins.com
.
This bibliography presents selected
English-language articles, books, and other printed and electronic sources
that are useful in understanding scholarly electronic publishing efforts on
the Internet. Most sources have been published between 1990 and the present;
however, a limited number of key sources published prior to 1990 are also
included. Where possible, links are provided to sources that are freely
available on the Internet.
Announcements for new versions of the
bibliography are distributed on PACS-P
and other mailing lists.
An archive
of prior versions of the bibliography is available.
Here is another search engine for your page:
Multimeta: ( http://www.multimeta.com/
)
This is a fast meta search engine (simultaneous searches in the major
search engines, free URL submission service). There is even an
e-mail feature that allows to receive the search results by e-mail.
Best regards,
Mehul Trivedi
I was browsing your
site with interest, and wondered if you might be interested in adding our
Web-site, at http://www.bublos.com
to your list of resources. Bublos is primarily a book price comparison site,
though also offers many other useful resources as well, including a growing
book-review archive, and lots of other literary bits and pieces. We also have
useful information regarding used and rare books at http://www.bublos.com/library/rare.books.html
I hope you'll have a
few moments to pay us a visit, and would be very grateful for your
consideration in adding our Web-site to your list of other resources.
The owners of this lucrative URL address have
sponsored a Web directory created by a "team of 50 research analysts
[that] has sifted through the Web to find relevant sites for our handcrafted
Directory." All Websites in this 30-category directory have been
annotated. The annotations, however, tend to be very terse and a bit vague.
First time users are encouraged to skim over the excellent site guide, which
gives a step-by-step manual for using the site as well as in-depth
explanations of the terminology and taxonomy.
Adobe
needs to improve its PDF search engine. It misses far more than it hits. I
tried the above search service by entering "FAS 133" with the
quote marks into the Search box at http://searchpdf.adobe.com/.
I only got nine hits, and the search engine failed to detect many PDF documents
that I know are available on the web, particularly PDF documents from the FASB
and IASC. The link to "Help With Searching" was of no help since it
only links to AltaVista Help. For threads on this topic, go to http://www.trinity.edu/rjensen/acrobat.htm
Feeling
the pressure from sleek newcomer Google
perhaps, or maybe just caught up in 5K
fever, AltaVista has
unveiled a new, lean, ad-free
interface to
their powerful search engine at Raging.com.
Mysteriously, though, Raging detects and blocks Lynx users.
What makes Raging
Search results the best on the Web? At Raging Search, we use the most advanced
technology to sort through the ever-expanding World Wide Web, and provide you,
the user, with only the most relevant Web pages for your query.
In order to do that,
we start with the biggest and freshest index of Web pages. Our index contains
every word found on more than 350 million unique Web pages, and we are
constantly updating our index by removing dead links and adding new pages. Our
goal is to create an index of the entire World Wide Web!
But having the best
index is only half the battle. We also rely on sophisticated technology to
sort through that index and find exactly the pages you want. There are many
ways to judge whether a Web page is relevant to your query, and Raging Search
combines many different factors to find the best matches, including text
relevance and link analysis.
Text relevance
searches every Web page for exactly the words you enter. Many factors enter
into text relevance, such as how important the words are on the page, how many
times the words appear, where on the page they appear, and how many other
pages contain those words.
Link analysis uses
the many connections from one page to another to rank the quality and/or
usefulness of each page. In other words, if many Web pages are linking to a
page X, then page X is considered a high-quality page. In this way, Raging
Search technology uses the judgment of actual people across the Web to improve
our rankings.
These and many other
factors are combined to insure that you receive the results you need right
away. Raging Search is the only search site you will ever need.
I t
hought that some of you might
be interested in how InfraSearch uses Gnutella. The
following appears in the June 1 edition of IWNews.
Search Site
InfraSearch Tries to Re-Knit the Web
By Brian Caulfield
Software that made
its mark as a music-sharing utility is now being used to create a potentially
powerful search tool by InfraSearch ( http://www.infrasearch.com/
), a prototype search site launched Tuesday.
The site's technology
is based on file-swapping software Gnutella
( http://gnutella.wego.com/
), created by NullSoft, the individuals who had created the Winamp MP3 player
while working for America Online. Unlike the better-known Napster, the
Gnutella software enables users to form ad hoc networks for sharing the
contents of their hard drives.
Although its
potential for distributing music online has received the most attention, this
only scratches the surface of what the software can do. Users can search for
any kind of information other Gnutella users make available -- whether image
files or recipes for tamale pie.
InfraSearch lets
users run the software inside their databases and makes the contents
searchable at its Web site. InfraSearch makes it possible to index Web pages
created on the fly (such as news and shopping sites), something traditional
search services find difficult. The developers behind InfraSearch remain
unincorporated, but they are actively talking to investors and exploring
business models for the technology. Though AOL distanced itself from Gnutella
shortly after its release, open-source software enthusiasts outside AOL have
continued to improve the software.
On the InfraSearch
Web site, both NullSoft and the University of California Berkeley's
Experimental Computing Facility (XCF) are credited with developing the
technology behind the site. XCF is an undergraduate group developing
open-source technology, such as GIMP (Gnu Image Manipulation Program).
InfraSearch
remains in development, but the idea behind it may have major implications for
the future of the Web, according to Gene Kan, a programmer involved with the
project. Potentially, users could earmark their content that may be exposed to
outside searches, rather than relying on the sometimes arbitrary results of
popular search engines. "There is a top-down effect where the only way to
get on the Web is AltaVista or Lycos; the Web is not a Web anymore," said
Kan.
Here is another one on the Gnutella
paradigm shift --- http://news.cnet.com/news/0-1005-200-1983259.html?tag=stNapster-like
technology takes Web search to new level
By John Borland <mailto:jborland@cnet.com>
Staff Writer, CNET News.com
May 31, 2000, 4:00 a.m. PT
It's a big deal," said Andreessen, who met with
Gnutella developers last week and quickly became an admirer. "It will be
a way for businesses to expose what they want people to find more
easily."
It also is one of the first moves by what has been
hugely controversial file-swapping software into the realm of unquestionably
legitimate Web business. That's likely to take some of the legal shadows off
the technology and could spur a new phase in development.
Then there are the
standard search engines. Popular ones include AltaVista (http://www.altavista.com),
Excite (http://www.excite.com), Go Network
(http://infoseek.go.com), and HotBot (http://hotbot.lycos.com).
Unlike hierarchical indexes, standard search engines send out software
“robots” or “spiders” to search the Web and index the pages in each
site they encounter.
Northern Light (http://www.northernlight.com),
for instance, ranks Web pages as a standard search engine does. But instead of
displaying all of its results in a single listing, it sorts pages into
categories and groups the results into folders. As an example, a search for
“alternative energy” creates folders with labels such as “solar
power,” “air pollution,” and “National Technical Information
Service,” which includes documents from that agency.
Ask Jeeves (http://www.askjeeves.com)
takes an altogether different approach. You don’t enter keywords, but type a
question in plain English — perhaps “Is there evidence of life on Mars?”
Ask Jeeves has recorded millions of questions that users have asked it, and
has found Web sites that answer those questions.
Google (http://www.google.com)
takes yet another tack. Like other search engines, it first matches up your
keywords to the pages it has collected in its index. Then, however, it ranks
each page based on how many other pages link to it—and how many link to
those pages in turn.
Oingo (http://www.oingo.com)
has an even more radical approach. The site’s slogan is “We know what you
mean,” and Oingo conducts a “conceptual search” to make sure that it
understands your request. Ask it to search for “china,” for example, and
it will ask you to choose “porcelain” or any of the various geographical
Chinas.
Search engines that
search other engines are called meta search engines. Among the popular ones
are Dogpile (http://www.dogpile.com),
Inference Find (http://www.inferencefind.com),
and MetaCrawler (http://www.metacrawler.com).
The concept here is that because no single search engine indexes the entire
Web, using a meta search engine allows a researcher to scan more sites. The
downside is that such an engine needs to use a “lowest common denominator”
search statement, so that all of the search engines that it searches
understand the request.
The many new, free
databases on the Web can also be helpful. A site that does an excellent job of
identifying and sorting free databases is The BigHub (http://www.thebighub.com).
Through its “specialty search categories,” it allows you to search more
than 1,500 databases on the Web, many of which are oriented toward academics.
What new tools for
searching the Web are on the horizon?
At a recent conference, I heard about “vortals,” vertical portals that
provide information from only a designated slice of the Web. For example, a
vortal might search only those sites and pages that have to do with health
care. VerticalNet (http://www.verticalnet.com)
offers portals to industries including communications and advanced
technologies. Although the concept is a good one, the jury is still out on
vortals’ usefulness.
Farther down the
road are visual representations of search results. Those search tools display
their results graphically, allowing you to see at a glance which items are the
most relevant. A service called NewsMaps (http://www.newsmaps.com),
for example, displays the results of your search as a thematic map.
Topographical markers indicate clusters of similar documents—the most
similar ones are piled up into little hills. According to Cartia, the company
behind the technology, the maps are created automatically by an algorithm that
“reads documents, extracts the content, and organizes the collection into a
map.” You can view some sample maps at the site.
WebBrain can be placed on top of any indexable and
searchable database. It's not limited to HTML links and files, but can also
index Microsoft Office documents and other files as well, making it usable on
a corporate intranet.
"Our objective here is to demonstrate a superior
navigation, search, and discovery capability," said Peter Fuchs, CEO of
Santa Monica, California-based TheBrain.com. "The technology is designed
to separate the navigation from the Web pages. Instead of the typical search,
where you have long lists of textual information where you could get hundreds
or thousands of search results, now you see it in a visual form."
The WebBrain.com interface is split in half, with the
top part completely written in Java. It gives a Star Trek-like visual
representation of the search results by category and shows all of the threads
and branches from that category. As you select categories, links appear in the
lower half of the screen and submenus are drawn in the upper half.
Instead of building its own database, TheBrain.com
used the ODP database to demonstrate its support for others. The technology
can work with any database from major vendors, including Oracle and IBM; all
that's necessary is to build the connection between the interface and the
data. The software is available from the single-user, Windows version, called
PersonalBrain, up to enterprise-scale search engines, such as the one powering
WebBrain.com.
Introducing SearchNetworking.com , the
networking-specific search site focused on enterprise networks issues.
Register at http://www.SearchNetworking.com
for FREE and you might win a new Palm Pilot VII. As an added bonus, you can
download the 56-page report on virtual enterprise networks -- "The
Network Services Model: New Infrastructure for New Business Models" by
The Burton Group -- no purchase necessary!
Networking-related news and technical resources are
identified by our expert editorial team led by Paul Gillin, former
Editor-in-Chief of Computerworld. SearchNetworking.com helps you: 1)
Efficiently search the Web for enterprise networks info! Search against 2,000+
sites hand-picked by our editorial team so you only get relevant results. 2)
Stay current on networking issues and new technologies! Get email newsletters
based on your specific interests -- choose from industry news, network
administration tips, and career tips newsletters. 3) Learn from the experts!
Participate in Live Expert Q&A with industry experts. AND MORE!
Go to http://www.SearchNetworking.com
NOW and become a registered member -- ABSOLUTELY FREE! We'll enter you to win
a PalmVII and you can download The Burton Group's 56-page report on virtual
enterprise networks - all FREE! Don't delay. Quantities are limited so
register today before it's too late.
SearchNetworking.com is a TechTarget.com community.
Other TechTarget.com sites include:
Note from Bob Jensen: In a
recent PBS broadcast, the Digital Duo warns that advertising on search engine
pages is to be expected if the search engines are free. Somebody has to
pay for the resources needed to construct and maintain a search engine.
However, the warn us to watch out for search engines that give priorities of
hits to sources that pay for such priorities. For example, it would worry
me if I searched for "portfolio management" and have the listed hits
being only websites that paid to be hits on this topic. There may be far
better sites on this topic that just did not see a need to pay to be hit.
A transcript of the Digital Duo segment on search engines is given at http://www.digitalduo.com/210_dig.html
Search engines made
the Web truly useful. But they're anything but perfect. Sometimes they give
you way more than you're looking for; sometimes much less. And often they
don't ferret out the information you really need.
Many people get used
to a single search engine and never sample the other ones. The differences are
pretty amazing. Nor do most people know all the ways they can make their
favorite search engines work better.
Let's start by
telling you where NOT to go. Probably
the most useless search engine around is called goto.com
. For many searches, the top of the results list shows you sites that have
paid to be listed. So
when you search "Seattle Mariners," you get an offshore Internet
gambling site or a site that sells baseball flags. Even after the paid
listings, the Mariners' official Website is nowhere to be found. The only
thing to say about goto.com is never go to goto.com.
As for where you
SHOULD go, Steve's personal fave is AltaVista
. We recommend taking a few minutes to learn to improve its capability. For
instance, when you do a search for your own name, you may get tons of results!
That's because each result must contain EITHER your first or last name. But if
you put quotation marks around your name, your results will be much closer.
That's because you've told it to find BOTH your first and last names together,
in that order. AltaVista's advanced search can make things even more
manageable by letting you combine and exclude terms, and limit the range of
dates.
It's been reported
that AltaVista will also sell their rankings, which is guaranteed to muck up a
great service. It's one thing to sell ads that match a search term but selling
the actual results is out of bounds.
Susie likes HotBot.
The site's checkboxes make it easy to construct your search, and then make it
easy to narrow it down.
But when you're
looking for a general term, Steve's pick is Yahoo.
Since Yahoo actually has human beings organizing data into categories, you
often find the right general area at the top of the list, with links to the
sites you're looking for just a click away.
Susie also likes about.com
(formerly The Mining Company), because its human beings not only organize
results into categories, but also rate sites to help you get what you want.
And if you still can't find it, you can e-mail their staff for other
suggestions. But Steve hasn't gotten good results from about.com.
Metacrawler
searches several popular search engines and delivers the results on one page.
But it often delivers fewer results than if you searched just one. Same goes
for Dogpile.
The Duo tried a new
engine, google.com, while it was in beta
testing. It ranks pages by how many pages point to them. It's sort of like a
secret popularity contest. But the amazing thing is that a lot of the time, it
actually works.
All of the search engines above can be used to search for plagiarized
phrases or entire works. Searchers should keep in mind that some search
engines require that quotation marks be placed around the phrase being
searched.
This is part of a message sent by a university professor on May 12, 2000 about
plagiarism:
My second comment concerns the program I used to find
conclusive evidence that the paper had been plagiarized. The paper had a
number of telltale signs when I read it (no reference to class readings, no
page references to sources, a very unstudent-like, encyclopedia-ish prose
style, jumping around from one topic to another, etc.). I had received the
paper via an email attachment, so I used it to try out a program called EVE2.
If you convert a document into plain text format, EVE2 will search the
internet to try to find matching text. It then displays the paper with the
text it was able to match highlighted and gives a set of links to sites with
matching material. It also gives the percentage of the paper for which it
found matching material. In my case it found pages at two sites (infoplease.com
and britannica.com) from which about a third of the paper had been stolen
(combination of word-for-word borrowing and very close paraphrase). (I'm sure
the rest of the paper was also plagiarized but it didn't turn up anything for
the rest.) (If you don't convert the document to plain text, the program will
still find sites with matching material but won't estimate the percentage or
display the paper with matching portions highlighted.)
The program wasn't perfect -- to my puzzlement it
highlighted some parts of the paper which I couldn't find in the sites it
listed, while other parts of the paper that clearly *were* lifted from those
sites weren't highlighted. Nevertheless I found it extremely useful. I had
already tried using a couple of the major search engines on bits of text from
the paper without finding anything, and I'm not sure I ever would have tracked
it down on my own.
This was a trial version of the program, but I think
I'm going to pay the $20 to register it. If anyone's interested it can be
downloaded from http://www.canexus.com/eve/.
Signed XXXXXXXXX
In the
November 23 Edition of New Bookmarks, I reported that the following: Plagiarism.org
will soon be an important web site for many educators and many others, including
investigators and journal editors, who want to check if any writer's work is
authentic. Entire schools may be interested in paying for this service.
This site was featured on November 22 on CNN television. I discovered it
at breakfast while watching the news. Go to http://www.plagiarism.org/
Paul Myers
replied as follows:
The
following URL -- with spelling error and all -- provides background on the
Berkeley plagiarism-detection program --- http://www.cnn.com/TECH/computing/9911/21/plagerism.detective/index.html
J. Paul Myers, Jr. Associate Professor Department of Computer Science Trinity
University 715 Stadium Drive San Antonio, Texas 78212
Database
Searching (including literature searches)
Google, Yahoo, Wikipedia, and YouTube as
Knowledge Bases
A professor wrote to me drawing a fine line between information
and knowledge. Information is just organized data that can be right or wrong or
unknown in terms of been fact versus fiction. Knowledge generally is information
that is more widely accepted as being "true" although academics generally hate
the word "true" because it is either too demanding or too misleading in terms of
being set in stone. Generally accepted "knowledge" can be proven wrong at later
points in time just like Galileo purportedly proved that heavy balls fall at the
same rate of speed as their lighter counterparts, thereby proving, that what was
generally accepted knowledge until then was false. "Galileo
Galilei is said to have dropped two
cannon balls of different masses from the tower to demonstrate that their
descending
speed
was independent of their
mass. This is
considered an apocryphal tale, and the only source for it comes from Galileo's
secretary." Quoted from
http://en.wikipedia.org/wiki/Leaning_Tower_of_Pisa#History
In my opinion there is a spectrum along the lines of data to
information to knowledge. Researchers attempt to add something new and creative
at any point along the spectrum. Scholars learn from most any point on the
spectrum and usually attempt to share their scholarship in papers, books,
Websites, blogs, and online or onsite classrooms.
That professor then mentioned above then asserted that
Wikipedia
and YouTube were
information databases but not knowledge bases. He then mentioned the problem of
students knowing facts but not organizing these facts in a scholarly manner. He
conjectured that this was perhaps do to increased virtual learning in their
development. My December 5, 2007 reply to him was as follows (off-the-cuff so to
speak).
Although
I see your point about information versus knowledge, the addition of the
“Discussion tab” in Wikipedia changed the name of the game. As
“information” gets discussed and debated and critiqued it’s beginning to
look a whole lot more like knowledge in Wikipedia. For example, note the
Discussion tab at
http://en.wikipedia.org/wiki/Intelligent_Design
And when
UC Berkeley puts 177 science courses on YouTube (some of them in
biology), it’s beginning to look a lot more like YouTube knowledge ---
---
http://www.jimmyr.com/free_education.php
With
respect to virtual learning, my best example is Stanford’s million+
dollar virtual surgery cadaver that can do more than a real cadaver. For
one thing it can have blood pressure such that a nicked artery can
hemorrhage. Learning throughout time is based on models and simulations
of sorts. Our models and simulations keep getting better and better to a
point where the line between virtual and real world become very blurred
much like pilots in virtual reality begin to think they are in reality.
Much
depends on the purpose and goals of virtual learning. Sometimes
edutainment is important to both motivate and make learners more
attentive (like wake them up). But this also has drawbacks when it makes
learning too easy. I’m a strong believer in blood, sweat, and tears
learning ---
http://www.trinity.edu/rjensen/265wp.htm
When I put it into practice it was not popular with students of this
generation who want it to be easy.
You note
that: “These
students have prepared but it is poorly arranged, planned, and
articulated.” One thing
we’ve noted in Student Managed Funds (like in Phil Cooley’s course where
students actually control the investments of a million dollars or more
of a Trinity University's endowment) where students must make
presentations before the Board of Trustees greatly improves students
“planning and articulation.”You can read more about this at the University
of XXXXX (December 4) at
http://financialrounds.blogspot.com/
Note that the portfolios in these courses are not virtual portfolios.
They’re the real thing with real dollars! Students adapt to higher
levels of performance when the hurdles require higher ordered
performance.
Much of
the focus in metacognitive learning is how to examine/discover what
students have learned on their own and how to control cheating when
assessing discovery and concept learning ---
http://www.trinity.edu/rjensen/assess.htm
We
studied whether instructional material that connects accounting concept
discussions with sample case applications through hypertext links would
enable students to better understand how concepts are to be applied to
practical case situations.
Results
from a laboratory experiment indicated that students who learned from
such hypertext-enriched instructional material were better able to apply
concepts to new accounting cases than those who learned from
instructional material that contained identical content but lacked the
concept-case application hyperlinks.
Results
also indicated that the learning benefits of concept-case application
hyperlinks in instructional material were greater when the hyperlinks
were self-generated by the students rather than inherited from
instructors, but only when students had generated appropriate links.
I look forward to your
writings on this subject when you get things sorted out. You’re a good
writer. Scientist's aren't meant to be such good writers.
It goes without saying that Wikipedia modules are always
suspect, but it is easy to make corrections for the world. I
think this particular model requires registration to discourage
anonymous edits.
What is often better about Wikipedia is to read the discussion
and criticisms of any module. For example, some facts in dispute
in this particular module are mentioned in the “Discussion” or
“talk” section about the module ---
http://en.wikipedia.org/wiki/Talk:Mahmoud_Ahmadinejad
Perhaps some of the disputed facts have already been pointed out
in the “Discussion” section. Of course pointing out differences
of opinion about “facts” does not, in and of itself, resolve
these differences. I did read the “Discussion” section on this
module before suggesting the module as a supplementary link. I
assumed others would also check the “Talk” section before
assuming what is in dispute.
Since Wikipedia is so widely used by so many students and others
like me it’s important to try to correct the record whenever
possible. This can be done quite simply from your Web browser
and does not require any special software. It requires
registration for politically sensitive modules.
Wikipedia modules are often “corrected” by the FBI, CIA,
corporations, foreign governments, professors of all
persuasions, butchers, bakers, and candlestick makers. This
makes them fun and suspect at the same time. It’s like having a
paper refereed by the world instead of a few, often biased or
casual, journal referees. What I like best is that “referee
comments” are made public in Wikipedia’s “Discussion” sections.
You don’t often find this in scholarly research journals where
referee comments are supposed to remain confidential.
Reasons for flawed journal peer reviews were recently brought to
light at
http://www.trinity.edu/rjensen/HigherEdControversies.htm#PeerReviewFlaws
The biggest danger in Wikipedia in generally for modules that
are rarely sought out. For example, Bill Smith might right a
deceitful module about John Doe. If nobody’s interested in John
Doe, it may take forever and a day for corrections to appear.
Generally modules that are of great interest to many people,
however, generate a lot of “talk” in the “Discussion” sections.
For example, the Discussion section for George W. Bush is at
http://en.wikipedia.org/wiki/Talk:George_W._Bush
You already know about Wikipedia -- or
think you do. It's the online encyclopedia that anyone can edit, the
one that by dint of its 1.9 million English-language entries has
become the Internet's main information source and the 17th busiest
U.S. Web site.
But that's just the half of it.
Most people are familiar with Wikipedia's
collection of articles. Less well-known, unfortunately, are the
discussions about these articles. You can find these at the top of a
Wikipedia page under a separate tab for "Discussion."
Reading these discussion pages is a vastly
rewarding, slightly addictive, experience -- so much so that it has
become my habit to first check out the discussion before going to
the article proper.
At Wikipedia, anyone can be an editor and
all but 600 or so articles can be freely altered. The discussion
pages exist so the people working on an article can talk about what
they're doing to it. Part of the discussion pages, the least
interesting part, involves simple housekeeping; -- editors noting
how they moved around the sections of an article or eliminated
duplications. And sometimes readers seek answers to homework-style
questions, though that practice is discouraged.
But discussion pages are also where
Wikipedians discuss and debate what an article should or shouldn't
say.
This is where the fun begins. You'd be
astonished at the sorts of things editors argue about, and the
prolix vehemence they bring to stating their cases. The 9,500-word
article "Ireland," for example, spawned a 10,000-word discussion
about whether "Republic of Ireland" would be a better name for the
piece. "I know full well that many Unionist editors would object
completely to my stance on this subject," wrote one person.
A ferocious back and forth ensued over
whether Antonio Meucci or Alexander Graham Bell invented the
telephone. One person from the Meucci camp taunted the Bell side by
saying, "'Nationalistic pride' stop you and people like you to
accept the truth. Bell was a liar and thief. He invented nothing."
As for the age-old philosophical question,
"What is truth," it's an issue Wikipedia editors have spent 242,000
words trying to settle, an impressive feat considering how Plato
needed only 118,000 words to write "The Republic."
These debates extend to topics most people
wouldn't consider remotely controversial. The article on calculus,
for instance, was host to some sparring over whether the concept of
"limit," central to calculus, should be better explained as an
"average."
Wikipedia editors are always on the prowl
for passages in articles that violate Wikipedia policy, such as its
ban on bias. Editors use the discussion pages to report these
sightings, and reading the back and forth makes it clear that
editors take this task very seriously.
On one discussion page is the comment: "I
am not sure that it does not present an entirely Eurocentric view,
nor can I see that it is sourced sufficiently well so as to be
reliable."
Does it address a polarizing topic from
politics or religion? Hardly. The article was about kittens. The
editor was objecting to the statement that most people think kittens
are cute.
These debates are not the only treasures in
the discussion pages. You can learn a lot of stray facts, facts that
an editor didn't think were important enough for the main article.
For example, in the discussion accompanying the article about diets,
it's noted that potatoes, eaten raw, can be poisonous. The National
Potato Council didn't believe this when asked about it last week,
but later called back to say that it was true, on account of the
solanine in potatoes. Of course, you'd have to eat many sackfuls of
raw potatoes to be done in by them.
The discussion about "biography" included
random facts from sundry biographies, including that Marshall
McLuhan believed his ideas about mass media and the rest to have
been inspired by the Virgin Mary. This is true, said McLuhan
biographer Philip Marchand. (Mr. Marchand also said McLuhan believed
that a global conspiracy of Freemasons was seeking to hinder his
career.)
Remember, though, this is Wikipedia, and
while it tends to get things right in the long run, it can goof up
along the way. A "tomato" article contained a lyrical description of
the Carolina breed, said to be "first noted by Italian monk Giacomo
Tiramisunelli" and "considered a rare delicacy amongst
tomato-connoisseurs."
That's all a complete fabrication, said
Roger Chetelat, tomato expert at the University of California,
Davis. While now gone from Wikipedia, the passage was there long
enough for "Giacomo Tiramisunelli" to turn up now in search engines
as a key figure in tomato history.
Wikipedia is very self-aware. It has a
Wikipedia article about Wikipedia. But this meta-analysis doesn't
extend to "Wikipedia discussions." No article on the topic exists.
Search for "discussion," and you are sent to "debate."
But, naturally, that's controversial. The
discussion page about debate includes a debate over whether
"discussion" and "debate" are synonymous. Emotions run high; the
inability to distinguish the two, said one participant, is "one of
the problems with Western Society."
Maybe I have been reading too many
Wikipedia discussion pages, but I can see the point.
Jensen Comment
This may be more educational than what we teach in class. Try it by
clicking on the Discussion tab for the following"
"CIA, FBI Computers Used for Wikipedia Edits," by Randall
Mikkelsen, The Washington Post, August 16, 2007 ---
Click Here
"CIA and Vatican Edit Wikipedia Entries," TheAge.com, August 18, 2007
---
Click Here
Jensen Comment
Wikipedia installed software to trace the source of edits and new modules.
Wow! Over 20 years of Usenet discussion groups to search, browse, and
post messages --- http://groups.google.com/
Example: A popular search engine
(Google) has posted 20 years' worth of Usenet
discussion group postings: more than 700 million entries in all. Included:
American Taliban John Walker, screen name, "doodoo." --- http://www.wired.com/news/culture/0,1284,49016,00.html
For most of us, Web searching continues to take the
form of searching Google. A simple click to Google.com produces relevant
results. When Google was first released by two Stanford Ph.D. candidates,
Larry Page and Sergey Brin, in 1998, the search tool transformed Internet
searching almost immediately. Google incorporated linking structures into its
algorithm—the code that determines the ranking of pages within a retrieval
list—and, thereby, retrieved better and more accurate information. Since
that time Google has added features to an index that now includes more than
two billion pages.
With all this and more, Google has become the search
tool of choice for most information specialists and novices alike. So why use
any other search tool?
Google searching became synonymous with Web searching
because it works. Brin and Page believed there were no "bad"
searches when using Google. Whatever the search, Google could retrieve better
information based on the relationship of link structures—in other words,
good sites are linked to more often by other sites. Google quickly became the
best place to look for top-level Web sites such as business, institutional,
and personal homepages.
But there is more. Google organizes a query within
subject categories when applicable, provides links to language translations,
maintains cached links to original pages, and lists similar pages. Other
features include access to street maps when typing an address with a city or
state, dictionary definitions of search terms, and specialized databases, such
as Google Uncle Sam for government information and University Search for
information from specific institutions. With all these features and added
accuracy, it becomes difficult to use anything else but Google.
What You Don't See is What You Don't Get
Ah, but there's the rub. Neither Google nor any other search tool can index
all the information on the Internet. Conventional search tools such as Google,
Yahoo, AltaVista, All the Web, or meta-searchers like Ixquick, Vivísimo, and
SurfWax often access more than a couple billion pages in their databases.
However, a large portion of available information has been difficult or
impossible to search. Material that is not accessible using conventional
search tools has become known as the "Invisible Web." Other names
for the Invisible Web include the Deep Web, Opaque Web, and searchable
databases.
Such information is not accessible to conventional
search tools because it is inside databases such as the U.S. Census,
Amazon.com, or a library's online catalog. The locations of these pages can be
found through resources such as Gary Price's Direct Search, Complete Planet:
The Deep Web, http://www.invisibleweb.com/
, and Invisible-web.net. Or the information can be located via subject
directory tools like Infomine, Librarians' Index to the Internet (LII), Best
Information on the Net, and AlphaSearch.
The reality is that many information specialists, as
well as the general public, use the Invisible Web already. Most Web surfers
have accessed an Invisible Web site at one time or another. However, they
access only a portion of the Invisible Web, typically the portion found in
three general forms:
First, the Fee Group, or paid databases, such as
EBSCO, OVID, ProQuest, and Medline. These databases have a cost associated
with use.
Second, the Free Group: government databases such as
the Census, AskERIC, PublicMed, the Currency Converter, FindArticles, and
library online catalogs. These databases are free for anyone to access.
And the Hybrid Group: UnCoverWeb, online newspapers
like the New York Times and Wall Street Journal, which currently take this
form. These databases have both free portions and fee portions.
The Not-So-Invisible Web
For the past two years, the Invisible Web has been the "next big
thing" in Internet searching. The truth is, it is still a big thing. When
it comes to more than 500 billion Web pages located in searchable databases,
how can it be anything but big? But the Invisible Web is still unwieldy for
most. Resources such as Infomine, Librarians' Index to the Internet, and
Direct Search have been underused by both information specialists and novice
searchers in part because they are difficult to use. However, the increasing
exposure of the Invisible Web is helping to bring these resources to the
surface. As more searchers use them, access will become better, driven by the
demand to find and use relevant information.
Continued in the article.
The InvisibleWeb can access over 10,000
databases that are generally missed by common search engines --- http://www.invisibleweb.com
What is the InvisibleWeb.com?
The InvisibleWeb.com is a directory of over 10,000
databases, archives, and search engines that contain information that
traditional search engines have been unable to access. InvisibleWeb.com take
you to these invisible sources.
Why would I want this?
Classic search engines such as Yahoo or AltaVista are
just too large. They work just like the index in the back of a book; you give
the engine a word to look for and it returns every page it has ever seen that
word on. You don’t want to wade through futile, repetitive information; you
want targeted, precise information and that's exactly what InvisibleWeb.com
delivers!
Traditional search engines have access to only a
fraction of 1% of what exists on the Web, according to BrightPlanet, an
Internet search company, noting that as many as 550 billion pieces of content
are hidden from most search engine scrutiny. These documents make up what is
known as "The Deep Web."
Undercover and undercovered, the vast reservoir of
the Deep Web is estimated to be 500 times larger than the "surface"
World Wide Web. And, according to BrightPlanet, the Deep Web is the largest
growing category of new information on the Net.
"There's a huge amount of information you can't
find entirely or easily via a search engine," says Net search guru Gary
Price, a librarian at George Washington University, and co-author of the
upcoming book "The Invisible Web" (CyberAge Books, $29.95).
"The material on the Web is unorganized, very ephemeral. There's no rhyme
or reason, no language control. The Web is a huge directory that's very hard
to get at."
What's hidden? What makes up the depths of The Deep
Web? The biggest part of this invisible Web is information stored in databases
- massive libraries of Web content unsearchable through such tools as Yahoo!
and Google. You have to know they exist before you can search them.
Other aspects of the Net remain hidden in deep
waters, too.
"There are tons of things out there," says
Tara Calishain of Researchbuzz.com, an online Internet guide. "Pay
content sources, lots of genealogy sources. The Library of Congress ( www.loc.gov
) has fabulous collections you can't find on Alta Vista."
Several types of information are most elusive for
search engines - bibliographies, multimedia files, information that comes in .pdf
files (Adobe's portable document format). "News is dreadful, says
Calishain. "Search engines don't cover it. It's tough to find breaking
news."
Some sites, such as Amazon.com have sections so far
from the surface of their home pages that they, too, can be classified as Deep
Web, says David Crane, a spokesman for search engine Google ( www.google.com
). An example, says Crane, is "the section that specifically offers a
'portable compact disc player by Sony.'"
But the deepest Deep Web drop-off is in the category
of government, and it's getting deeper.
"More and more city and county governments are
putting their offerings on the Web. The State of Pennsylvania has a new crime
reporting database ( www.ucr.psp.state.pa.us/UCR/ComMain.asp
), and more and more of that kind of thing is coming up now," says
Calishain.
. . .
Two groups of Web experts are also making it their
business to provide searchers with information on Deep Web sources.
Calishain's Researchbuzz.com (www.researchbuzz.com)
chronicles search engines, new data collections ("Online Legal
Information in Denmark, Norway and Sweden"), browser software and other
Deep Web mining tools that "a research librarian, journalist, educator
and others would find helpful, from the perspective of someone who's really
going to use it."
And in the early '90s at the University of
Wisconsin-Madison, the Internet Scout Project ( www.scout.cs.wisc.edu
) was started with funding from the National Science Foundation to
"inform the higher education and research communities about resources on
the Internet," says Scout Director Rachael Bower. The project posts
detailed reports each Friday to keep searchers, including the general public,
"up to speed" on Deep Web sources.
The Scout Project is driven by five editors who have
spent years creating bookmarks and automatically checking changes in existing
sources; there's a searchable archive of 11,500 sites available.
"We do supply Deep Web information. A lot of the
things you get from Scout an Alta Vista search wouldn't get, or it's buried.
Think of it as being a card catalog with information about information,"
says Bower. "It's one of the first attempts to get librarians to catalog
Web resources. All of the editors doing the cataloging are graduate students
in the subject or in library science."
So, then, why
do so many feel the need to ignore the vast resources available to them,
publicly and repeatedly offer up disinformation, and generally offend the basic
tenets of the liberal arts education? What can be done to help these people, so
obviously confused by their encounter with a badly constructed tutorial, or
ruined by unmonitored self-study? I mulled the problem over a strong cup of
Kenya AA and suddenly struck my fist into my palm, shouting, "Eureka! We
must introduce them to the primary sources!"
One of the
great things about being a Web designer or developer is that you have access to
an enormous collection of tutorials, documentation, specifications, and related
materials, no matter what part of the Web you work with.
. . .
The IETF produces several different kinds of
"standards":
Internet Drafts (I-D)
Like the Bill from everyone's favorite Schoolhouse Rock episode,
even an Internet standard has to go from one stage to another on its way
to glorious worldwide acceptance. The first step on this path is an
Internet draft, which represents an RFC's larval stage. Anyone can write
an Internet draft, as long as they follow the IETF guidelines for doing
so.
For Your Information (FYI)
FYI documents are handy primers for newbies of all ages, ranging from
discussions of Netiquette to "why it's bad to spam." They are
great resources to force on people who just don't get it. There are about
35 of these handy
documents at present.
Requests for Comments (RFC)
These are informal
documents that discuss, at varying levels of detail, anything from
proposed to existing protocols, processes, infrastructure, and the like.
There's even an RFC that documents
the IETF standards process, though technically it is also a BCP (see
below). At any rate, RFCs have different classifications, including
informational, experimental, and historic, and not all RFCs become
Internet standards. In fact, some are elaborate practical jokes.
Best Current Practice (BCP)
These document best current practices, as you might expect. There are only
36 of these at the
time of this writing.
Internet Standards
These are generally recognized as documenting those protocols and
practices that have proved the test of time. By way of illustration, there
are, at the time of this writing, more than 2,500 RFCs but only 58
Internet standards (mostly related to low-level networking, like
TCP/IP or PPP or the Post Office Protocol).
. . .
Armed with
this knowledge, go forth and uphold the social contract of the Internet:
"Be conservative in what you do, and liberal in what you accept from
others." The conservatism is the natural result of having a good reference
library at your fingertips. The liberalism extends only so far and doesn't
include accepting an uninformed line of bull from somebody on a mailing list.
direct search
is a growing compilation of links to the search interfaces of resources that
contain data not easily or entirely searchable/accessible from general
search tools like Alta Vista, Google, or Hotbot. Although these
"general" tools are essential for the retrieval of Internet based
data, searchers often fail to realize that a massive amount of information
is not easily or entirely searchable/accessible via these search tools.
Material "hidden" from the general search tools is said to reside
on the Invisible Web.
Search
Engine Submission Tips This area covers search engine registration and submission
tips, such as using meta tags, improving placement and how to submit URLs. Formerly called "A Webmaster's
Guide To Search Engines"
Web Searching Tips Learn how to search better and how the major search engines
work from a searcher's perspective. Also see how people search and other fun
stuff.
Search Engine Listings Find all the major search engines; popular meta search
engines; MP3 search engines; kid-safe services and much more.
Reviews, Ratings & Tests Read comparison reviews, see which search engines are most
popular, and check out various tests and statistics.
Search Engine Resources If it is related to search engines in
some way, you'll find it here.
Search Engine Newsletters Over 165,000 readers depend on our free newsletters to keep
up with search engines. Sign-up
or learn more.
AllTheWeb has had a face lift and added several new
features. Now when users perform a search using any of the five major search
options (Web pages, Pictures, Videos, MP3 files, or FTP files), a sidebar shows
helpful results in other categories. So, for example, a search for
"Internet Scout" in the pictures index brought up our staff picture,
but also offered a listing of Websites such as our front page and our Weblog.
Users can also now limit their searches in a variety of ways directly from the
search box, which offers a handy pull-down menu to limit by language. The help
page explains the various search options, including searching for words in the
title or domain name, searching pages on a specific site, searching for pages
with a link to a specific site, and more. AllTheWeb has always been this scout's
favorite workhorse search engine because it's big and fast. It's nice to see
more functionality as well.
How
would you like a snapshot of the past year's trends and a glimpse into the
search engine future? To gather the information needed to take such a look,
I've interviewed some of the major engines: AllTheWeb,
AltaVista, Google, LookSmart, Lycos, and Yahoo! I'll be sharing what I've
learned, beginning with AllTheWeb, a site owned by Fast
Search & Transfer (FAST) --- http://www.fastsearch.com/
.
By
offering AllTheWeb as a public search engine, FAST aims to provide users with
comprehensive results while using the engine as a demonstration piece for its
original equipment manufacturer (OEM) partners. The public search engine
enables the company's corporate partners to evaluate the technology before
rolling it out on their own sites.
"FAST
attacks the four pillars of search: relevancy, freshness, size, and
speed," said Rob Rubin, FAST general manager, Internet business unit.
"This year, FAST introduced its linguistics tool kit to automatically
detect phrases and provide accurate search results." (Check this out by
entering "Who is Little Richard?" or "What is the weather in
New York?")
The Internet has provided a great way to search for news articles
relating to a particular subject..... if you can figure out how to
find such articles.....
It's been said that Internet search engines can be the most
useful--or useless--tools on the Internet. As one writer put it:
"the Internet is an enormous library in which someone has turned
out the lights and tipped the index cards all over the floor."
There is so much information on the Internet that it can be
overwhelming (for most users) to try to find what (they) are looking
for. Finding information is easy...finding the article that you're
looking for is more challenging.
What's Out There?
The first instinct may be to use one of the many search engines
online. However, search engines are generally poor sources for current
news. Some have developed separate engines and directories for news
postings and these tend to work the same way as search
engines. Examples include: Yahoo
News, Excite, CNN.COM,
ABCNews.COMand Lycos
Top News. But if you're looking for topics that are not in the
newspaper headlines, these may not necessarily be the best search
options for you.
Where Should I Start?
There are many good places to search for the latest news stories from
hundreds of sources on the web. The following sites provide good
results for current event searching. And because they search only news
sites, the results are usually focused and timely (see Online
Research Advice for how they work).
American
Journalism Review News Search Engine
This online tools allows you to search various news sources,
including Excite News Tracker, Associated Press (via the
Washington Post), Infoseek Wires, NewsBot, News Index and Total
News.
The Wired
Cybrarian
of the most extensive collection of search engines can be found at
the Wired Cybrarian. In addition to having links to some of the
best reference sites, the Wired Cybrarian acts as a multi-engine
search for news articles broken out into eight topic areas:
Technology, Current Affairs, Business, Recreation, Investing and
Finance, Media, Culture and Health and Science.
NewsBot
Developed by HotBot, NewsBot's
greatest strength is that it allows you to filter your search by
date, so you can receive only the most current information.
Newsbot then takes all of the documents that fit your criteria,
sorts them according to their relevance, and returns a list of
documents in the form of abstracts and links.
Ixquick Metasearch
Ixquick Metasearch finds the news you're looking for by
intelligently searching top newspapers simultaneously.
Once you have a feel for the keywords being used in the
articles, some other good resources include:
FindArticles.com
A partnership between LookSmart and the Gale Group, a publisher of
research and reference materials for libraries, businesses, and
information technologists. Offers free access to the full-text of
articles published in more than 350 magazines and journals dating
from 1998. Search returns include article title, periodical, and
short description, with a link to the full-text, which is
conveniently and quickly displayed at the FindArticles site,
though with numerous advertising banners. Visitors can also view a
list of the publications indexed, alphabetically or by subject.
Periodical listings include a one-sentence description and a link
to their Website. A very useful reference source, indexing many
leading journals and magazines.
NewsLinx
To many this is the best site to keep up-to-date on any technology
beat. It's a page of deep links, updated in real-time, to a host
of newspapers, trade magazines and websites covering e-commerce
(and other topics.) You can also search the database to get
real-time news on any technology topic you like. The ads follow
the copy, and the stories aren't "framed," so they're
not stealing from copyright holders.
NewsMaps
"Explore thousands of news articles and discussions,
organized by topic onto an information landscape." You can
click on data "points" on the maps to link to actual
news articles. Works best at 56KB and faster, and your browser
must support Java."
Periodicals.net
A search engine by the Library Technology Alliance.
What's Best?
Finding what is best depends on what you're looking for. Sometimes
searches on a general search engine may prove to hold the best
information for your search, sometimes you may have to use multiple
sources. As with any search, the more sources that you use, the more
information you will be able to retrieve.
Citing Internet Sources
When you are given an assignment to gather news articles, provide the
assigning organization the name of the article, the date of the
article, and the direct URL of the article, so that they can then
access the article themselves. This is also important so the article
can be properly cited if it is used.
Another great resource for learning to find news articles is the Information
Research FAQ at http://cn.net.au/faq.txt
This Law Office Computing column
covers electronic resources and strategies for accomplishing various
types of legal and business research. Request
print copies.
The Directory is intended to help you identify and
contact organizations that provide information and assistance on a broad range
of education-related topics.
This is mostly a site designed to help
you search for help and other information on most any education-related topic,
including education technology.
For students who are starting their college searches, several Web
sites offer one-stop shops that include searchable databases of
college information, test preparation aids, virtual tours and online
applications. Here are some of the leading sites and some of the
additional features they contain:
THE COLLEGE BOARD: www.collegeboard.com Features
online registration for the SAT and help with essay preparation. Plans
to offer soon a search feature called LikeFinder, which will enable
students to find colleges similar to the ones they are reading about,
and a feature that will generate side-by-side comparisons of selected
colleges.
COLLEGELINK: www.collegelink.com Offers
a month-by-month planner and articles about financial aid.
COLLEGENET: www.collegenet.com The
CollegeBot search engine looks at college-related Web sites.
PETERSON'S COLLEGEQUEST: www.collegequest.com Includes
a personal organizer, practice tests for the SAT and ACT and
discussion groups.
EMBARK: www.embark.com Offers
online "lockers" where students can store applications in
progress and results of searches.
XAP: www.xap.com Gives students
a head start on the admissions process, starting in eighth grade, by
leading them through questions about high school courses and the types
of colleges they would like to attend.
•
Some Web sites, like the ones below, focus on specific aspects of the
college search:
FAFSA ON THE WEB: www.fafsa.ed.gov An
online version of the federal financial aid form.
FASTWEB: www.fastweb.com A
database of scholarships and grants.
FINAID: www.finaid.org Calculators
and resources to help demystify the financial-aid process.
NATIONAL CENTER FOR EDUCATION STATISTICS/IPEDS COLLEGE OPPORTUNITIES
ON-LINE: www.nces.ed.gov/ipeds/cool A
database of 9,000 colleges. Students can search for colleges based on
a profile of the types of schools they are interested in.
USNEWS.COM: www.usnews.com/usnews/edu Annual
rankings of colleges according to U.S. News & World Report, and a
database that can be searched for specific criteria.
According to one of its own brochures, "Business
2.0 is the essential tool for navigating today's relentlessly changing
marketplace, particularly as it's driven by the Internet and other
technologies." In both print and electronic versions, Business 2.0 does
cover an incredible amount of ground, including day-to-day and month-to-month
information and offering extensive subject lists of its material, broken down
by general subjects -- from management and marketing to Enron and the
Internet. Not only clearly in touch with today's business world, Business 2.0
promises to put its readers in touch with it through company links, as well as
through straightforward contact lists. While Business 2.0 is open for anyone's
consultation, registered readers are granted greater access privileges to
archived and premium cont
With a team of
researchers headed by Prof. Kathy McKeown, Columbia Newsblaster is an online
project at Columbia University's Department of Computer Science in the School
of Engineering and Applied Science. Newsblaster currently looks at news
reports from thirteen sources, including Yahoo, ABCNews, CNN, Reuters, Los
Angeles Times, CBS News, Canadian Broadcasting Corporation, Virtual New York,
Washington Post, Wired, and USA Today. The product uses artificial
intelligence techniques to cull through news reports published online and then
sorts and summarizes these reports in five different news categories -- US,
world, finance, entertainment, and sports. These summaries are based on
reflecting factors, such as where a fact is mentioned in the published reports
and how often it is repeated across reports dealing with the same event or
subject. They are also based on the news value of individual facts, such as
how many were killed or injured, or how much damage to property occurred. On
the whole, in an age of information overload, this newly developed tool may
provide assistance to journalists, executives, and average news consumers.
Before I landed my cushy job as a magazine editor, I
spent three years under the hood at Hotbot as an engineer and manager. Between
days reading our log files and nights shmoozing with other search engineers, I
learned more than I'd ever wanted to know about where search traffic comes
from, and where it goes to. I even wrote an article
about it for Webmonkey.
But I had put all that behind me ... until my lovely
wife, Christina,
asked me about search engine optimization for Artloop,
her fine art research and location service.
Dozens of companies had pitched their optimization
services to her, but Christina, a former MSN manager as smart about database
schema as she is about business plans, balked. Why pay someone to set up bogus
domains, build huge farms of gateway pages, and cram hundreds of keywords like
"britney spears" into Artloop's HTML? The very idea ran contrary to
the information
architecture and site
layout her staff had worked so hard to make as clean and clear as possible
for their visitors. Moreover, as a Web user herself, she'd learned to
recognize these traffic-grabbing methods and had become wary of sites using
tricks to get her to click. Why should she assume her own customers would
behave differently?
And she was right: Trying to fool search engine users
with keywords and trick tags makes sense only if your goal is to flash a lot
of ad banners, return traffic be damned. That used to be the business model
for an entire industry. But most sites in business today hope to convert
first-time visitors into loyal customers by building long-term relationships.
Sure, searchers need to find your site, but the results on Hotbot's Top Ten
lists show that the only results people stick with are the ones that don't try
to scam them. Trap doors, redirects, keyword spam, and multiple domains that
host the same pages are more likely to make people reach for the back button
(a move the Direct Hit technology behind Top Ten results can detect), not
their credit cards.
So, rather than waste money on consultants, Christina
and I decided to create our own search optimization spec. Using data gleaned
from representatives of leading search engines, insider data, and
old-fashioned trial and error, we came up with our own strategy for getting
traffic from search engines and portals without having to fake people out. In
the process, we encountered so many dubious "experts" with something
for sale — software, books, services — that we decided to raise the bar on
them and publish our notes for free.
Imagine our surprise when Google's engineers read
this article (when it first published in early June, 2001) and invited us to
visit their offices to dig even deeper into the workings of their gigapage Web
index. Of course we took them up on the offer, and we've updated this article
with our notes from those meetings. We've also included answers to the best
questions from the hundreds of emails we've received over the past couple
months.
. . .
The Biggest Fish to Fry: Yahoo, Google, and
Inktomi
The Yahoo directory accounts for half the traffic
referred to most sites. So get your site listed on Yahoo, and your traffic can
literally double overnight. Beyond that, most search engine traffic comes from
two places: Google and Inktomi.
Traffic from Google has increased at an astonishing
rate over the past year: Jakob Nielsen's search engine referrals to his Useit
site confirm this, as do the unpublished reports from retail sites like Stylata.
Google, once considered a niche site for nerds, is the Wall Street Journal's
pick for best search engine on the Net, and the traffic numbers seem to agree.
Inktomi, the number two traffic generator, doesn't
run its own search site. Instead, the company provides the technology behind
MSN Search and AOL Search, two top referrers, as well as Hotbot and over a dozen
more.
Portal sites like Excite, Lycos, and AltaVista still
draw lots of traffic, but together Google and Inktomi outweigh the entire rest
of the field. Add it up and it's pretty clear how to maximize your traffic for
the least effort:
Get yourself into Yahoo's directory.
Make sure your site is thoroughly crawled by
Google and Inktomi.
Get lots of links to your site from domains that a
lot of other sites link to — that's how Google and Inktomi determine
relevance when ranking search results.
For all other search engines, implement a blanket
strategy that gets you reasonable results. By not chasing each one of them
separately, you can put your company's time and money to more important
uses.
All of this can be accomplished with one, three-step
process. And it really is as easy as 1-2-3.
There are quite a few things you can do to grab the
attention of search engines and directories:
Clean Up Your URLs
Frames used to be the biggest roadblock to getting
crawled, but no more: Both Google and Inktomi now crawl them (the section of
Inktomi's support FAQ that claims this isn't so is out of date, according to
the company). Instead, the problem with most e-commerce sites today is that
their product pages are dynamically generated. While Google will crawl any URL
that a browser can read, most of the other search engines balk at links with
"?" and "&" characters that separate CGI variables
(such as "artloop.com/store?sku=123&uid=456"). As a result, many
individual product pages don't show up outside of Google.
One way to circumvent this difficulty is to create
static versions of your site's dynamic pages for search engines to crawl.
Unfortunately, duplicating your pages is a huge amount of extra work and a
constant maintenance chore, plus the resulting pages are never quite
up-to-date — all the headaches dynamic pages were designed to eliminate.
A far better strategy is to follow the lead of Amazon
and rewrite your dynamic URLs in a syntax that search engines will gladly
crawl. So URLs that look like this ...
Amazon's application server knows the fields in the
URL are actually CGI parameters in a certain order, and processes them
accordingly.
J.K. Bowman's Spider
Food site explains how to fix URLs for most popular e-commerce servers.
One of Artloop's Web programmers learned Apache
rewrite rules that tell Apache how to translate slash-separated URLs into
a format used by their Netzyme
application server. On the back end, Netzyme is passed something like this:
But users and search engines see the tidier,
Apache-served URLs, which look something like this:
artloop.com/artists/profiles/3918.html
Not only are the rewritten URLs crawlable by all
search engines, they're also more human-friendly, making them easier to pass
around the Net.
Many readers have written in to to ask if the search
engines will begin crawling and indexing Flash content soon. The answer, as
you might guess, is no. Unlike PDF files, Flash files rarely contain
information in text format. Search developers don't want to clutter up their
indexes with a million "Skip Intro" pages.
Submit your Site
There are a lot of automated search engine submission
services that you can use to submit your site to as many search engines as
possible. The one most recommended by people I talked to is Submit
It, an early player that did so well, Microsoft bought them — Submit It
is now part of MSN bCentral, and it charges a minimum fee of US$59 to keep a
few URLs submitted for a year.
You can avoid the fees by simply submitting to
individual search engines on your own. Start with UseIt's list
of top referrers — that's where most of the traffic you can get will
come from. And while you'd think submitting your site to one Inktomi-powered
site would work for all
of them, optimization experts have told us it works better if you hit them
all.
Don't Forget the Directories
Submit It does submit your site to the busiest
directory sites, except for the biggies: Yahoo, LookSmart (which MSN
serves under its logo), and the Open Directory Project (which powers Lycos,
Hotbot, and Netcenter categories). Some of these directories charge for
submission, but $400-500 total will get your most important pages into the
most trafficked places.
Yahoo still offers free submissions, except for
business categories, which cost $199. But even the fee doesn't guarantee
they'll accept your site, just that they'll decide on it within a week —
with free submissions, you don't even get the promise that they'll ever get
around to evaluating it, given the incredible volume of submissions.
Once you've submitted your pages, be ready to wait a
month, two, or three before they're crawled and indexed. It's frustrating, but
processing a billion Web pages takes time — at a nonstop rate of one hundred
per second, it would still take almost four months.
Make a Crawler Page
It isn't necessary to submit every page on your site
to the search engines. Just make sure they can find all the pages that matter
by hopping links from your front door. To do that, make a "crawler
page" that contains nothing but a link to every page you want search
engines to crawl. Use the page's TITLE info as the link text — this helps
improve your site score. For an example, check out Artloop's
crawler page.
Basically, the crawler page is a site map that lists
all the pages on your site — it may be a bit too big for humans to read
through, but it will be no problem for a search engine. Add an obscure link to
the crawler page on one of your site's top-level pages, using a small amount
of text. MSN used to use 1x1 images for this trick, but the Google geeks
warned us to avoid such obviously invisible tags. "Why not just label it
'site map?'" one asked. Search engine spiders will find it as soon as
they get to your site, and suck down all the pages it finds on it.
Don't worry, the crawler page won't show up in search
results. It does get pulled into the search engine's index, but because it has
no text or tags to match a query, it isn't listed as a result. The pages it
links to, however, will appear because the search engine's spider found them
right after it visited the crawler page. Wired News, for example, uses
hierarchical sets of crawler
pages to make sure every story ever published is crawlable from the top of
the site.
For Artloop, we decided to break the crawler pages
down into 100KB pages or smaller, just to be careful — we wanted to prevent
search spiders from timing out or deciding the pages were too big to crawl.
Now there's a whole new generation of
search engines trying to find new ways to top Google's accuracy or optimize
the way that results are organized to make them easier to go through.
Some of them want to beat Google at
its own game of being the universal search engine; others simply want to be
specific research tools, increasing the depth while reducing the scope of a
search.
WiseNut,
an up-and-coming search engine launched in May, improves upon the relevance of
Google's search results through a context-sensitive ranking algorithm, which,
according to WiseNut, Google lacks.
Google ranks pages based on links
from other pages: the more links, the higher the page ranks in the results.
WiseNut's context-sensitive ranking algorithm examines a page's links and the
text on the page, compares the two, and puts the most relevant results first.
WiseNut also groups Web pages from
the same site under one result listing, allowing more results to be shown on
the same page.
WiseNut claims to have the fastest
and most cost-effective search technology. It says it can index 50 million
pages a day using only 100 off-the-shelf servers, which is faster than any of
its competitors, the company said.
So far, its database has collected
800 million pages (compared to Google's 1 billion).
"Google's relevance created a
lot of popularity among users," said Yeogirl Yun, WiseNut's founder.
"But that doesn't mean there's no room for the next better search engine.
If you look at search engine history, the leadership has changed every two or
three years. First was Yahoo, then AltaVista, then Google."
Teoma
also has its eye on succeeding Google. It claims to return more relevant
search results based on the judgment of "peer sites."
Google's system is based on the
structure of the Web: Sites are ranked by popularity. The more hits and links
a page enjoys, the higher it is returned in a search.
Teoma pushes this principle a little
bit further by ranking pages according to how many links they have from other
sites relating to the query subject. More than a general popularity contest,
it measures a website's standing among its peers.
Teoma works by searching its database
for pages that match the search terms. The resulting pool is then organized
according to topics, and the engine determines the most popular sites that
deal with the same topic.
Teoma tackles Google's organization
problem by presenting search results in three different ways: Normal ranking,
which portrays the most "authoritative" sites; by topic; and by
experts' links, which are created by topic experts.
But what Teoma makes up for in
organization, it lacks in reach: Teoma's URL database still has less than 100
million pages.
Another way of increasing the
relevance of a search is by limiting its scope.
Toronto-based Lasoo
limits the scope of searches geographically, by asking users to select an area
on an electronic map using a circular "lasso."
Unlike Google, Lasoo
looks for information -- such as businesses and jobs –- only in the selected
geographic area.
Lasoo selects results
according to their physical proximity to the "epicenter" chosen by
the user. It doesn't follow administrative or political divisions. Results are
displayed on a map.
"We have a
high-speed map server technology that allows us to create detailed maps for
the entire world," says Peter Forth, Lasoo's chief technology officer.
"And we have a geographic search engine technology that sifts through a
database of over 30 million geo-coded businesses to find entries that match a
particular keyword and are in a specific geographic area."
According to Forth,
most other geographical search engines such as Yahoo are focused on the United
States or on major cities only, whereas Lasoo is worldwide.
Vivisimo,
a spin-off company from Carnegie Mellon University, is a meta search engine
that uses other search engines and classifies the results.
Vivisimo
categorizes summaries that are produced by other search engines and then
groups the pages according to terms that the algorithm deems descriptive.
Users can pick among
several search engines, including Google, AltaVista and Hotbot.
"Information
clustering is a very old problem in computing," says Raúl Pérez Valdés,
president of Vivisimo. "We have invented an algorithm that optimizes
group formation in such a way that makes groups easier to describe."
The company plans to
sell its cataloging technology to other search engines and corporate websites
but it is already getting a lot of attention from end users: Traffic has
increased by 43 percent per month.
This shows what search engines have been used
recently to access the Trinity server. It also shows what words people were
using in the search that sent them to a Trinity page.
Larry
Reply announcing another search
engine
Hello Dr. Jensen,
My name is Kris Burke from Slider search engine.
I was wondering if you would please consider adding
our search engine to your search engine links?
We have a directory of over 2.5 million websites, ftp
search, whole web search and a free encyclopedia. We will soon have a
shareware section and many more features and content :)
What has the "Jennifer Lopez" search phrase got that the phrase
"Bob Jensen" is lacking?
Don't answer that!
"Getting an Answer Is One Thing,
Learning Is Another," by Peter Coffee, eWeek, July 29, 2002
In the process of
attempting to inform people via IT, it's ironic that we may be misinforming or
disinforming them more than ever before. We're helping people find the most
popular sources of what's often inaccurate or misleading data; we're answering
people's questions, instead of questioning their implied assumptions. We're
applying the ever-more-impressive technologies of Internet search and
context-sensitive help toward counterproductive ends.
What got me thinking
along these lines was an incident last week, when someone asked me how a
computer actually stores pictures and sounds. I handed him a book on PCs (the
one that I wrote in 1998, as it happens) and told him that the answers were in
Chapter 8. In fact, that was the entire subject of that chapter. But he came
back to me a few minutes later, frustrated, saying: "I don't understand
why you said I should read this whole chapter. I just had one little
question." (Honestly, it was less than 30 pages, with plenty of white
space.)
I felt as if I were
seeing, in that one brief exchange, the combined success and failure of our
efforts over the years to devise interactive tools that answer the question
the user is asking—and nothing more than that. Microsoft, with its Office
Assistant and "Semantic Web" research efforts, is arguably the
leader in giving people what they seem to think they want in this regard, but
many others have also pursued these goals. Until now, I have thought that this
was a completely good thing, but I'm starting to have my doubts.
The problem, I'm
starting to suspect, is that people may have learned to resist the idea of
absorbing a foundation of information before they start accumulating details.
We've thrown so much complexity at people, during the past 20 years or so,
that users have had to develop a defense mechanism: "Just tell me what I
need to know!" But when we do this, we wind up with people who are merely
following recipes that might as well be magic spells.
People used to have a
chance to learn fundamentals, and maybe even see opportunities to do things in
fundamentally different ways, before they were forced to buy in to the
existing way of doing things, before they felt in danger of being hopelessly
overtaken by minutiae. But look, for example, at the way we've changed our
approach to the task of teaching people to write. We used to start children
off with simple tools that did no more than they needed: When they were first
learning to form letters, we gave them pencils. When they were ready for words
and sentences, we gave them typewriters. When they were ready to start
rereading and rewriting their own work, we gave them text editors.
Now, we're giving
grade-school children desktop publishing tools, whose use exposes choices that
they don't understand—and involves answers to questions that the kids have
no idea of how, or why, to ask.
The problem also
strikes in the opposite direction: Sometimes, it's not a question of knowing
too little, but rather of "knowing" too much. If you only answer the
question that a person chooses to ask, you give up any opportunity to
influence the assumptions and beliefs behind that question.
For example, if
someone asked you how to stop excessive bleeding from skin punctures, you
could answer that question—and that person could happily go back to treating
patients by bleeding them with leeches, now that you had "solved"
his "problem." Would his patients appreciate your help?
We get angry when
someone seems to be condescending to us by asking, "Are you sure that's
really the question?" But we're done no favor when a tightly focused
answer helps us keep doing the same irrelevant things, and lets us continue
making "progress" in the wrong direction—instead of getting us out
of our rut.
It's ironic that the
vast worldwide knowledge base of the Web is actually helping us stay stupid
and uninformed, merely because we can now find the answer that we don't know
better than to want—instead of finding the easiest portal to knowledge
through a door marked, "Let's begin at the beginning." There's the
challenge: to build distance learning systems, knowledge-base search tools,
and interactive help technologies that can help us find the trunk, and even
the roots, as well as the leaves of the tree of knowledge.
The Web is made up of hundreds of billions of Web
documents -- far more than the 8 billion to 20 billion claimed by Google or
Yahoo. But most of these Web pages are largely unreachable by most search
engines because they are stored in databases that cannot be accessed by Web
crawlers.
Now a San Mateo start-up called Glenbrook Networks
-- says it has devised a way to tunnel far into the ``deep web'' and extract
this previously inaccessible information.
Glenbrook, run by a father-daughter team,
demonstrated its technology by building a search engine that scoops up job
listings from the databases of various Web sites, something the company
claims most search engines cannot do. But there are myriad other
applications as well, the founders say.
``Most of the information out there, people want
you to see,'' said Julia Komissarchik, Glenbrook Networks' vice president of
products. ``But it's not designed to be accessed by a machine like a search
engine. It requires human intervention.''
This is particularly true of Web pages that are
stored in databases. Many ordinary Web pages are static files that exist
permanently on a server somewhere. But an untold number of pages do not
exist until the very moment an individual fills out a form on a Web site and
asks for the information. Online dictionaries, travel sites, library
catalogs and medical databases are few such examples.
Search the deep (password protected) Web Yahoo said it had begun testing a service that lets
users search information on password-protected subscription sites such as
LexisNexis, known as the "deep Web." The move comes as Yahoo (YHOO), Google (GOOG)
and Ask Jeeves (ASKJ) rush to give web searchers access to ever more information
-- from books, blogs and scholarly journals to news, products, images and video.
The service, called Yahoo Search
Subscriptions, allows users to search multiple online subscription content
sources and the web from a single search box. Users can see content from the
sites they subscribe to, while nonsubscribers have the option of paying to see
it. Content providers, for their part, get access to the vast audience of web
search users.
"Surfing the Deep Web," Wired News, June 16, 2005 ---
http://www.wired.com/news/business/0,1367,67883,00.html?tw=wn_tophead_7
The next generation of Web search engines will do
more than give you a longer list of search results. They will disrupt the
information economy.
When Yahoo announced its Content Acquisition Program
on March 2, press coverage zeroed in on its controversial paid inclusion
program, whereby customers can pony up in exchange for enhanced search
coverage and a vaunted "trusted feed" status. But lost amid the
inevitable search-wars storyline was another, more intriguing development: the
unlocking of the deep Web.
Those of us who place our faith in the Googlebot may
be surprised to learn that the big search engines crawl less than 1 percent of
the known Web. Beneath the surface layer of company sites, blogs and porn lies
another, hidden Web. The "deep Web" is the great lode of databases,
flight schedules, library catalogs, classified ads, patent filings, genetic
research data and another 90-odd terabytes of data that never find their way
onto a typical search results page.
Today, the deep Web remains invisible except when we
engage in a focused transaction: searching a catalog, booking a flight,
looking for a job. That's about to change. In addition to Yahoo, outfits like
Google and IBM, along with a raft of startups, are developing new approaches
for trawling the deep Web. And while their solutions differ, they are all
pursuing the same goal: to expand the reach of search engines into our
cultural, economic and civic lives.