eBlue, Sacra Blue Online Magazine
Jun 2003 — Issue 252
eBlue articles
SPCUG Logo
HardCopy

Edited by
Sacra Blue Staff




Contact Information:
Editor

Google Hacks

Review by Brian Smither

If you don’t know the word "Google," you haven’t been using computers very long. "Google" has been misappropriated and verbified as badly as "Xerox" – much to the company’s dismay. They can’t complain too much, though, as such is perhaps the greatest form of respect.

Google Hacks is currently the most comprehensive source of "how to" information on the world’s best tool for searching the web. The book presents 100 things to do with Google, Google’s programming hooks, and third-party Web sites that will glean and compact info returned from search results.

Chapter 1 goes through in intimate detail the page of results one gets when submitting terms to search. There’s a lot more than is apparent to the eye. This section also goes into deep detail on how extensive you can formulate your search terms. Not only in how you specify the words to search for, their grouping and order, but also restricting the search to those words that appear only in page titles, web addresses, the links on those pages, etc. (Here’s one I didn’t know: an "*" is a wildcard – "Sacramento * Users Group" will get PC, Amiga, Macintosh, Linux, and any other name in that form.)

Discussion also goes into the power the Google people have put into your hands through what is called their "application programming interface" (API). The Google API allows programmers to write utilities that directly access Google’s databases. (And that is the only way Google allows the automated retrieval of search results. Getting caught doing it any other way will cause your IP address and everyone else’s who shares the first three numbers as you to be locked out for an uncomfortably long time.)

The Google people have also created a toolbar that computer users can attach to the Internet Explorer browser. The chapter goes into how cool the Google toolbar is, and some of things you can do with it that can’t be done easily using Google’s Advanced Search.

Chapter 2 takes what was explained about search results and applies it to the other databases Google maintains: newsgroups, images, catalogs, news, categorical directories, the Froogle shopping service, the reverse phonebook, and a peek at what’s cooking in the Google Labs.

Chapter 3 explores what other people have done to enhance what you do when searching. It’s kind of a mish-mash of special-results searches that intend to amuse and educate.

Chapter 4 introduces us to the concept of "scraping." While the Google API is set up to allow us direct access into the database, it currently does not supply everything that regular HTML-based search result pages provide. Hence, there are programs that will scan through said pages, extract the relevant information, and dump it all into a spreadsheet. Why? Why create spreadsheets, databases, and lists in the first place? To organize the data. Here’s the thing... Even lists of sources and peripheral data can be as valuable (or perhaps even more so) as the actual data itself.

Chapter 5 and 6 delve even deeper into the Google API and gives dozens of examples of how to write your own programs in any of several programming languages for a dozen or so purposes.

Chapter 7 is included to provide a balance to the seriousness and power that is Google. Games can be played and results intentionally sought other than what Google was created to provide. One popular pastime is "Google Whacking." Try to run a search in which no results turn up. There are rules and the effort is harder than it sounds.

Chapter 8 returns to the serious side of things and discusses what Webmasters can do in order to get their site ranked higher, what they should do make their site easier for Google to index, and what they must not do to keep Google from punishing their site by dropping its ranking into the abyss.

There have been and continue to be schemes hatched that attempt to use Google’s rank determination algorithm unfairly. The programmers at Google are some of the most ingenious people at data indexing and retrieval. They are continuing to tweak the system, managing more disparate kinds of data, and developing means of delivering a sorted list of results that have true meaning and relevance.

Google Hacks
Tara Calishain & Rael Dornfest
2003 O’reilly & Associates
$24.95, 330 pages
ISBN:0-596-00447-8

eBlue articles
This page prepared by:

Brian Smither

Copyright © 2003 Sacramento PC Users Group, Inc. All rights reserved.
Read our disclaimer and copyright page for more information.