20 January 2009

How Do You Find Me?

When you hit google's cache of a page it conveniently highlights all of your search terms. Some web sites have this neat trick of highlighting your search terms when you've come from a search, even though you're not in google any more. The "referer" (sic) http header gives them the necessary information. Thanks to the amazing technology presented in this article, you, too, can do the same thing and dazzle your visitors.

On top of that, if you're not using Google Analytics, you will surely want to know what search terms people are using to get to your site. Why not Google Analytics? Because They Know Too Much Already!

Here's a lump of java code you can stick into your web app for extracting google search term information. There are three important methods:

public String getSearchTerm(HttpServletRequest req) - return the URL-decoded search term. Given a referrer http://google.com/search?q=awesome+icon+editor, return "awesome icon editor". This String is what you would use for highlighting content on your page to give visitors the creepy feeling that you know what they're thinking.

public boolean isSearching(HttpServletRequest req) - true if getSearchTerm returns a non-empty String

public static Object[] google(String referrer) - returns an array of length 2; google()[0] is the same as getSearchTerm; google()[1] is the search results page number. This tells you how many times your user clicked "next" on google's search results before they got to your site. This is useful information - it tells you how desperately your user wants your app, and how irrelevant google considers your site. Yes, the truth hurts, but you need to know before you can do anything about it. I'm sorry. The page number comes from the "start" parameter. So, with this referrer string http://google.com/search?q=awesome+icon+editor&start=80, the page number would be 8.

If you're using Freemarker, you can bind this object to a global variable, so unless you abhor globalisation you will know directly within your template whether you're being googled.

So, the code. Free for private and commercial use. Don't be afraid to link back here. Enjoy.

ExternalSearchHelper.java

/*
  copyright conan dalton 2009, license http://creativecommons.org/licenses/by-sa/3.0/ 
*/
import org.apache.commons.lang.StringUtils;

import javax.servlet.http.HttpServletRequest;
import java.net.URLDecoder;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class ExternalSearchHelper {
  static final Pattern itsGoogle = Pattern.compile("http://[^/]*google[^/]+/.*[&\\?]q=([^&]+).*");
  static final Pattern itsGoogle2 = Pattern.compile("http://[^/]*google[^/]+/.*[&\\?]q=([^&]+).*&start=([^&]+).*");

  public String getSearchTerm(HttpServletRequest req) {
    String referrer = req.getHeader("Referer");
    if (referrer == null || referrer.length() == 0) {
      return "";
    }

    return (String) google(referrer)[0];
  }

  public boolean isSearching(HttpServletRequest req) {
    // re-implement this if you don't want to depend on apache-commons
    return StringUtils.isNotBlank(getSearchTerm(req)); 
  }

  public static Object[] google(String referrer) {
    Object[] result = new Object[2];
    if (referrer == null || referrer.length() == 0) {
      return result;
    }

    Matcher m2 = itsGoogle2.matcher(referrer);
    if (m2.matches()) {
      result[0] = decode(m2.group(1));
      result[1] = new Integer(Integer.parseInt(m2.group(2)) / 10);
      return result;
    }

    Matcher m = itsGoogle.matcher(referrer);
    if (m.matches()) {
      result[0] = decode(m.group(1));
      result[1] = 0;
    }

    System.out.println("search term " + result[0]);
    return result;
  }

  private static String decode(String s) {
    return URLDecoder.decode(s);
  }
}

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.