This post was updated in February 2007.
Some of you are probably aware of spiders. They are these little programs that surf the internet looking for data. Some spiders assist search engines in helping you find the web page you are looking for. Those are the good spiders. There also exists evil spiders. They jump from web page to web page looking for email addresses. Once they find one, they send it to a database so someone can send you junk email. Not cool.
Hiding In Plain Sight
What we need is a way to display an email address so the reader of a web page can communicate with the web site, yet we also need to hide the address from the spider. The reader and the spider are looking at the same web page but at differently levels. The reader is looking at the browser’s rendering of HTML. The spider is looking at raw HTML. Three ideas come to mind: ASCII codes, server-side mail forms and images. ASCII codes and images will look like email addresses on the screen, but nothing like an email address in the source code of the HTML document.
Method 1: ASCII
In HTML when you place “&#” in front of the ASCII code of a character the browser will write the character not the ASCII code to the screen. And because this article is being viewed by a browser, the code shots are images. The download will have the source in a text format.
The function below accepts an email address as a parameter and returns a masked email address that is made up of ASCII codes. When the browser writes the codes to the screen it will get converted back to text. Although it’s possible for a spider to read and convert ASCII codes inside the HTML source, it’s probably not that prevalent. The function goes character by character converting the email address. The last step is to merge the masked email address with the HTML mailto: tag. In order to minimize the chances a clever spider might look for the mailto:, this example masks that word as well.
Function maskEmail(email) For j= 1 to Len(email) maskEmail = maskEmail & "&#" & asc(Mid(email,j,1)) & ";" Next maskEmail = "<a href=""m&#" & asc("a") & ";&#" & asc("i") & ";&#" &_ asc("l") & ";&#" & asc("t") & ";o:" & maskEmail & """>" & maskEmail & "</a>" End Function
Then you can call that VBScript function from inside an ASP page.
<p>For more information email me at <%= maskEmail("someEmail@someDomain.net") %>.</p>
Method 2: Server-side Mail Forms
These are the contact forms you see everywhere these days. The user fills out a form, clicks submit, and hopes it gets to somebody. This is great for the recipient, because their email address never appears on the site. However, some users don’t trust filling out a form and will withhold feedback. Recently I was trying to open a new account with eFax.com. Their order entry page was down so I filled out a form alerting them to the problem. After detailing the problem the form rejected because I didn’t already have an account with them. They didn’t have a direct email address listed anywhere else, so I became a customer of one of their competitors.
Method 3: Images
I believe the ASCII method will work for a little while. Eventually as that trick becomes more popular, the spiders will get smarter. Who knows maybe the developer that writes the code for those evil little email spiders is reading this article right now. If we created an image with text of our email, the spiders would be truly be defeated. However creating an image for every email address in PhotoShop would be a hassle. And then what happens when we change fonts when the site gets redesigned? Creating the email images by hand isn’t an option. We need to automate.
For this version I extended the ImageToText code sample at which used to reside on yyyZ.net.