Using ASPHttp To Scrape Data in Classic ASP

The article was written in 2001.

I recently discovered a cool ASP component from Server Objects called AspHTTP. This component allows an ASP page the ability to GET documents via the HTTP protocol. It also can POST data to a remote web page. Why would you need this for an ASP page? The ability to parse data off a web page and place it in your own format is one need. The AspHttp component lets you to pull the remote web page into your code as a string. From there you can use extensive string parsing to extract the data.

Copyright Issues

Copyrights are outside the scope of this article. Just be aware that snagging someone else’s data may make their legal departments unhappy. If you choose to use someone else’s data, getting their permission may be a wise decision. Use this tool for good, not for evil.

Sample Code

<%
Set HttpObj = Server.CreateObject("AspHTTP.Conn")
HTTPObj.Url = ' enter some URL here 

' fetch the HTML page into the strResult variable 
strResult = HTTPObj.GetURL
' check for component error 
If Len(HTTPObj.Error) Then
   Response.Write "ERROR: " & HTTPObj.Error 

Else ' did we retrieve a document? 
  intLength = Len(strResult)
  If intLength > 0 Then
    ' proceed with scrape 
  End If 

End If
%>

Last Words

Data scraping can be an inexact science. If the web site changes their markup, you can find yourself recoding your scrape. Look for an API first before proceeding with a data scrape.

This entry was posted in Classic ASP and tagged , . Bookmark the permalink.

Comments are closed.