Digital Colony!

Classic ASP Search Engine Friendly Redirects

A while back I posted how to code a search engine friendly redirect in ASP.NET. Recently I've had to go old school and preform this task in Classic ASP. Here is the VBScript version of a search engine friendly redirect.
<% 
   Response.Status="301 Moved Permanently" 
   Response.AddHeader "Location","http://example.com/newpage/"
   Response.End
%>

Labels: , ,

 

Comment Out ASP.NET Control

Just like HTML has a way to comment out markup from being displayed by the browser, so does ASP.NET. Below is an example of a commented out ASP.NET tag.
<%-- <asp:Label ID="lblGreeting" runat="server" Text="Hello!"/>--%>

Labels:

 

Fix: RSS Feed is Encoded as ASCII not UTF-8

If you are using an ASP.NET page to generate your RSS feed, you may run into this validation error.

Your feed appears to be encoded as "utf-8", but your server is reporting "US-ASCII".

Set the ResponseEncoding property to utf-8. Here is an example of its use in the Page directive.
<%@ Page Language="C#" ResponseEncoding="utf-8" %>
Use FeedValidator.org to test your RSS and ATOM feeds.

Labels: ,

 

Capitalize the First Letter of a Word in C#

I noticed that a few of my labels in the right column were in lower-case. Since I wrote my own label control, I decided to add a line of code to make sure the first letter in each tag was capitalized.
string labelName;
labelName = char.ToUpper(labelName[0]) + labelName.Substring(1);

Labels: ,

 

Using Recursion To Return a List of all Files on a Web Site

When I decided to build a sitemap for INeedCoffee, I was able to pull a list of URLs from the database. One simple query provided me with all the URLS on the web site. But what if you don't have a database table holding all the URLs for your site? Maybe the only way to get a full list of URLs is to go through each folder on the site and make a list. Sounds like a job for code.

Recursion To the Rescue

The job we want the code to perform is to start in the root folder and build a list of files with the .ASPX extension. In this example .ASPX files are the content files. You might add .HTML or .ASP files depending on how you setup your web site. Once you have a list of files for the root folder, the code is to go inside each subfolder, add to the list of files and repeat the process until it's exhausted the entire tree structure of your web site.

The nist.gov site defines recursion as:
An algorithmic technique where a function, in order to accomplish a task, calls itself with some part of the task.

The Code

The following code when executed will build a list of every .ASPX file on the web server. Note that I add an underscore to the beginning of files that I don't wish to include on the list of indexed URLs.
public ArrayList urlList;

public void ScanWebsiteForFiles(DirectoryInfo directory)
{           
   // look for .ASPX files in current folder
   foreach (FileInfo file in directory.GetFiles("*.aspx"))
   {
       if (!file.Name.StartsWith("_"))
       {
           string thisURL = file.FullName.ToString();
           urlList.Add(thisURL.ToLower());
       }
   }

   DirectoryInfo[] subDirectories = directory.GetDirectories();
   foreach (DirectoryInfo subDirectory in subDirectories)
   {               
       ScanWebsiteForFiles(subDirectory);
   }
}
In order to run this code, pass it the root directory of your web site.
string dirName = HttpContext.Current.Server.MapPath("~/");
DirectoryInfo rootDirectory = new DirectoryInfo(dirName);
ScanWebsiteForFiles(rootDirectory);

Labels: , , ,

 

Creating an HttpHandler to Build a Search Engine Site Map

My previous post Build a Search Engine SiteMap in C# covered how to create a sitemap.xml file using the File System. It also provided guidelines on how to go about validating the sitemap as well as submitting it to the major search engines. If you came here for background on search engine sitemaps, go read that post first. If all you care about is the HttpHandler, you may proceed.

An Overview of the HttpHandler

Scenario: There is a page you wish to create that will be generated dynamically with code, but you want to use a file extension that isn't dynamic. Like XML. RSS feeds and a sitemap are two examples of xml files that would be ideal for an HttpHandler. There are other uses for HttpHandlers, but in this post we are only interested in creating a dynamic sitemap.

Here is a brief overview of how the HttpHandler will work. A request will be made for the sitemap.xml, probably by a search engine like Google. Instead of looking for it on the file system, your web application will intercept that request and pass it off to a class which which generate and deliver an XML document.

Step 1: Create SitemapHandler.cs

Inside the App_Code folder create SitemapHandler.cs. This class will implement the IHttpHandler interface. Before we deliver an XML document, let's create a simple HTML test to make sure the HttpHandler is working.
namespace HttpExtensions
{
    public class SitemapHandler : IHttpHandler
    {
        public SitemapHandler()
        { }

        #region IHttpHandler Members

        public bool IsReusable
        {
            get { return true; }
        }

        public void ProcessRequest(HttpContext context)
        {
            response = context.Response;
            response.Write("<html><body><h1>HTTP Handler is Working!</h1></body></html>");
        }
        
        #endregion
    }
}

Update the Web.Config

Add the following section inside system.web. The sitemap.aspx line is for debugging purposes only and will be removed once everything is working.
<httpHandlers>
<add verb="*" path="sitemap.xml" type="HttpExtensions.SitemapHandler"/>
<add verb="*" path="sitemap.aspx" type="HttpExtensions.SitemapHandler"/>
</httpHandlers>
From Visual Studio 2005, test the site. Now type in sitemap.xml in the path of the URL. You should see your HTTP Handler is Working! message. And if you type in sitemap.aspx, you should also see the message. As long as you view your site through Visual Studio 2005, you handler will work fine for both cases. Once you hand that job back to IIS, you'll need to do one more step.

Map *.XML to the aspnet_isapi.dll

If you run your code without doing this step, you will see your HTTP Handler is Working! message only on the sitemap.aspx request. Any request to sitemap.xml will return a 404 Page Not Found error. Instead of the HTTP Handler intercepting the request, IIS sees that the file extension is not a .NET file extension so it takes command. It looks on the file server and doesn't see a sitemap.xml and returns the 404.

The solution is map the *.xml file extension to the ASP.NET DLL (aspnet_isapi.dll). Once this is done and the server is restarted, the handler should work. For more information on how to do the IIS mapping go to Protecting Files with ASP.NET and scroll down to Protecting .mdb Files. Replace .xml for .mdb when following those directions.

Back to the SitemapHandler

Now that we have a working HttpHandler, we can go back and replace the HTTP Handler is Working! message with a real sitemap XML file. I've commented out the loop to add pages. Here is where you would add the code to pull the urls from some data store, be it a component or database.
public void ProcessRequest(HttpContext context)
{
   response = context.Response;
   response.ContentType = "text/xml";       
   using (TextWriter textWriter = new StreamWriter(response.OutputStream, System.Text.Encoding.UTF8))
   {
       XmlTextWriter writer = new XmlTextWriter(textWriter);
       writer.Formatting = Formatting.Indented;
       writer.WriteStartDocument();
       writer.WriteStartElement("urlset");
       writer.WriteAttributeString("xmlns:xsi", "http://www.w3.org/2001/XMLSchema-instance");
       writer.WriteAttributeString("xsi:schemaLocation", "http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd");
       writer.WriteAttributeString("xmlns", "http://www.sitemaps.org/schemas/sitemap/0.9");

       // Add Home Page
       writer.WriteStartElement("url");
       writer.WriteElementString("loc", "http://example.com");
       writer.WriteElementString("changefreq", "daily");
       writer.WriteEndElement(); // url

       // Add code Loop here for page nodes
       /*
       {
           writer.WriteStartElement("url");
           writer.WriteElementString("loc", url);
           writer.WriteElementString("changefreq", "monthly");
           writer.WriteEndElement(); // url
       }
       */
       writer.WriteEndElement(); // urlset
   }                      
}

Validate and Submit

Details on how to validate and submit your sitemap can be found on Build a Search Engine SiteMap in C#. Don't forget to remove the sitemap.aspx directive inside the web.config file. That was just for debugging.

A Word of Warning

After writing this and patting myself on the back for being so clever, I discovered that other XML files on my site were not displaying. They were throwing errors. Whereas IIS can natively display XML files, the ASP.NET DLL can't.

This leaves you with 2 possibilities. ONE: Use a different file extension such as .MAP instead of .XML. TWO: Write a second HTTP Handler to catch all other XML file requests. That handler would open the XML and stream it back to the browser with a contentType of "text/xml".
<add verb="*" path="sitemap.xml" type="HttpExtensions.SitemapHandler"/>
<add verb="*" path="*.xml" type="HttpExtensions.XMLHandler"/>
public void ProcessRequest(HttpContext context)
{
  HttpResponse response; 
  response = context.Response;
  string thisURL = context.Request.RawUrl.ToString();
  string thisXMLFile = HttpContext.Current.Server.MapPath(thisURL);
           
  StreamReader xmlStream = File.OpenText(thisXMLFile);
  string xmlOutput = xmlStream.ReadToEnd();
  response.ContentType = "text/xml";
  response.Write(xmlOutput);
}

Labels: , , , ,

 

Build a Search Engine SiteMap in C#

Sitemaps are XML files that web masters can create to let search engines know what what pages to index and how frequently to check for changes on each page. The XML format of the sitemap file is detailed on sitemaps.org. Here is a sample of a sitemap with a single url.
<?xml version="1.0" encoding="utf-8"?>
<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 
http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd" 
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>http://example.com</loc>
    <changefreq>daily</changefreq>
  </url>
</urlset>
My sample differs from the one on sitemaps.org. The more defined namespace in my example will validate on Google, Yahoo! and with Ask.com. At the time of this writing, theirs doesn't. The loc and changefreq are required, priority and lastmod are optional.

My advice with the search engines is treat them like a passport agent. Say only what is required and nothing else or you could find yourself sent to the end of the line. Once you've determined what URLs will be on the sitemap, the only decision is defining the changefreq of each page. One strategy might be to set the home page to daily, section pages to weekly and content pages to monthly. If your home page has stock tickers or sports scores, you could set the changefreq to always or hourly.

Google prefers the sitemap file to be in the root folder and every example on their site names the file sitemap.xml. What Google wants, Google gets.

Additional Namespaces

using System.IO;
using System.Xml;

Sample Code

You will need write access to the sitemap.xml file. Since you don't want to give your entire root folder write access, my advice is to create a dummy sitemap.xml, place it into the root folder and then set write access to that file. Adding a try...catch to the code below will alert you if that write access is not there.
string SiteMapFile = @"~/sitemap.xml";
string xmlFile = Server.MapPath(SiteMapFile);

XmlTextWriter writer = new XmlTextWriter(xmlFile, System.Text.Encoding.UTF8);
writer.Formatting = Formatting.Indented;
writer.WriteStartDocument();
writer.WriteStartElement("urlset");
writer.WriteAttributeString("xmlns:xsi", "http://www.w3.org/2001/XMLSchema-instance");
writer.WriteAttributeString("xsi:schemaLocation", "http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd");
writer.WriteAttributeString("xmlns", "http://www.sitemaps.org/schemas/sitemap/0.9");

// Add Home Page
writer.WriteStartElement("url");
writer.WriteElementString("loc", "http://example.com");
writer.WriteElementString("changefreq", "daily");
writer.WriteEndElement(); // url

// Add Sections and Articles
SqlConnection con = new SqlConnection(connectionString);
string sql = @"SELECT url, 'weekly' as changefreq FROM Section 
UNION SELECT url, 'monthly' as changefreq FROM Articles ";
SqlCommand cmd = new SqlCommand(sql, con);
cmd.CommandType = CommandType.Text;

try
{
    con.Open();
    SqlDataReader reader = cmd.ExecuteReader();
    while (reader.Read())
    {
        string loc = "http://example.com" + reader["URL"].ToString();
        string changefreq = reader["changefreq"].ToString();
        writer.WriteStartElement("url");
        writer.WriteElementString("loc", loc);
        writer.WriteElementString("changefreq", changefreq);
        writer.WriteEndElement(); // url
    }
    reader.Close();
}
catch (SqlException err)
{
    throw new ApplicationException("Data Error (Sections):" + err.Message);
}
finally
{
    con.Close();
}

writer.WriteEndElement();// urlset        
writer.Close();

Validate Your Sitemap

Once you've confirmed you have a good looking sitemap.xml file in the root folder of your web site and it contains all the pages you want indexed by the search engines, it is now time to validate it. XML-Sitemaps.com has a sitemap validator that you can test out your new sitemap. Once it's fine, move to the next step.

Update Your robots.txt File

Add the location of your sitemap in your robots.txt file.

Sitemap: http://example.com/sitemap.xml

Google Webmaster

Google has a suite of tools for managing your relationship between your web sites and them. They call this suite Google Webmaster Central. It is here that you will register your web sites with validation files. Once they've established you are the webmaster of your site, they will present you will a screen to submit your sitemap. You will need a Google Account for this process. If you don't have one, follow the link to Create a Google Account.

Yahoo! Site Explorer

Yahoo! has a similar setup which is called Yahoo! Site Explorer. Using your Yahoo! ID, you will go through the same process of registering your web sites. And once that process has been completed, you can then submit your sitemap.xml file. Don't have a Yahoo ID? Get one.

Ask.com

Ask.com doesn't require any accounts or site validation. Just ping their server with the location of your sitemap.xml file modeled after the URL example below. Of course replace example.com with your domain name.

http://submissions.ask.com/ping?sitemap=http%3A//example.com/sitemap.xml

Monitoring the Sitemap Crawl

Both the Yahoo! Site Explorer and the Google Webmaster Tools have reports that provide updated status on the success and failure of the sitemap crawl. Ask.com to my knowledge doesn't have any such tools. And I couldn't locate a sitemap submission tool at all for MSN.com.

Using an HTTP Handler

This example uses the File System. Another option is to use an HTTP Handler to deliver the sitemap.xml.

Final Word

A friend of mine with a low traffic site saw his page views double after adding a sitemap. If one page of code can potentially double your page views, then it is worth pursuing.

Labels: , , , , ,

 

EnableTheming Propery Ignored - The Bug and the Fix

ASP.NET 2.0 gave developers a server-based styling technology called Themes. From the web.config file a single line can dictate how an entire web site will look. The theme of this site at this writing is called Joshua.
<system.web>
 <pages theme="Joshua"/>
</system.web>
Microsoft tells us that is we want to turn off theming for a given page and not the entire site, we can set the EnableTheming property to false. But this doesn't work. The web.config setting will override the Page directive.
<%@ Page Language="C#" EnableTheming="false" %>
To get around this bug for a single page, specify a Theme with no value in the Page directive.
<%@ Page Language="C#" Theme="" %>

Labels: , ,

 

Detect ASP.NET Version Running on Server

Below is a handy script that you can drop onto any server running any version of ASP.NET. When you load the page into the browser, it will report back to you which version of ASP.NET the current web site is configured to run.
<%@ Page Language="C#" %>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<script runat="server">
    protected void Page_Load(object sender, EventArgs e)
    {
        lblVersion.Text = System.Environment.Version.ToString();
    }
</script>
<html xmlns="http://www.w3.org/1999/xhtml" >
<head runat="server">
    <title>ASP.NET Version Checker</title>
</head>
<body>
    <form id="form1" runat="server">
    <div>     
        <p>.NET CLR Version: 
        <strong><asp:Label ID="lblVersion" runat="server" /></strong></p>
    </div>
    </form>
</body>
</html>

Labels: ,

 

Convert Hexadecimal to Integer in SQL

Below is a handy function for converting a hexadecimal value into an integer using a SQL Server user defined function.
IF OBJECT_ID('dbo.udfHex2Int') IS NOT NULL
        DROP FUNCTION dbo.udfHex2Int
GO
CREATE FUNCTION dbo.udfHex2Int
(
  @hexstr AS varchar(1000)
)
-- Function converts VARCHAR representation of HEX to INT
-- 'FF'  --> 255

RETURNS INT
AS
BEGIN

  IF @hexstr IS NULL RETURN NULL

  DECLARE
    @curbyte AS int,
    @varbin  AS varbinary(500)

  IF @hexstr LIKE '0x%' SET @hexstr = SUBSTRING(@hexstr, 3, 8000)

  SET @hexstr =
    CASE LEN(@hexstr) % 2 WHEN 1 THEN '0' ELSE '' END + @hexstr

  SET @varbin = 0x
  SET @curbyte = LEN(@hexstr) / 2

  WHILE @curbyte > 0
  BEGIN
    SET @varbin =
      CAST(
        CASE SUBSTRING(@hexstr, @curbyte * 2, 1)
          WHEN '0' THEN 0x00
          WHEN '1' THEN 0x01
          WHEN '2' THEN 0x02
          WHEN '3' THEN 0x03
          WHEN '4' THEN 0x04
          WHEN '5' THEN 0x05
          WHEN '6' THEN 0x06
          WHEN '7' THEN 0x07
          WHEN '8' THEN 0x08
          WHEN '9' THEN 0x09
          WHEN 'A' THEN 0x0A
          WHEN 'B' THEN 0x0B
          WHEN 'C' THEN 0x0C
          WHEN 'D' THEN 0x0D
          WHEN 'E' THEN 0x0E
          WHEN 'F' THEN 0x0F
        END |
        CAST(
          CASE SUBSTRING(@hexstr, @curbyte * 2 - 1, 1)
            WHEN '0' THEN 0x00
            WHEN '1' THEN 0x10
            WHEN '2' THEN 0x20
            WHEN '3' THEN 0x30
            WHEN '4' THEN 0x40
            WHEN '5' THEN 0x50
            WHEN '6' THEN 0x60
            WHEN '7' THEN 0x70
            WHEN '8' THEN 0x80
            WHEN '9' THEN 0x90
            WHEN 'A' THEN 0xA0
            WHEN 'B' THEN 0xB0
            WHEN 'C' THEN 0xC0
            WHEN 'D' THEN 0xD0
            WHEN 'E' THEN 0xE0
            WHEN 'F' THEN 0xF0
          END AS tinyint) AS binary(1))
      + @varbin
    SET @curbyte = @curbyte - 1
  END

  RETURN CAST(@varbin AS INT)

END 

Labels: , ,

 

Digital Colony Copyright © 1999-2008 XHTML   508
This site uses Blogger, which is not 100% XHTML compliant.
Try...Catch Disclaimer: For brevity many examples do not include error handling. That is your responsibility.