March 31, 2011

Redirecting Your Visitors to Your Preferred Domain

A typical website is configured to serve the same content via two addresses: example.com and www.example.com. This redundancy will increase the chances of someone reaching that website directly; but poses a problem for search engines, and you.

Search engines treat a domain and its subdomains as different websites altogether. This means that content available on example.com and www.example.com will be indexed and ranked separately. Thus, a page from your website could be presented multiple times in search results, but not as high as it could possibly get. Moreover, search engines weigh the importance of a page in terms of the number of backlinks to that page. If websites backlink to YourAwesomeArticle.html in multiple ways (specifically by omitting or including the www prefix), this weight will be distributed between these URLs.

The first step to counter this problem is to choose which domain you would prefer, start referencing it within your content and encourage others to use the same when they link to your website. The question, to www or not to www, has no definite answer. I would say it is purely a matter of preference. My observations are as follows:

  • Facebook uses www, Twitter does not
  • Word processors, web applications, instant messaging applications, cell phones will convert text beginning with www into a link; less often, they will also convert text ending with .com into a link but may not do the same for fancy TLDs such as come.to
  • Some websites choose to omit www if their domain name ends with .com, .org or .net
  • Some websites choose to omit www if their domain name is very short (3 character or less)
  • Typing www is a pain in the bottom, creating a logo that contains www is worse, spelling out www using NATO phonetic alphabet is worst
  • Some people habitually type www in the browser's address bar; some enter a word and hit CTRL+Enter, the browser adds www and .com automatically
  • People are likely to interpret www.aweso.me as a website name and aweso.me as a typo

Once you've chosen the preferred domain, the second step is to redirect the traffic of the non-preferred domain to the preferred one using a 301 redirect. The HTTP status code 301 Moved Permanently followed by Location header informs the browser (or search engine) that the requested resource has been permanently moved to a new location. Search engines will honor this information when they index and rank your content.

Rest of this article shows you how to setup (or code) a 301 redirect on your system. The following examples are generic and do not contain any hard-coded domain/subdomain. It is assumed that your website content is present in one physical directory, and it is equally accessible by the domain name and the www subdomain.

301 redirect in Apache using mod_rewrite

Most Apache server setups have mod_rewrite module enabled which can be used to redirect incoming requests. The most common way to configure this module is using the .htaccess file present inside the root directory of your website. Create this file if necessary, and add the lines as described below.

CAUTION: .htaccess is a hidden file; if your FTP/Shell client or web-based file manager does not show you the file you SHOULD NOT assume that it is not there. If present, make a backup of the original file before making any changes. Errors in this file will render your website non-functional.

#########################
# redirect no-www to www
#########################

RewriteEngine On
RewriteCond %{HTTP_HOST} ^(?!www\.)(.+) [NC]
RewriteRule ^(.*) http://www.%1/$1 [R=301,NE,L]
#########################
# redirect www to no-www
#########################

RewriteEngine On
RewriteCond %{HTTP_HOST} ^www\.(.+) [NC]
RewriteRule ^(.*) http://%1/$1 [R=301,NE,L]

Directives and flags used in the above examples are explained as follows:

  • RewriteEngine — enable or disable the rewriting engine
    Set to On so that the rewriting engine does its magic. mod_rewrite does not perform URL rewriting by default.
  • NC — no case flag
    Used to make the pattern matching case-insensitive.
    • Without this flag, WWW.EXAMPLE.COM will not match the pattern ^www\.(.+)
    • With this flag, WWW.EXAMPLE.COM will match the pattern ^www\.(.+)
  • %n — nth RewriteCond backreference
    Backreferences are created when you use parentheses inside the pattern
  • $n — nth RewriteRule backreference
    Backreferences are created when you use parentheses inside the pattern
  • R — force redirect flag
    Set to 301 so that the rewriting engine sends a "301 Moved Permanently" header to the client.
  • NE — no escape flag
    Used to prevent the rewriting engine from applying URI escaping rules to the result of a rewrite. For the "no-www to www" example:
    • Without this flag, http://example.com/?q=foo%20bar will be redirected to http://www.example.com/?q=foo%2520bar
    • With this flag, http://example.com/?q=foo%20bar will be redirected to http://www.example.com/?q=foo%20bar
  • L — last rule flag
    Used to stop the rewriting process so that no further rewrite rules are applied.

301 redirect in IIS using IIRF

Ionics Isapi Rewrite Filter aka IIRF, is a free, open-source ISAPI filter capable of URL rewriting. It enables mod_rewrite style URL rewriting on IIS. This filter can be configured on a per-website basis through IIRF.ini file present in the root directory of that website. To use this filter for redirects, create the this file if necessary and add the following lines:

#########################
# redirect no-www to www
#########################

RewriteEngine On
RewriteCond %{HTTP_HOST} ^(?!www\.)(.+) [NC]
RedirectRule ^/(.*) http://www.*1/$1 [R=301]
#########################
# redirect www to no-www
#########################

RewriteEngine On
RewriteCond %{HTTP_HOST} ^www\.(.+) [NC]
RedirectRule ^/(.*) http://*1/$1 [R=301]

IIRF rules are mod_rewrite compatible to a certain extent hence the previous explanation will suffice. The differences are explained below:

  • RewriteEngine — enable or disable the rewriting filter
    Set to On so that the rewriting filter does its magic. IIRF performs URL rewriting by default.
  • *n — nth RewriteCond backreference
    IIRF seems to use the * character for RewriteCond backreferences.
  • $n — nth matched sub-pattern inside RewriteRule
  • RedirectRule — redirect directive
    This directive is roughly equivalent to mod_rewrite's RewriteRule directive with [R,L] flags.
  • R=nnn — HTTP status code for the redirect
    Set to 301 so that the rewriting filter sends a 301 redirect header to the client.

301 redirect in IIS using URL Rewrite Module

IIS URL Rewrite Module is a Microsoft supported module that enables URL rewriting on IIS 7. The rules can be configured on a per-website basis using the web.config file. Add the following lines to this file under system.webServer/rewrite/rules section:

<rule name="no-www to www" stopProcessing="true">
  <match url="(.*)" />
  <conditions>
    <add input="{HTTP_HOST}" pattern="^(?!www\.)(.+)" />
  </conditions>
  <action type="Redirect" url="http://www.{C:1}/{R:1}" redirectType="Permanent" />
</rule>
<rule name="www to no-www" enabled="false" stopProcessing="true">
  <match url="(.*)" />
  <conditions>
    <add input="{HTTP_HOST}" pattern="^www\.(.+)" />
  </conditions>
  <action type="Redirect" url="http://{C:1}/{R:1}" redirectType="Permanent" />
</rule>

301 redirect with PHP

If configuring your server is not a possibility, you can implement 301 redirects via server-side scripting. This redirection will not cover 100% of the requests, for example, requests for images, documents or html files cannot be redirected this way. Yet this is a viable option if your website is dynamic and uses a server-side scripting language such as PHP to deliver its content.

The following PHP snippets show how you can send a 301 redirect header via PHP. This code should be added to the beginning of all PHP pages on your website. Use include files if possible.

<?php
#########################
# redirect no-www to www
#########################

if(preg_match('@^(?!www\.)(.+)@i', $_SERVER['HTTP_HOST'], $match)){
  header('Location: http://www.' . $match[1] . $_SERVER['REQUEST_URI'], true, 301);
  die;
}
?>
<?php
#########################
# redirect www to no-www
#########################

if(preg_match('@^www\.(.+)@i', $_SERVER['HTTP_HOST'], $match)){
  header('Location: http://' . $match[1] . $_SERVER['REQUEST_URI'], true, 301);
  die;
}
?>

301 redirect in an ASP.NET application

The afore-mentioned PHP code can easily be translated into virtually any server-side scripting/programming language, including ASP.NET+VB. However, the ASP.NET framework makes things more easier compared to the previous example. You can place the redirection logic inside the Application_BeginRequest() event which fires for every request handled by the framework. The handler for this event should be declared inside the Global.asax; thus, only one file needs to be changed to cater for an entire ASP.NET application.

'#######################
' redirect no-www to www
'#######################

Protected Sub Application_BeginRequest(ByVal sender As Object, ByVal e As System.EventArgs)
  If Request.Url.Host.ToLower().StartsWith("www.") = False Then
    Response.Clear()
    Response.Status = "301 Moved Permanently"
    Response.AddHeader("Location", "http://www." + Request.Url.Host + Request.Url.PathAndQuery)
    Response.End()
  End If
End Sub
'#######################
' redirect www to no-www
'#######################

Protected Sub Application_BeginRequest(ByVal sender As Object, ByVal e As System.EventArgs)
  If Request.Url.Host.ToLower().StartsWith("www.") Then
    Response.Clear()
    Response.Status = "301 Moved Permanently"
    Response.AddHeader("Location", "http://" + Request.Url.Host.Substring(4) + Request.Url.PathAndQuery)
    Response.End()
  End If
End Sub