The comments table has been repopulated! I’m currently working on restoring their hierarchy. For ease, I’m working backwards, from the newest to the oldest posts. Please bear with any site instability/sluggishness you might encounter. Cheers! ☺

HTML redirection, mod_rewrite, and .htaccess

Written by Raymond Santos Estrella on Wednesday, 04 September 2013. Posted in 2013

Or how to turn a complicated thing simple

HTML redirection, mod_rewrite, and .htaccess

Last night, after hitting my reading quota and while trying to fall asleep, I was taking a deeper look into the access logs of this site and Corpus Juris and found some really strange access entries. It seems that because of the shared hosting environment of the two websites, there was some erroneous cross hits occuring.

The strange thing with these pagehits were that they were returning a 200 OK status code rather than a 404 “Not Found” error. What this means is that a browser or crawler was requesting a URL that shouldn’t have existed and the site was returning a proper page back to the browser. I decided to do a random test of these URLs and my initial assessment was correct. To put it simply, valid URLs on Corpus Juris were also being parsed as valid URLs on this domain. Thus, URLs such as www.santosestrella.net/laws/statutes/item/ra-no-5680.html and www.santosestrella.net/jurisprudence.html would return actual pages that were rendered using Corpus Juris’ CMS. This kind of behavior shouldn’t be happening since this domain should be treated as a different site altogether and shouldn’t have anything related to Corpus Juris.

It occured to me the only way to solve this cross-domain pagehit behavior without getting a separate hosting plan for one of the domains, I would have to make use of URL redirection via Apache’s mod_alias/mod_rewrite. Mod_rewrite is an Apache module that provides a way to manipulate URLs based on programmed rules that is written in a PCRE regular expression. On my host, these rules are contained in the .htaccess file commonly found at the site’s root folder. Unfortunately, this is one of the worst things in the programming world for me. I’ve never really taken the time to fully learn the intricacies of writing an .htaccess file. Even worse, I’ve never had any formal reading or training in using regular expressions1. Sure, the CMS already uses its own specialized .htaccess file but I absolutely loathed tweaking it. I dread playing around with this particular niche part of site management.

In any case I had no choice but to bite the bullet. I eventually spent a whole 1½ hours on this problem late last night. At first I tried to just hack my way through the whole thing and mash out my code as fast as I could, test and debug, and then iterate again. An hour later, I was going nowhere except falling asleep at my desk.

To anyone else who is going to attempt something like this, I give you this advice: draw your redirects on paper. I cannot stress how important this step is. Maybe it’s my ADHD at work but all of this became much, much easier when I was typing the code while doodling the redirect arrows on a sheet of paper. To be clear, this is what I had to do:

From To Note
www?.santosestrella.net main.santosestrella.net Here, I’m only redirecting a hit on the homepage of the santosestrella.net domain. It redirects from the domain root to the main subdomain.
raymond.thecorpusjuris.com raymond.santosestrella.net This redirects all pages from one domain to another.
thecorpusjuris.com/raymond raymond.santosestrella.net Redirection from a folder to a subdomain on a different domain.
santosestrella.net/xxx thecorpusjuris.com/xxx This redirects all pages in one domain to another, excepting the homepage.

I’ve reproduced the code below:

Options +FollowSymLinks
RewriteEngine On

# Redirect www?.santosestrella.net root TO main.santosestrella.net
RewriteCond %{HTTP_HOST} ^(www\.)?santosestrella\.net$ [NC]
RewriteRule ^$ http://main.santosestrella.net [R=301,L]

# Redirect www?.santosestrella.net/xxx TO www.thecorpusjuris.com/xxx
RewriteCond %{HTTP_HOST} ^(www\.)?santosestrella\.net$ [NC]
RewriteRule ^(.+)$ http://www.thecorpusjuris.com/$1 [R=301,L]

Tada! Done and done. :)

Endnote

1 Regular expressions are pattern strings that are typically used in a search and replace operation. A string is broken down into a pattern and then transformed by following another pattern.

Share This Article

About the Author

Raymond

Raymond Santos Estrella

I guess I should really make a proper writeup here. Something witty or maybe a joke to add some levity. I’ll come back to this when I have time. If you have any suggested copy that I can insert here, drop me a line.

Comments (3)

  • reagal

    reagal

    04 September 2013 at 02:02 |

    Nice!

    reply

  • TheDarkTower

    TheDarkTower

    04 September 2013 at 05:14 |

    Same as you, I didn’t put any effort into understanding .htaccess files. They remain a mystery to me.

    reply

  • Shinji

    Shinji

    24 September 2013 at 12:57 |

    Bored much?

    reply

Leave a comment

You are commenting as guest. Optional login below.