menu

Wednesday, September 21, 2011

Introduction to Module Rewrite (mod rewrite) :: What/Why & How

Hey guys,

    Its been some time that i have posted something, basically a long time. I was really out of time over the past few weeks, as the schools started and even i had to finish off my blog application project. However, it seems that my time in making this project has not wasted as I have some new stuff to share with :).

One of the biggest things I came up while I was on the project was mod rewrite. Honestly I never knew about that before, but i always wondered how those managed to make Virtual Directories and Slugs like stuff. So on to the tutorial, this article is not something with big coding examples but a mere guide or a basic understanding about mod rewrite. Lets START!

What is mod rewirte?

Most of you may have seen how blogger's URLs are given,

ex: http://manzzup.blogspot.com/2010/08/art-of-hacking-exposed-web-based.html

at the first glance someone can suggest that there's a directory structure like this [according to the url]:

blogspot.com --
manzzup --
2010 --
08 ---
art-of-hacking-exposed-web-based.html

is it? NO

This is the magic of the popular apache module mod rewrite. Sometimes some have even refered to this as making Virtual Directories, but the story goes far ahead of that.

Plainly,, mod rewrite is a set of rules that will tell the server to parse a requested URL (ex: http://mysite.com/my-new-story.html) and rewrite it to another URL (ex: http://mysite.com/view.php?id=10) on the fly. The user wont see anything happen that the document loading.
mod rewrite uses a powerful rule based rewrite engine for this to work.

There's more indeptth information about mod rewrite available at the apache site.
http://httpd.apache.org/docs/current/mod/mod_rewrite.html

Why to use mod rewite?

Once you know what mod rewrite is, you get this question, always. To answer this, i would bring up several uses with examples.

1) Having SEO/User friendly URLs

Think for a moment of blogger having post url's like this, manzzup.blogger.com/view.php?id=2
Once this url is crawled by a search engine and when a user see this in his search results, what he sees as the URL is not very encouraging

User searches for "How to install Ubuntu"
and get 2 search results (just imagine that there are only two web sites in the whole universe :D)

How to install Ubuntu
......blab abla....
http://mysite1.com/view.php?id=7987787&fd=8898769

How to install Ubuntu
.....blaww blae
http://mysite2.com/complete-installation-guide-for-ubuntu.html

now see, which result would have more probability to be selected? Perfectly it is the second one, user believes from the URL that he has found the right page.

The magic is done by mod rewrite, it will take the users URL, or the pretty little URL and in turn get the correct document from the server.

2) Specific security features

Think of a situation where your host does not prevent indexing of any directory on your website, that means any directory without a proper index would be exposed to external view. So you decide to put up a index.php to every directory like this. But unfortunately there are 1000+ directories [auto generated by action of scripts]. Even putting an index.php would consume a little of your precious space and if you even wanted the way to show the user the "Access restriction" you will just need to modify 1000+ pages :D

Again mod rewrite comes for the help, you can just put up a rule to prevent indexing of every directory [that is one line] and even better, indexing of only some of the directories [that is about 2~4 lines]and much more better, giving out custom error pages for indexing [that would be some lines but not 1000+ :)]

3)In a situation of changing of the host/site

This is a somewhat rare yet practical situation, mostly this occurs with free web host but can be anywhere. You would be using a certain site for sometime and all your web site links will be crawled and indexed by search engines under your company name. Then one day you need to change the host/domain/site but you are in the BIG problem of loosing your visitors and links to the current site if you move on to a new host. Some hosts allow the forwarding feature but it is just a URL redirection that means all your crawled links would just point to the new homepage, not to the page they should land. To understand this more carefully i'll take a situation i had to face.

I am using zymic's free hosting for our organizations site
http://zontek.zzl.org
and for sometime i had our official forum hosted in this site in the URL
http://zontek.zzl.org/forum
but then there was a case of spammer activity and i had to move the forum ONLY to another site so i can add some more anti spammer techniques, but at the time I had a reasonable traffic to thee previous URL + many links were distributed in the internet
now the forum is here,
http://forum.zonet.freehostingcloud.com
mod rewrite was the only option, i put up a rule to forward all the traffic from my previous host to the new one and it ensures that they will land to the correct page

ex: http://zontek.zzl.org/forum/showthread.php?id=12
would redirect to
http://forum.zonet.freehostingcloud.com/showthread.php?id=12

saved me a lot :D

4) Customizing standard error pages

Some hosts wont allow you to customize the standard error pages like 404,301 ...
but if mod rewrite is enabled, you can put you own error page + can do it in a more flexible way


those arent the only uses, frankly there are many practical situation which would be known only by the ones who have gone through it. [i would update the list if i become so unfortunate to find a new one :D]


How to use mod rewrite

First of all you need to enable the mod rewrite / rewrite module from apache configuration. If you own your web server, you can edit the httpd.conf file and enable it [By removing the # infront of the Load module .....], if not contact your administrator or use php_info() to check the availability.

Now it is there, what next?

Next we need to know that there are 2 ways of using the rewrite engine.

1) One way is to add the rule to apache configuration file so that it will be enabled to all the directories, virtual hosts and basically it is a global configuration.

2) But the more common way is to use a .htaccess file. .htaccess file is the thing which defines the set of rules to be obeyed or executed by the engines.

Common stuff regarding the .htaccess file:

(i) when you rename this file DONOT PUT A FILE NAME, JUST AN EXTENSION " .htaccess "
(ii) You can place the .htaccess file anywhere you like in your directory hierarchy
(iii)The file will be effective to the directory in which the file lays ONLY [can be altered to affect all sub directories as well]
(iv) There's no limit for the no of .htaccess files you can place

Now we reach the conclusion of this article, yes I know that you have been waiting for some action with scripts but this is an introduction for rewrite module. But dont worry the Complete Guide to Mod Rewrite is waiting for you.

http://manzzup.blogspot.com/2011/09/complete-guide-to-mod-rewrite-rules.html

So, that's all for now, hope you'll hang up with mod rewrite for some long :D

cya-guys

1 comment: