I have been exploring different ways to do URL rewriting. Basically there are two options:
* doing the rewrite at the web server level
* doing the rewrite using ASP.NET
The first option is the most flexible one and the one that causes less problems and less side effects but it is not an option if you don’t have full permissions on the server for the project you’re working (which is not the case). IIS 7 will allow modules written in managed code but I’m not sure if an application will be able to configure the modules to use without full permissions.
So, the only option is to do it at the ASP.NET level. ASP.NET 2.0 has some built in functionality for doing URL rewriting but it’s nearly useless because it does not support regular expressions. There are some HTTP modules and HTTP handlers out there that provide more functionality, but most of them have some problems because it’s tricky to handle themes, postbacks, output caching, etc properly when using rewritten URLs. It seems that the most used HTTP module for URL rewriting is:
It’s easy to use and seems to work properly for now (I have been using it only for a few days), however there are times where regular expressions are not enough for doing rewrites. I wrote the authors an explained them a scenario where a more generic approach is desirable. Basically what I had in mind was that the module was able to call a custom method when certain condition is met (this condition will be evaluated using regular expressions) or always and that method could handle the rewrite or not (the return value could be used to indicate that the rule has been applied or not). Inside the method you could do anything that you want to rewrite the URL. They told me that they were thinking in adding this kind of functionality in a near future, so good news but we’ll have to wait ;)
I’ll show an example in case the problem isn’t obvious. If you want to optimize your site for search engine spiders, a good thing to do is to avoid passing parameters using the query string because some spiders will not index pages with dynamic parameters or others will limit the indexed pages with dynamic parameters or not index pages with more than a predefined number of parameters. Rewriting a URL like:
helps the search engine spiders to index it. If you have different number of parameters for your pages you can end up with a lot of rules like:
Note that in order for this to work properly you have to specify the rule with the maximum number of parameters first, so every time the module is trying to rewrite an URL it will evaluate the first rule, that probably will not be the most used, incurring in an unnecessary performance loss. If you want to do a rewrite similar to the previous one and you have pages in multiple directories things get more complicated because the directory could be handled as a parameter if the rules are not written conscientiously.
All the code about URL rewriting that I’ve found out there was for an HTTP module or an HTTP handler. However, for ASP.NET 2.0 another option is available: a Virtual Path Provider (VPP for now on).
Essentially, access to files and folders in ASP.NET has been virtualized. The default VPP provider just reads files from the file system checking IIS permissions. You can code your own VPP to read files from a database, generate aspx pages on the fly or anything you can imagine.
There’s a good article about VPP written by Victor García Aprea where he explains how to run a website from a zip file:
Most of the stuff about VPP is straightforward except the GetFileHash and GetCacheDependency methods. ASP.NET caches aspx pages after they’re compiled for the first time and monitors file changes, forcing a recompile when a page changes. A bad implementation of the one of those methods could make your application to compile each page on each request so if you are going to code your own VPP provider be sure to triple check your GetCacheDependency and GetFileHash methods.
I started playing with a custom VPP provider that did URL rewriting based on regular expressions in all methods that were using a virtual path (nothing serious, just some code to make a proof of concept). Basically I was calling the default provider implementation with the rewritten virtual path but the framework checks that the path stays unchanged on instances of the VirtualFile and VirtualDirectory classes!
I had to change my strategy to return the same virtual path as I was passed even if I was actually accessing to a different path. I had a quick test version working on my local machine but when trying it on the server there was a big problem. Unfortunately, to use a VPP provider you need full trust permissions. That was a stopper for me so I ended my VPP adventure here. Thank god I tried it on the server before transforming the quick and primitive proof of concept code into something usable (also I didn’t spent any time thinking about other implications (like output caching) of my approach). Even if VPP weren’t useful for me, the knowledge acquired is always welcome.