This article and the code is a result of this experiment. It is not a complete solution but serves the idea. I have also created some basic analytic graphs based on URLs visited. See the end of this article.
1) Shorten the lengthy URLFor example, http://en.wikipedia.org/wiki/URL_shortening to http://sitename/u4rwki
From the above example, the original URL is mapped to the key u4rwki (some random key). We need to choose a key generation mechanism such that we have enough characters to uniquely represent the original URL. Also shorter the generated key, the better it will be.
A key can be generated in many ways. We can use Base 36 conversion (that would be 26 alphabets and 10 digits). Click here for more information on Base 36.
For example, a decimal 1000,000,000 (yes, a Billion) is GJDGXS in Base 36.
So in order for us to use Base 36 encoding we need to get an integer key for every URL. We are going to store the URL into some store so we can lookup and redirect users. In that case there are two options that I can think of.
a) An auto incremented numeric value for every row inserted to a database
b) Generate a hash-code for each URL and save it to the database along with the URL.
Either way, we have an interger value associated with the long URL, which we can convert to Base 36 representation to get our key.
2) Faster redirectsWhatever technology we select, when we receive a request we should be able to lookup the long URL based on the key and issue a 301 redirect.
We can achieve faster key lookups by using an in-memory cache.
3) Data, Analytics, ReportEnd of the day you might want to know how many people clicked your URL and generate some reports. Basically when you have data you can do some nice cool things. So lets try to save as much information as possible.
4) Admin interfaceWe need a user interface for user to create an account, shorten urls, view click history, share links etc.
The sample is developed using Windows Azure SDK, Google charting API. It contains the following roles,
1 Web role for the core service, Core Web
1 Worker role savings clicks, saving request headers etc, lets call it just Worker
1 Web role to do the UI heavy lifting, lets call it AdminWeb
Worker role is light weight and so serves as a cache instance.
Here are few screen shots:
The main page,
The Shortened URL grid
And some analytics,
The code is here, https://github.com/anilnakkala/skewrl
I hope it is useful for someone.