Project: Roll Your Own URL Shortener

Like many other folks, I started using Twitter, and soon thereafter, found the value in URL shorteners. You need to keep each message to 140 characters or less, and if you are passing along the Web link, sometimes that link can take up most of the allotted space.

A URL shortener basically assigns a Web address with as few characters as possible to a longer address, so that when you enter the shorter address into the browser, a server automatically redirects the page requester to the original, longer address.

While there are plenty of free Web sites that offer this service, there are reasons for deploying your own, if you have the gumption and a Web server at your disposal.

For one, how long will these services last? They don't seem to have business plans. So it keeping your own file of URLs assures that the shortened links won't be recycled (unless you want them to), and that they'll be up as long as you want them to.

Another reason for rolling your own is pure vanity (media companies should take note). Like personalized license plates, a shortened URL can say what you want. It can also advertise domain name of the owner.

This is how I built my own URL shortener.

Please note that this code I'll present has some previous limitations. I would strongly advise not offering it as a public service, at least not without more security measures in place (Filtering what is placed in the Web input, for instance, to prevent cross-site scripting).

I set it up to use as a private service. In other words, place it in some hard-to-find cranny of your Web server, inaccessible from the spidering probes of the search engine. Even as a private service, you should consider putting some authentication code in place.

This is bare-bones code-showing how URL redirect works, on a Unix box, using a Web front-end. It's so crude, you even have to provide your own shortened URL.

Here is what you need to do:

To set up a URL shortening service, at its most basic, you need to set up three files. I'll explain each in detail.

One is a Web page that users can use to submit a long link and its short link (to keep things simple for me as a non-programmer, I'll ask the user to create the short link names, rather than have them automatically generated).

The second file is a text file on the Unix server listing all the short file names, alongside the original ling URL addresses that they will redirect for. It is called the .htaccess file and it is for the Web server, so when the request for the short link comes in, it will redirect the browser to the original (long link).

The third file, a PHP-Web page connects the two above-named pages. So that when a user hits "submit" on the HTML page, the information is sent to the PHP page, which has code to append the Unix file with the new link information.

And that's about it.

Now the details:

STEP 1: Set up a directory on you Web server for the shorteneing service: For the sake of keeping things tidy, I set up a specific directory on my Web server, just to keep the URL shortening service. It is immediately under the root directory, and is called "A" ("xttp://www.joabj.com/A"). This directory itself has the same permissions as other publicly-accessible directories on the server (drwxr-xr-x). It will contain only the three files needed for this service, plus an obscuring index page so others can't snoop on the directory's contents.

STEP2: Set up the .htaccess file: Most of the decent URL shorteners use 301 redirect HTML command. To use 301 redirect, you set up a file, called .htaccess in one of the Web server's directories. It should be in a directory that the Web server can read and that can be written to by the PHP software.

Setting up an .htaccess file takes a number of steps. I've covered them here. Follow these steps and come back when your finished.

The permissions of the .htaccess file should be set so that anyone can write to the file, and read it (-rw-rw-rw-). Security-wise, this sucks. Hiding it in a part of the publicly-accessible though unlinked part of the Web server will keep it from snoopers, though the security-through-obscurity approach is not a good long-term solution. But for the purposes of this instruction, it is the easiest path. You've been warned though.

For more information on changing permissions on a Unix box, go here you fool!

STEP 3: Set up the landing HTML page: So the idea is to set up a basic html page that you can bookmark and go to when you want to shorten a URL. It should have a short and easy-to-remember title so you can get to it when you are on the road. But you should NOT link to it from any other page on your site that is crawled by human or search engine spider. Warning: Security through obscurity again.

Anyway, at its most basic, this should page should include three things. It should have two fields for the user to enter the original URL and a short URL that they make up. The page should also have a button that can be pushed to kick off the whole operation, once the values are filled in.

This would be the active code for such a page:

<FORM ACTION="Shorten.php" METHOD="get">
The link to be shorted: <INPUT TYPE="Text" NAME="Link" />
Shorty nickname: <INPUT TYPE="Text" NAME="Shorty" />
<INPUT TYPE=SUBMIT VALUE="GO" />
</FORM> 
To see this code in an actual working html page, go here. To make it operational, change the suffix of the file name from .txt to .html .

O.k., some explanation of what is going on here. Using the W3C standards for creating Web forms, we've given the user two fields to fill in. The content filled into the "Link" field will be assigned to the variable "Link" and the content filled into the short field will be assigned to the variable "Shorty."

Note that we are asking the user to fill the original link address into the "Link" field and the shortened link name in the "Shorty" field. Also on the page is the code:

FORM ACTION="Shorten.php" METHOD="get"
and
INPUT TYPE=SUBMIT VALUE="GO"
This basically instructs the browser to fetch the Shorten.php page and feed it the contents of the "Link" and "Shorty" variables, when the SUBMIT button is pushed.

Next we create the Shorten.php page.

STEP 4: Create the PHP page: next you have to create the page that the HTML page is sending its information to. And this page will format and insert the data into the .htaccess page in such a way that it can be read by the Web server, capiche?

$Link = $_GET["Link"]; 
$Shorty = $_GET["Shorty"];

$Preamble = "redirect 301  ";
$Space = "     ";
$Directory = "/A/";
$NewLine = "n";
$All = $Preamble.$Directory.$Shorty.$Space.$Link.$NewLine;
$Name = ".htaccess";
$Handle = fopen($Name, 'a'); 
fwrite($Handle, $All);
fclose($Handle);
To see this code in an actual working html page, go here. To make it operational, change the suffix of the file name from .txt to .php .

So what is going here? What we need to do is take the information given to this page from the HTML page ("$Shorty = $_GET["Shorty"]" and "$Shorty = $_GET["Shorty"];") and format it in the appropriate way for an .htaccess file ("redirect 301 [new short address] [original address]).

To do this, PHP has to make a one-line string. You can concatenate multiple PHP variables through the "." symbol. So we create variables for the additional formatting we have to do. "$Preamble" is the first statement needed on the .htaccess line ("redirect 301"). "$Directory" is the directory (in this case "A") the new address will be appear to be in (so the user doesn't have to type it in, as a prefix). "$Space" adds the space needed between the two addresses, and "$NewLine" tells Unix to start a new line after this string is entered.

Finally, $All assembles all these variables together in the order of a proper 301 redirect request.

The page then opens the .htaccess file, appends on the new request, and closes the file. After this file is appended, when the short link is typed into the browser, as part of the full file-name (i.e. http://www.joabj.com/A/0"), your Web server should automatically send the viewer to the page you indicated.

That's it. Ezy pezy, yes?

Again, this is just the bare bones code, to show you how it works.

There are some easy things you can do to pretty up the service, by adding to the HTML portion of these pages: You need to see up error messages, to tell the user when they fill in the boxes incorrectly. You may want to place a Twitter submission box on the results page, so you can submit your newly-christened short link directly to the microblogging service. Or you could post a link to try the new short URL. Or show the URL to the last link, so you know where you left off. You could even insert a generator of short addresses, taking the manual naming of the address out of the process.

Heck, this code doesn't even offer the ability to tell the user that the short link submitted has already been used!

A word on naming the short links: As you can tell, the user has to supply the own short links, which become live as soon as they are entered. While this offers a way to way to customize Web addresses ("http://www.yourname.com/ThisStorySucks.html"), if you want to make them as short as possible, you should use as few letters as possible. And long-term use requires a few heuristics, as they say.

Myself, I am starting by running through all the 1-character options (0-9 a-z, for a total of 36 links) in order ("0" then "1" then "2" and so on). When they are exhausted, I'll go through all the 2-character options ("01" then "02" and so on). This will provide a total of 1296 links (36*36), and then, all the three-character options (36*36*36 = 46,656 links), and so on.

In my lifetime, I probably won't use up all three-letter URL combinations. So my URLs will, at the most, run only 20 characters in length (i.e. "http://joabj.com/z99"), thanks to my relatively short domain name. This is the exact length as the shortened links that Bit.Ly current offers. Yay! Brevity!



Back