They’re out there, services such as bit.ly, tinyurl.com, now goo.gl, and there is probably more that I am missing. You’ve probably used them, either in IRC, or Twitter, or somewhere else. No one really thinks about them, for all you need to do is simply click on them, and you will get to the website you want. Nice, easy, and it saves you characters on your twitter post, along with keeping the people in your IRC channel happy that you didn’t post a three line link. However, this is going to be a very large problem down the road, unless some major modification to how the internet works is made.
One of the reasons I write these long posts on my blog rather than on some social networking site is because of archive-ability. At some level, I consider this to be a journal, sure not a private journal, not even a log of my life, but rather, a log of the world as I see it. As such, I would like to be able to return here forty years from now, and read about all of the little issues that we are having that, at the time, we thought were so great. To a certain extent, I can do this. Sure, some, maybe all of the links will be dead. But at least they are direct links. So, if worst comes to worst, I could go to the wayback machine (assuming that it, or something like it, still exists), and find my links there. Even if no central archive exists forty years down the road, at least I can look at the link, and get the general gist of where it is pointing to.
None of this works with short links. If there is a central archive, then it may be possible to go to it forty years from now, and see where tweets are pointing to. However, unless an organisation puts forth this massive amount of effort, all of these short links are going to go to the dogs. Back when short links were only used in IRC, this was less of a problem (except for those IRC channels that were being logged). However, with Twitter, and the occasional logged IRC, or other form of communication I am not aware of, the historical value of these tweets are going to become useless. Many people who find that they can’t say everything they wish to on twitter use the service as a form of getting attention. They post a little tease, and a link to an article (either there post, or that of another). Because Twitter requires your submission to be less than 140 characters, and people in IRC get cranky when you post links longer than a line or so long, people use a service that takes the long URL, and shortens it. This seams all good and fine now. Sure, there may be some people that put malicious links in them, but people just click on them and read the article, and give little to no thought on where to find them again.
So what should we do about this? Well, we could just assume that the companies that run these URL shortening services will continue to last. However, it is likely that if that is the only thing archiving short URLs, we are all doomed. For, even if the firms are still in business, they will likely charge some sort of fee to get to these links. It will likely also be a large fee that only universities can pay. Or, if the paradigm of academic research changes, it could be a small monthly access fee.
Another solution could be to get the government involved. The NSA probably is already making a database of where many of these short URLs go to. So, we could go the extra step. We could sanitise it, and release it to the general public. This too, however, is also a problem, as it fixes the symptoms, but not the problem. We will be able, with some effort, to determine where links are pointing too. But what of people in other countries? We could make it a world wide project, but who pays for it at that point? Speaking of which, why should anyone have to pay for it, when there are solutions that can be free, just by a paradigm shift?
Quite possibly, the best solution there is, is to rethink the web. Several people have considered adding meta-data to tweets, thinking that it will solve the problem. Sure, it may solve it for Twitter, but what about IRC, and what if another service crops up that uses short URLs? The problem crops up again. Besides, is it really all that good of an idea to set up a system that requires users to submit content in a very brittle fashion? Rather, it would be better to set up a system that, like this blog, and HTML in general, can point to one site, even with text to another. In other words, I want ubiquitous <a> and </a> tags on the internet. Yes, that does mean that someone can post a link, which actually links to another place. However, for archival purposes, this will be much more powerful. Also, a large portion of the web is already doing this. That way, we could still have our short tweets, and IRC posts, and still be able to link to websites without the need to use third party solutions. Yes, some people will end up not implementing it, but if we do have some sort of standard, even if it’s an ad hoc standard, we will at least have something.