Modern software development relies a great deal on the web. Our source code is hosted on GitHub, we download necessary libraries from Maven Central, Ruby Gems or NPM. We communicate and discuss using issue trackers and forums and our software is built on hosted Continuous Integration servers. We rely a lot on SaaS infrastructure. But the Internet is constantly in motion. Pages appear and vanish. New services get created, moved around and shut down again. When Tim Berners-Lee created the web, he wished for URIs that would not change, but unfortunately the reality is different.
I hate dead links. While usually dead links are not a big deal, you just use Internet search to find their new homes, I still find them annoying. It is impossible to update all dead links on my blog and in my notes, but at least I want the cross references of my own stuff to be correct. This means that I have to migrate and update something at least once a year. The recent shutdown of Google Code concerned me and I had a lot of extra work. This made me think.
Learning: Host as much of your content under your own control, e.g. your personal web space.
To deny the usage of today's powerful and often free services like GitHub would be foolish, but still I believe we (developers) should consider them dependencies and as dependencies they are also liabilities. We should try to reduce coupling and be aware of potential migration costs.
Learning: When choosing a SaaS service, think about its benefits versus the impact of it not being available any more.
Learning: When you pay for it, it is more stable.
Personal Development Process
The impact can be internal, which means it affects only you and your work, or it can be external, which means it affects others, your users or people using your code or documentation. For example let us consider CloudBees. I started using it in the early beta and it was great. Given time I moved all my private, kata and open source projects there and had them build on each commit. It was awesome. But last year they removed their free plan and I did not want to pay, so I stopped using it. (This is no criticism. CloudBees is a company and needs to make money and reduce cost.) The external impact was zero as the Jenkins instance was private. The internal impact seemed huge. I had lost my CI. I looked for alternatives like Travis, but was too lazy to configure it for all my projects. Then I used the Jenkins Job Import Plugin to copy my jobs to a local instance and got it sort of running. (I had to patch the plugin, wasting hours...) Still I needed to touch every project configuration and in the end I abandoned CI. In reality the private impact was also low as I am not actively developing any Open Source right now and I am not working on real commercial software where CI is a must. Now I just run my local builds more often. It was cool to use CloudBees, but I can live without it.
Learning: Feel free to use third party SaaS for convenience, i.e. for anything that you can live without easily.
Another example is about written information, the Hackergarten wiki. When I started Hackergarten Vienna in 2011, I collected material how to run it and put it into the stub wiki someone had created for Hackergarten. I did not think about it and just used the existing wiki at Wikispaces. It seemed the right place. There were a few changes by other users, but not many changes at all. Two years later Wikispaces removed their free plan. Do you see a pattern? The internal impact was zero, the but external impact was high, as I wanted to keep the information about running a Hackergarten available to other hackers. Still I did not want to spend 50$ to keep my three pages alive. Fortunately Wikispaces offered a raw download of your wiki pages. I used this accessible copy of my work and converted the wiki pages into blog pages in no time. As I changed the pages rarely the extra overhead of working in HTML versus Creole was acceptable. Of course I had to update several links, a few blog posts and two slide-decks, sigh. (And it increased my dependency to Google Blogger.)
Learning: When choosing a SaaS service, check its ways of migrating. Avoid lock-in.
Learning: Use static pages for data that rarely changes.
Moving code repositories is always a pain. My JavaClass Ruby Gem started out on Rubyforge, later I moved it to Google Code. With Google Code shutting down I had to migrate it again, together with seven other projects. The raw code and history were no problem,
hg convertdealt with that. But there were a lot of small things to take care of. For example, different version control system used a different ignore syntax. The Google Code project description was proprietary and needed to be copied manually. The same was true for wiki pages, issues and downloads.
I had to change many incoming links. First URL to change was the source repository location in all migrated projects' descriptors, e.g. Maven's
package.jsonand so on. Next were the links to and from project wiki pages and finally I updated many blog posts and several slide-decks. And all the project documentation, e.g. Maven sites or RDoc API pages needed to be re-generated to reflect the new locations. While this would be no big deal for a single project, it was a lot of work for all of them. I full-text-searched my hard-disc for obsolete URLs and kept finding them again and again.
Maybe I should not cross link my stuff that much, and I am not even sure I do link that much at all. But instead of putting the GitHub URL of the code kata we will be working on in a Coding Dojo directly into the slides, I could just write down the URL on a flip-chart at the beginning of the dojo. The information about the kata seems to be more stable than the location of the source code. Also I might use the same slides working on code in different languages, which might be stored in different repositories. But on the other hand, if I bother to write documentation and I reference something related, I expect it to be linked for fast navigation. That is the essence of hyper-text, isn't it?
Learning: (maybe) Do not cross link too much.
Learning: (maybe) Do not link from stable resources to less stable ones.
Next to source code I had to migrate generated artefacts like my Maven repository. I had used a Google Code feature that a repository was accessible in raw mode. I would just push to my Maven repository repository (recursion yeah ;-) and the newly released artefacts would show up. That was very convenient. Unfortunately Bitbucket could not do that. I had a look into Bitbucket pages, but really did not feel like changing the layout of the repository. I was getting tired of all this. In the end I just uploaded the whole thing to my public web space. Static web pages and binary files, e.g. compressed archives, can be hosted on any web server and I should have put them there in the very beginning. Again I had to update site locations, repository URLs and incoming links in several projects and blog posts. As I updated my parent Pom I had to release new versions of several projects. I started to hate hyper-links.
Learning: Host static data on regular (personal) web spaces.
You might argue that Maven Central would be a better place for Maven artefacts and I totally agree. I consider Maven Central much more stable than my personal web space, but I did not bother to go through the process of getting access to a service that would mirror my releases to Central. Anyway, this mirroring service, like Sonatype's, feels less stable than Central itself.
Learning: Host your stuff on the most stable option available.
Now all my repositories are hosted on Bitbucket. If its services stop working some day, and they surely will stop somewhere in the future, I will stop using hosted repositories for my projects. I will not migrate everything again. I am done.
Learning: (maybe) Do not bother with dead links or losing your stuff. Who cares?
For some time Bitbucket offered a CNAME feature that allowed you to associate a domain or sub-domain with an account. That was really nice, instead of
hg.code-cop.org/project-x. I liked it, or so I thought. Of course Bitbucket decided to disable this feature this July and I ended - again - updating URLs everywhere. While the change was minimal, path and parameters of URLs stayed the same, I had to touch almost every source repository to change its public repository location, fix 20+ blog posts and update several Coding Dojo slide-decks, which in turn needed to be uploaded to SlideShare again.
Learning: Only use the minimal, strictly necessary features of a SaaS.
Learning: Even when offered, turn off all features you do not need, e.g. wiki, issues, etc.
Learning: Do not use anything just because it is handy or "cool". It increases your coupling.
URLs Change All The Time
Maybe I should not link too much in this post as eventually I will have to change all the links again, grrrr. I consider using a URL service like bitly. Maybe not exactly like bitly because its purpose is the shortening and marketing aspect of links and I do not see a way to change the actual link once it was created. And I would depend on another service, which eventually would go away. So I need to host the service myself, like J. B. Rainsberger does. I like the idea of updating all my links with a single
updatestatement in the link database. I wished I had used such a thing. It would increase work when creating links, but would reduce work when changing them. Like with code, it seems that my links are far more often changed than created, at least some of them.
I could not find any free or otherwise service providing this functionality. So I would have to create my own. Also I do not know the impact of the extra redirect on search engines and other automatic consumers of my content. And still some manual changes are inevitable. If the repository location moves, the SCM settings need to be changed. So I will just wait until the next feature or whole service is discontinued and I have to start over.
Thanks to Thomas Sundberg for proof-reading this article.