22 March 2013

Twitter Command Line Backup

I used to be a low volume Twitter user. I would not connect with people having more than 1000 tweets because they seemed to "high-volume" for me. But given time the number of my tweets rose as well and I managed to create a decent tweet now and then.

U.S.-Bundeswehr medical trainingSaving My Tweets
As soon as I noticed that my tweets were "piling up", I thought about backing them up. This would have to work incrementally as older tweets might vanish from my time line. I searched for an online tool, but did not find anything useful. While searching for offline tools, I found TwitterBackup by Johann Burkard. It is a great tool and does exactly what I need, incremental backup of tweets. It has a small user interface and works out of the box. I recommend it if you like using UIs. (Note that the password input field in the main window is not used, you do not have to provide your password there.)

GUIs are for wimps ;-)
I prefer to start my tools from the command line, especially if I plan to run them periodically. Fortunately Johann Burkard provided his TwitterBackup under a MIT license, so I forked the source and "mavenized" the project. Johann had kept his logic separated from the UI which made it easy to remove the user interface and add command line parameters instead. A few days later I added support to backup favorites and retweets as well. The current version supports the following commands:
E:\>java -jar twitter-backup-cli-3.1.8.2-jar-with-dependencies.jar -h
usage: twitter-backup-cli [-u <twitter handle>] [-f <backup file>]
Backup Twitter Tweets with TwitterBackup (command line).
 -f,--file <arg>         File to save tweets to (saved to system)
 -fv,--favorites         Load favorites instead of tweets (default=tweets)
 -h,--help               Print this usage information
 -o,--port <arg>         HTTP proxy port for web access
 -p,--proxy <arg>        HTTP proxy URL for web access (saved to system)
 -r,--reset-preferences  Do not load preferences from system
 -rt,--retweets          Also load retweets in timeline (default=false)
 -si,--sinceId <arg>     Load tweets/favorites since the given id (default=all)
 -t,--timeout <arg>      Timeout in ms between calls to Twitter (default=10500)
 -u,--username <arg>     Twitter username to load tweets from (saved to system)
For general instructions, see Johann's website at: http://is.gd/4ete
Note that both the original and the new version save some of their parameters to the java.util.prefs.Preferences, so you need to provide your credentials only once.

Usage Patterns
Download the binary Jar and provide your twitter handle, e.g. -u codecopkofler. Use -f tweets.xml -rt to backup all tweets and retweets and use -f favorites.xml -fv to save all your favorites. Finally decide where the backup should start, e.g. -si 286077755567779841. The value 286077755567779841 is Twitter's id of my first tweet in the year 2013, whereas 152504278093795328 was my first in 2012 and so on. With -si I separate my tweets into yearly backup chunks.
Loading Tweets
read 27 tweets from tweets.xml
loading http://api.twitter.com/1/statuses/user_timeline.xml?...
9 new tweets downloaded, 36 total
waiting 10500ms
loading http://api.twitter.com/1/statuses/user_timeline.xml?...
no new tweets found
saving backup to tweets.xml
Loading Favorites
read 0 tweets from favorites.xml
loading http://api.twitter.com/1/favorites.xml?...
1 new tweets downloaded, 1 total
waiting 10500ms
loading http://api.twitter.com/1/favorites.xml?...
no new tweets found
saving backup to favorites.xml

2 comments:

Peter Kofler said...

I noticed that twitter-backup-cli-3.1.8.2 does not work any more due a change in the Twitter API. A new version 3.1.8.3 is available which fixes this.

Peter Kofler said...

Again twitter-backup-cli-3.1.8.3 stopped working because Twitter retired the old XML API. I updated the code to use JSON download (API 1.1) and a new version 3.1.8.4 is available now.