Friday, December 24, 2010

Downloading files from the Apple Developer website using wget (for poor connections or scheduling)

Recently we had some issues downloading the latest iphone SDK. Firstly due to a crappy 3G broadband connection; we never seemed to be able to download the entire 3.5 Gigabyte file without a dropout. Unfortunately for us, Apple seem to have overlooked this possibility with their developer website and they do not offer the download using a more robust facility. Note to Apple. Ftp would be nice.

Secondly with several team members located in the same city but not on the same LAN we wanted to distribute the update to all members via our shared linux server. Unfortunately, this said server is getting a little long in the tooth now and being a 32 bit linux distribution it does not support files of the size that is the Xcode and IOS dmg. We were going to have to split the file into more managable chunks. What a pain.

The command line tool wget often yields answers to these kinds of problems so consequently it was our initial foray into finding a solution. Firstly it can provide an extra layer of robustness for downloading files, secondly it's very easy to schedule downloads via cron. For those of you wondering why this is a consideration - welcome to the reality of living in the internet 3rd world - Australia. With the typical internet plans in Australia, heavy internet users such as ourselves find it important to spread our download usage between peak periods (any time you're likely to be awake) and offpeak periods (any time you're likely to be sleeping) to maximise our bandwidth allocation.

Frustratingly, downloading the IOS SDK via wget is complicated by the need for any web client connecting to the Apple Developer website to have been authenticated. The Apple website is known to use cookies to authenticate web clients, and several recipes for extracting authetication credentials from browser cookies into a file and using then via the wget command line interface are well known - at least for Firefox.

The basic procedure for accessing content using wget from a site requiring authentication involves logging into the said site using a standard web browser, once authenticated via a login page one can set abou extracting the authentication cookie from the browser. The extracted cookie is then fed to wget which can use the cookie for permission to download the desired content.

Being on a Macintosh system we are by default provided with Safari. Not bothering to install Firefox on every system one uses, we figured it was easier just to stick with Safari. Luckily the same technique can be performed with Safari as with Firefox. The technique for Safari is not as well known as the Firefox technique, so we'll cover it here.

Safari stores its cookies on a per user basis within a user's home directory. Specifically cookies are stored in a simple XML file. Have a look in Library/Cookies/Cookies.plist. You can see all of Safari's cookies in there.

To get the required cookie into the Cookies.plist before we proceed, use Safari to login to the Apple Developer Website using your Apple ID credentials. Safari should now have the requisite cookie. Opening the Cookies.plist with a text editor to view the cookie; we're looking for the one called ADCDownloadAuth.

With wget expecting it's cookie information in the nescape cookie.txt file format we'd like a quick and simple way to convert from one format to the other. Luckily this is relatively easy to do on a Macintosh system. As the language Ruby is preinstalled on Tiger, Leopard and Snow Leopard systems we may as well leverage the language to do this job.

Install the plist Ruby library and run the short ruby script listing below to convert the file to Firefox's cookies.txt format.

$ sudo gem install plist
$ irb
>> require 'plist'
>> result = Plist::parse_xml("Library/Cookies/Cookies.plist")
>> File.open("cookies.txt", "w") {f result.each {r f.write("#{r["Domain"]}\tTRUE\t#{r["Path"]}\tFALSE\t#{r["Expires"].strftime("%s")}\t#{r["Name"]}\t#{r["Value"]}\n")}}

Now that we have our cookies.txt file we can download the file we would like. Note that the URL for the sdk was found by looking at Apple's Download website to see where the download link led.

Below is the wget command line used to download the Xcode and iPhone SDK. Note that the command line variables for wget tell it to pipe the downloaded file to split which breaks the file up to make our venerable Linux file and webserver happy (2GB file limit). We're splitting the download into 512mb chunks here.

To make sure the authetication works I needed to use the header flag and insert the cookie value at the command line. Looking at the cookie.txt file to again find the ADCDownloadAuth key and it's datavalue we place this data in exchange of the "XXX" marked command line below for this recipe to work.

wget -qO- -U firefox -ct 0 --timeout=60 --waitretry=60 --load-cookies cookies.txt -c http://adcdownload.apple.com/ios/ios_sdk_4.2__final/xcode_3.2.5_and_ios_sdk_4.2_final.dmg --header="Cookie: ADCDownloadAuth=XXX" | split --bytes=512m - xcode_3.2.5_and_ios_sdk_4.2_final.dmg


You should now see your download commence and with everything to plan you'll have your dmg ready to install. You can ftp to you Macintosh. Once aboard you can
cat part1 part2 part3 ... > combined.dmg

In order to restore the split components. Happy Developing with the latest SDK!