The code behind OpenAustralia


How to Install

This document explains how to install the openaustralia parser. The parser generates XML files which get loaded into the database of the web application. If all you need is the XML files and aren’t interested particularly in installing and configuring the parser then you can just download the XML files from data.openaustralia.org.

Requirements

Install the dependencies

Max OS X Leopard

Install DarwinPorts and then install ImageMagick and ghostscript:

$ sudo port install ImageMagick
$ sudo port install ghostscript

Note: the previous step takes a long while to complete, make yourself a coffee (or two)

Install the required rubygems:

$ sudo gem install -y mechanize -v 0.9.2
$ sudo gem install -y builder rmagick rcov htmlentities rspec activesupport log4r hpricot

Note: Currently OpenAustralia requires an older version of mechanize (0.9.2), but this might change in the future.

Ubuntu 8.04

Use apt-get to install the requirements:

$ sudo apt-get install imagemagick libmagick9-dev ghostscript ruby rubygems ruby1.8-dev libxslt1-dev

Install the required rubygems:

$ sudo gem install -y mechanize -v 0.9.2
$ sudo gem install -y builder rmagick rcov htmlentities rspec activesupport log4r hpricot

Note: Currently OpenAustralia requires an older version of mechanize (0.9.2), but this might change in the future.

For Windows

Ruby has its own Windows versions that you need to get from Ruby Downloads (choose the one-click installer option).

In addition to the Ruby gems required above you’ll need to install Ruby-MySQL, which can be downloaded from http://www.tmtm.org/en/ruby/mysql/.

Configure the Parser

The only configuration necessary is to change the web-root if you have installed the web application in another location. That value is web_root in openaustralia/openaustralia-parser/configuration.yml.

Run the Parser

Before you can run the parser, you will need to create the directories that will hold the images of the MPs.

$ mkdir -p pwdata/images/mps pwdata/images/mpsL

You are now ready to create the members information. You should just use:

$ ./parse-members.rb
# you should see messages on the console similar to the following
Reading members data...
Running consistency checks...
Writing XML...
Replacing existing member with new data for 5
This is for your information only, just check it looks OK.

$VAR1 = [
          '5',
          '10006',
          1,
          '',
          'Albert',
          'Adermann',
          'Fisher',
          'National Party',
          '1972-12-02',
          '1984-12-01',
          'general_election',
          'elected_elsewhere'
        ];
[...]

To download the members images (this will take a while):

$ ./member-images.rb 

If you want, though it is not particularly important initially, you can also download the links information (which goes on the Representatives’ and Senators’ pages) by running:

$ ./parse-member-links.rb

To download the Hansard data (the speeches) for one day, say Sept 20th, 2007 and load them into the database:

$ ./parse-speeches.rb 2007.09.20
parse-speeche: 100% |oooooooooooooooooooooooooooooooooooooooooo| Time: 00:01:27
db loading  2007-09-20
db loading  2007-09-20

You should now be able to view the results at your webserver URL, dev.openaustralia.org

You should now see a version of openaustralia.org populated with data.

Congratulations, you’ve got a mostly complete running version of OpenAustralia! Give yourself a big pat on the back.

Keep in touch

Google Groups
Subscribe to OpenAustralia Development
Email:
Visit this group

Twitter // Identi.ca // Facebook // Blog

www.openaustralia.org

How to edit these pages