Copacetic

Ace King, check it out!

Python Program to scrape the CRO Database

with 5 comments

I’ve spent a lot of time looking at the CRO database in the past few weeks, so I wrote this python script called croscraper.py to do a quick lookup on companies.

Usage is:

backus:~/Documents/workspace/CloudSplit/src/croscrape jpd$ python croscraper.py -h
Usage: croscraper.py <list of files>

Options:
 --version             show program's version number and exit
 -h, --help            show this help message and exit
 -c COMPANY, --company=COMPANY
 return information on company named <company>
 -d, --debug           turn on debugging
backus:~/Documents/workspace/CloudSplit/src/croscrape jpd$

To look up a specific company name or part of a name try,

backus:~/Documents/workspace/CloudSplit/src/croscrape jpd$ python croscraper.py -c cloud
{'Number': '44950', 'Type': 'Company', 'Name': 'BLUE CLOUD LIMITED', 'Address': '46, LOWER LEESON STREET, DUBLIN 2.  '}
{'Number': '332318', 'Type': 'Company', 'Name': 'CLOUD 9 DESIGN LIMITED', 'Address': '21 CLOISTER AVENUE BLACKROCK CO DUBLIN '}
{'Number': '102707', 'Type': 'Company', 'Name': 'CLOUD DANCER LIMITED', 'Address': '18, MERRION SQUARE, DUBLIN 2.  '}
{'Number': '&', 'Type': '361040', 'Name': 'CLOUD ELECTRICAL ', 'Address': ' COMMUNICATION SERVICES LIMITED'}
{'Number': '472475', 'Type': 'Company', 'Name': 'CLOUD IGNITE LIMITED', 'Address': 'BAYVIEW HOUSE 49 NORTH STRAND ROAD DUBLIN 3 '}
{'Number': '466027', 'Type': 'Company', 'Name': 'THE CLOUD NETWORKS (IRELAND) LIMITED', 'Address': "C/O O'MAHONY DONNELLY 10 MCCURTAIN HILL CLONAKILTY CO CORK"}
{'Number': '328355', 'Type': 'Company', 'Name': 'CLOUD NINE BEDS LIMITED', 'Address': 'Unit T1 Coolmine Industrial Estate Clonsilla Road Dublin 15'}
{'Number': '124102', 'Type': 'Company', 'Name': 'CLOUD NINE HAIRLINE (SALES) LIMITED', 'Address': 'BALLYMOUNT ROAD, DUBLIN 12.  '}
{'Number': '55964', 'Type': 'Company', 'Name': 'CLOUD NINE PROMOTIONS LIMITED', 'Address': '********NO ADDRESS DETAILS******* ********NO ADDRESS DETAILS******* ********NO ADDRESS DETAILS******* ********NO ADDRESS DETAILS*******'}
{'Number': '311145', 'Type': 'Company', 'Name': 'DILL CLOUD LIMITED', 'Address': '6 RICHMOND ROAD DRUMCONDRA DUBLIN 3 '}
{'Number': '316954', 'Type': 'Company', 'Name': 'MUSHROOM CLOUD PRODUCTIONS LIMITED', 'Address': '3RD FLOOR WESTLAND SQUARE DUBLIN 2 '}
{'Number': '44954', 'Type': 'Company', 'Name': 'RED CLOUD LIMITED', 'Address': '6, CAVENDISH ROW, DUBLIN.  '}
{'Number': '433681', 'Type': 'Company', 'Name': 'ROLLIN CLOUD LIMITED', 'Address': '44 BELGRAVE SQUARE WEST RATHMINES DUBLIN 6 '}
{'Number': '175728', 'Type': 'Company', 'Name': 'SOLAR CLOUD LIMITED', 'Address': '38, ENNAFORT ROAD, RAHENY, DUBLIN 5. '}
backus:~/Documents/workspace/CloudSplit/src/croscrape jpd$

Gee look at all those cloud companies 🙂

It is left as an exercise to the reader to port the program to Google App Engine or get more data out of the CRO. I may do some more work on this in the future.

You will need Python installed to run this program. It has only been tested on a Mac.

Advertisements

Written by Joe

September 1, 2009 at 11:51 pm

Posted in Uncategorized

5 Responses

Subscribe to comments with RSS.

  1. Works like a charm on XP with Python 2.6

    Paul Power

    September 2, 2009 at 12:29 am

  2. Interesting market research technique 😉

    Jarek

    September 2, 2009 at 8:31 am

  3. tested on FreeBSD 🙂 no problems at all 😉

    Jarek

    September 2, 2009 at 8:35 am

  4. […] Scraping the CRO Database by Joe Drumgoole […]

  5. […] Copacetic » Blog Archive » Python Program to scrape the CRO Database I’ve spent a lot of time looking at the CRO database in the past few weeks, so I wrote this python script called croscraper.py to do a quick lookup on companies. (tags: @2file) Comments closed — Trackbacks closed RSS 2.0 feed for these comments This entry (permalink) was posted on Thursday, October 8, 2009, at 02:09 by nrvous.delicious. Filed in Delicious Links. « links for 2009-10-06 links for 2009-10-08 » Home […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: