Daniel Rossi • about 5 years ago
Hi I had a look at the datasets and it seems indeed there is a massive problem in terms of how the government expresses their data. They really need to open up data apis.
They are nearly all XLS / ODS yeah some are open office spreadsheets. There seems to be python packages available to extract this out to JSON so either in command line or via django perhaps.
OR process the datasets into a local db which might help with caching , which can be scheduled to update.
A command line script using python could be used to suck the whole thing up into a DB and scheduled via cron.
Comments are closed.