Why We Built This
The Isle of Man Government publishes data. Quite a lot of it, in fact — more than many jurisdictions its size. There's an official open data page, departments release annual reports, and various registers are queryable online.
So why did we build Smart Island?
Because publishing data and making it usable are different things.
The Format Problem
Government data on the Isle of Man arrives in every format imaginable:
- CSV files with inconsistent column names across years
- PDF reports with tables embedded in flowing text
- Scanned PDFs — yes, actual images of printed pages, where the data isn't even text
- Web portals with no bulk export option
- Excel spreadsheets emailed on request
Each format requires different extraction tooling. CSV is straightforward but schemas change between publications. Text-based PDFs need table detection algorithms. And scanned PDFs — the final boss of open data — need OCR with AI verification.
A Real Example: School Rolls
The Isle of Man Department for Education publishes school roll data annually as PDF reports. Six years of data, 37 schools, detailed year group breakdowns by gender.
Of those six PDFs:
- Four were text-based — tables could be extracted programmatically using PyMuPDF
- Two were scanned images of printed documents
For the scanned reports, we used AI-assisted OCR: extract text from the images, then use Claude to verify and correct the extracted numbers against row and column totals. Every number in the final dataset has been checked against the published totals.
The result is the Isle of Man's first structured, machine-readable schools dataset. Six years, 37 schools, 222 school-year records, every year group broken down by gender — available as JSON via API, downloadable as a bulk file, and queryable through AI tools.
None of this existed before. The data was public, but it wasn't accessible.
Not a Criticism
This is important to say: Smart Island is not a criticism of the Isle of Man Government's data practices. The government has a genuine commitment to transparency and publishes more open data than many comparable jurisdictions.
But government operates within constraints that the private sector doesn't face. Procurement cycles, committee approvals, legacy systems, competing priorities, and limited specialist resource all slow things down. This isn't unique to the Isle of Man — it's universal.
The private sector can move faster. We can iterate weekly, adopt new tools immediately, and build the kind of infrastructure that makes raw data genuinely useful. APIs with filtering and pagination. AI enrichment that adds context automatically. MCP tools that let AI assistants query Isle of Man data directly.
Government publishes the data. We make it sing.
What We Built
Smart Island now covers 30+ datasets:
From government sources: jobs, vehicles, property transactions, aircraft, ships, crime statistics, companies, schools, population, FOI requests, gambling licences, financial services, parliament, elections, planning applications, and more.
From community sources: OpenStreetMap points of interest, infrastructure, heritage sites, geography, Manx place names.
From international bodies: World Bank economic indicators, OECD education data, GBIF biodiversity records, ERA5 climate reanalysis.
Every dataset follows the same access pattern:
- Interactive dashboard — charts, tables, AI commentary
- REST API — JSON with filtering, pagination, full-text search
- MCP tools — AI-callable via Claude, ChatGPT, Cursor, or any agent
- Bulk downloads — JSON, ZIP, and GZ for offline analysis
The Pipeline
65+ automated pipelines run daily, nightly, and weekly:
- Scrape — pull data from source websites, APIs, and file servers
- Parse — extract structured data from PDFs, CSVs, and HTML
- Normalise — consistent naming, date formats, and schema alignment
- Enrich — AI adds analysis, risk scores, skills extraction, sentiment
- Publish — API endpoints, MCP tools, page data, download exports
When a new PDF is published or a government website updates, the pipeline picks it up automatically. No manual intervention, no waiting for someone to re-export a spreadsheet.
Why It Matters for the Isle of Man
Small islands face unique challenges. With 84,000 people, the Isle of Man doesn't have the scale for large research institutions or dedicated data agencies. But it has the advantage of being comprehensible — one island, one government, a manageable number of datasets.
That makes it possible to build something genuinely complete. Not a portal linking to scattered PDFs, but a unified platform where a researcher, policy maker, or AI agent can query across education, employment, property, demographics, environment, and infrastructure in a single session.
Open data isn't just about transparency (though that matters). It's about enabling the kind of cross-domain analysis that leads to better decisions. When you can see school enrolment falling alongside birth rate decline alongside property market trends alongside employment patterns — the picture that emerges is richer than any single dataset could provide.
Try It
- Browse: Data Catalogue — all 30+ datasets with dashboards
- Download: Open Data Downloads — JSON, ZIP, GZ exports
- Query: API Reference — interactive REST API docs
- AI: MCP Tools — 89+ tools for AI assistants
- Source: Data Sources — provenance and licensing for every dataset
All data is free. No API key required. No registration. Just structured, machine-readable Isle of Man data.
This post was written by Claude as part of the Smart Island open data platform.
