Cycle to School Scheme Report

An exploration of a hypothetical government scheme, using open data to create rich visualisations and reach interesting conclusions.

1 Introduction

1.1 What is this document

This document is a study of a prospective government scheme, the Cycle to School Scheme, including sections on feasibility and on potential future outcomes of the scheme.

All visualisations, studies, text, and data used in this report are published under open licenses, enabling re-use of any part of this document. The methodology and source code used to create this document are also published under an open license.

The source code for this document can be found at https://gitlab.com/trick16/cycle-to-school-scheme

The source code is released under the BSD 2-Clause open license, except for the CSS for this website, which is released under the GPL 3.0.

The visualisations and text within this document are released under Creative Commons Attribution (4.0), copyright Carl Lange.

Data used in the making of this document is copyright their original copyright holder, listed in the Appendix.

1.2 Statement of the scheme

The Cycle To School Scheme is a hypothetical proposed scheme to supply bicycles - one bicycle per student - to all secondary schools in the Republic of Ireland. The intent of this scheme is to increase mobility and exercise among all socio-economic groups of secondary school students, as well as decreasing Ireland's carbon footprint by requiring fewer car journeys to and from schools.

1.3 The goal of this report

This report is intended as a research study, determining how feasible the prospective Cycle To School Scheme would be to implement, as well as charting some of the potential outcomes of the scheme. The main question this report is intended to answer is: Should the Cycle To School Scheme be implemented, and if so, how?

We aim to explore many aspects of the scheme in a readable and understandable manner using data analytics and static and dynamic visualisations. The report is not exhaustive or particularly rigorous, but we hope that it inspires interest in a scheme like the Cycle to School Scheme. We also hope that this report, and the workflow we used to create it, inspires future open civic research and interest in open data.

Open Data has opened the door for a significant amount of citizen participation in research, by enabling anybody to access, re-use, and re-distribute source data. This document is intended as an example of the kind of research that can now be done by any citizen. The publication of government data as open data has removed a really sizeable "data origination" step from the process of research, particularly where the data is published in machine-readable formats. We hope to demonstrate some of the power of this data with this report.

1.4 Acknowledgements

Data for this report comes primarily from data.gov.ie, Ireland's Open Data Portal. A list of used datasets is included in the appendix, which also includes relevant datasets that we did not end up using directly.

Map data used is from Open Street Maps and used under the Open Database License.

This report was funded by the 2020 Open Data Engagement Fund.

The producers of this report are Carl Lange, Jen Carey, and Shane McDonagh, for Trick16. Trick16 is a creative technical consultancy in Ireland aiming to bring bespoke technical solutions to the Arts, Culture, Heritage sector.

Chief author: Carl Lange

Assistant author: Shane McDonagh

Producer: Jen Carey

Finally, a big thank you to everyone involved in open data in Ireland!

2 Methodology

2.1 How was the document created?

This document is written primarily in org-mode, an open source, plaintext format for writing. The use of this format over proprietary formats such as Microsoft Word ensures that the text of the report is accessible to as many people as possible, and allows the report to be "source-controlled" - different revisions of the document are stored and retrieved, and the entire history of the document is available.

This org-mode document is converted into HTML, the format used for web pages, to publish the document on the internet.

Research was primarily done within Wolfram Mathematica, a proprietary programming language, document format, and toolkit. Mathematica is commonly used for scientific research, and has an extremely large feature-set, enabling speedy research workflows. The use of a proprietary format for the research was a trade-off based on the ease of research. However, the Wolfram Notebooks are published alongside all other source material. Everything done within Mathematica can be recreated using an open-source programming language such as Python or R, and the Wolfram Notebooks are published to facilitate the verification of research results in arbitrary programming languages. Diagrams and visualisations have been created by Wolfram Mathematica, except where otherwise stated.

We structured the process of research around building blocks called "ARC"s - "Ask, Research, Conclude" blocks, which then built up larger sections of research over the course of a "sprint", typically 3-4 weeks in length.

An ARC section is typically a standalone data exploration. We "ask" a question (for example, "How many bicycles would we need?"), "research" the answer based on several data sources, exploring the data and creating visualisations and prose, and then "conclude" the result. We did monthly reviews of the questions we had lined up to explore using an ARC block, and usually did one ARC per week, although the sections often progressed asynchronously. Typically at the end of each month, we had a writing week, where the results of each ARC was drafted into the report itself. Then, a new set of questions for ARC blocks were originated, and another "sprint" was begun.

2.2 Structure of this document

We structured this document to contain several "Research and Outcomes" sections, based around our ARC blocks (described in the previous section). We intend the Research and Outcomes sections to be standalone data explorations, which contribute to the overall goal of determining the feasibility of the Cycle to School Scheme. They can be read in any order, more or less.

Typically, a Research and Outcomes section is structured by:

  • stating the problem, hypothesis, or question
  • an approximate methodology and the datasets we expect to use
  • the body of research, often including specific code snippets and visualisations
  • a conclusion, ideally solving, proving, or answering the problem stated in the beginning

3 Research and Outcomes

3.1 How many bikes do we need?

3.1.1 Introduction

This is one of the most fundamental questions laid out in this report: how many bicycles are actually involved in the Cycle to School Scheme? Luckily, it is a fairly easy thing to work out. We will focus on secondary schools only, although there might also be some interesting data to look at regarding primary schools.

3.1.2 Method

We can simply take the total student body from all secondary schools in Ireland.

EntityRegister[ResourceData["Irish Schools"]]

Total[EntityValue[EntityClass["IrishSchools", "Secondary School"], "BodySize"]]
362704

So there we have it - 362,704 students in secondary schools in Ireland.

We can obviously also look at primary schools:

Total[EntityValue[EntityClass["IrishSchools", "Primary School"], "BodySize"]]
528562

So there are in total just under a million people in primary and post-primary education in Ireland. That's a lot of bikes.

3.1.2.1 How many schools are there in the first place?
Length[EntityList[EntityClass["IrishSchools", "Secondary School"]]]
723

723 secondary schools in Ireland. We can see how they are distributed in the following visualisations.

First, here's all the schools on a map.

school-geo-list.png
Figure 1: Irish schools geo-list

And here's the number of schools per county:

school-grv.png
Figure 2: Irish schools by county

This is one of those things that ends up practically just being a population map. Naturally, there are more schools where there are more people.

Next, let's take a look at the distribution of the number of students in schools by county.

school-body-histogram.png
Figure 3: Student body distribution by county

Looking at the above graph, it's evident that Cork and Dublin have more schools than the rest of the country, but in general most schools appear to have somewhere between 200 and 600 students. Curiously, it seems that there are more schools that take smaller numbers of students in Dublin and Cork, proportonal to other counties. I also certainly didn't expect to see quite so many schools with more than a thousand students.

Now, let's think about how many bicycles would be irreparably damaged, lost, or stolen over the course of a year. I would expect somewhere in the region of 10%.

As we found above, the number of students in secondary schools in Ireland is 362,704. If we assume 370,000 bicycles in our initial bicycle investment and 37,000 bicycles per year that need replacing, over ten years, we would need (37,000*9) + 370,000 bicycles - that works out nicely to 700,000 bicycles. That seems like a lot of bicycles to me - probably something like 50% of all bicycles currently in use in Ireland.

3.1.3 Research Outcomes

It seems that we will need somewhere around 370,000 bikes, if we focus entirely on secondary schools. We could lower this number slightly if we don't distribute bikes to students who live significantly close or far from their schools, or some other assumptions about which students would or would not cycle regardless of owning a bicycle. However, it's probably safe to assume that some number of bicycles would be lost, damaged, or in some way unmaintainable, and this number will probably meet or exceed the number we would filter out.

  • Approximately 370,000 bicycles would be needed to cover all secondary schools in Ireland at any point.
  • Assuming 10% loss per year, over the first 10 years of the scheme 700,000 bicycles would need to be supplied.

3.2 How expensive would the scheme be?

3.2.1 Introduction

This is among the most important factors for the implementation of the scheme. We must determine the approximate cost of the scheme, and ideally also for alternative implementations.

3.2.2 Method

This one's pretty much just arithmetic. We can do a quick estimate by simply thinking of some costs and adding them all together. That should put us somewhere in the correct order of magnitude, and then we can try to narrow it down from there..

For this, we need to know the number of schools, and the size of the student bodies in those schools. This is examined in the "How many bikes" section, and those totals come out to 362704 bikes for secondary schools, and 527562 bikes for primary schools. However, we'll focus on secondary schools only for this - my hypothesis is that supplying secondary schools would be a more effective and successful version of the scheme.

Anyways, let's get started by simply noting down some example costs that we can think off off the top of our head.

item cost per student
bike 100
equipment - helmets 25
equipment - high vis 2
equipment - bike locks 10
equipment - lights 5
training for students - 1h/100 students @ 75 eur/hr 7.5
bike storage for schools 6
Total per student 155.5

Then we simply add all that together and multiply it per number of bikes! Now we're playing with the true power of arithmetic.

I've created an interactive "cost estimator" to allow you to specify your own guesses for these values.

Given the numbers I've laid out above, we get a total cost of 54,496,276 Euro. However, that's just material cost estimate - let's add a pretty big margin for administrative and other costs and call it 80 million Euro. I'm comfortable with that as an estimated cost. A pilot scheme supplying only 5% of secondary school students might cost somewhere in the region of 5 million Euro.

3.2.3 Research Outcomes

  • The initial cost of the full scheme for secondary school students might cost around 80,000,000 euros.
  • A pilot scheme supplying 5% of secondary school students might cost around 5,000,00 Euro.

3.3 How many schools are ready for the scheme?

3.3.1 Introduction

We need to determine whether a phased rollout of the scheme is worthwhile (versus a single-stage rollout). To do that, it would be valuable to find out how many schools might be ready for the scheme today.

We have determined a number of factors that schools would require to be eligible for the scheme. These are:

  • A bike shop or car mechanic should be nearby the school. This would be important to facilitate repair of the bikes, as well as potentially education of the students in bicycle maintenance, road safety, and so on.
  • Storage space for bicycles. One could estimate that at any point, some large percentage of the bicycles given to the schools would need to be stored at the school - during school days, for example.
  • Some large percentage of the student body must live within 8km (or similar arbitrary distance) from the school. One can assume that students living outside of that distance are extremely unlikely to cycle to school on a regular basis.
  • Cycle lane / road quality, road sinuosity

3.3.2 Method

In this section, we will determine approximately how many schools pass the predicates listed above. Some of the data used here is not definitive (Open Street Maps and so on) and so we must add a pinch of salt to these numbers.

3.3.2.1 Bike shop within 10km of the schools

This is a fairly straightforward query to Open Street Maps. We can use Overpass, a query language for Open Street Maps - however, we might change this to use SPARQL via Sophox instead, since the Linked Data nature of SPARQL is appealing for this project, and SPARQL is a slightly more readable and expressive language.

Here is a simple Overpass query to get all bike shops:

[out:json][bbox:{{bbox}}][timeout:25];
(
  node["shop"="bicycle"];
  area["shop"="bicycle"];
);
out body;
>;
out skel qt;

This turns out to only be 213 bike shops listed in Open Street Maps. That sounds low to me, but it doesn't seem like it's a million miles away from what I would expect. This is partially to do with using community-driven data sources like Open Street Maps - the data is unlikely to be completely comprehensive.

Anyway, we want something slightly more useful - bike shops within a certain distance to school buildings.

[out:json][bbox:{{bbox}}][timeout:30];

(
  node[building=school];
  way[building=school];
  rel[building=school];
)->.schools;

(
  node[shop=bicycle];
  way[shop=bicycle];
  rel[shop=bicycle];
)->.bikeshops;

(
  node.bikeshops(around.schools:10000);
  way.bikeshops(around.schools:10000);
  rel.bikeshops(around.schools:10000);
)->.bikeShopsNearSchools;

.bikeShopsNearSchools;

out geom meta;

The SPARQL version of this query is the following:

SELECT ?bikeshops ?schools ?distance WHERE {
  ?schools osmt:building "school" ;
           osmm:loc ?schoolsLoc.
  ?bikeshops osmt:shop "bicycle" ;
             osmm:loc ?bikeshopsLoc.

  BIND(geof:distance(?bikeshopsLoc, ?schoolsLoc) as ?distance)
  FILTER(?distance < 10)
}

This returns to us the list of bike shops within ten kilometres of a school. In fact, this is 213 bike shops, meaning none of the bike shops on Open Street Maps in Ireland is more than 10 kilometres from a school.

The bike shop closest to a school is The Cycle Shop in Dundalk, 29 metres away from Coláiste Rís, right next door.

Let's flip it and find the total number of schools with bike shops no less than 5km away. If we change the SELECT statement above to only ask for DISTINCT ?schools, and change the FILTER to 5 rather than 10, we get 1,556 schools (of a total of 4,118 schools in the Open Street Maps data for the island of Ireland).

3.3.3 Research Outcomes

  • A little over one quarter of schools are within 5km of a bicycle shop
  • About one quarter of schools might be ready for the scheme

3.4 How many kilometres are driven to and from Irish schools today?

3.4.1 Introduction

This is a surprisingly deep question. There are several ways to measure this, and data also already exists from the CSO.

This particular question has a large impact on the environmental outcomes of the scheme, since driving to and from school is probably a large amount of Ireland's carbon emissions.

3.4.2 Method

3.4.2.1 CSO data
3.4.2.1.1 CSO data for travel times to school

The CSO has a very useful table, E6016, which describes travel times to schools by mode of transport. We can use this to guess at how many kilometres are driven.

We can also see the death of cycling as a method for going to schools, using the "mode of transport" table, E6010.

volumes-of-transport.png
Figure 4: Travel methods to schools for secondary-age students. Source: CSO

From the above figure, we can see that cycling has really lost its place in Irish homes for getting to school. In 1986, the first year this data was collected, more students cycled to school than were driven. However, by 2005 the number of students cycling to school had dropped dramatically.

3.4.2.1.2 Travel methods to schools (E6010)

This dataset shows that, in 2016, 146755 students were passengers in motor cars to get to school. We can simply guess at the average number of kilometres driven per student and multiply it by the number of students to get an estimate of our result.

We can use a Voronoi mesh method to establish an upper bound for this estimate, if we want.

Wikipedia describes a Voronoi mesh as "a partition of a plane into regions close to each of a given set of objects." For me, I imagine growing each point outwards at the same rate and stopping when they collide with another point.

voronoi-explanation.gif
Figure 5: A visual explanation of how Voronoi meshes work

I've created a Voronoi mesh centred on all of the secondary schools in Ireland. I've also visualised population data for each cell. For our purposes here, these cells will approximate the "catchment area" of each of our schools, since any point within a cell will be closer to the school at its centre than any other school.

schools-voronoi-population.png
Figure 6: Voronoi mesh centred on schools by population

What's the distribution of population across these cells?

cell-population-histogram.png
Figure 7: Histogram of population of school catchment cells.
cell-area-histogram.png
Figure 8: Histogram of area of school catchment cells.

The average area of these cells is 96.6 kilometres squared. If we imagined these cells as simple disks, with that area, their radius would be about 5.5 kilometres. This would indicate that the furthest anyone would need to travel to their schools - in a straight line - would be around that distance, although it is fudged quite substantially by roads not in fact being straight lines to schools (that's for another scheme…). So I will estimate an upper bound of most people living at most 20 kilometres of driving distance away from their nearest secondary school, and an average driving distance of about 10 kilometres.

If we say the average distance driven is 10 kilometres, and each student drives to school individually, we get 1467750 kilometers. We can then multiply this by 2 (since they have to get to school and back again), to get a total distance driven of 2935100 kilometers, every day.

Seems like a lot of kilometres driven to school. Then again, nearly 150,000 students are being driven to school - that's a lot of car journeys, every day.

3.4.2.1.3 Travel times to schools (E6016)

We can use the travel times table from the CSO, E6016, to estimate more closely the distance students need to travel to school. This dataset describes the times that students needed to leave home in order to get to school. If we assume that most schools start around 08:50, we will be able to calculate how long students are sitting in cars on their way to school. Then, we can guess an average velocity, and we can estimate the distance driven that way.

Below, we can see a representation of the data in this table.

travel-times-table.png
Figure 9: Travel times to school. Source: CSO

From this, we can see that the vast majority of students driving to school leave home between 07:30 and 09:00.

Let's make a few guesses here. I am guessing that anyone leaving for school after 08:30 get there in ten minutes, that anyone leaving between 08:00 and 08:30 gets there in fifteen minutes, and anyone leaving before that gets to school in thirty minutes. I also guess an average car travelling speed of 60km/h.

Let's collect our buckets of times, and then estimate the kilometres travelled from there. People driving in a car after 0830 number (1793+51197+3066+169+87+380) 56692. Between 0800 and 0830: 69669. Before 0800: 23868.

Now, let's calculate minutes in the car. If our first bucket spends ten minutes in the car: 566920 minutes. Our second bucket, fifteen minutes in the car: 1045035 minutes. Our third bucket, thirty minutes in the car: 716040 minutes. That's a total of 2328005 minutes spent in the car.

Now, helpfully I've gussed an average travelling speed of 60 kilometres per hour. That means our "minutes spent in the car" cleanly converts into "kilometres driven", since 60 kilometres per hour is one kilometre per minute. This means that we've guessed 2328005 kilometres driven to school in the morning. If we assume the same journey time for the return, it's 4178650 kilometres in total per day (a journey to school and a journey returning).

There are, of course, several issues with this approach. Our buckets are very broad, and our guesses about average speed and time in the car are potentially very suspect. Even so, our guess here is in the same order of magnitude as our guess from the previous CSO table. If we split the difference and go for the average of the two guesses, we get 3,55,6875 kilometres driven to and from school every day.

3.4.3 Research Outcomes

  • We estimate that about 3.6 million kilometres are driven to and from school every day in Ireland.
  • The vast majority of students driving to school leave home between 07:30 and 09:00. This overlaps quite significantly with the peak times for accidents we discovered in the Accidents section.

3.5 How much traffic is due to the school run?

3.5.1 Introduction

It might be interesting to find out how much traffic is due to parents bringing their children to and from schools. This is relevant for us because we'd like to know how much of an impact on daily traffic the cycle to school scheme would have. This is quite closely related to how many kilometres are driven to school and some of the work might be a little duplicated.

3.5.2 Method

3.5.2.1 Fermi estimate

We can start out doing a Fermi calculation, which we've described elsewhere in how many bikes. Basically, we're just going to sit and think for a minute and come up with a number that sounds kind of plausible. We're doing this because it can be helpful to have a kind of "sanity check" on the numbers we come out with at the end. It's always helpful to come up with, and then write down, some expectations and hypotheses. That way you can be happy at how smart you are (if you got a reasonable estimate) or intrigued at how surprising the real data actually is (if your estimate was shockingly far off).

We know that just under a million students are in primary and secondary schools in Ireland. We can assume that practically all of those actually need to attend the school every day (remote learning notwithstanding).

If we assume that there are a million students, let's estimate that 40% of those get to school via methods that aren't cars - walking, cycling, teleportation, and so on. So then, 600,000 students travel to school via the roads. Let's say half of these go via busses in groups of about 30. That's 300,000/30: 3,000 journeys for those students taking the bus. Let's imagine the rest all carpool in groups of three: 300,000/3: 10,000 journeys for those. That's 13,000 vehicles on the roads to bring students to school every morning, and 13,000 vehicles on the roads to bring them home in the evening. That's our baseline estimate. Of course, there are plenty of holes in this strategy - for example, people two drop their kids off on their way to work, so their car would still be on the road - but it's a rough idea.

3.5.2.2 Traffic during the summer holidays

Now that we have a baseline idea, let's take a look at some data. Luckily for us (and for students), we happen to close most of our schools in the summer. We can simply compare traffic levels during the summer holidays against traffic in the surrounding months. There are some pretty clear confounding factors here - for instance, families going abroad and being more mobile generally during the summer months. Still, it'll give us some interesting numbers.

Since we have a baseline guess, what we'd like to see is a drop in traffic volumes somewhere in the same order of magnitude.

Transport Information Ireland publish a lot of really high quality open data, including the raw data for every traffic counter they run since 2013 - that's over 300 traffic counters, every five minutes, for years. It's a really incredible amount of detail. In theory, we could look at traffic during the "school run" times specifically, which might lead to a better estimate. However, the datasets involved are absolutely massive - in the region of one gigabyte per day of data. Sadly, I simply don't have the bandwidth to download over a terabyte of data!

So, let's look at the daily aggregates instead. These files are a lot smaller, and they'll still have what we're really looking for. I'm going to be looking at the data for 2016, 2017, and 2018 - in 2019, their counters appear to have had a month-long outage, and for obvious reasons, 2020 and 2021 aren't suitable for a representative sample.

The data we're looking at contains the daily totals for each traffic counter across the country, and it includes the vehicle types. We're going to look only at cars.

Here's the data per year. In order to make it a little easier to see trends, I've used a 14-day moving average, meaning that every "day" is an average of the 14 surrounding days.

daily-14-day-moving-avg.png
Figure 10: Traffic counter 14 day moving average. Source: TII

It's interesting to see how much less traffic there was in 2016 - in some cases about a million fewer counts.

Let's split the data per month and remove the 14-day moving average. We'll be able to see differences and similarities per month more easily this way.

daily-by-month.png
Figure 11: Traffic counts by month for 2016, 2017, 2018. Source: TII

Well, Storm Ophelia and the Christmas-New Year's break is certainly quite clear in these two plots, but it doesn't really make it clear whether or not car traffic counts go down during the Summer holidays. Let's look at the average per month, against the average over the three years, and see if there is a marked decrease in the Summer months.

traffic-counter-mean-per-month.png
Figure 12: Traffic counter mean per month. Source: TII

Well, I think that's a resounding "there is not a marked decrease in the Summer months." This means we're not able to verify our estimate using this data. That's life! Still, it's incredible data and I look forward to looking at it again - it'll be really interesting to see the impact of COVID-19 on these counters, for instance.

3.5.3 Research Outcomes

  • We estimated 13000 vehicles are on the roads to bring students to and from school. However, this estimate is not proven by the data we researched.

3.6 How big is the cycling industry in Ireland?

3.6.1 Introduction

It might be valuable to determine the approximate size of the present cycling industry in Ireland. This area of business would be likely to see an increase in value if the scheme were implemented.

This industry includes bicycle shops and mechanics and internal cycle tourism, including greenways, the Wild Atlantic Way and so on.

It's valuable for us to consider this industry, since they would be the most impacted by a scheme like the Cycle to School Scheme.

3.6.2 Method

3.6.2.1 How many bicycle shops are there in Ireland?

The easiest way to get an estimate of the number of bike shops is to use Open Street Maps data, as we have done in several other research sections.

We can use a Sophox instance to query Open Street Maps data using SPARQL. A SPARQL query to find simply a list of all bike shops is the following:

SELECT (COUNT(?place) as ?count) WHERE {
  ?place osmt:shop "bicycle".
}

Our Sophox instance, described in the appendix, only covers the Island of Ireland, and so we don't need to add a bounding box or similar check.

This returns 213 rows, which indicates that there are 213 bike shops listed in Open Street Maps on the island of Ireland. That seems like quite a low number! Open Street Maps is unfortunately not an exhaustive database, as, like Wikipedia, it depends on member contributions.

A useful method when presented with a result like this is to ask: is that really a low number? If we think about it, I suppose I would expect there to be a bike shop in more or less every "big" town in Ireland, and an extra few dozen for Dublin, Galway, Limerick, and Cork. If we say 50 bike shops in those larger cities, then we're left with 163 bike shops for largish towns.

If we take a look at the CSO's population data for towns, taken from the 2016 Census (table E2052), we can see how many towns might be described as "big" towns. There are 873 towns in the dataset. Let's say that we think any town of 5,000 people or more will have a bike shop in it.

ds = SemanticImport["./Downloads/E2052.csv"][[;; -2]] (* the last row is the total, so we skip it*)

ds[Select[#"VALUE" > 5000 &] /* Length] (*86*)
ds[Select[#"VALUE" > 2500 &] /* Length] (*139*)

Apparently, there are 86 towns with more than 5,000 people in them. If we lower our threshold to 2,500 people, we get 139 towns. I live in a town of 2,700 people, and it's got a bike shop, so I'm going to assume that's the average (this method is called #datascience).

Here's a histogram of town populations in Ireland. Here I've restricted the range to populations of less than ten thousand people, because there are only 46 places with populations greater than that (as opposed to 827 with smaller populations), and they make it hard to see detail in the chart.

town-population-histogram.png
Figure 13: Histogram of town populations in Ireland up to 10,000 people Source: CSO

(If I wanted to include Dublin in that chart, it would be so far over to the right that you couldn't see any of the rest of the data. The difference in order of magnitude between Dublin's population and practically anywhere else in Ireland is hard to comprehend.)

Having looked at that town population data, I suppose I no longer think that 213 bike shops is a really low number. It's possible that it's still way off, but I'm fairly confident that the maximum number of bike shops in Ireland is somewhere less than 500.

3.6.2.2 Planning Permission data

One of the most interesting and most wide-ranging datasets on data.gov.ie is the planning permission dataset. Nearly all planning permission applications for the last several years are included in this dataset, including short descriptions of the work, when the application was put in, where the work was set to be done, and so on. It's a really valuable dataset. We can even use it here!

We can start by simply looking for "bike" or "bicycle" in the dataset and see how often these words are mentioned, and when. There are nearly 3,000 mentions of those words in the dataset!

We can see that the mentions of our keywords increases a fair bit from 2010, particularly increasing in 2013 and 2014. It looks like it may have been impacted a little by the late 2000s financial crisis. It's also useful to remember that the Cycle to Work scheme was started in 2009, so some more focus will have been put into bike parking at workplaces and so on from then.

pp-mention-timeline.png
Figure 14: Counts of mentions of 'bicycle' or 'bike' in Irish planning applications per six month

We can also look at this data geographically, showing where the most mentions of these keywords are:

pp-mention-map.png
Figure 15: Map of mentions of 'bicycle' or 'bike' in Irish planning applications

Since that's a bit crowded, here it is in a geo-histogram. Each hexagon is lighter-coloured if it has fewer mentions, and darker coloured if it has more.

pp-mention-histogram.png
Figure 16: Geo-histogram of mentions of 'bicycle' or 'bike' in Irish planning applications

Now onto the really cool stuff - mining the descriptions of work to be carried out for more detailed knowledge. Since there are almost 3,000 references to "bike" or "bicycle", it isn't really feasible to read through every single one of those to determine whether a bike shop is being built.

One useful method here is to use n-grams (sometimes written engrams). An n-gram is a sequence of words in a corpus. For example, the most common 2-gram (or bigram) in the Wikipedia page summary for "N-grams" is {an, ngram}. The most common 3-gram, or trigram, in the whole Wikipedia page for "recursion" is {in, terms, of} with 7 instances.

Now, let's look at the most common n-grams in the planning permission dataset, where one of the grams is "bicycle" or "bike". I've changed all mentions to "bicycle" to "bike" in the dataset in order to count duplicates together ({no, bike, parking} and {no, bicycle, parking} mean the same thing, after all).

pp-trigrams-count.png
Figure 17: The most common trigrams in the planning permission dataset mentioning bicycles

Well, that's fairly clear, but just for the sake of being able to use a word cloud: here's a word cloud that really shows the common theme in these n-grams.

pp-wordcloud.png
Figure 18: Word cloud of words in trigrams containing "bicycle" or "bike")

Parking. It seems like the vast, vast majority of bicycle related infrastructure being built is bike parking. Sadly, it's not clear that we found any references to bike shops at all - mostly, we saw reference to bike parking. It seems that a large number of new developments include bike parking - a good sign, although it's not what we're looking for.

3.6.3 Research Outcomes

  • there are somewhere between 200 and 500 bike shops in Ireland, indicating a reasonably healthy industry
  • the creation of bicycle-related infrastructure is growing significantly

3.7 How many accidents involving cyclists are there?

3.7.1 Introduction

For the scheme to be feasible, we must show that the accident rate involving cyclists is low in Ireland. If the accident rate involving cyclists is quite high, the scheme is unlikely to succeed as cycling will be too dangerous for students - especially considering that many bicycles would be added to roads at peak times. However, it is important to note that "peak times" are at least partially so because of journeys to school, and so the scheme might ameliorate that to some degree. It's difficult to estimate with any certainly how much of the traffic is due to journeys to school, although we give it a try in the section titled "How much traffic is due to the school run?".

3.7.2 Method

We explored various open datasets listed on data.gov.ie, particularly data from the CSO, who list accident data per year.

One of our datasets included accident totals by hour of day and by year, which meant that we could see how the rate of accidents has changed over the years and what time is most and least dangerous on the roads.

hour-of-day-by-year.png
Figure 19: Road traffic accidents by hour of day and year. Source: CSO

It is really interesting to see this trend based on the hour. There's a peak in the morning, between 0700 and around 1000, and then a larger peak in the evening around 1700. There are a lot of easy-to-make and difficult-to-prove hypotheses to make about this trend - for example, perhaps there are more accidents at 1700 because people are more impatient to get home than they are to get to work in the mornings. Or that people might be more tired in the evenings than in the mornings. In any case, a really interesting trend. We also have a yearly component to this data, and we can see that we've created these charts so that the older data is more faded. Interestingly, even though we haven't done any normalisation (for example, accidents per capita), the trends all match each other very well.

hour-of-day-by-year-fatal.png
Figure 20: Fatal road traffic accidents by hour of day and year. Source: CSO

This is quite interesting - despite the fact that we have a very clear trend across all accidents, when we zoom in to just fatal accidents, the trend is quite a lot harder to pick out. This is mostly to do with the sheer drop in the number of fatal accidents. In 2018, we had less than half of the number of accidents that we had in 2005. We can see this easily if we plot the total fatal accidents over the years, ignoring the hourly component:

by-year-fatal.png
Figure 21: Fatal road traffic accidents by year. Source: CSO

The above figure is really encouraging, as it shows a major drop in the number of fatal accidents since 2005. You can imagine all kinds of reasons for this change, and especially since it involves a driving population of millions, no one single reason will apply. Regardless, it's a very positive change. However, if we the total number of accidents on Irish roads doesn't quite show the same magnitude of change:

by-year-total.png
Figure 22: Total road traffic accidents by year. Source: CSO

I can't come up with a good explanation for the noisiness in the data here. There doesn't appear to be a clear trend. I suppose the safest thing to take from this is that the rate is around 5800 ± 1000.

Since we're talking about the cycle to school scheme, let's also look at the number of accidents involving cyclists.

cyclist-fatal.png
Figure 23: Fatal traffic accidents involving cyclists by hour of day and year. Source: CSO

Also not a hugely encouraging trend, although the numbers are fairly low.

Training is critical in preventing accidents. We can see an established training platform in the form of Cycle Right, with 3 different stages in training for bicycle safety. They have a number of approved training instructors in multiple counties (seen here). The courses begin off road, and teaches the basics of cycling, stopping and safety, progressing to road junctions, roundabouts and hazard protection over the 3 stages. These stages could easily be implemented during summer or midterm holidays.

We can see a pilot "Cycle Bus" scheme in action in Strandhill National School. The scheme was introduced in September 2020, and has parents cycle with children to school in the morning. An extract from their news section on their website is as follows:

We are the first school in Sligo to have a daily Cycle Bus whereby parents and adult volunteers supervise a group of children to cycle safely to school. The success of this healthy and environmentally-friendly idea was recognised by Minister of State for Public Health & Well Being, Frank Feighan, who took part in the Cycle Bus on Monday 20th April. Well done to the parents and children involved.

According to Mr. Fogarty from Strandhill National School, the parents association work together with Sligo Cycling Campaign to organise the cycling bus, with an average ratio of 5:1 students to adults, and they have had no accidents with an average of 25 students cycling to school each day with a maximum of 50 students that could cycle or have access to bikes.

Finally, as part of the Safe Routes to School Programme created by Green School Ireland, funding of €15 million is available in 2021 for safe routes to school. A mixture of primary, post-primary, and other schools are investing in cycle lanes, footpaths, and bicycle storage to encourage students to walk/cycle to school. This will build upon existing infrastructure and increase safety for cyclists.

3.7.3 Research Outcomes

  • We can see peak accident times as 0800 to 0900 and 1600 to 1800, which is to be expected with peak traffic times. Unfortunately, this coincides with school opening times.
  • It appears that training providers are already established in the field. Working with training providers would likely lower the safety risk.
  • Government funding already exists to improve cycling and walking infrastructure.
  • A case study of Strandhill National School's Cycle Bus suggests that accidents are unlikely in rural or semi-rural locations, especially with adult supervision.

4 Conclusion

4.1 Research Outcomes

  • Approximately 370,000 bicycles would be needed to cover all secondary schools in Ireland at any point.
  • Assuming 10% loss per year, over the first 10 years of the scheme 700,000 bicycles would need to be supplied.
  • A little over one quarter of schools are within 5km of a bicycle shop, enabling easier maintenance of stock
  • About one quarter of schools might be ready for the scheme
  • there are somewhere between 200 and 500 bike shops in Ireland, indicating a reasonably healthy industry
  • the creation of bicycle-related infrastructure is growing significantly
  • Peak accident times involving cyclists are 0800 to 0900 and 1600 to 1800
  • The vast majority of students driving to school leave home between 07:30 and 09:00.
  • Government funding already exists to improve cycling and walking infrastructure.
  • A case study of Strandhill National School's Cycle Bus suggests that accidents are unlikely in rural or semi-rural locations, especially with adult supervision.
  • Training providers are already established in the field. Working with training providers would likely lower the safety risk.
  • We estimate that about 3.6 million kilometres are driven to and from school every day in Ireland.
  • The initial cost of the full scheme for secondary school students might cost around 80,000,000 euros.
  • A pilot scheme supplying 5% of secondary school students might cost around 5,000,00 Euro.

The list of questions we did not get to answer is extremely lengthy. There are a lot of long-term effects a scheme like the Cycle to School Scheme might have, but unfortunately we didn't get to discuss any of those in depth. Finally: we are not scientists, and our results should not be seen as comprehensive, exhaustive, or scientifically rigorous. If you happen to base a multi-million euro government scheme based on this document alone, that's on you.

5 Appendix

5.1 Code attributions

Org-mode CSS theme by Abhinav Tushar, licensed under GPL 3.0. (Original source; edited source)

Org-mode anchor links helper by alphapapa, licensed under GPL 3.0. (Original source; edited source)

Javascript image zoom library by Desmond Ding, licensed under MIT. (Source)

htmlize by Hrvoje Nikšić, licensed under GPL 3.0. (Source)

dash.el by Magnar Sveen, licensed under GPL 3.0. (Source)

sparql-mode by Bjarte Johansen, licensed under GPL 3.0. (Source)

wolfram-mode by kawabata, licensed under GPL 3.0. (Source)

5.2 Master list of data sources

The following is a list of all of the datasets used in the making of this report. Some datasets may not have been directly used in visualisations or report text, but are relevant and provided background information to us. A big thank you to all of the data providers.

Name License Copyright Link
2016 Population 1km2 Grid CC-BY Central Statistics Office Source
E2052: Population and Birthplace 2016 CC-BY Central Statistics Office Source
E6010: Travel Methods to schools CC-BY Central Statistics Office Source
E6016: Travel times to schools CC-BY Central Statistics Office Source
National Planning Applications CC-BY Department of Housing, Local Government, and Heritage Source
Open Street Maps ODbL OpenStreetMap contributors Source
Post Primary Schools CC-BY All-Island Research Observatory Source
Primary Schools CC-BY All-Island Research Observatory Source
ROA06 CC-BY Central Statistics Office Source
ROA09 CC-BY Central Statistics Office Source
ROA17: Traffic Collisions and Casualties CC-BY Central Statistics Office Source
ROA19: Traffic Collisions and Casualties CC-BY Central Statistics Office Source
Small Areas (Ungeneralised) CC-BY Ordnance Survey Ireland Source
THA17: Traffic Volumes CC-BY Central Statistics Office Source
Traffic Counter Volumes CC-BY Transport Infrastructure Ireland Source

5.3 Sophox, a linked data store for Open Street Maps data

In the process of writing this report, we frequently made use of linked data, particularly linked data related to Open Street Maps. We set this up using Sophox (https://github.com/sophox/sophox), an open source project bringing together several different projects to enable SPARQL querying of Open Street Maps data.

Normally, the easiest way to query Open Street Maps data is using Overpass, a domain-specific language and API. Those queries can look like:

[out:json][timeout:25];
(
  node["shop"="bicycle"];
  way["shop"="bicycle"];
  relation["shop"="bicycle"];
);
out body;
>;
out skel qt;

Personally, having used SPARQL quite a lot with WikiData and Data.gov.ie's catalog.ttl file and so on, I find it a bit more comfortable to query OSM using it.

The same query as above looks like the following in SPARQL:

SELECT (COUNT(?place) as ?count) WHERE {
  ?place osmt:shop "bicycle".
}

As you can see, these queries are about as difficult in one language as the other. However, the really useful thing about SPARQL is that it's specifically useful for linked data. If we link together, for instance, the Open Street Maps object for "County Roscommon" and the CSO's object for "County Roscommon", we could construct a query asking for all CSO statistics relevant to a certain area ("County Roscommon"). Then, we could also connect this object to WikiData's object for "County Roscommon", and do some sort of crazy query asking for the locations of important musicians born in County Roscommon (from WikiData), the population of the county at the time they were born (from the CSO), and the geometry of the area they were born in (from Open Street Maps). Obviously, this is quite a contrived example, but it shows that linked data can be quite a powerful tool.