Aviation Incidents in Canada

Visualizing 80 years of data

Mounir Kara Zaitri
13 min readJun 3, 2021
Courtesy of Global News Canada

As stated by the International Civil Aviation Organization (ICAO), safety is the highest priority of all involved in aviation. The shared goal is for every flight to take-off and land safely, as happens more than 126,000 times every day. In 2018, the fatal accident rate was 0.28 per 1 million flights, the equivalent of one fatal accident for every 4.2 million flights.

The aviation industry is a complex collaboration of multiple fields, from manufacturers to commercial airlines. The air traffic is regulated and managed by different agencies, and service providers.

In Canada, the aviation industry is regulated by Transport Canada. Air traffic services are provided by Nav Canada.

Data collection and analysis is a key factor in the safety management systems. It helps investigations and defines efficient regulations and procedures.

In this article, we will analyze aviation occurrences data provided by Transport Canada. Data preparation and other details are available on my Github.

We will investigate the following points:

  • Where did the incidents happen?
  • What are the principal causes of aviation events?
  • When did the incidents happen?
  • What are the categories and consequences of accidents and incidents?
  • What is the impact of environmental factors on aviation safety?
  • Which flight phases are more dangerous?
  • Which aircraft are more likely to be involved in events?
  • How do flight plans, search and rescue operations impact aviation safety?

Before diving into the data, let’s define some terminology:

  • Occurrence: Any event which is irregular, unplanned, or non-routine, including any aircraft accident, incident, or other occurrences.
  • Accident: An occurrence associated with the operation of an aircraft which takes place between the time any person boards the aircraft with the intention of flight until all such persons have disembarked, in which:
    1- a person is fatally or seriously injured
    2- the aircraft sustains damage or structural failure
    3- the aircraft is missing or is completely inaccessible.
  • INCIDENT: An occurrence, other than an accident, associated with the operation of an aircraft that affects or could affect the safety of operation.
  • SERIOUS INCIDENT: An incident involving circumstances indicating that an accident nearly occurred.

Datasets used in this study

The data is divided into five (5) data frames:

  • Occurrence table: This table contains data on the occurrence summary, including the date, time, and location of the occurrence, the occurrence type and category, the occurrence classification, the aircraft involved, the number of injuries/or fatalities, the weather conditions, and data relating to the landing and takeoff aerodrome or operating surface.
  • Aircraft table: This table contains data about the involved aircraft, including its type, make, model, registration, and country of registration, aircraft’s engine(s), propellers, and rotors, data relating to an explosion, fire, fumes, and/or smoke, operator information including the type of operator, type of flight plan, flight number, departure and destination, and air traffic service involvement.
  • Injuries table: This table contains data on the number and severity of injuries resulting from the occurrence.
  • Events and phases table: This table contains data about the phases of the occurrence flight and the events during the flight.
  • Survivability table: This table contains data relating to the evacuation of the occurrence aircraft, the effectiveness of survival devices, and the systems for locating the occurrence aircraft.

All these tables are available as csv files. data_dictionary contains the definitions for each column in the data frames.

Here are the URLs that we used to download the data frames :

The occurrences are described using more than 500 parameters. After looking into data_dict_df, we decided to keep the following columns for this study:

Using Pandas library, we load the selected data frames. Encoding ISO-8859-1 will be used.

Occurrences have a unique number, stored in the column OccNo. This column is shared by all files and will be used as a reference. Some occurrences appeared multiple times, to report different pieces of information. We start first by creating two copies of the tables, a full version that contains duplicates, and a clean version without duplicates.

The first thing to know is how many unique occurrences are reported in these data frames?

Where did the incidents happen?

Aviation occurrences are classified by geographic area:

The first plot shows the breakdown of events by region:

More events were reported in ONTARIO compared to the rest of Canada. This is mainly due to the higher traffic density in this province, which is the most populous province in Canada. FOREIGN region refers to occurrences that happened outside Canada, involving Canadian aircraft (manufactured, operated, etc). Let’s look at the countries where occurrences were reported:

What are the principal causes of aviation events?

Each event is categorized by ICAO, according to its principal cause. The following plot shows the breakdown of events:

System failures (related or not to engines) are the most common causes of occurrences (about 50% of events).

When did the incidents happen?

Date information requires some transformations to extract interesting insights. Using OccDate data, we create 4 new columns:

Now, we can display the number of events reported every year.

We have excluded events before 1965 due to the low number of reports. The number of occurrences seems to be steady since the 90s, with a slight tendency to decrease. An interesting comparison would include the total number of aircraft movements. Using data provided by Statcan website, we can observe that the number of movements is not increasing (at least since 1997 if we exclude 2020 and 2021 due to the covid-19 crisis).

Note that this graph only shows the number of movements at airports and not the number of passengers. Using the two plots, we could infer that the number of events is somewhat constant concerning traffic density.

In the following plot we answer the question: are there any periods of the year where more events take place?

And the answer is Yes! It turns out that more events happen during the summer months. This is due to a higher density of traffic during summer.

Can we have similar deductions if we consider weekdays?

The result shows that there is no obvious correlation between the number of events and weekdays. Later on, we will show that some categories of incidents happen more frequently during weekends.

In aviation, time is provided in UTC (universal coordinated time or GMT) standard format. In our dataset, some of the events are not reported in UTC, but in local time. This can be useful to have an intuition about the time of the day (day/night). Let’s take a look at the breakdown of these occurrences.

Again, Pandas is used to extract hour information from the dataset:

It comes out that most occurrences happen during the daytime. Traffic density during the day is higher, and the risk of incidents is increased.

Accidents and incidents

In this part, we look at the incident/accident repartition.

Statistically, there are more incidents (not involving fatal injuries, or aircraft damage) than accidents. With that being said, about 40% of events are categorized as accidents. Let’s see how this rate is evolving with time.

The result is encouraging! the rate of accidents among the reported events is decreasing. This means that there are fewer fatal injuries and substantial damages.

Let’s focus now on the accidents, and plot the number of deaths reported every year:

The general trend is downward (which is good news). There are still some peaks in this graph that require more investigation. Those values are due to crashes that caused a high number of fatal injuries. Let’s look for more details by searching for the worst accidents, by death count:

We can see that the top 3 accidents happened in the years 1985, 1991 and 1998. Let’s get more information about these accidents.

Statistically, those events affect significantly the properties of the dataset. For instance, here is a plot of average fatal injuries count for each month:

We can see peaks in July, September, and December, where the three worst accidents happened.

We perform the same analysis to the serious injuries column. The results are shown in the following plot:

New technologies and new regulations are improving the safety of air travel, in addition to more sophisticated search and rescue services. Interestingly, the worst 3 accidents we have shown earlier, didn’t affect the number of serious injuries. Sadly, these catastrophic events have a very low count of survivors.

We observe a curious relationship between the average number of serious injuries and the weekdays:

It appears that more serious injuries happen on weekends. More investigation is required to determine the real reasons for such a trend.

Impact of environmental factors on aviation safety

Multiple environment variables could affect flight progress. The first parameter we consider is light condition.

Let’s consider sky conditions impact:

Based on this data, it appears that occurrences happen more often in a clear sky, in daylight.

Another weather-related parameter is the visual (VMC)/instrument (IMC) weather condition. According to Canadian regulations, this refers to visibility being above or less than 3 nautical miles. The definition can vary in other countries.

Events happen more often in IMC conditions, where visibility is low. In aviation, visibility is described using two parameters: visibility ceiling (vertical visibility) and visibility (horizontal visibility).

Aviation accidents and incidents are more likely to happen in low visibility conditions. These conditions are usually referred to as marginal and can change quickly. They affect mainly VFR (visual flight rules) aircraft that are not certified and equipped to fly using IFR (instrument flight rules).

Many other weather phenomena can affect safety as shown in the following plot.

The top 3 weather conditions affecting aviation are icing, obscuration, and precipitation. Turbulence is a very serious concern, especially when it is categorized as moderate or severe. Among the less recurrent but most dangerous phenomena are wind shear, Microburst, and lightning. These three conditions are difficult to forecast and can cause substantial damages in a short time.

Which flight phases are more dangerous?

The flight phases have a key influence on the outcome of events. Let’s take a look at the breakdown of incidents by phase.

It turns out that more events happened during the cruise phase. To have a better intuition about flight phase impact, we need to determine which events are more dangerous? (by dangerous, we mean more fatal injuries)

The accidents causing more fatal injuries happen in general during the beginning of the flight (Initial climb) and the last minutes of the flight (Final approach, Landing roll).

When we consider takeoff and landing operations, it is clear that the runway plays a key role. Runway surface can be described using two parameters: surface condition and contamination.

Even if the contamination affects the runway operation, we can observe that a high number of events occur in bare and dry runways. In this case, human or equipment factors must be considered.

Which aircraft are more likely to be involved in events?

Let’s see if a specific aircraft manufacturer is more represented in the incident/accident database:

Cessna and Piper show among the top 5 manufacturers. Most of the aircraft produced by these two makes are light aircraft used for personal/tourism and training flights. Pilots flying these aircraft are less experienced on average. There are also many occurrences involving aircraft made by Boeing, Airbus, and De Havilland. These aircraft are the most commonly exploited by Canadian commercial operators.

Aviation events are not reported only for airplanes, but for all types of aircraft. Here is a breakdown of events by aircraft types:

Events related to airplanes are the top category. There are still a non-neglectable number of events that happened with helicopters, gliders, and even ballons!

Events can be classified as well, according to their operator type:

Commercial flights report the highest number of events, due to more intense activity. Private flights include tourism and training activities.
Using the YearOfManuf column, we will compute the aircraft age at the time of the event.

This histogram shows that -in general- incidents/accidents are not more frequent with old aircraft. This can be associated with two factors:

  • There are fewer old aircraft, so fewer events involving this category.
  • Airworthiness and strict maintenance programs allow keeping aircraft in a good and safe condition.

Not all occurrences lead to aircraft damage. The next plot shows a breakdown by damage level.

Most of the events caused no damage. It turns out that, if an aircraft is damaged during an event, it is more likely that the damage is substantial.

Flight plans, search and rescue operations, and aviation safety

Flight plans are documents filed by a pilot or flight dispatcher with the local Air Navigation Service Provider before departure which indicates the plane’s planned route or flight path. Flight plans are obligatory in some cases (IFR, VFR cross border, etc). They contain important data about the flight originating airport, route, destination airport, altitude, and aircraft type. They include contact information and search and rescue time.

The top-two categories of flights involved in events are :

  • flights without a flight plan,
  • flight notes, where the pilot is responsible for search and rescue

Furthermore, we can prove that the average fatal injuries count is lower when flight plans are available.

During search and rescue operations, evacuation time plays a significant role. In the following plot, we show a breakdown of reported evacuation times.

Let’s look at the impact of evacuation time on fatal injuries count.

As we might expect, more fatal injuries happen in situations where it takes longer to proceed with evacuation.

The availability of survival equipment has a great impact on the incident outcome. Here is a breakdown of reported survival equipment in the dataset.

Now let’s see if the survival equipment was used properly during the events.

It appears that -in a majority of cases- the survival equipment was used properly. The next question would be: how effective is the equipment?

At first glance, we could observe that survival equipment didn’t affect survivability in more than 14000 events. The important statistic here is that equipment was effective in more than 6000 events, so it saved at least 6000 lives!

Summary

We presented through this analysis some of the key aspects to consider when it comes to aviation safety. In general, PETE factors are largely used to investigate safety concerns (Person, Equipment, Task, Environment). Some trends can be observed, such as the decrease in the number of fatal injuries number. On the other hand, the number of incidents is not decreasing. We have shown that some periods are more prone to aviation incidents so more precautions should be taken. In the document, the impact of the environmental variables was shown. We established the link between the flight phase and the likelihood of incidents/accidents.
The last part of the analysis was devoted to showing the importance of flight plans in search and rescue operations. Time and information are the most important factors during these operations.

Future work

This study was done using less than 40 columns out of 500. We still have to go in-depth to get the most of the available data. The next step will be more specialized EDAs, with more emphasis on what we can do to improve safety. Each factor of the PETE model can be considered and an extensive search is required to develop efficient solutions.

Acknowledgments

I would like to thank the Jovian team, especially Aakash N S for all the extensive assistance through my Data Science training.

References

Transport Canada [https://tc.canada.ca/en]

Nav Canada [https://www.navcanada.ca/en/]

Statistics Canada [https://www.statcan.gc.ca/eng/start]

My Github [https://github.com/zaitrik]

My LinkedIn [https://www.linkedin.com/in/mounir-kara-zaitri-a01a00208/]

--

--

Mounir Kara Zaitri

I'm a Canadian air traffic controller, fascinated by data analytics and machine learning.