On 15 January 2014 Greg Hunt, the Commonwealth Environment minister, approved a request by the Western Australian Government to be exempted from the *Environment Protection and Biodiversity Act* *1999 *(Cth) so that it could proceed with plans to set up 72 baited drum lines in order to cull white sharks, tiger sharks and bull sharks greater than 3 metres long. The exemption is required due to the listing of those species being listed as threatened species.

The predominant reasons for granting the exemption were:

1. A statistically significant increase in shark attack strikes in Western Australia in the years 2010 to 2013 compared to population increase;

2. As a result of point 1, people are scared of entering the water and there is anecdotal evidence of tourist businesses suffering financially;

3. The proposed cull will reduce the incidence of shark attacks and will provide useful information to other states in dealing with the same problem;

4. The Western Australian government have implemented measures to reduce the risk of death to other sea life from the exercise; and

5. It is in the national interest (due to points 1 to 3 above) for the cull to be allowed, and therefore within the class of reasons the EPBC Act gives as being the basis for an exemption to be allowed.

**1. Statistically significant rise in shark attacks**

At the outset, I am not formally trained in statistics and have only learned small sections of statistical analysis in passing. I would be grateful for any comments from those with greater knowledge about my analysis and conclusions.

The assertion made is that, when adjusted for population increase over time, the number of shark attacks has risen by a statistically significant degree. The paper this is based on (here) doesn’t provide the population or shark attack numbers used. However, a comparison between graphs on pages 10 and 11 of that paper (showing frequency of attacks per year and frequency of attacks per year per 100,000 people respectively) don’t show any major distortion. On the basis of this lack of distortion, my analysis will use the non adjusted figure of shark attacks per year, which I have obtained from the Shark Attack File.

However, before doing so, one variable not taken taken into account for which it could be argued must be accounted for in order to properly analyse the data, is any increase in average time spent per person in the ocean. Such a variable could result in an increased number of attack incidents due to the same number of people entering the water, the same number of sharks in the water, but an increase in the chance the two will interact.

Below is my tabling of shark attack data over the past 30 years extracted from the Shark Attack File. The data represents number of people injured per attack. Therefore, there may be incidents where one incident injures two people. Further, I’ve attempted to remove all reports listed as a hoax or reports where a shark has been caught in Australian waters with human parts found in their stomach contents (therefore not necessarily an incident which has occurred in Western Australia), but I cant guarantee that all instances of such data points have been completely removed.

One other point to note is the greater number of reports of minor incidents in the later years of the data. Data in the earlier years appears to be limited to incidents with either very serious or fatal outcomes, while there are greater reports of minor incidents or incidents where no injury is reported in the last decade or so. This may be a common finding in databases reliant on the reporting of events which are gaining increased importance or scrutiny for some particular reason. This skewing of the data can lead to false trends being extracted from it.

YEAR | AUSTRALIA | WESTERN AUSTRALIA | ||

Incidents | Fatalities | Incidents | Fatalities | |

2013 | 18 | 3 | 7 | 2 |

2012 | 22 | 3 | 7 | 2 |

2011 | 18 | 5 | 6 | 3 |

2010 | 16 | 2 | 5 | 2 |

2009 | 30 | 0 | 5 | 0 |

2008 | 18 | 2 | 3 | 1 |

2007 | 18 | 2 | 2 | 0 |

2006 | 12 | 1 | 2 | 0 |

2005 | 17 | 2 | 2 | 1 |

2004 | 16 | 4 | 6 | 1 |

2003 | 9 | 1 | 3 | 0 |

2002 | 12 | 3 | 1 | 0 |

2001 | 13 | 1 | 3 | 0 |

2000 | 18 | 5 | 2 | 1 |

1999 | 2 | 1 | 0 | 0 |

1998 | 4 | 2 | 0 | 0 |

1997 | 10 | 2 | 3 | 1 |

1996 | 15 | 1 | 2 | 0 |

1995 | 7 | 1 | 2 | 1 |

1994 | 3 | 1 | 1 | 0 |

1993 | 7 | 3 | 1 | 0 |

1992 | 6 | 1 | 0 | 0 |

1991 | 7 | 1 | 2 | 0 |

1990 | 8 | 1 | 0 | 0 |

1989 | 12 | 2 | 1 | 0 |

1988 | 7 | 3 | 1 | 0 |

1987 | 4 | 2 | 1 | 0 |

1986 | 5 | 0 | 2 | 0 |

1985 | 2 | 1 | 0 | 0 |

1984 | 3 | 1 | 1 | 0 |

Given the assertion about shark attacks being significant above the average in the years 2010 to 2013, I’ve decided to use the chi-squared test to assess the statistical significance of the number of attacks over those 4 years compared to the average of the proceeding 26 years (1.77 incidents per year):

Year | Expected Average ( E ) | Observed Number ( O ) | O – E = D | D² | D²/E |

2010 | 1.77 | 5 | 3.23 | 10.4329 | 5.8942937853 |

2011 | 1.77 | 6 | 4.23 | 17.8929 | 10.1089830508 |

2012 | 1.77 | 7 | 5.23 | 27.3529 | 15.4536158192 |

2013 | 1.77 | 7 | 5.23 | 27.3529 | 15.4536158192 |

Sum D²/E | 46.9105084746 | ||||

Degrees of freedom = 3 |

A chi-squared value of 46.91 with three degrees of freedom gives an absurdly high, statistically significant p value of much less than 0.05, meaning that the likelihood of the increased number of shark incidents in those years could occur by chance alone is less than 5% (this is the level considered to give high enough confidence for the null hypothesis, such as “shark attack incidents per year remain static, to be rejected). This does not mean that shark incidents are definitely increasing, or that there is some underlying cause for an increase in shark incidents (such as a greater number of sharks in the ocean).

There may well be a better way of analysing this data than the chi-squared test given we are using a yearly data set and the chi-squared value I got being extremely high (and I’m open to suggestions in the comments).

There is a further assertion in the exemption statement that the increase in average attacks from 1995 on-wards is also made. Whether this is correct or not is not my main concern.

My concern with basing public policy on statistical analysis of this data is that the low number of attacks per year means any analysis will be of low statistical power. With such a low average number of attacks per year figure and a sampling of a small number of years to compare to the long term average, small deviations above or below the average can result in a statistically significant, but none-the-less erroneous, conclusion. Similar problems arise in medical trials when small sample sizes can lead to either significant improvements in effectiveness of, say, a specific treatment, not being found to be statistically significant, or, as in this case, statistical significance is found in a sample set which, after a few more years worth of data is collected, could yet be seen to be no more than a ‘blip’ in the data based on chance alone (see here, which I found to be useful in explaining the difficulties in making positive assertions about statistical significance test in large sample sizes and small sample sizes equally, and about drawing conclusions for analysis of single data sets by themselves).

Further, debate continues about the ability reliably draw conclusions based on a statistically significant finding from a single or small number of experiments or trials – how reproducible the result is is a greater determinant of whether the statistics are describing an event or occurrence that is really occurring, as does the debate about whether some tests are actually useful in drawing meaningful conclusions from otherwise good data. See here, here, here for example.

Statistical analysis is an extremely useful tool to study a given hypothesis and be able to draw conclusions as to the probably that an effect shown in the data gathered is due due to chance alone. But these tools are subject to limitations. The presentation of statements about statistical significance give undue legitimacy to policy decisions where the limitations to the analysis are not provided or explained.

**2. Effect on people entering the water and on tourism businesses**

Statistics are given of the number of holiday makers to Western Australia, how many intend to enter the water and what percentage of the State’s economy comes from tourism. A further generalised assertion that “There is substantial public concern about the safety of water based activities in Western Australia, and anecdotal evidence that the frequency of shark strikes is impacting on businesses in Western Australia”, and this is followed by a report of report of dive business saying that it had had a 90% drop in people wanting to learn to dive.

If the improper use of data and statistics is the failing in attempting to give legitimacy to the assertion that there is problem with shark attack frequency, then the lack of legitimate, grounded, provable evidence of these asserted problems is the failing with this trumping up of the effects of the ‘problem’.

One would be excused for thinking that the application for exemption has been put forward by people have done no more than read the newspapers and searched holiday stats from their own tourism department website to create a narrative it could use to promote the plan.

The real difficulty with realistic concern about increasing shark safety is the ‘zero infinity problem‘ – the chance of it happening to any particular person is so low that it barely warrants concern, but the effect on the victim if it does occur are infinite (in a non-mathematical sense of that term). To, at least in part, base policy decisions with likely deleterious effects on a population of any living thing by playing on heightened concern of something so unlikely to happen, and then in turn superimpose that on financial reasons, must be considered poor leadership.

**Conclusion**

Reliance on these two factors to support such a move as actively killing threatened species is significantly flawed. Statements about the statistical significance of a problem on minimal data points, followed by generalised statements of the effect of the problem with no proper basis in evidence, cannot pass as being reasonable premises to infer that action must be taken, let alone the mode of action to be taken.

Pingback: Lies, Damn Lies and Killing Sharks (Part 2) | Cameron's Page