Whose data is it anyway?

24 July 2014

Posted by: Liz Harley

Category: Staff blog

folders.jpgWho owns scientific data? The scientist who collects it or the person that pays for it? With the UK making serious commitments to open access for the outcomes of research, Dr Liz Harley asks whether the data underpinning those outcomes should be publicly available as well.

Ownership is of scientific data is a contentious issue. Historically the scientist who developed the hypothesis, designed the experiments and analysed the outcomes has been considered the owner of the data. No one would dispute Charles Darwin’s ownership of the data he meticulously collected through his pigeon breeding experiments, or barnacle dissections.

Darwin paid for his research using family money, and conducted much of his work in laboratories on his own premises. But today the vast majority of UK scientific research is funded not by personal fortune, like Darwin’s, but by the Research Councils. Their money is allocated by the Government, which comes ultimately from the taxpayer. Today Darwin would be working out of a laboratory on UCL’s Gower Street (the site of his family’s London house) and funded by a grant from the Natural Environment Research Council.

One of the key arguments of the Open Access movement is that scientists should be accountable to the public that fund their research. Making the outcomes of publicly funded work free at the point of publication allow the public to assess the research conducted with their money and ostensibly on their behalf for themselves. Research Councils UK, the umbrella body for UK research councils, has set in motion an ambitious open access policy that would see all peer-reviewed research articles that acknowledge research council funding to be published in an open access format. The policy was implemented on April 1st 2013, with the anticipation of a transition period of around five years.

That takes care of the outcomes of research, the published papers, but what about the underlying data? Does anyone have a right to see, or demand to see that data?

There are many benefits to having openly available scientific data, chief of which is aiding the development of new research. Hypothetically if Darwin had made his data on the basic physiology of the barnacle freely available it could have formed the basis for research that Darwin could never have imagined. Perhaps in the fields of conservation or ecotoxicology.

Having access to existing data could prevent duplication of research efforts by different groups, which would be particularly beneficial when that research involves the use of animals. And from a broader perspective making all data subject to wider scrutiny – essentially peer review – has the potential to identify mistakes, deliberate or otherwise, more rapidly. While open data is unlikely to eliminate scientific fraud, it could make fraud a lot more difficult to get away with.

However, many would argue that while it is acceptable to make the outcomes of data publicly available, the data itself is the intellectual property of the scientist. The adage ‘publish or perish’ is often used to describe the pressure that scientists are under to generate publications, and as one dataset can form the basis of multiple publications there is often little incentive to make that resource available to competing interests. And there is nothing to stop the unscrupulous passing another scientist’s data off as their own.

But theoretically there is a system that could manage these competing interests. The premise is simple: research grants typically run for a fixed length of time, approximately three years. So a condition of receiving the research funding could be that within three years of the end of the grant the researcher has to make all data collected under that grant publicly available. This would give the scientist time to work with the data and produce the publications that are so critical to research success, while still ensuring that the public who paid for the data will get to see it.

The question of who owns what makes implementing any kind of concrete open data policy difficult. At present Research Councils UK have a set of common principles, which acknowledge the importance of open data, along with the “legal, ethical and commercial constraints” surrounding data release.

But in a climate of greater openness and transparency, not just in science but across all sectors, perhaps what we should be asking ourselves is not who owns the data, but how can we use it to do the greatest good.

Dr Liz Harley