The Research Works Act H.R.3699

Following my post about SOPA & PIPA I’ve now come across The Research Works Act (H.R.3699) being introduced to US Congress. This bill, if I’m correct, is designed to prevent federal agencies in the US from stipulating that research funded with tax payers money via grants should be published online for open access. This is contrary to say what the National Institutes of Health set out in their Public Access Policy.

Further reading:

This is interesting to me following yesterday’s presentation.



Managing & Sharing Research Data Part 1

I attended a presentation this morning given by Martin Donnelly, Digital Curation Centre (DCC), University of Edinburgh covering ‘Managing & Sharing Research Data: Good practice in an ideal world … in the real world’  held at The University of Sheffield and promoted by the Research Ethics Committee there. It was a two hour presentation, with the first part made up of a presentation and the second of a demonstration of an online resource produced by the DCC called the Data Management Planning (DMP) Tool to enable easy production of DMPs to meet research funding council requirements.

I attempted to make notes during the presentation in the form of this blogpost; so the following is just that, my notes but you might find some use in them.


DCC was founded in 2004 for UK HE & FE sectors. Its major funder is the JISC. It provides support for JISC projects as well as producing tools, providing guidance, case studies, consultancy, etc.

Body of Presentation

When considering data management there are a number of areas to focus on:

  • Ensure the physical integrity of the files
  • Ensuring safety of the content (read and understood by your target audience but not accessible by other people / Data Protection / file format / etc.)
  • Describing the data (metadata), and what’s been done to the data
  • Access at the right time – make data available only after publication (embargo)
  • Transferring custody of data from the field to storage, archiving and possibly on to destroying (this process needs managing and is not necessarily done by the data collector)
  • Research Ethics & Integrity.

However, there is also the concept of Openness, Open Science, Open Data that needs to be considered. Martin touched on the Panton Principle with respect to Open Science. This was a Principle drafted in Cambridge in July 2009 and officially launched in February 2010. Originally based out of the discipline of chemistry, the concept of the Principle as taken from their website is:

Science is based on building on, reusing and openly criticising the published body of scientific knowledge.

For science to effectively function, and for society to reap the full benefits from scientific endeavours, it is crucial that science data be made open.

By open data in science we mean that it is freely available on the public internet permitting any user to download, copy, analyse, re-process, pass them to software or use them for any other purpose without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. To this end data related to published science should be explicitly placed in the public domain.

[Aside: I shall be returning to this, not least for the ioe12 course.]

Martin also pointed to an article in The Guardian, ‘Give us back our crown jewels‘, Arthur & Cross, 9 March 2006.

Our taxes fund the collection of public data – yet we pay again to access it. Make the data freely available to stimulate innovation, argue Charles Arthur and Michael Cross

The Research Councils UK (RCUK) is the strategic partnership of the UK’s seven Research Councils. It has produced a Common Principles on Data Policy, which Martin summarised as having Key Messages:

  1. Data is public resource
  2. Adhere to standards & best practice
  3. Metadata for ease of discovery and access
  4. Constraints on what data to release
  5. Embargo periods delaying data release
  6. Acknowledge of / compliance with Terms & Conditions
  7. Data management & sharing activities should be explicitly funded

There are an increasing number of things influencing the management of reasearch data some of which I managed to jot down:

  • Research outputs are often based on the collection, analysis, etc of data
  • Some data is unique (e.g. date & time specific weather conditions data) and can’t be reproduced
  • Data must be accessible and comprehensible
  • There’s a greater demand for open access to publicly funded data
  • Research today is technology enabled and data intensive
  • Data is a long-term asset
  • Data is fragile and there is a cost to digital data; curate to reuse and preserve
  • Data sharing and research pooling might be more cost-effective: cross-disciplinary and increased global partnership
  • Costs of technology and human infrastructures
  • Increasing pressure to make a return on public investment

Most (but not all) Research Councils are broadly the same in their approach to data management. They are generally requiring a Data Management Plan prior to funding being granted. The NERC Research Council has a Data Policy & Guidance (pdf), and also provides data centres for managing funded research data.

EPSRC is the odd one out; they are requiring all institutions to provide a roadmap for data management by 1st May 2012 and implemented by 1st May 2015.

RCUK has a Policy and Code of Conduct on the Governance of Good Research Conduct (available as a pdf).

Martin highlighted how some universities have got into difficulty with regards to Freedom of Information (FOI) requests. He mentioned Queen’s University Belfast and a request about Irish tree rings that was made under FOI. He also said about how Stirling University had received a request from a tobacco company about the take up of smoking amongst teenagers, useful data for a tobacco company.

The University of Edinburgh has developed a Research Data Management Policy.

The question Martin then put was Why? Why do this? And he outlined the incentives in the form of carrots and sticks.

It’s a good thing

  • Data as a public good (the RCUK common principles)
  • others can build on your work  (Isaac Newton “If I have seen farther it is by standing on the shoulders of giants.”)
  • Passing on custody so making effective use of resources.

Direct incentives to researchers are:

  • Increased impact of your work
  • making publications online increases citations

These are covered more fully in:

More incentives:

  • Increase citations helps REF
  • Research councils are increasingly rejecting on the grounds of poor data management plans
  • You receive more funding if you do this right

And the ‘Sticks’:

There is a concern often raised by academic researchers about how their data will be used or misconstrued if it is out in the open. Martin emphasised the importance of appropriate metadata to try to prevent this. However, he did say that even then if the data was going to be misconstrued it will be anyway. Files need to be labelled in an understandable, meaningful, standard and appropriate fashion, to include the project title and date. It would also be useful to maintain a separate log describing the data, to include

  • research context
  • data history
  • where & how to access the data
  • access rights
  • etc.

Backup is also a consideration. It is different from archiving. Backup is about loss, damage and recovery of data during the research process. (Archiving is about retaining and providing access at the end of the research process.) There should be some means of off-site backup. There should be an implemented, automatic backup process at the University, Faculty or School level. If not, then a manual backup process is required with set repeat reminders.

Archiving is a case of depositing data for the long-term. However, it does require things like checking copyright, consent and data protection. You should use the appropriate archive for your subject discipline. It’s also important to publicise your archived data for increased citations. The point was made that there isn’t yet a standard for data referencing, and that some work needs to be done in this area. The other concerns about use of data without knowledge are just the same as if your published work is plagiarised.

Rachel Kane from RIS in Sheffield highlighted that specific Sheffield resources will be made available soon. She also provided some useful examples of what people where doing at the University, including:

  • Prof. Steve Banwart in Civil and Structural Engineering approach to open data
  • Dr Bethan Thomas in Geography SASI
  • HRI Digital – data management services – from application to archiving stages – consultancy

ioe12 & the day of the internet blackout

It seems rather poignant to be considering licensing and copyright issues today as parts of the internet take action against two bills being debated by Congress in the US.

The two Acts are Stop Online Piracy Act (SOPA) and Protect Intellectual Property Act (PIPA).

There is much media coverage (e.g. BBC) , particularly of the protest by Wikipedia,

Wikipedia anti-censorship splash page

Wikipedia anti-censorship splash page

with Jimmy Wales appearing in numerous news articles, both online and in the ‘standard’ media.

But other sites are joining in the protest, Boing Boing for example.

Boing Boing SOPA & PIPA protest page

Boing Boing SOPA & PIPA protest page

And the Digital Storytelling MOOC, ds106 run by Jim Groom has this:

Digital Storytelling MOOC DS106 Censorship Splash Page

Digital Storytelling MOOC DS106 Censorship Splash Page

DS106 is reliant upon ‘fair use’ of media to enable participants to engage with the media and create their own material to interact fully. Under the Acts, such a course covering digital expression could face closure.

This lead me to this rather informative video outlining the PIPA:

From this video I take it that PIPA allows powers for censoring the internet to go to the entertainment industry. This enables shut down of sites where people download unauthorized media content. Most of these sites reside outside US law. The Bill gives the US Government powers to make internet providers in the US block infringing domain names access; similar to the powers used by China, Iran & Syria, which democratic peoples find so objectionable. It will allow the suing of US-based search engines, directories, even blogs and fora to have infringing links taken down. Additionally, it can cause funds to be cut off to such sites by having advertisers & payment sites cut off their accounts. Blacklisting will mean that foreign sites won’t be displayed in major search engine results.

Concerns arising from the PIPA are that it will reduce the number of successful new start-ups, because they can be accused of not actively filtering strongly enough to prevent copyright infringement: this could particularly impact new search engines and social media start-ups. The early days of YouTube would probably have fallen into this category. Small sites or those in their infancy won’t have enough funds to defend themselves. The Bill will mean that it is easier to take down a site than for courts to decide upon the nuances of copyright law compared to free expression.

In the history of the internet, wherever people have come to express themselves, be creative, share ideas and knowledge, or even develop protest movements there is a tendency for there to be copyright media material uploaded as well. This Bill would seek to prevent that, and could lead to other countries developing legislation along similar lines. This would inevitably mean a very different internet being visible to differing parts of the global population. Potentially powerful localized laws would cause censorship of content, enabling abuse of people and limiting the freedom of:

  • expression
  • choice
  • communication
  • education
  • discussion
  • etc.

It can be argued that there is already adequate (or, in the views of some, already extreme) legal provision in place via the Digital Millennium Copyright Act (DMCA), where for example links to infringing material can be removed. This power has also been said by some to be abused, with:

  • journalists being sued
  • YouTube videos being removed, example
  • suing families and children for infringement
  • and seemingly excessive royalties being demanded for use of content, thus inhibiting creative cultural documenting or expression.

In fact, this final point touches on the essence of ‘Bound by Law?‘ about the use of media in documentary filmmaking.

The powerful rightsowners will be protected by this legislation, but innovation, creativity and cultural expression might well be the biggest sufferers.

I’ll have to see how this story plays out in the coming weeks.

Open Licensing #ioe12 Post1

Until I watched the Larry Lessig TEDxNYED video (outlined in this post) I didn’t really understand the reason for copyright too well. I thought it was primarily about income revenue, which wasn’t the driving motivation for my work.

I’ve gone down the Creative Commons route for licensing my own works, be it this blog, images on Flickr, videos, whatever. My own personal approach is that if someone wants to use my work, please go ahead;

  • re-use, re-mix,
  • make it better,
  • make it more relevant,
    • more understandable.

For me that’s what creation and culture is all about.

But copyright is about this level of control, how others want their own work to be licensed and used. The argument for the combined system is that there is then a place for commercial success as well as for this ‘other’ culture. To enable this to happen there needs to be a respect for the creators of both aspects, with an option of fair use or fair dealing in the Commonwealth.

I have found this MIT World video captured debate, ‘Copyright, Fair Use, and the Cultural Commons’ a useful one to expand my own understanding and others might also find it of interest at this point.

Welcome to Firefox Openness

The video on the ‘Welcome to Firefox‘ page really encompasses my reasoning for staying with the product.

  • Principle over profit
  • Secrecy is trumped by honesty & corporate interest by community
  • We believe that the web is more cared for than owned
  • More of a resource to be tended to than a commodity to be sold
  • Strongly believe in innovation that puts users strongly & squarely in the drivers’ seat
  • We believe that together, with this cause in mind, we can continue to innovate for the benefit of the individual & the betterment of the web so that it always & forever serves the greater good.

Notes from Lawrence Lessig TEDxNYED video

Open Licensing

As part of the ioe12 course work I took notes of what I thought to be the significant bits from this video:

All the following content is therefore attributed to Larry Lessig.

Copyright is about what level of control.

Copyright policy isn’t just about how to incentivize production of an artistic commodity, it’s about what level of control we are going to permit to be exercised over our social realities – social realities that are now inevitably permitted by pop culture.

… its important that we keep these two different kinds of public goods in mind. If we are only focussed on how to maximise the supply of one, … we risk suppressing this different and richer, and in some ways maybe even more important one.

Freedom needs

This opportunity to both have the commercial success of the great commercial works.


The opportunity to build this different kind of culture.

and for that to happen you need ideas like ‘fair use’ to be central & protected to enable this kind of innovation between these two creative cultures. A commercial & sharing culture.

  • A need for ownership
  • A respect of ownership
  • A respect we should give to
      • the Creator
      • the Remixer
      • the Owner
      • the Property Owner
      • the Copyright Owner

of this extraordinarily powerful stuff not a generation of ‘share croppers’.

There are lessons here about ‘openness’.

  1. Our lives are sharing activities, at least in part. For this to happen we need to have well protected spaces of fair use.
  2. This ecology of sharing needs freedom in which to create. Freedom which means without permission from anyone the ability to create.
  3. We need to respect the creator. The creator of these remixes through rights that are directly tied to them.

Creative Commons is offering authors this simple way to mark their contact with the freedoms they intend it to carry.

So we go from an

“All Rights Reserved” world

to a

“Some Rights Reserved” world.

And people can know the freedoms they have attached to content, building & creating on the basis of this creative copyrighted work.

These tools Creative Commons enable sharing, in parts, through licenses that make it clear. And a freedom that make it clear. And a freedom to create without requiring permission first, because the permission has already been granted & a respect for the creator because it builds upon a copyright the creator has licensed freely.

Hundreds of millions of digital artifacts are already licensed in this way.

Do we have this ecology right now?

Openness is a commitment to a certain set of values. We need to speak of those values.

The value of

  • Freedom
  • Community
  • limits in regulation
  • respecting the creator.

3 of the best

I’m just pumped up so far this year – participating in 3 big MOOCs – wallowing in sheer MOOCy goodness:

* The sweeping majesty of Change11 with Stephen Downes, George Seimens, & Dave Cormier
* The openness philosphy of ioe12 with David Wiley
* The creativity of ds106 with Jim Groom & Tim Owens

I’ll be a changed man by then end of all this.