Open Policy #ioe12

At the heart of the movement to open educational resources is the simple and powerful idea that the world’s knowledge is a public good and that technology in general, and the World Wide Web in particular provide an opportunity for everyone to share, use and re-use it.

Kathy Casserly & Mike Smith, Hewlett Foundation

The course topic ‘readings’ consider the area of pushing for legislation within the US to increase public access to data generated by publicly funded grants. Examples being the expansion of National Institutes of Health Public Access policy. However, I have previously written about the Research Works Act H.R.3699 which would undo this approach if my understanding is correct.

In the UK the Research Councils are requiring research data to be made openly available as it’s a ‘public resource’. Increasingly there is a requirement for research institutions to have a Data Management Plan in place prior to funding being granted, as I’ve previously mentioned.

Brazil has a very interesting openness approach as outlined in the OER into federal legislation article.

The bill deals with three main issues: It

1) requires government funded educational resources to be made widely available to the public under an open license,

2) clarifies that resources produced by public servants under his/her official capacities should be open educational resources (or otherwise released under an open access framework), and

3) urges the government to support open federated systems for the distribution and archiving of OER.

https://creativecommons.org/weblog/entry/27698

When reading this for education I was minded of the robust approach taken by Brazil towards pharmaceutical patents for the good of the national public health that I have previously encountered. There does seem to be a will in that country to work for the educational benefit of its peoples.

But more generally across the globe there are problems in policy at different levels:

  • within institutions
  • in government
  • etc.

where they don’t understand the technologies and make decisions within existing parameters.

Most of the resources in higher education are digital, non-rivalrous and we just need to license them properly, using Creative Commons licenses for example. Cable Green argues that by licensing and opening work there can be greater leveraging of a global workforce that will take one’s work and maybe translate it, make it more accessible, or improve it in other ways.

Cable suggests that there are instances where the policies of institutions have been circumvented.

Where the faculty have come together and said “we are the Academy. Our job in the Academy is to advance knowledge. Our job in the Academy is to share knowledge to the extent that it’s what we are about. We will not only publish in the journals, but we will provide a free, accessible version of our research as well to anybody who would like to access it.”

Cable Green, Creative Commons, (video) http://youtu.be/bPTzFbpKIFA#t=12m08s

By following an openness policy there is increased potential for sharing and learning from each other:

  • across an institution (intra-openness)
  • and crossing institutional boundaries (inter-openness).

This also has the potential for financial savings. But Cable suggests that we need to move towards a ‘not invented here’ to ‘proudly borrowed from there’ stance so that resources can be shared. Additionally there are general advantages for society if people have increased access to education; good quality curricula and affordable up-to-date ‘textbooks’, constantly maintained and with use of the latest technologies.

There is a movement where some universities are proving resources and instruction openly. “The OER university (OERu) is a virtual collaboration of like-minded institutions committed to creating flexible pathways for OER learners to gain formal academic credit.

The OER university aims to provide free learning to all students worldwide using OER learning materials with pathways to gain credible qualifications from recognised education institutions.

http://wikieducator.org/OER_university/Home

There is a much reduced fee for assessment and credit from the institutions. Obviously there is an outreach and community mission to this approach, but there are potentially widespread general implications to the approach, meaning a shift away from the status quo in higher education provision.

Currently the ‘anchor’ partner institutions of OERu are:

Other interesting things happening in this area are the University of the People and Wikiwijs in the Netherlands. Athabasca University in Canada has a policy that prior to building a new course the academic must go out globally and look at what OER materials are already available.

But there is a challenge with all of this; existing structures are difficult to change. The current ‘preferred’ institutional model of higher education is one of gatekeeper and rivalrous resource model.

Managing & Sharing Research Data Part 1

I attended a presentation this morning given by Martin Donnelly, Digital Curation Centre (DCC), University of Edinburgh covering ‘Managing & Sharing Research Data: Good practice in an ideal world … in the real world’  held at The University of Sheffield and promoted by the Research Ethics Committee there. It was a two hour presentation, with the first part made up of a presentation and the second of a demonstration of an online resource produced by the DCC called the Data Management Planning (DMP) Tool to enable easy production of DMPs to meet research funding council requirements.

I attempted to make notes during the presentation in the form of this blogpost; so the following is just that, my notes but you might find some use in them.

Background

DCC was founded in 2004 for UK HE & FE sectors. Its major funder is the JISC. It provides support for JISC projects as well as producing tools, providing guidance, case studies, consultancy, etc.

Body of Presentation

When considering data management there are a number of areas to focus on:

  • Ensure the physical integrity of the files
  • Ensuring safety of the content (read and understood by your target audience but not accessible by other people / Data Protection / file format / etc.)
  • Describing the data (metadata), and what’s been done to the data
  • Access at the right time – make data available only after publication (embargo)
  • Transferring custody of data from the field to storage, archiving and possibly on to destroying (this process needs managing and is not necessarily done by the data collector)
  • Research Ethics & Integrity.

However, there is also the concept of Openness, Open Science, Open Data that needs to be considered. Martin touched on the Panton Principle with respect to Open Science. This was a Principle drafted in Cambridge in July 2009 and officially launched in February 2010. Originally based out of the discipline of chemistry, the concept of the Principle as taken from their website is:

Science is based on building on, reusing and openly criticising the published body of scientific knowledge.

For science to effectively function, and for society to reap the full benefits from scientific endeavours, it is crucial that science data be made open.

By open data in science we mean that it is freely available on the public internet permitting any user to download, copy, analyse, re-process, pass them to software or use them for any other purpose without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. To this end data related to published science should be explicitly placed in the public domain.

[Aside: I shall be returning to this, not least for the ioe12 course.]

Martin also pointed to an article in The Guardian, ‘Give us back our crown jewels‘, Arthur & Cross, 9 March 2006.

Our taxes fund the collection of public data – yet we pay again to access it. Make the data freely available to stimulate innovation, argue Charles Arthur and Michael Cross

The Research Councils UK (RCUK) is the strategic partnership of the UK’s seven Research Councils. It has produced a Common Principles on Data Policy, which Martin summarised as having Key Messages:

  1. Data is public resource
  2. Adhere to standards & best practice
  3. Metadata for ease of discovery and access
  4. Constraints on what data to release
  5. Embargo periods delaying data release
  6. Acknowledge of / compliance with Terms & Conditions
  7. Data management & sharing activities should be explicitly funded

There are an increasing number of things influencing the management of reasearch data some of which I managed to jot down:

  • Research outputs are often based on the collection, analysis, etc of data
  • Some data is unique (e.g. date & time specific weather conditions data) and can’t be reproduced
  • Data must be accessible and comprehensible
  • There’s a greater demand for open access to publicly funded data
  • Research today is technology enabled and data intensive
  • Data is a long-term asset
  • Data is fragile and there is a cost to digital data; curate to reuse and preserve
  • Data sharing and research pooling might be more cost-effective: cross-disciplinary and increased global partnership
  • Costs of technology and human infrastructures
  • Increasing pressure to make a return on public investment

Most (but not all) Research Councils are broadly the same in their approach to data management. They are generally requiring a Data Management Plan prior to funding being granted. The NERC Research Council has a Data Policy & Guidance (pdf), and also provides data centres for managing funded research data.

EPSRC is the odd one out; they are requiring all institutions to provide a roadmap for data management by 1st May 2012 and implemented by 1st May 2015.

RCUK has a Policy and Code of Conduct on the Governance of Good Research Conduct (available as a pdf).

Martin highlighted how some universities have got into difficulty with regards to Freedom of Information (FOI) requests. He mentioned Queen’s University Belfast and a request about Irish tree rings that was made under FOI. He also said about how Stirling University had received a request from a tobacco company about the take up of smoking amongst teenagers, useful data for a tobacco company.

The University of Edinburgh has developed a Research Data Management Policy.

The question Martin then put was Why? Why do this? And he outlined the incentives in the form of carrots and sticks.

It’s a good thing

  • Data as a public good (the RCUK common principles)
  • others can build on your work  (Isaac Newton “If I have seen farther it is by standing on the shoulders of giants.”)
  • Passing on custody so making effective use of resources.

Direct incentives to researchers are:

  • Increased impact of your work
  • making publications online increases citations

These are covered more fully in:

More incentives:

  • Increase citations helps REF
  • Research councils are increasingly rejecting on the grounds of poor data management plans
  • You receive more funding if you do this right

And the ‘Sticks’:

There is a concern often raised by academic researchers about how their data will be used or misconstrued if it is out in the open. Martin emphasised the importance of appropriate metadata to try to prevent this. However, he did say that even then if the data was going to be misconstrued it will be anyway. Files need to be labelled in an understandable, meaningful, standard and appropriate fashion, to include the project title and date. It would also be useful to maintain a separate log describing the data, to include

  • research context
  • data history
  • where & how to access the data
  • access rights
  • etc.

Backup is also a consideration. It is different from archiving. Backup is about loss, damage and recovery of data during the research process. (Archiving is about retaining and providing access at the end of the research process.) There should be some means of off-site backup. There should be an implemented, automatic backup process at the University, Faculty or School level. If not, then a manual backup process is required with set repeat reminders.

Archiving is a case of depositing data for the long-term. However, it does require things like checking copyright, consent and data protection. You should use the appropriate archive for your subject discipline. It’s also important to publicise your archived data for increased citations. The point was made that there isn’t yet a standard for data referencing, and that some work needs to be done in this area. The other concerns about use of data without knowledge are just the same as if your published work is plagiarised.

Rachel Kane from RIS in Sheffield highlighted that specific Sheffield resources will be made available soon. She also provided some useful examples of what people where doing at the University, including:

  • Prof. Steve Banwart in Civil and Structural Engineering approach to open data
  • Dr Bethan Thomas in Geography SASI
  • HRI Digital – data management services – from application to archiving stages – consultancy

Web 2.0 to Web Squared – the next phase of web evolution

The term Web 2.0 is about five years old now. It was coined by Tim O’Reilly at a conference and was intended to indicate the second coming of the web; that it wasn’t dead following the bursting of the “dotcom bubble”. But it has taken on this kind of folklore meaning, with many seeing it as an incremental version roll-out as with a software update. And Tim says he has been continually asked what the next big thing will be.

Is it:

  • The Semantic Web,
  • Virtual reality,
  • The social web,
  • the mobile web?

And what’s it called, Web 3.0?

The short answer from Tim seems to be that it’s all those things listed, and more; and it’s not Web 3.0.

The next phase of web development is Web meets World and to achieve this doesn’t need an incremental step, but an exponential one.

Hence, the term we can expect to see moving into folklore following next week’s Web 2.0 Summit 2009 Conference is Web Squared or Web².

Back on June 25, Tim O’Reilly and John Battelle presented a webinar setting out their view of Web².

I’ve Tubechopped the initial 17min section of the video and a 1min15sec answer to questions about the impact on Higher Education. Please note: this was recorded from a webinar over a phone line so the audio isn’t great quality.

They also produced a special report on the topic; available as a pdf.

The remainder or this post is concerned with what I think is most pertinent from this report and my comments.

The fundamental premiss of Web 2.0 is that the Web is becoming an application platform reliant on data subsystems that get better the more people use them, rather than just an information platform.

The question that then arises is, “Is the web getting smarter?

Looking at the current generation of apps is where we see the web getting smarter. An example Tim gives is Google Mobile Application for the iPhone. The speech recognition in the cloud is aligned with the search in the cloud, so Google knows what you’re likely to say – Pizza rather than Pisa – then the location information from the phone indicates that you want to know where the nearest three pizza places are located, rather than a Wikipedia entry on the history and origins of pizza. That seems to be much smarter. Speech recognition, search and location information all working seamlessly together.

And boiling down the essence of good web apps is that they harness collective intelligence. Collective intelligence is a collective working that acts more intelligently and leads to greater value than can be achieved by the individual components, be they people, groups or computers.

Key takeaway: A key competency of the Web 2.0 era is discovering implied metadata, and then building a database to capture that metadata and/or foster an ecosystem around it.

Web²  Special Report, p.4

Examples of what appear at first to be unstructured data that have subsequently been identified and utilized include Facebook where online relationships with friends are used to form a social graph, Bit.ly where a URL shortening service realised the potential of realtime analytics, the fact that every web link is a vote and every link from a person deemed to have greater standing in a group (as measured by their contributions to that group) has a greater weighting.

The report considers the influences that moving sensory and input devices away from the fixed keyboard and into our hands will have. These devices (e.g. smart phones) have eyes (cameras), ears (mics), position and direction locators. All of this will enable increasing amounts of metadata and tags to be automatically and more accurately assigned to vast amounts of data stored in cloud databases. And, interestingly, when the amount of data reaches a critical point, the addition of extra data actually reduces the size of the database because the linkages become stronger and the need for explicit metadata reduces.

This will give rise to a number of new applications, leveraging these affordances. Already we are seeing interesting augmented reality applications, including Layar on Android phones;

and the potential use of location specific images with Adobe’s Infinite Images to create 3D experiences of real and imaginary worlds (video filmed at conference).

An article appeared a couple of days ago in computing.co.uk entitled Moving beyond Web 2.0. It too was looking at Tim O’Reilly’s Web Squared concept. I’d like to highlight some points from this article, because it not only talks about the advances in technology and the concepts that encapsulates, but it also focusses on the (for me) important philosophies underpinning Web 2.0.

There is, however, more to Web Squared than new types of application that will process the immense data shadows soon to be cast by the emerging internet of things. More broadly, Web Squared is also about recognising that Web 2.0 has been as concerned with embracing new philosophies as new technologies. And in championing Web Squared, O’Reilly is signalling that the Web 2.0 ideologies of openness, transparency and rapid, collaborative value creation may have significant value well beyond the internet.

A big idea of Web Squared is that this may be achieved by applying the philosophies of Web 2.0 to mainstream politics and business thinking.

The CIOs who are embracing the cloud and not trying to build barricades around their datacentres are the ones who understand the philosophies as well as the technologies of Web 2.0, and who will also very much grasp Web Squared.

Some time ago I expressed my take on the importance of the philosophy of web 2.0 rather than just the software, services and mechanics in a presentation I gave (full text available).

The relevant specific audio section about the philosophy is reproduced here:

My Diigo links for Web Squared.

Process or product – assessment and HE institutions

Following a private (DM) discussion on Twitter with @evestirling the following occurred to me.

If the product of students’ work is to be assessed then it’s appropriate for the HE institution to set the assessment environment and medium.

If process is the important factor, then the students should be allowed to work in whatever environment and media they want, and the institutions need to adapt their assessment processes to accommodate. The control should be with the student in the form of a PLE; not the HE institution.

Comments are very welcome on this.