Monday 6 December 2010

Records management: the plasterer's hammer?

“For a field largely reliant on the active participation of the individual users responsible for creating, using and managing records to achieve its aims, much of records management appears sorely lacking in the depth and sophistication of its knowledge about those same user s, their needs and objectives”.

So begins the conclusion of the paper written by myself and Jay Vidyarthi and published in the latest volume of the Records Management Journal. (Vol 20, No 3)

The paper discusses the way in which records management has focused almost exclusively and to the exclusion of virtually all other considerations on the needs of ‘the organisation’ often to the detriment of the users we are so reliant upon. Records management is a discipline which strives for standardisation, consistency and uniformity; for example in the form of functional classification schemes attempting to map activities across the entire organisation with a view to constructing a ‘corporate file plan’ or shared metadata schemas. This drive to standardisation isn’t just evident within organisations, but across them – be it in the guise of ISO 15489 or any of the specification standards for an EDRMS – all of which have at their heart the desired goal of uniformity of approach.

Read any section of 15489 and it’s abundantly clear who the main beneficiary of records management is intended to be – and its not the user. Virtually every section defines its objectives in terms of the benefits it will provide to ‘the organization’ with the user(s) getting barely a mention. Now none of this may strike the user as particularly surprising, nor in any way negative. After all, records management has long strived to be acknowledged as a ‘corporate function’ alongside HR, finance etc and clearly many of the drivers for it (accountability, governance, regulation etc) tend to apply at the organisational, rather than the individual level.

None of this is intended to criticise, but to shed some light on why it is that records management often struggles to satisfy the requirements of the individual users it relies on for success and why it could be argued that it has given up even trying. At its most extreme this disparity between the design of many records management systems and the needs of the individual user is most succinctly summed up in a quote made by one EDRMS user to me once that ‘making me use an EDRMS is like asking a plasterer to use a hammer’!

This clearly puts records management and the technology we rely on to implement it (whatever that technology may be) in something of a quandary. Is it really possible for it to successfully serve two equally demanding masters? Can we really hope to find ways of meeting the myriad, highly specific, highly personal demands of our user community in a way which not only pleases each individual user but also in a way which continues to meet the obligations and interests of the organisation as a whole?

Carry on as we are and I fear the answer will continue to be ‘no’; but open our eyes and ears to some radical new perspectives and it could yet be a ‘yes’. Human-Computer Interaction, or HCI is a combination of computer science, cognitive psychology, sociology, information science and design which might just represent the ‘missing piece of the puzzle’. A blog post doesn’t provide the space to explore the detail – that’ what Jay and I start to do in the RMJ paper. Here it suffices to describe it as a structured approach which puts the users first to ensure that they can interact with the system in ways which meet their needs whilst also continuing to meet the needs of the organisation. By shining a light on the behaviour, needs, opinions, tendencies and motivations of end-users it’s the first step towards achieving truly effective records management systems. After all, give somebody a tool that patently saves them time, energy and frustration and they would be foolish not to embrace it; but so too we must acknowledge that the reverse is true and that to try to make somebody use a tool that promises to only help someone (or something) else but at their own personal expense and surely we must concede that they would be a fool to use it.

The implications of such a shift in emphasis are profound, for records management as traditionally conceived is a house built from the top down determined by the needs of the organisation, and not one built from the bottom up based on the needs of its users. But it also offers some tantalising prospects: not just RM systems that users actively want to engage with, but also the possibility that we could start to use this new found knowledge of user behaviour to design and create records management systems that can actually manage records ‘automatically’ (at least in part) based on this behaviour – in a way similar to that used by Amazon et al to organise their content to aid the user experience. Desirable? Definitely. Possible? Who knows, but what this space…

Wednesday 25 August 2010

Is the Cloud aware that it has 'the future of digital archiving in its hands'?

As anyone in the audience at the ECA conference in Geneva earlier this year will be aware, one of issues which I’ve been mulling over in recent months relates to roles and responsibilities in ‘the cloud’. The question I was asked to address in Switzerland was ‘in whose hands does the future of digital preservation lie?’ and my succinct response was: ‘Google's’. This was (for reasons evident in the paper I gave) meant both literally – given their increasing dominance of the cloud space but also metaphorically, as an encapsulation of all cloud service providers.

And certainly when my colleague, Doug Belshaw, pointed me in the direction of this post regarding Facebook’s archiving policy it became clear that I’m not the only one thinking about the (unintended?) consequences for all parties of where this might lead us.

Its tempting to see things only from our (by that I mean the archival community) side of the fence – to lament the inevitable decline in our future professional role that the handing over of content to commercial external service providers for its long term preservation will entail and to worry about what it may mean for the archives (and their users) of the future.

But maybe we should also pause to reflect on what it may mean for these service providers themselves and whether they actually have as much concern about the implications of this new found responsibility on their side as we do on ours.

For as I concluded my paper in Geneva:

"Perhaps we should actually stop to ask Google and their peers whether they are indeed aware of the fact that the future of digital preservation lies in their hands and the responsibilities which comes with it and whether this is a role they are happy to fulfil. For perhaps just as we are in danger of sleepwalking our way into a situation where we have let this responsibility slip through our fingers, so they might be equally guilty of unwittingly finding it has landed in theirs.

If so, might this provide the opportunity for dialogue between the archival professions and cloud based service providers and in doing so, the opportunity for us to influence (and perhaps even still directly manage) the preservation of digital archives long into the future".


To again quote from the conclusion of my paper:

"Maybe the interconnection of content creation and use and its long term preservation need not be as indivisible within the cloud as it might first appear. Yes Google’s appetite for content might appear insatiable, but that does not necessarily mean that they wish to hold it all themselves – after all, their core business of search does not require them to hold themselves every web page they index, merely to have the means to crawl it and to return the results to the user. Might we be able to persuade them that the same logic should also apply to the contents of Google Apps, Blogger, YouTube and the like? If so, might the door be open for us, the archival community through the publicly funded purse to create and maintain our own meta-repository within which online content can be transferred, or just copied, for controlled, managed long term storage whilst continuing to provide access to it to the services and companies from which it originated?

That way they get to continue to accrue the benefit of allowing their users to access and manipulate digital content in ways which benefit their bottom line, the user continues to enjoy the services they have grown accustomed to and the archival community can sleep soundly, safe in the knowledge that whilst service providers are free to do what they want with live content, its long term preservation and safety continues to lie in our own experienced and trusted hands".


I wonder if such dialogue is already occurring between Google, Facebook et al and the likes of NARA, NAA and TNA. Lets hope so…

Thursday 1 July 2010

Wisdom of the citizen?

Very interesting to see the new government initiative launched today by Nick Clegg which calls on the public to help decide which laws they want scrapped.

Now I should point out from the beginning that I'm not passing any political judgement here on its relative merit. What interested me was how the proposal reflects the whole 'wisdom of the crowd' ethos which as many of you will know I have long been advocating as a model for ensuring records management.

The idea is explained more fully in the Daily Telegraph and elsewhere, but in summary (and to quote from Nick Clegg)

"Today we are taking an unprecedented step. Based on the belief that it is people, not policy makers, who know best, we are asking the people of Britain to tell us how you want to see your freedom restored"

Certainly echoes there of my belief that the creators and users of records are often far better placed than Records Managers to understand their records and how we should be looking for innovative ways of extracting this knowledge and using it for management purposes.

And how is this to be done? To quote from Nick Clegg again:

"We are calling on you for your ideas on how to protect our hard-won liberties and repeal unnecessary laws... we're hoping for virtual mailbags full of suggestions. Every single one will be read, with the best put to Parliament"

Again, interesting to see an example of how technology can now be employed to gather and quantify information from a large cross-section of interested people and then used to inform the deliberations of those whose formal role it is to make such decisions.

Now whether this is just political gimmick or a genuine attempt at change is not for me to comment on. But as a high profile example of how technology now has the potential to empower individuals (be they 'citizens' or 'users') and how decision makers can and should now make use of such decision does seem worthy of comment.

Sunday 2 May 2010

Is digital preservation now routine?

It’s been a while since I attended a conference specifically themed around digital preservation / electronic archiving and having spent a few days last week in Geneva at the excellent European Conference on Archiving I was struck by the change. Not many years ago such conferences were dominated by debate about the technical complexities it posed, about the relative merits of competing theoretical approaches such as emulation and migration and the risks we faced if and when we got it wrong. The fragility of digital media was stressed and compared unfavourably to the durability of their traditional counterparts (encapsulated by seemingly endless comparisons between the original Domesday Book and its 1980s electronic equivalent).

I heard none of this at ECA, at least not in the sessions I attended or the conversations I was party to. Instead there were plenty of case studies from around Europe of organisations who are quietly and successfully getting on with it. On the evidence of the past few days we seem to have found ourselves in the situation where our ability to actually preserve this stuff indefinitely and to continue to provide access to it seems, without much triumph or fanfare to now be taken as read. This is not, of course, the same as saying that no more problems or challenges exist, but they seem to be of a more prosaic, ‘routine’ nature revolving around the need to secure budgets and improve the user experience etc.

More interestingly still if a single concern dominating the conference can be identified it seemed to be one related to the volume of information being created and stored today and estimated to be created tomorrow. I lost count of the number of presentations which contained jaw-dropping predictions of the amount of data soon to be at our fingertips and the challenges this will pose in terms of resource discovery, legal discovery and overall management. But interestingly virtually never its preservation. So, based on the evidence of this conference alone, it seems as though within a few short years we have jumped from a situation where we used to worry obsessively that we were in danger of losing everything to one where we now stress about how to manage a world where we will lose nothing, with barely a pause for reflection on this change.

My other observation was that (my own meagre contribution to one side) there was little or no discussion about the growing impact of the web as a storage ‘repository’ as heralded by the rise of ‘the cloud’. The unspoken assumption behind most of the debate and the projects and initiatives they represented seemed to be that these organisations will always have physical control of the electronic information they wish to preserve both prior to and after its ingest into their electronic archive, but as I tried to stress in my paper, I wonder how safe an assumption that will prove to be?

Friday 5 March 2010

'Big data': big potential, big challenges

This week’s Economist has an excellent special report on managing information entitled ‘Data, Data everywhere’. It looks at the changes, opportunities and challenges posed by our new found ability to create and manipulate vast quantities of data – big data. There are lots of impressive/daunting (depending on your point of view) statistics about just how much data we are now talking about (40 billion photos on Facebook for example) during this “industrial revolution of data”. It also explores the concept of ‘data exhaust’ the trail of clicks which users leave behind them and which Google and others have been able to put to such incredible use: from search to speech recognition and from spell checking to language translation. All made possible not by attempting to training computers the rules which determine how these concepts work, but instead by tracking the activities of billions of user transactions which do the work of refining, correcting and adding relative value to words. Those who have heard my ‘Meet the future of Records Management: Amazon.com’ conference paper will know that I have long suspected that we could and should be making use of this exact same ‘exhaust’ to help us manage information, as well as profit from it - what I describe as 'Automated Records Management' (See also Records Management Journal Vol 19 No.2 2009 for a paper I wrote on this entitled 'Forget Electronic Records Management its Automated Records management that we desperately need')

There’s also interesting stuff in the Economist supplement on the problems of how to make sense of all this data, including new ways of visualising it and the prediction that statistics will soon be one of the coolest jobs around(!). It also makes some interesting points about the need for management to be trained in how to make sense of all this data. This chimes with a conversation I had with a Chief Exec a few weeks ago who also made the case for ensuring that senior management were aware of good old fashioned archival concepts such as provenance and context to give them a better appreciation of what the data they are looking at is actually telling them or how much it can be relied upon (rather than what they wish it was telling them and how much faith they may wish to place in it).

To give the Economist its due it does also look beyond the potential and address some of the challenges (and not just in relation to security – see previous post). Admittedly it does appear a little confused about the subject of data retention stating that ‘current roles on digital records state that data should never be stored for longer than necessary because they might be misused or inadvertently released’. It then goes on to state that ‘in future it is more likely that companies will be required retain all digital files, and ensure their accuracy, rather than to delete them’ – a vision of the future likely to strike fear into every records managers heart. There are some immediate flaws obvious in this logic (in the EU at least) where current data protection laws prevent this in relation to personal data, and elsewhere the Economist itself draws attention to the problems that storing such massive amounts of data is causing to the existing technical and resource infra-structure that Google et al rely on which would seem to favour a more selective approach to data retention on pragmatic grounds if nothing else. But whether such concerns are considered enough to stop the ‘lets keep and exploit everything’ bandwagon which lies behind much of this report is at best debatable and at worst, I suspect, distinctly unlikely.

Wednesday 24 February 2010

Information Management: the forgotten issue of the cloud

There was an interesting supplement on Cloud Computing from MediaPlanet within this Saturday’s Daily Telegraph (Ok I know it’s now Wednesday, but it takes me most of the week to wade through the weekend papers!).
The supplement - of which I can sadly find no English language online version - appears to be aimed at a senior management audience and is deliberately light on the technical detail, choosing to focus more on the benefits to the organisation which moving towards cloud-based computing can bring (institutional agility, flexibility and cost saving seem to be the main arguments in favour). It also includes ‘5 steps to making the most of cloud’ which are:

1. See the possibilities
2. Consider security
3. Use it to your advantage
4. Push the boundaries
5. Consider logistics

It would be hard to disagree with any of these, but its steps 2 and 5 which interested me most. For whilst steps 1, 3 and 4 (and, indeed, the rest of the content in the supplement) is designed to articulate the advantages and to push the potential it is these two steps which are designed to sound a note of caution and to instil the need for a cautious, managed approach to the management of the risks involved.

But if you were to rely solely on this supplement for guidance you’d be mistaken for assuming that data security should be your only concern when adopting a cloud-based computing environment (especially as the ‘logistics’ which Step 5 encourages you to consider relate to issues of security and mobile devices so is, in effect, just an extension of Step 2: Consider Security).

Aside from a passing mention of data protection and the potential need for some organisations to keep certain data within ‘certain geographic boundaries’ (which I’m assuming is again essentially related to the requirements of the Data Protection Act) what is entirely missing is an appreciation of the information management implications of moving data to the cloud. There is no acknowledgement of the need to ensure that current levels of record and information management control, say in relation to resource discovery or retention, must be continued into the cloud; nor any recognition of the potential problems of ensuring that this is so.

Interestingly, some of the issues which may come to the surface if these concerns are ignored are obliquely and inadvertently acknowledged – for example the point is made that in the cloud you pay as you consume, but the point is not expanded to its logical conclusion that it therefore pays to know exactly what information you still need to store (and pay for) and what can safely be destroyed. Likewise, the point is made that one of the biggest advantages foreseen for the ‘G Cloud’ (the UK Government Application Store which is currently being trialled) “could be allowing departments to share non-sensitive data so paper work is reduced and processes sped up” but no consideration is given as to how ‘sensitive’ and ‘non-sensitive’ data might be appropriately identified and controlled within the cloud.

On a more positive note Mark Taylor from Microsoft draws attention to the need for increasing standardisation so that the cloud ‘runs along the same principles and business models no matter who is managing the hosting’. Might the development of such standards and interoperability offer a potential means by which a single management layer can be placed on top of the cloud to allow organisations to consistently manage their information wherever it happens to reside in the cloud? And in doing so might it help address some of the management information issues which this supplement failed to acknowledge?