UKOLN Informatics Research Group » From REDm-MED http://irg.ukoln.ac.uk Expertise in digital information management Mon, 09 Dec 2013 15:09:09 +0000 en-US hourly 1 http://wordpress.org/?v=3.5.2 End of REDm-MED Project http://blogs.bath.ac.uk/redm-med/2013/04/18/end-of-redm-med-project/?utm_source=rss&utm_medium=rss&utm_campaign=end-of-redm-med-project http://blogs.bath.ac.uk/redm-med/2013/04/18/end-of-redm-med-project/#comments Thu, 18 Apr 2013 16:37:07 +0000 Alex Ball http://blogs.bath.ac.uk/redm-med/?p=83 As you may have gathered from an earlier post, the REDm-MED project successfully concluded in June 2012. As a result, the time has come to close the blog; it will still be around for reference but there won’t be any new posts or comments. You can still see a full list of project outputs on the REDm-MED Web page, while RAIDmap lives on at SourceForge. I should finally point out that we handed over the baton to our longer-running sister project Research360, so if you want to know what happened next, that’s the place to look.

Goodbye!

]]>
http://blogs.bath.ac.uk/redm-med/2013/04/18/end-of-redm-med-project/feed/ 0
RAIDmap documentation on SourceForge http://blogs.bath.ac.uk/redm-med/2013/04/18/raidmap-documentation-on-sourceforge/?utm_source=rss&utm_medium=rss&utm_campaign=raidmap-documentation-on-sourceforge http://blogs.bath.ac.uk/redm-med/2013/04/18/raidmap-documentation-on-sourceforge/#comments Thu, 18 Apr 2013 16:15:01 +0000 Alex Ball http://blogs.bath.ac.uk/redm-med/?p=81 The RAIDmap Application User Guide and RAIDmap Application Developer Guide have been converted and can now be read on the RAIDmap wiki on SourceForge. In the long term, this will make it easier to update the documentation when changes are made to the software. If anyone is interested in developing RAIDmap further, please just put in a request in the usual way through SourceForge.

]]>
http://blogs.bath.ac.uk/redm-med/2013/04/18/raidmap-documentation-on-sourceforge/feed/ 0
RAIDmap: coming to a computer near you http://blogs.bath.ac.uk/redm-med/2012/06/26/raidmap-coming-to-a-computer-near-you/?utm_source=rss&utm_medium=rss&utm_campaign=raidmap-coming-to-a-computer-near-you http://blogs.bath.ac.uk/redm-med/2012/06/26/raidmap-coming-to-a-computer-near-you/#comments Tue, 26 Jun 2012 09:36:45 +0000 Alex Ball http://blogs.bath.ac.uk/redm-med/?p=76 The wait is finally over! RAIDmap, the piece of software we have been developing in REDm-MED, is now available for early adoption from SourceForge. RAIDmap is an adaptation of the Compendium information mapping software, tailored for use as a data documentation tool. It uses the National Library of New Zealand’s Metadata Extractor to collect information about data records, provides tools for mapping out the associations between them, and can export this information in a handful of different formats.

Truth be told, RAIDmap has been up on SourceForge for a little while now, as we’ve been tidying it up for release and making sure everything works as advertised. Not perfectly, you understand, but as advertised: there are a few bugs still lurking in the system, and plenty of scope for improvement. We see the delivery of the tool as the beginning rather than an end to it, and on that note we would like to put out a special plea to any developers out there to have a look at the code and see if this a tool you might like to contribute to. If so, you’ll be interested in the fifth and final part of REDm-MED Deliverable 5, the RAIDmap Application Developer Guide. This explains how to get started compiling and developing RAIDmap, how the installers are generated, and gives an overview of which bits of the code do what. We think RAIDmap has the potential to be a useful data management tool, but it needs just a bit more TLC than we have time or funds to give it, so any help would be greatly appreciated.

With the delivery of the software and the Developer Guide, that pretty much brings us to the end of the REDm-MED project, but in the manner of an album bonus track, there’s one more report to reveal. (Don’t get too excited.) The Minimum Mandatory Metadata Set for RAIDmap specifies the metadata collected by the RAIDmap tool, and explains the reasons why those particular elements were chosen. For those in a hurry, the short version is that we took from PREMIS and the DataCite Metadata Schema those elements which were easiest to supply at the point of record creation (or thereabouts) rather than at the point of ingest into a repository. We hope the report and its approach may be of use to institutions who are also considering what metadata to collect about research datasets.

]]>
http://blogs.bath.ac.uk/redm-med/2012/06/26/raidmap-coming-to-a-computer-near-you/feed/ 0
Deliverables behaving just like Buses; arriving all at once! http://blogs.bath.ac.uk/redm-med/2012/06/08/deliverables-behaving-just-like-buses-arriving-all-at-once/?utm_source=rss&utm_medium=rss&utm_campaign=deliverables-behaving-just-like-buses-arriving-all-at-once http://blogs.bath.ac.uk/redm-med/2012/06/08/deliverables-behaving-just-like-buses-arriving-all-at-once/#comments Fri, 08 Jun 2012 12:26:26 +0000 Mansur Darlington http://blogs.bath.ac.uk/redm-med/?p=67 As promised by my esteemed colleague Alex Ball, here are some more REDm-MED deliverables, arriving not quite in numerical order.

As Alex said, Deliverable 3, A Research Data Management Plan for Engineering Research, is a generic departmental data management plan based on Deliverable 2.

And then we have the first four parts (yes, really) of Deliverable 5. Parts 1, 2 & 3 are all tools which are associated with, and help project-level implementation of, the ME-RDMP (Deliverable 2) and, incidentally, which should be considered adjuncts also to Deliverable 3. These are:

Deliverable 5, part 4 is the RAIDmap Application User Guide, which should whet your appetite for Deliverable 6, the RAIDmap application itself. Sorry you’ll have to be patient just a little longer for this, as you will for the the last part of Deliverable 5.

Happy Reading, folks!

]]>
http://blogs.bath.ac.uk/redm-med/2012/06/08/deliverables-behaving-just-like-buses-arriving-all-at-once/feed/ 0
Deploy along with REDm-MED http://blogs.bath.ac.uk/redm-med/2012/06/01/deploy-along-with-redm-med/?utm_source=rss&utm_medium=rss&utm_campaign=deploy-along-with-redm-med http://blogs.bath.ac.uk/redm-med/2012/06/01/deploy-along-with-redm-med/#comments Fri, 01 Jun 2012 15:35:17 +0000 Alex Ball http://blogs.bath.ac.uk/redm-med/?p=64 Anyone who has done some professional writing will know that the order in which one writes things is not necessarily the order in which those things should eventually be read. I often find it easier to leave writing an introduction until I have made some headway with the body of a report. So it is that the third deliverable from REDm-MED Project will not be, as one might expect, Deliverable 3 (a generic departmental data management plan based on Deliverable 2). Instead, I give you Deliverable 4: Infrastructure Supporting a Research Data Management Plan for the Department of Mechanical Engineering, University of Bath. This document sets out in a concise and hopefully clear way the components of the infrastructure proposed by the Project. These include guidance documents, templates, tools, storage areas and some of the rôles and responsibilities of the people involved.

While I am on the subject, I should also point out the remaining deliverables will also arrive in a somewhat eccentric order. You can expect seven deliverables in total, of which the last to arrive will be Deliverable 6 and Deliverable 5 Part 5 (yes, you read that right). But that’s enough spoilers for one post…

]]>
http://blogs.bath.ac.uk/redm-med/2012/06/01/deploy-along-with-redm-med/feed/ 0
Demonstrating RAIDmap http://blogs.bath.ac.uk/redm-med/2012/04/20/demonstrating-raidmap/?utm_source=rss&utm_medium=rss&utm_campaign=demonstrating-raidmap http://blogs.bath.ac.uk/redm-med/2012/04/20/demonstrating-raidmap/#comments Fri, 20 Apr 2012 17:22:17 +0000 Alex Ball http://blogs.bath.ac.uk/redm-med/?p=60 On 23 March 2012, Uday Thangarajah and I gave a demonstration of the RAIDmap tool at a JISC workshop entitled ‘Meeting (Disciplinary) Challenges in Research Data Management Planning’. The slides and script I prepared for this demonstration are now available.

RAIDmap is a software tool we have been developing that provides a simple and intuitive way of recording information about research records and how they all fit together. The diagrams it produces map out the course of a research activity in terms of the records that is has left behind (or not); underlying the map are metadata about each of the records, some of them collected automatically. The genesis of the mapping technique is explained in an output of the ERIM Project.

The demonstration itself was done live using an unfinished version of the software, so there isn’t a copy I can show you. We will, however, be making the software available at the end of the project, so you should be able to give it a try yourself before too long.

]]>
http://blogs.bath.ac.uk/redm-med/2012/04/20/demonstrating-raidmap/feed/ 0
A Departmental Data Management Plan http://blogs.bath.ac.uk/redm-med/2012/03/16/a-departmental-data-management-plan/?utm_source=rss&utm_medium=rss&utm_campaign=a-departmental-data-management-plan http://blogs.bath.ac.uk/redm-med/2012/03/16/a-departmental-data-management-plan/#comments Fri, 16 Mar 2012 16:56:08 +0000 Alex Ball http://blogs.bath.ac.uk/redm-med/?p=56 The REDm-MED Project has produced its second deliverable: A Research Data Management Plan for the Department of Mechanical Engineering, University of Bath. This document begins the work of satisfying the Requirements Specification that formed the Project’s first deliverable.

The plan has two main sections. The first explains how to use data management plans at the project level: where they should be kept, how ‘public’ they should be, what is expected in terms of review and revision, and so on. The second provides a template to assist principle investigators and researchers in writing a project data management plan. Both sections include recommendations of tools for planning and performing data management, and give pointers to more detailed advice.

The recommendations were probably the hardest part of the plan to write. Despite reviewing the landscape, we did not find suitable tools for all the tasks that required them. Part of the problem was that some infrastructure and guidance needs to be provided at the institutional level, rather than the departmental or national level, so one consequence of writing this plan was a set of recommendations for our sister project, Research360.

]]>
http://blogs.bath.ac.uk/redm-med/2012/03/16/a-departmental-data-management-plan/feed/ 0
Requirements for Data Management Planning http://blogs.bath.ac.uk/redm-med/2012/03/05/requirements-for-dmp/?utm_source=rss&utm_medium=rss&utm_campaign=requirements-for-data-management-planning http://blogs.bath.ac.uk/redm-med/2012/03/05/requirements-for-dmp/#comments Mon, 05 Mar 2012 11:17:50 +0000 Alex Ball http://blogs.bath.ac.uk/redm-med/?p=50 The REDm-MED Project has produced its first deliverable: the Research Data Management Plan Requirements Specification for the Department of Mechanical Engineering, University of Bath. The meat of the document is a table that lists a series of requirements for research data management, and for each one provides

  • the rationale for the requirement,
  • the rôle (principal investigator, researcher, data manager) supported by the requirement,
  • the level (institutional, departmental, project) at which the requirement should be met,
  • information or resources that could help meet the requirement, and
  • validation of the requirement.

As the document explains, the requirements were derived from work conducted by the ERIM Project and the IDMB Project, in consultation with a panel of researchers and academics from the department. The validation for the requirements came from several sources, most notably the checklist underlying the DCC’s DMP Online tool, and Crowston and Qin’s Capability Maturity Model for Scientific Data Management. The latter came to our attention during a review of data management lifecycle models that we conducted.

]]>
http://blogs.bath.ac.uk/redm-med/2012/03/05/requirements-for-dmp/feed/ 0
Sharing is so Nice http://blogs.bath.ac.uk/redm-med/2012/02/28/sharing-is-so-nice/?utm_source=rss&utm_medium=rss&utm_campaign=sharing-is-so-nice http://blogs.bath.ac.uk/redm-med/2012/02/28/sharing-is-so-nice/#comments Tue, 28 Feb 2012 13:00:04 +0000 Mansur Darlington http://blogs.bath.ac.uk/redm-med/?p=45 Some quite nice stories are beginning to emerge from the Department of Mechanical Engineering related to the benefits (and to a lesser extent difficulties) of sharing data. This is of interest because non-data-sharing seems to be the current default. Here are a couple from the horse’s mouth.

Two data sets, the core of which were in the first instance made available commercially, are used as part of a software tool being used as a modeller in environmental energy research. As the research develops the local data set is updated and amended. The data sets are being used currently to support research in three disparate disciplines within the same university and have been used to support the research of 14 individual researchers. In addition four or so MEng student projects every year and some MSc student projects have used the data. At the same time, these data are being used by six major multi-institution research projects. That’s a lot of data sharing!

Notwithstanding this, the software doesn’t support easy data sharing amongst team members, and manual extraction and addition of data elements into the data sets being used by others has been resorted to. It has also been reported that data made available to one other colleague within the research community was subsequently published without appropriate acknowledgement to source. Clearly, then, sharing data is neither effort nor risk free.

Another project here relies on large high-speed video files for kinematic analysis, being sent (usually by memory stick or hard-disk) to one project partner university. The videos require signal processing and the data is sent back – as smaller files – for further analysis. Also processed are large numbers of image files for flow visualisation, these being twinned high-resolution images. They are stored and analysed locally, with output vector field data shared and sent to partners online via a wiki page egroupware. When too large for this, they are uploaded to an ftp server at the project co-ordinator’s lab in Estonia; the data are then available to all five project partners. The partners also provide returned analysis, usually via email, or via hard drives (sent or handed over personally at meetings). The data generated here provides data for re-use by three other project partners and is returned for further analysis from one of them.

The locally held data sets for this project (that is, those at the University of Bath) include 1.5 Terabytes of video, 1.5-plus Terabytes of digital particle image velocimetry images with an additional 1 Terabyte on a computer hard drive. There is also a 1.5 Terabyte external hard disk drive for further cropped or edited data. The size of these data sets has posed problems for current infrastructure, since the space demands are far higher than those on offer, and where local sharing between researchers has been desirable. External hard drives are cumbersome and difficulties arise when external hard drives and/or main computer are being used by others to secure immediate access to data sets. The large size of the files has caused sharing difficulties, as well as difficulties in file naming and notation for tracking data sets sent between partners.

It is clear from these two stories, that data sharing is already part of common practice in some engineering research activities and that it can be highly facilitating of research. It is also clear that, in the fullness of time, much more robust and easily usable methods of local and more remote data sharing are required, and ones that promote exchange, manipulation and security. Additionally, the development and adoption of methods of contextualizing data and data sets through the use of metadata is clearly needed.

Lots of work to be done, then!

]]>
http://blogs.bath.ac.uk/redm-med/2012/02/28/sharing-is-so-nice/feed/ 0
Scoping Data Management: inflation, inflation! http://blogs.bath.ac.uk/redm-med/2012/02/02/scoping-data-management-inflation-inflation/?utm_source=rss&utm_medium=rss&utm_campaign=scoping-data-management-inflation-inflation http://blogs.bath.ac.uk/redm-med/2012/02/02/scoping-data-management-inflation-inflation/#comments Thu, 02 Feb 2012 12:33:43 +0000 Mansur Darlington http://blogs.bath.ac.uk/redm-med/?p=38 I recently asked a PI how much data space he thought he and his team of researchers might need over, say, the next two years. Based on his current research he thought perhaps, 1-2 GB per project would be about right on average, although he did have one anomalous project the raw data for which is video records for which he thought about 50 GB would be ample. He then went on to say that predicting with any confidence how much research data would be generated by his research team over a longer time frame would be difficult since that would mean knowing what research they were doing, and that in turn would depend on what money was being put on the table, choices about bid writing, and bid success, studentships on offer and so on.

I decided to pursue the ‘anomalous’ project by talking to the project researcher in question. This what he said:

‘I originally recorded three main types of data: computer logging information, screen capture and webcam video, and mobile camera footage. Of these I generated (per week) about 1MB of logging data, about 26GB of screen capture and webcam footage and about 24GB of mobile footage.

‘I recorded data for three weeks (~150GB) and stored all of the originals for this period in addition to the converted files necessary for coding. I did not, however, store all the original files for the mobile video camera due to their size, approximately reducing them by half.

‘In addition to this I deleted a further nine weeks of data (~ 600-700GB) gathered during the participants’ acclimatisation period. Had the storage been available I would have kept and converted this data also, because there are a number of extremely relevant research questions that could be investigated based on it. This would have given an uncompressed total somewhere in the region of 1TB.’

Quite a difference, then, from the original 50 gigabytes guessed at by the PI.

This raises a number of questions. The first is about the confidence with which one can ask questions at ‘one remove’ about data use and have faith in the answers; even when there is no real prediction involved. When prediction is involved based on insufficient information then the quality of the answer is likely to be poor. If a question is motivated by the need to plan ahead, a ‘good’ answer is needed if good plans are to be made. Good data management planning requires good information: the inelegant dictum: ‘garbage in: garbage out’ applies. We are currently being asked to answer many questions, or having ourselves to ask them, about future data management needs. How, then, to we get good information upon which to base our data management?

Equally important, though, is the question of information proliferation. Currently there are about 130 GB of data on file for the project in question. With good house-keeping (aka, delete-and-be-damned) there might, say, be 50 gigabytes. How much of the 1TB that might be collected by project end would have been kept if the principal driver were to maximize the amount of data available for re-use? Presumably all of it, together with additional contextualizing data to maximize its amenability to re-use. But of course, that’s not the end of it: this data will need to be ‘managed’ during the project in accordance with the data management policies in force, and managed thereafter during the period appropriate to its continued usefulness and research funder policy.

So my question is: ‘to what extent will our data storage needs and concomitant management effort be inflated by the act of formalizing research data management?’.

And: ‘Can the research budget afford it?’.

Your views, as ever, most welcome.

]]>
http://blogs.bath.ac.uk/redm-med/2012/02/02/scoping-data-management-inflation-inflation/feed/ 0