University of Huddersfield Logo

Research Data Management Support

University resources

Live projects: The University of Huddersfield uses Box, for which we have an institutional subscription. Box provides secure and GDPR compliant storage for live projects. Files are guaranteed to be stored on servers located within the UK. Box also allows for secure collaboration with both internal and external collaborators. There is a single file size limit of 100GB, but there is no limit to the overall size of the dataset. Box does not have a specific back up function, so good practice is once you are set up in Box, to create your own subfolder entitled “Backup”, and then periodically backup data to this folder. Box has a recovery system whereby you can recover the last 100 versions of a file. Once something is deleted from Box, it goes to a recycle bin. After 30 days items in the recycle bin will automatically be deleted. The institutional Box account is managed by Research, Innovation and Knowledge Exchange (RIKE). For questions and access, please contact oa@hud.ac.uk.

Sharing data publicly

If it was stated in your research data management plan that your research data would be shared publicly using Box and Pure, this can be arranged. However, moving forward, the preferred option for most research projects will be to use a third-party publically available data repository, provided the repository meets FAIR Principles: Findable, Accessible, Interoperable, Reusable. For UKRI-funded research projects, there are specific requirements. There are hundreds of data repositories including discipline-specific, funder-recommended, journal/publisher-recommended and general.

Examples include:

Whether you choose the institutional option or a third-party option, a record for the dataset should be made in Pure.

For help choosing the correct data repository to share your research data, including meeting funder requirements if applicable, please contact oa@hud.ac.uk. You can also find further guidance on the Data Curation Centre website. 

What to consider before sharing research data publicly

Increasingly, scholarly journals insist that underlying research data be shared publicly and that a link to the dataset be provided in the manuscript, which can be checked during the peer-review process. Some journals insist that the data be deposited in their own data repository if you wish to publish with them - please check this requirement and the standards of the repository (FAIR Principles) before submitting a manuscript. 

The first section below lists items to consider before sharing research data publicly. The next section lists fields required when creating a metadata record in Pure for your dataset:

  • Is the research connected to this dataset being funded by a grant?
    • If so, what funding body?
    • Funding bodies often have their own dataset repositories and their own requirements. If you have answered ‘yes’ to this question, please contact oa@hud.ac.uk for more information.
  • Do you have the rights to make the data open? Or have you received permission from rights-holders? This may be important if you are using data from third parties.
    • If you are going to make the dataset open, what license will you use? Many datasets use Creative Commons (CC) licenses, but there are others to chose from that may be more appropriate to software, code etc. For further detail read this guidance provided by OpenAIRE.
  • Has the dataset already been published via another repository (does your dataset already have a DOI)?
    • If so, please ensure you create a metadata record for the dataset in Pure and link to the other repository.

 

  • Have you considered what data should be shared and what shouldn’t?
    • Not all data collected should or needs to be shared. Keeping all of the considerations listed here in mind, what was specified in the original ethics application in terms of what can be made publically available and what consent was given from participants if applicable and how to manage consent, consider what data should or should not be shared and why. This can then form part of your README.txt file describing the data (see below for more information on README.txt files).
  • Is your data sufficiently anonymised?
    • Data can include text, images, videos and many other formats, which may need consideration in terms of anonymising.
    • Care should be taken when dealing with personal or sensitive data. If the data cannot be sufficiently anonymised, or if the details are so sensitive (for example interview transcripts with personal or identifiable details) that they are not able to be shared, it is good practice to include an explanation as to why the dataset is closed or should be closed. This explanation can be included in research data management plans, grant bids and publications. Although some data cannot be shared, this helps with transparency.
  • View the ICO definition of personal data and the ICO anonymisation code of practice. 

 

    • Have you created a .txt file entitled ‘README’ as part of your dataset?
      • If not, please create this file and include as much information as possible about what data is in each file, methodology, define columns of spreadsheets if they are not already defined, any other useful information about how to use and interpret the information in each file.
    • Have you structured your data and labelled it in a consistent manner?
      • Make sure files are labelled as explicitly as possible.
    • Have you considered the best file formats?
      • Your file(s) should be future-proof and if possible, not dependent on proprietary software formats. Common recommended file types include: .csv, .txt, .xml, .tiff, .mp4, .jp2, .pdf, but for a full list contact oa@hud.ac.uk.
      • .sav files (SPSS) should be converted to .csv or should have a .csv version deposited alongside the original.
      • .ltx and .latex (LaTeX) should have pdf versions deposited alongside the original.
      • .rtf (Rich text) should have .pdf or plain text versions alongside the original
      • .gif, .psd, .pdd, .bmp and other image files should be converted to .tiff files.
      • .mp4 is the recommended audio file type

    For datasets comprising more than 200 files:

    • .zip is the recommended compressed format, although if possible it is preferable to submit individual files for allow for easier navigation by users.
    • Note to Mac users: Sometimes .zip files larger than 4 GB created using Mac in-built zip functionality cannot be opened using other platforms. Therefore Mac users may wish to use an alternative such as GNU zip (gzip) for zipping archives of more than 200 files.

Metadata fields for Pure dataset records

  • Title
  • Description
  • Date of data production (either a specific date or a period of time with a start and end date)
  • Author/contributor names
  • Name of your department
  • Ethics committee name and document approval number/reference (where it exists)
  • Has the dataset been published via another repository (does your dataset already have a DOI?) or will you be depositing this dataset with the University of Huddersfield for the first time?
  • What dates does the dataset cover? For example, if the dataset contained information about livestock purchases between 1850 – 1900.
  • Do you want your dataset to be made open?
    • If so, what license would you like to assign to your dataset? Many datasets use Creative Commons (CC) licenses, but there are other types of licenses that may be more appropriate for software, code etc. Read this guidance by OpenAIRE for further information. If you have questions about licenses, please contact oa@hud.ac.uk.
  • Would you like your dataset to be embargoed for a period of time, then be made open once the embargo period has ended?
    • If so, what license would you like to assign to your dataset once the embargo period is over?
    • How long would you like the embargo period to be?
  • Optional: Does your data correspond to a particular geographical point or area?
    • If so, you can specify coordinates or a range of coordinates
  • Are there any research outputs, activities, funded projects, press/media etc that this dataset is related to?
    • If so, please ensure these records are in Pure. We can then ‘relate’ the records so that if a user clicks on one, they can see that other items are related to it. This makes it easier to see all the items connected to the research.

Advice and support regarding research data management

Email oa@hud.ac.uk with any questions or requests regarding research data management.

The awards winner 2012, 13, 14, 15
University of the year 2013
QS 4 Star Logo
Athena Swan Bronze Award

VAT registration number 516 3101 90

All rights reserved ©