Report on ‘Bring Your Own Data’ workshops on 8th & 9th March 2018

Following the well-attended seminar on ‘Managing, sharing and curating your research data in a digital environment’ on 6^th March 2018, Sonia Barbosa, Manager of Data Curation, Institute of Quantitative Social Science Dataverse, Harvard University and Danny Brooke, Dataverse Development Project Manager, Harvard University, conducted two ‘Bring Your Own Data’ workshops, one for soft sciences and another for hard sciences researchers.

The ‘Bring Your Own Data’ workshops provided researchers with the opportunity with a hands-on experience in sharing their datasets on DR-NTU (Data).

Participants began by creating a sub-dataverse for themselves under their respective school/research centre and learnt that they could do the following customisations to make their sub-dataverses more user-friendly:

Customising the browse/search facets in their sub-dataverses to facilitate better browsing and discovery of their datasets
Customising a dataset template with the relevant fields to describe their research data

After which, Sonia went on to explain what constitutes a high quality dataset records and dispensed the following tips to ensure the visibility and reusability of datasets:

Description of the dataset record should be as comprehensive as possible to provide ample context of the research data files and if they should be accessed in any particular order or with any particular software
Related publication’s citation (if any) should be included in the dataset record to allow other researchers to refer the publication for more information
Code files, on top of final research data files, should be included to allow other researchers to reproduce the data where possible
File names should be short but meaningful so that users will know what to expect when their access the files
Use data tags to label the data files for better organization
Data files should be saved in open (i.e. software-agnostic) file formats to ensure long term accessibility where possible
Sensitive data should be sufficiently de-identified before sharing them publicly

To wrap up the workshops, Sonia and Danny went through the list of questions (see below) which the participants posed on slido. We hope to incorporate as many of the questions as possible in our FAQ soon.

We hope that the workshop participants would continue with the best practices in sharing their datasets on DR-NTU (Data) as demonstrated by Sonia and Danny!

Questions by participants:

How do you ensure data security? Is Dropbox a secure platform for intermediary data sharing (e.g. lab members enter data and update datasheets in Dropbox)?
Some types of data are not stable over time (e.g. reproducibility issues when R.
How do we handle hardcopy data (e.g. paper questionnaires, consent and demographic forms with sensitive information)?
Can the school repository be linked to OSF?
Is it possible to create connections to other published studies or projects by international collaborators? Or is the school repository limited to sharing between NTU researchers? Ease of international collaborators in using the system.
Possible ways to make external storage devices safer? Currently my lab has a Synology hard drive where all data backup is done (i.e. we transfer data from thumbdrives or portable hard disks to the lab hard drive). It has a few layers of password protection, but still I worry about security.
Much of the information required in the DMP was detailed in the IRB. Are they considered equivalent?
How to de-identify data? Any guidelines?? Especially if we have demographic info and various info about the experimental settings?
10 year retention for data — what about sensitive data ? (fear for thefts etc.)
Possible to make amendments to DMP after submission? (like how IRB allows amendments)
Can we still use the repository after graduating?
Is the repository safe from hackers? are librarians/curators able to access all our research data?
Can doi ever be eliminated? if you accidentally put up wrong data and don't want it to be up there
If I have 200+ data files, is there a way to batch upload them? or uploading them in .tar / .zip is the only option?
Dataset within a dataset?
Is it possible for the data provider to delete the data after I used it and cite it in my paper?
Can files be previewed inside a dataverse or a dataset except downloading for integrity checks?

Workshop for data producers (hard sciences) photos

Workshop for data producers (soft sciences) photos

	April 2024
Mon	Tue	Wed	Thu	Fri	Sat	Sun
1	2	3 Intro to DR-NTU & DR-NTU (Data) (Online) Intro to DR-NTU & DR-NTU (Data) (Online) April 3, 2024 @ 10:00 am - 11:00 am Online through Zoom Details & Registration •	4	5	6	7
8	9	10	11	12 The Influence of AI on Academic Publishing The Influence of AI on Academic Publishing April 12, 2024 @ 4:00 pm - 5:00 pm Details & Registration •	13	14
15	16	17	18	19	20	21
22	23	24	25	26 Data Management Plan (DMP) Data Management Plan (DMP) April 26, 2024 @ 3:00 pm - 4:30 pm Online via MS Teams Details & Registration •	27	28
29	30

Report on ‘Bring Your Own Data’ workshops on 8th & 9th March 2018

About The Author

Chew Shu Wen

Leave a reply Cancel reply

Workshops & Events

April 2024

Follow Us

Facebook

Twitter

Instagram

Categories

Archives

Report on ‘Bring Your Own Data’ workshops on 8th & 9th March 2018

About The Author

Chew Shu Wen

Related Posts

DR-NTU (Data) is here!

Open Access Week 2020: Webinar Series

Convocation 2018 at NTU Library

DR-NTU (Data) turns 1! Increase your research impact via data sharing

Leave a reply Cancel reply

Workshops & Events

April 2024

Follow Us

Facebook

Twitter

Instagram

Categories

Tags

Archives

Subscribe by Email