Thing 10: Sharing sensitive data
Sharing sensitive data requires careful consideration, but it can be done. Find out how.
- Getting started: If it’s so sensitive - how can it possibly be shared and published?!
- Learn more: Who are the “data gatekeepers”?
- Challenge me: Make me anonymous
Getting started
Sensitive data can be shared!
Major, familiar, categories of sensitive data are human data (eg health and personal data, secret or sacred practices); or ecological data (may place vulnerable species at risk)
Given the nature of this type of data, you might expect that it can’t be shared and reused. But in many cases, it can be.
1. Explore one of these examples of published sensitive data:
- The Pregnancy and Lifestyle dataset shows how sensitive, de-identified data can be safely and openly shared.
- This one page story tells how sensitive data from the Australian Longitudinal Study of Women’s Health data has been successfully published for almost 20 years.
2. How do you share and publish sensitive data?
- Scan the ANDS sensitive data webpage.
- If you have time: follow a couple of the links on the sensitive data page which are of particular interest to you.
Consider:
Imagine you are either a researcher or a participant in a health data survey:
- Participant: what questions might you first ask the researcher about intended sharing and reuse of the survey data?
- Researcher: What responses would you need to prepare to anticipate participants’ questions about publishing “their data for all the world to see”?
Learn more
The ethics of sensitive data
How we manage sensitive data through its lifecycle and who has a role in ensuring sensitive data is appropriately managed and shared are critical issues in ensuring sensitive data can be shared.
- Explore the roles that different people in an institution can play when considering the ethical framework for sharing research data.
- Check out the landing page for the Open Data Institute’s Data Ethics Canvas and read the questions on the data ethics canvas.
If you have time: Read the user guide, and find out how the data ethics canvas can be used as a framework for a discussion or workshop.
Consider: Try answering two of the squares in the data ethics canvas based on a research project you a familiar with.
Challenge me
Anonymising sensitive data
De-identification (also called anonymisation or confidentialisation in some cases) is most commonly undertaken to protect the privacy of individuals. It aims to allow data to be used by others without the possibility of individuals being identified. Data de-identification may also be used to protect organisations, such as businesses or other information such as the spatial location of mineral or archaeological findings or endangered species.
- Consider the different techniques required to de-identify quantitative and qualitative data. The UK data service has information on anonymisation of both.
- Explore some tools and resources for practical advice on how to de-identify data.
Consider: What types of data are there for which de-identification might not be needed or appropriate?
Do you have a question? Want to share a resource?
- Post to the Data Librarians Slack group to connect with the community.
- Tweet to @ardc_au using hashtag #23things
Keep on going to the next thing: What's my metadata schema? or return to all the things