september 2014 -...

114
RESEARCH DATA MANAGEMENT PRACTICES OF RESEARCHERS IN HIGHER EDUCATION INSTITUTIONS IN MALAWI A study submitted in partial fulfillment of the requirements for the degree of Master of Science in Digital Library Management at THE UNIVERSITY OF SHEFFIELD by THOMAS MPHATSO BELLO September 2014

Upload: others

Post on 08-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

RESEARCH DATA MANAGEMENT PRACTICES OF RESEARCHERS IN

HIGHER EDUCATION INSTITUTIONS IN MALAWI

A study submitted in partial fulfillment

of the requirements for the degree of

Master of Science in Digital Library Management

at

THE UNIVERSITY OF SHEFFIELD

by

THOMAS MPHATSO BELLO

September 2014

Page 2: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

i

Abstract

Background. Research funders’ policies on open access to data are influencing how research data are

being managed and shared in HEIs.

Aims. This study aimed to assess the current data management practices of researchers in Malawi’s

institutions of higher learning and their perceptions towards sharing.

Methods. A web-based survey instrument was used to collect data on the attributes of data being

produced and the data management practices. It was based on the DAF methodology developed by

DCC. The link to the questionnaire was emailed to contact people in Malawi’s universities who

forwarded it to researchers in their institutions. Reminders were sent two weeks and and one week

before the close of the survey period. The survey attracted a total of 34 respondents.

Results. Researchers in Malawi are collecting various types of data and in different formats. Some of

it is more discipline specific than others. The volumes being produced range from some megabytes to

the 50GB - 100GB region. The overwhelming majority of researchers manage their research data on

their own. Storage and backup of the data is mostly done using laptop hard disc drives, external drives

and memory sticks. Backup of the data is done as and when the researchers feel. In general, the

researchers support the idea of data sharing but do not appear to be enthusiastic about sharing their

own data. Most of them do not have data management plans.

Conclusions. The findings of this study suggest that although research in Malawi’s academia has been

going on for some time and generating data of varying types, formats and volumes, the data are

managed in a risky manner. However, the extent of these is not clear, calling for physical data audits

and more in depth face to face interviews. Detailed analyses of particular disciplines would be useful

in order to establish discipline-specific approaches and attitudes about research data management in

order to provide effective support and infrastructure to them.

Word count of Abstract: 322

Page 3: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

ii

Table of contents

Abstract .................................................................................................................................................................... i

List of figures ........................................................................................................................................................... v

List of tables .......................................................................................................................................................... vii

Acknowledgements ...............................................................................................................................................viii

CHAPTER ONE: INTRODUCTION .............................................................................................................................. 1

1.0 Introduction and context ........................................................................................................................ 1

1.1 Research aims and objectives ................................................................................................................. 4

1.1.1 Aim of study .................................................................................................................................... 4

1.1.2 Specific objectives ........................................................................................................................... 4

1.1.3 Research questions ......................................................................................................................... 4

1.1.4 Significance ..................................................................................................................................... 5

CHAPTER TWO: LITERATURE REVIEW ..................................................................................................................... 7

2.0 Introduction ............................................................................................................................................ 7

2.1 Funder’s requirements ............................................................................................................................ 7

2.2 Academic librarians and research data support ................................................................................... 11

2.3 Researchers’ data management practices ............................................................................................ 12

2.3.1 Storage .......................................................................................................................................... 13

2.3.2 Data policies: ................................................................................................................................. 13

2.3.3 Long-term preservation ................................................................................................................ 14

2.3.4 Training/advice .............................................................................................................................. 14

2.4 State of RDM in Africa ........................................................................................................................... 15

2.5 Conclusion ............................................................................................................................................. 16

CHAPTER III: RESEARCH METHODOLOGY ...................................................................................................... 18

3.0 Introduction .......................................................................................................................................... 18

3.1 Research design .................................................................................................................................... 18

3.1.1 The Data Asset Framework ........................................................................................................... 19

3.1.1.1 Planning the audit ......................................................................................................................... 19

3.1.1.2 Identifying and classifying assets .................................................................................................. 22

3.1.1.3 Assessing management of data assets .......................................................................................... 22

3.1.1.4 Reporting and recommendations ................................................................................................. 22

3.2 Survey Population ................................................................................................................................. 22

3.3 Data analysis ......................................................................................................................................... 23

CHAPTER IV: RESULTS ........................................................................................................................................... 24

4.0 Introduction ................................................................................................................................................ 24

Page 4: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

iii

4.1 Demographics of respondents .............................................................................................................. 24

4.2 Details of research data ........................................................................................................................ 25

4.2.1 Data Categories ............................................................................................................................. 26

4.2.2 Data storage media ....................................................................................................................... 26

4.2.3 Formats ......................................................................................................................................... 28

4.2.4 Research data volumes ................................................................................................................. 28

4.2.5 Use of data management plans .................................................................................................... 29

4.3 Responsibility for data management .................................................................................................... 32

4.4 Experiences with loss of research data ................................................................................................. 33

4.5 Issues with storage ................................................................................................................................ 35

4.6 Research Data Backup ........................................................................................................................... 36

4.6.1 Backup media ................................................................................................................................ 37

4.7 Researchers’ perceptions on sharing research data ............................................................................. 38

4.8 Research data ownership ...................................................................................................................... 40

4.9 Research data sharing ........................................................................................................................... 41

4.10 Experience with research council mandating data sharing .................................................................. 45

4.11 Free text responses given for question 24 are in Appendix C8. ........................................................... 47

CHAPTER V: DISCUSSION....................................................................................................................................... 48

5.0 Introduction .......................................................................................................................................... 48

5.1 Attributes of the research data ................................................................................................................... 48

5.1.1 Data categories .................................................................................................................................... 48

5.1.2 Databases ...................................................................................................................................... 49

5.1.3 Image data..................................................................................................................................... 49

5.1.4 Audio data ..................................................................................................................................... 50

5.1.5 Video data ..................................................................................................................................... 50

5.1.6 Data volumes ................................................................................................................................ 50

5.2 Management of research data .................................................................................................................... 51

5.2.1 Storage media used ....................................................................................................................... 51

5.2.2 Research Data Backup ................................................................................................................... 52

5.2.3 Research data sharing ................................................................................................................... 53

5.2.4 Sharing by discipline ...................................................................................................................... 54

5.2.5 Hindrances to sharing ................................................................................................................... 54

5.2.6 Long-term preservation of research data ..................................................................................... 55

5.2.7 Issues with day to day management of data and support needs ................................................. 56

5.2.8 Experience with Data Management Plans .................................................................................... 57

5.2.9 Researchers’ specific concerns ............................................................................................................ 58

Page 5: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

iv CHAPTER VI: CONCLUSION .................................................................................................................................... 60

6.0 Introduction .......................................................................................................................................... 60

6.1 Summary of findings of the study ......................................................................................................... 60

6.2 Contribution .......................................................................................................................................... 63

6.3 Limitations of the study ........................................................................................................................ 63

6.4 Recommendations for further research ............................................................................................... 64

CHAPTER VII: REFERENCES .................................................................................................................................... 65

Appendices ............................................................................................................................................................ 74

Appendix A1: Ethics Proposal ........................................................................................................................... 74

Appendix A – Ethics documentation ..................................................................................................................... 74

Research Ethics Review Declaration ................................................................................................................. 79

Appendix A2: Ethics Information Consent Form ............................................................................................... 81

Appendix A3: Ethics Approval ........................................................................................................................... 84

Appendix B: Copy of questionnaire ...................................................................................................................... 85

Research Data Management Practices of Researchers in Malawi ........................................................................ 85

About You .......................................................................................................................................................... 85

Details of your research data ............................................................................................................................ 85

Research data storage ....................................................................................................................................... 89

Research Data Backup ....................................................................................................................................... 90

Research data sharing ....................................................................................................................................... 92

Conclusion ......................................................................................................................................................... 94

End of questionnaire ......................................................................................................................................... 94

Appendix C – Additional Survey Results ............................................................................................................... 95

Appendix D – Letter of introduction ................................................................................................................... 102

Access to Dissertation ................................................................................................................................. 104

CONFIRMATION OF ADDRESS ............................................................................................................................. 106

Alumni Information ............................................................................................................................................. 107

First Employment Destination Details for School Records ................................................................................. 107

Page 6: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

v

List of figures

Figure 1: Distribution of respondents by research role

Figure 2: Do you currently hold or have you ever held any research data?

Figure 3: Categories of the electronic data created in respondents’ research fields

Figure 4: A graph showing responses to question 9: estimating of how much electronic research data

currently held/maintained by respondents.

Figure 5: Summary of responses to Question 10: “Do you currently have a data management plan for

your research data?”

Figure 6: Chart showing responses to question 8b on reasons for not developing DMPs

Figure 7: Responses to question 11 “Who, if anyone, is responsible for managing your electronic

research data?”

Figure 8: Question 12: Have you ever lost research data which was not backed up?

Figure 9: Ways in which data loss occurred

Figure 10: Question 13: Have you ever experienced any problems storing your research data due to the

size of the files?

Figure 11: Summary of answers to question 14 “On average, how frequently is your data backed up?”

Figure 12: Question 4b. Where are they backed up?

Figure 13: Responses to question 15 “If the service was offered, would you want your university's

repository to store any of your research data, either for your exclusive use or for wider access?”

Figure 14: Question 18. Do you share ownership of any of your research data with others?

Figure 15: Question 19. How do you currently share research data with colleagues?

Page 7: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

vi

Figure 16: Question 20. What problems have you encountered when sharing data with colleagues?

Figure 17: Question 22. What factors would prevent your research data from being made open access

to the general public?

Figure 18: Question 23. Have you ever applied for funding from a body that required some degree of

open access to be provided for your research data?

Figure C1: Research data categories by discipline

Figure C2: Responses to question 8 “What formats/software do you use for your electronic research

data?”

Figure C3: Responses to Question 8a “If you store data in databases, please select the primary program

you use:”

Figure C4: Primary format of images

Figure C5: Primary format of audio

Figure C6: Primary format of video

Figure C7: 14a. What data tends to be backed up?

Page 8: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

vii

List of tables

Table 1: Distribution of respondents by institutional affiliation

Table 2: Responses to Question 7 "What are the principal media on which your research data are

stored?"

Table 3: Use of DMPs by discipline

Table 4: Summary of responses to question 10a on main drivers for developing data management

strategies

Table 5: Question 16: If yes, how long would you want the repository to retain any of your research

data, including data only accessible by you?

Table 6: Question 17. Who owns the research data you hold?

Table 7: Question 21. Apart from yourself, who would you want to be allowed access to your research

data?

Table 8: Re-tabulation of Table 7 data

Table 9: Question 23b. Have you ever experienced difficulties in meeting these requirements?

Table C1: Other file formats/software being used by respondents and their areas of application

Page 9: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

viii

Acknowledgements

This work has been possible as a result of sponsorship from Kamuzu College of Nursing of the

University of Malawi.

My supervisor Dr. Andrew Cox, deserves special recognition for introducing me to the module

Research Data Management, in addition to several others, which I enjoyed and for guiding me

throughout the dissertation process which is based on that module. The mistakes are my own.

My wife Alice and little angels Mulinde, Becky and Dzalo: Thank you for putting up with dad’s

absenteeism. Your strength kept me going.

Page 10: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

1

CHAPTER ONE: INTRODUCTION

1.0 Introduction and context

Over the past decade, there has been a huge interest in the area of research data management

(RDM). Many universities in developed countries such as Australia, United Kingdom and the

United States are now increasingly and actively engaged in RDM (Groenewegen & Treloar,

2013). New job posts have been created to support researchers in managing their research

data throughout the research lifecycle (Pryor & Donnelly, 2009). Academic libraries for

example, have recently repositioned themselves strategically by aligning their service

offering to include supporting RDM on their campuses (Corrall, Kennan, & Afzal, 2013).

Data management training materials for different subject areas have been developed by

organisations such as DCC and ANDS (ANDS, 2014a; DCC, 2014a) and universities such as

The Australian National University1 to support researchers. Some of the training materials

such as RDMRose2 and MANTRA3 have focussed on providing continuing professional

development to librarians and research support staff to equip them to effectively support

researchers (Jones, Pryor, & Whyte, 2013). Furthermore, a range of data repositories to store

and preserve research data assets over the long term have also been deployed (Jones, 2014: p.

103).

To signify the importance of RDM and data curation issues, the DCC runs a special bi-annual

electronic journal, the International Journal of Digital Curation4, specifically to report on such

and related issues (DCC, 2014b).

1 http://anulib.anu.edu.au/_resources/training-and-resources/guides/DataManagement.pdf 2 http://rdmrose.group.shef.ac.uk/ 3 http://datalib.edina.ac.uk/mantra/ 4 http://www.ijdc.net/

Page 11: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

2

One of the major forces driving this interest in RDM in institutions of higher learning is the

requirement by funding bodies for research grant applicants to outline data management plans

in their applications for funding (ANDS, 2014b; EPSRC, 2014; NSF, 2010; RCUK, 2011).

Adhering to these funder policy regulations has been a challenge for researchers and they do

need support (Pryor, 2012). Higher education institutions such as those in the UK have

responded by putting in place RDM policy frameworks5 in order to comply with the

mandates by funders with the purpose of continuing to attract research funds (Jubb, 2007).

Among other support service providers, academic librarians are considered as important

stakeholders in the area of managing research outputs because of the various skills and

competencies inherent in their profession (Jones, Pryor, & Whyte, 2013; Michener et al.,

2012). The librarians are taking advantage of this emphasis on RDM to demonstrate their

value and are working in collaboration with IT and Research Offices in providing these data

support services (Corrall, Kennan, & Afzal, 2013; Pryor, 2014).

In a bid to provide effective research data support services, several studies have been

conducted in various universities to understand how researchers are managing their research

data and what their perceptions towards sharing of such data are (Jones, Ball & Ekmekcioglu,

2008; Martinez-Uribe, 2009; Alexogiannopoulos, McKenney & Pickton, 2010; Rice &

Haywood, 2011). Other studies have investigated academic libraries to identify the support

services they are offering or planning to offer to researchers in research data stewardship

(Corrall, Kennan, & Afzal, 2013). All these activities have been taking place in the western

world.

5 http://www.dcc.ac.uk/resources/policy-and-legal/institutional-data-policies

Page 12: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

3

In Malawi, various academic institutions have been involved in different types of research for

some time. For example, the University of Malawi and Lilongwe University of Agriculture

and Natural Resources have several research centres which have been conducting different

types of research for many years. Research Director positions have been established in almost

all Malawi’s HEIs in the last few years to lead in the management of institutional research

efforts to help the institutions attract more research funding.

The government body that is mandated to promote and coordinate science and technology

activities through funding local research in Malawi is the National Commission for Science

and Technology (NCST). Regarding data management, NCST’s published guidelines for

social sciences and humanities research (NCST, 2011) stipulate that “researchers shall ...

allow others to have access to their research data” and “institutions shall establish research

data banks and repositories ... to facilitate availability and access by other users.” They also

state that the general format of research proposals should among other things include “data

management ... methods”. With a history and special interest in research in academic

institutions and the stipulations by NCST and against the background of heightened interest

in RDM in the more highly resourced countries, it is not clear how the data arising from these

research efforts are being managed both during the active stages and beyond the life of

research projects because the practices are not documented. Where this is not known,

wasteful duplication of research efforts by collecting data that was already collected in past

projects is inevitable. RDM has been proved to be advantageous because among other things,

it prevents such wasteful duplication, promotes sharing of research data, especially in this age

where research is increasingly becoming global and taking multi-disciplinary approaches

(Michener et al., 2012).

Page 13: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

4

1.1 Research aims and objectives

1.1.1 Aim of study

The aim of this study is to assess the status of current Research Data Management practices

of researchers in Malawi’s higher education institutions.

1.1.2 Specific objectives

The objectives of this study are:

To understand the characteristics, types, and volumes of the research data being

generated by researchers in academic institutions in Malawi.

To assess the methods that the researchers are using to store and backup their research

data

To understand how the researchers share their data and what their perceptions towards

data sharing are.

To understand the support needs they have to enable them to effectively manage their

data throughout the research life-cycle

To identify the issues they face in the day to day management of these data

1.1.3 Research questions

This study seeks to answer the following questions:

1. What are the attributes and volumes of research data that researchers in higher

education institutions in Malawi are generating?

2. How are they managing their research data?

Page 14: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

5

3. How do researchers in Malawi share their data and how do they perceive the notion of

research data sharing?

4. What challenges do they experience when managing their research data on a day-to-

day basis?

1.1.4 Significance

This study provides a picture of the current practices of managing research data by

researchers in Malawi’s universities. It also helps to identify the research data support needs

that researchers have. This knowledge is important because it could be used in reshaping

some of the academic support services such as the library and IT to effectively support

researchers, thereby making a significant contribution to the whole institutional research

enterprise in the higher education sector. Knowing the types and volumes of data being

generated during research projects, for example, is useful in determining present and future

research data storage needs. Likewise, issues raised in this dissertation could influence

managers in how they budget and plan for training, recruitment and infrastructure in their

institutions to ensure that researchers are effectively supported and research data is

safeguarded.

In terms of data management and sharing, the study will raise awareness of the gaps that exist

in the local policies that deal with research practices such as the guidelines for Social Science

and Humanities research by NCST6 and College of Medicine Research and Ethics Committee

(COMREC) research proposal format7. Therefore, this study has the potential to influence

policy formulation for good RDM at the institutional and national level.

6 http://www.ncst.mw/wp-content/uploads/2014/03/NATIONAL-FRAMEWORK-OF-GUIDELINES-IN-SSH.pdf 7 http://www.medcol.mw/comrec/COMREC_format.doc

Page 15: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

6

It is good scientific practice that research is validated (European Science Foundation, 2000)

as this helps prevent fraud by academic researchers who are under pressure to ‘publish’ that

they may not ‘perish’. Validation is easier when good RDM is practised in the institutions

because the data and all contextual information regarding it such as metadata for each

research project is available. It is hoped that this will be the long-term contribution of this

study. Similarly, with funding going to researchers who comply with funders’ data

management and sharing requirements, good RDM is one area that can ensure the flow of

research income into academic institutions. This is another area that this study has the

potential of contributing in ultimately.

It is hoped also that the act of responding to the questionnaire itself will raise awareness of

the importance of having well laid down procedures for managing research outputs (Jerrome

& Breeze, 2009; Parsons, 2013). Stakeholders such as principal investigators and research

directors could use this awareness to initiate promotional activities that are aimed at the

effective stewardship of digital research data in their institutions.

The global nature and multidisciplinarity of 21st Century research mean that researchers and

all research stakeholders everywhere including Malawi need to be knowledgeable in good

data management practices. Owing to its novel nature, this study will form a baseline of

current RDM practices of researchers in Malawi’s higher education sector. Therefore, it has

much to contribute to future research and training development for LIS, IT and research staff

in Malawi in addition to informing policy.

Page 16: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

7

CHAPTER TWO: LITERATURE REVIEW

2.0 Introduction

Funders, academic librarians and researchers are some of the major players in the realm of

research data management. This literature review focuses on some of the pertinent

stipulations in funder’s data policies and the debates around them, librarians’ data support

services and the data management practices of researchers. This last theme is the longest

because it is the main focus of this project. It is important to understand the interplay between

the policies, support services and practices of researchers to have a holistic view of how this

area that has become an essential part of the modern research enterprise is unfolding. Lastly,

present RDM efforts in Africa have been discussed to establish context for this research.

Funders’ policies have been the main driving force behind the uptake of RDM. This has seen

HEIs formulating their own data policies and strategies which have influenced some changes

in how research stakeholders are working. Among these include the creation of new research

support roles in the library and IT professions, investments in storage infrastructure and

development of RDM training and guidance for researchers. Following the growing interest

in RDM, a number of studies have been conducted to understand researchers’ practices,

perceptions and requirements so that effective data support services could be tailored for

them.

2.1 Funder’s requirements

In countries such as the United Kingdom, Australia and the United States, most funders have

put in place policies mandating research grant applicants to include data management plans in

their applications addressing a number of issues including sharing and preservation of

datasets (Pryor, 2014). Thus the applicant’s ability to clearly state the type of data he or she

will create, how they will be maintained and shared and explain reasons why the data might

Page 17: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

8

not be shared where sharing is not appropriate or expected determines whether or not he or

she is able to secure the grant.

Research data which is generated using tax payers’ money has been described as a “public

good, produced in the public interest” and should therefore “be made openly available with as

few restrictions as possible …” (RCUK, 2011). Access to such research outputs maximises

returns from government investment (OECD, 2007). In the United States, the major funding

agencies that have influenced the data management landscape in academic institutions

include the National Science Foundation (NSF), the National Institutes of Health (NIH) and

the National Endowment for the Humanities (NEH). NSF, for example, declared that for

proposals submitted after January 18, 2011,

“investigators are expected to share with other researchers, at no more than

incremental cost and within a reasonable time, the primary data … created or gathered

in the course of work under National Science Foundation grants” (NSF, 2010a).

Similarly, the Australian National Data Service (ANDS) has outlined requirements for data

management planning for grant applicants in Australia. On data reuse, ANDS aims at

“transforming Australian research data from being a single use research output to a

continually reusable resource” (ANDS, n.d.). In all these policies, data sharing is among the

recurring themes as one of the advantages of managing research data. This seems to address

one of The Royal Society’s propositions against hoarding of research data which states that

there must be “a shift away from a research culture where data is viewed as a private

preserve” (The Royal Society, 2012). The International Human Genome project has been

hailed as one of the good demonstrations of large-scale international research efforts in which

different users globally successfully use openly accessible data for various purposes (OECD,

Page 18: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

9

2007). Regarding this project, the Human Genome Organisation (HUGO) Ethics Committee

(2002) described the human genomic databases as “global public goods”, a description which

bears strong similarity to RCUK’s reference to publicly funded research data (2011).

All the major research funders in the UK are proponents of sharing of data sets with domain

specific funders tending to expect varying degrees of sharing to accommodate their

contextual settings and regulatory requirements. MRC, for example, requires that ethical,

legal and institutional considerations should be addressed before sharing of research data

takes place (Medical Research Council, 2011). This is in keeping with RCUK’s advocacy for

balance between return on public investment and threats to infringement of confidentiality

rights of research subjects (RCUK, 2012). Likewise, funders such as EPSRC and ESRC

recognise that there may be circumstances where withholding of research data is justified and

they require that where such issues arise, reasons for restricting access should be given and

the associated metadata should also state those reasons in addition to stating the requirements

that should be fulfilled in order to permit access to such data (EPSRC, 2011; ESRC, 2010).

While AHRC recognises that there may be special circumstances for prohibiting access to

data, it further envisages cases where charging for access could be justified (AHRC, 2014).

ESRC also clearly emphasises the data citation responsibilities of those who publish by re-

using such data (ESRC, 2013).

Borgman (2012) cautions, however, that research data sharing is a complex issue because of

different perceptions by different researchers, types of data and a variety of contexts,

referring to it as a “conundrum”. She notes, for example, that:

“Some of those data may be in sharable forms, others not. Some data are of

recognized value to the community, others not. Some researchers wish to share all of

Page 19: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

10

their data all of the time, some wish never to share any of their data, and most are

willing to share some of their data some of the time”.

It seems that funders have anticipated such complexities and attempted to address these

uncertainties by being flexible in their policies, one example being the requirement for

researchers to provide justification where sharing is not possible. NSF (2010b) for example,

concedes that “what constitutes reasonable data management and access will be determined

by the community of interest through the process of peer review ...”, a flexibility which takes

into account disciplinary approaches that may exist or evolve with time. In addition, studies

continue to be carried out with the aim of understanding the discipline-specific approaches to

and perceptions towards data management (Akers & Doty, 2013). This understanding is

useful for planning storage infrastructure requirements and tailored support services for

research data management.

OECD (2007) has identified validating or verifying research as one of the rationales for

sharing data. As career progression in academia is based largely on continued publishing

among other things, it is of utmost importance to guard against scientific malpractices by

academics who may be tempted to fabricate or falsify their data because they are anxious to

publish for purposes of promotion in their job. Cases of data fraud that occurred in The

Netherlands between 2011 and 2012 have been reported, the most outstanding of them being

in the field of social psychology (Doorn, Dillo, & van Horik, 2013). While one of the

committees that were instituted to investigate the fraud admitted that there might be

justifiable reasons for the “selective omission of exceptional scores” that was observed, it

lamented that these were not documented (Levelt Committee, Noort Committee, & Drenth

Committee, 2014). This underlies the importance of depositing research data together with all

their related metadata and descriptions to aid future researchers to make sense of and reuse

Page 20: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

11

the data effectively. In this reported case of data fraud, the available primary data were used

to verify the published results and it was indeed proved, through re-calculation of statistics

such as means and standard deviations that the data had been “massaged” in addition to

revealing the research culture that prevailed behind the publications.

As a result of these fraudulent activities, research validation and data management have been

strengthened in the three institutions where these malpractices took place and in the whole

country as a way of clearing the bad image that has been associated with the field of

psychology research and also not to jeopardize employment opportunities for young

psychology graduates in the job market among other reasons. This has also raised awareness

of the importance of research data management in disciplines which are known to lag behind

in this area such as psychology (Doorn, Dillo, & van Horik, 2013).

2.2 Academic librarians and research data support

Studies on academic libraries’ engagement in RDM have been conducted with the aim of

understanding support service, curriculum and training requirements for information students

and professionals (Halbert, 2013). Corrall, Kennan & Afzal (2013) surveyed bibliometric and

data support activities of 140 libraries in Australia, New Zealand, Ireland, and the United

Kingdom. Tenopir, Birch & Allard (2012) assessed the prevailing state of and future plans for

research data services in a random stratified sample of academic libraries with membership in

the Association of College and Research Libraries (ACRL) in the United States and Canada.

Both studies found that a relatively small number of libraries were offering these services and

reported that more were planning to offer these in the future, attributing this to the relatively

new nature of this service offering in both cases. However, these future plans seemed lower

in Ireland by comparison, citing “slower development of data management policies by

national research funding bodies” as the reason. Similarly, in the North American study,

Page 21: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

12

libraries in institutions that received NSF grants were found to offer remarkably more data

support services than those that did not, reinforcing the notion that funders’ policies have

significantly contributed to institutional involvement in RDM. The Australia-Europe study

also reported that skills in bibliometrics and RDM along with understanding the research

environment by the librarians, were identified as areas that needed to be addressed in order to

provide effective research data services. The North American study reported that in some

libraries, most staff have been or plan to be reassigned to new data roles while other libraries

are hiring or plan to hire new staff members. This is a clear depiction of how academic

librarians are demonstrating their worth in their institutions.

2.3 Researchers’ data management practices

The science enterprise has become increasingly “data intensive” and more collaborative

(NSF, 2010c) as a result of the unprecedented deluge of data being collected, analysed, re-

used and preserved due to advancements in computational and communications technologies

(Borgman, 2012; Institute of Medicine and National Academy of Sciences, 2009). This

means that data sharing among researchers has now become more important than ever

(Tenopir et. al, 2011).

Several studies aimed at understanding researchers’ practices of and perceptions towards

RDM have been conducted in a number of universities in the western countries. The Data

Asset Framework (DAF) methodology, described in detail in the methodology chapter of this

study, is one of the tools that is being used in HEIs to identify, locate, and assess current

practices of researchers in the management of research data and understanding their

perceptions towards RDM (Jones, Ball & Ekmekcioglu, 2008). The DAF is a tool which was

designed by DCC to audit data assets and identify prevailing data management practices of

researchers.

Page 22: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

13

The DAF was piloted May – July 2008 at the Universities of Edinburgh (School of

GeoSciences), Glasgow (Department of Archaeology) and Bath (Innovative Design and

Manufacturing Research Centre (IdMRC), a research group within the Department of

Mechanical Engineering) (Jones, Ball & Ekmekcioglu, 2008) where it largely involved a

series of interviews with researchers. A fourth pilot audit in the series followed at King’s

College London (KCL) during October-December 2008 focusing on researchers from the

Centre for Computing in the Humanities (CCH) (Jones & Ross, 2009). Despite their differing

disciplines and contexts, the data audit of the first three institutions and a similar audit a year

later at the University of Oxford (Martinez-Uribe, 2009) revealed similar issues centring on

storage, data policy and preservation.

2.3.1 Storage

Most of the pilot audits reported that researchers complained of insufficient storage with

several observed cases of researchers storing their data on local hard drives of their laptops or

PCs and on memory sticks. Only very few were reported to have a well-established data

backup plan although most of them knew the consequences of not backing up their data

frequently and some of them had reported having experienced data loss or irretrievability due

to corruption of CDs. Studies have shown that when researchers store data themselves, it

tends to get lost, more especially over the long term (Vines et al., 2014; Wicherts, Borsboom,

Kats, & Molenaar, 2006). This has implications on efficiency in terms of the utilization of

funds, time and data reuse.

2.3.2 Data policies:

Data management policies were found to be non-existent in most cases. Where pockets of

good practice for file naming and version control were observed, these were marred by ad

Page 23: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

14

hoc approaches due to a lack of standardisation. The result of this is that collaboration among

researchers is hard to achieve because it is difficult to know the location and correct version

of the data sets.

2.3.3 Long-term preservation

The data audits reported that infrastructure for long-term preservation was not provided in the

piloted institutions and no person was assigned the responsibility over data management

meaning that there was no way of knowing what data assets existed, an issue which is

aggravated when researchers leave these institutions. Researchers at the University of Oxford

were reported to have indicated that one of their top requirements was infrastructure that

would allow publication and long-term preservation of research data.

2.3.4 Training/advice

Another recurring item reported in all these audits was the call by researchers for advice on

practical issues related to managing research data across the research life cycle because they

recognised the importance of managing their research data properly and the risks of not doing

so. This strongly agrees with a study by Tenopir, Birch and Allard (2012) which reported that

researchers faced challenges to manage their data assets properly and responsibly because of

lack of time and funding and they wanted other units to lead in championing research data

services. Most researchers have a desire to preserve their data beyond the life of their projects

but fail to do so because of lack of archiving mechanisms (Alexogiannopoulos,, McKenney

& Pickton, 2010). This is an area where services of IT professionals could be useful and

university management needs to massively invest in.

Page 24: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

15

Unlike the other four, the audit at KCL discovered good data management practice in which

its online list of projects was described by the auditors as ‘overwhelmingly of curated digital

assets’. Project directories were found to be well organised by project from the outset,

making it easy for data to be located and its context understood. Researchers in the centre

were said to know what was expected of them in terms of data management and were

committed to ensuring that the results of their data were far reaching. One could be left with

the impression that this was the case because researchers in the centre were from a computing

background and therefore, knowledgeable in managing their digital data assets. The centre

could serve as an exemplar of good practice of research data stewardship for other disciplines

in the institution and beyond.

Other early adopters of DAF include Imperial College London (Jerrome & Breeze, 2009),

Southampton (Gibbs, 2009), University of Northampton (Alexogiannopoulos, McKenney &

Pickton, 2010) and University of Oregon (Westra, 2010) in the USA with the University of

Nottingham in UK as one of the recent adopters of DAF (Parsons, 2013).

The DAF methodology of auditing data assets has gained wide acceptance as evidenced by

the number of studies that have followed the initial 2008 pilots which have taken place at the

Universities of Bath (Jones, 2011), Glasgow and Cambridge (Ward, Freiman, Jones, Molloy

& Snow, 2011), Edinburgh (Rice & Haywood, 2011) and Oxford (Wilson, & Jeffreys, 2013).

2.4 State of RDM in Africa

South Africa, a pacesetter in many areas on the African continent, seems to be preparing to

institutionalise RDM. This can be seen by the increase of RDM activities on its university

campuses. For instance, the University of Cape Town had a 2-day intensive RDM workshop

facilitated by DCC staff in late March 2014 (DCC, 2014). Similarly, in July 2014, the

Page 25: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

16

University of South Africa conducted a 2-day LIS Research Symposium8 in which the first

day was dedicated to presentations and discussions on RDM. Themes that were discussed at

this gathering included “libraries as part of the research infrastructure of the academic

institution”, “scientific data curation, citation and scholarly publication”, “developing an

institutional research data management plan” and “research data management and

institutional repositories”. On a deeper level, the Division of Epidemiology and Biostatistics

at the University of the Witwatersrand has introduced a new masters programme in Research

Data Management9 that aims to produce graduates who are able to lead data management

teams and integrate RDM activities at all stages of the research life cycle.

Unlike the developed countries where RDM is becoming fully entrenched, there are currently

no requirements for data management and sharing plans by research funding agencies in

South Africa as part of the grant application process (DCC, 2014; Pienaar, 2010). Without

doubt, this is the case in many African countries such as Malawi.

2.5 Conclusion

As research is becoming increasingly data-intensive, cross-disciplinary and global (Michener

et al., 2012), the workshops and MSc programme in South Africa are an indication of how

universities there are preparing for the global phenomenon that RDM is becoming. There is

need, therefore, for academic institutions in Malawi to start preparing institutional RDM

policies and services too. This study one step in that direction as it attempts to uncover

researchers’ current data management and sharing practices in selected institutions of higher

learning in Malawi. The research, which acts as a baseline study, attempts to identify the

attributes and volumes of the data that researchers are generating and what they do with the

8 http://www.unisa.ac.za/Default.asp?Cmd=ViewContent&ContentID=96779 9 http://www.wits.ac.za/10568/#MSCinfectiousepi

Page 26: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

17

data at the end of their research projects. It also highlights the challenges being faced by

researchers in the day-to-day management of their research data and captures some of their

thoughts on data management.

Page 27: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

18

CHAPTER III: RESEARCH METHODOLOGY

3.0 Introduction

This study aims to identify the ways in which researchers in Malawi’s higher education

institutions are currently managing their research data. To achieve this, the study has

attempted to answer the following questions:

What are the attributes and volumes of research data that researchers in higher

education institutions in Malawi are generating?

How are they managing their active data and what do they do with the data at the end

of the research projects?

How do researchers in Malawi share their data and how do they perceive the notion of

research data sharing?

What challenges do they experience when managing their research data on a day-to-

day basis?

This chapter outlines the research design that has been used in this study, detailing how the

DAF has been employed and mapped to the phases of the present study. It also discusses the

study population and the data analysis approaches that have been adopted.

3.1 Research design

To answer the research questions, a cross-sectional survey methodology was used. Cross-

sectional research designs are aimed at studying phenomena of interest at a single point in

time (Babbie, 1979; Bryman, 2012; Wagenaar & Babbie, 2004). This was based on a

modification of DCC’s Data Asset Framework (DAF) methodology (DCC, 2009) described

in detail in the subsections below.

Page 28: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

19

Studies that are based on the DAF can be termed as descriptive because they describe

researchers’ practices of managing their research data. Descriptive studies describe

populations of interest with respect to some phenomenon (Babbie, 1979).

DCC encourages implementers of the DAF to modify the methodology to account for the

specific contexts in which they are being applied (DCC, 2009). This is what previous data

audits in other institutions have done (Alexogiannopoulos, McKenney, and Pickton, 2010)

and this was also done in this study.

3.1.1 The Data Asset Framework

According to DCC (2009) the DAF is a collection of methods whose purposes are to

discover the data assets being created and held within institutions

assess how the data are managed, shared and preserved

identify any threats to the data

discover researchers’ perceptions towards data creation and sharing and

provide suggestions on improvement of prevailing practice

The framework takes a four-stage approach but encourages flexibility in its application in

order to accommodate the specific needs of the institution being studied. The stages are

planning, identifying and classifying assets, assessing management of data assets and

reporting and recommendations.

3.1.1.1 Planning the audit

This stage involves “planning, defining the purpose and scope of the survey and conducting

preliminary research”. The purpose of the survey has been defined in Chapter I. In terms of

scope, this study has been set to focus only on researchers working in academic institutions in

Page 29: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

20

Malawi. Planning for this study started at the proposal stage. It is during this time that key

contact people were sought in advance to help with the dissemination of the email containing

the link to the questionnaire in their institutions. This was done two months before the study

to ensure timely responses which was crucial to the completion of the study and also for

wider dissemination of the instrument with the understanding that respondents are more

likely to respond to a questionnaire sent by someone they know than a stranger (Barnes,

2001).

The research was conducted by collecting data using a web-based survey instrument. The

questionnaire, which was set not to collect respondents’ personal details, was designed using

Google Drive Forms. This method of collecting data is cheaper and quicker to administer

than other methods such as interviews or semi-structured interviews (Bryman, 2012). This

was desirable in this case because the respondents were “geographically widely dispersed”

(Bryman, 2012, p.233), the researcher far away and time was of the essence. The absence of

interviewer in the process of responding to the questionnaire also helps to eliminate any

influence on the responses caused by his or her presence, in addition to convenience on the

part of the respondents (Bryman, 2012, p.233) in that they are able to respond at a time and

place of their choice. The limitations of this method though are low response rates and

difficulty to ask many questions to avoid respondent fatigue (Bryman, 2012, p.235).

The survey instrument was disseminated via email on 25 June 2014 to the contact persons

who had been identified earlier. These in turn forwarded the email to researchers in their

institutions. After two weeks, the contacts were requested to send follow up emails. In

addition, some personalised emails were also sent by this researcher to researchers known to

him in order to increase the response rate. The ethics approval letter from the Information

School’s Ethics Committee and a letter of introduction written by the supervisor of this study

were attached to these emails.

Page 30: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

21

The survey comprised a total of 24 questions divided into four sections: demographics,

details of research data held, data storage, backup and research data sharing. See Appendix B.

The first page of the questionnaire contained ethics and consent information which

participants had to read before participating in the study. By clicking ‘Continue’ to proceed to

the next pages, participants agreed to take part in the survey. As is the case with many

questionnaires, the first question asked for demographic details of the respondent to enable

the research to obtain a profile of the participating group.

All the questions except the final one were closed-ended, multiple choice or tick-box based.

This is helpful for study participants because it helps to save the time they take to type in

their responses. This is also helpful to the researcher during data analysis because uniform

responses are more easily grouped and analysed. In addition to the answer options that came

with the questions, many of the questions provided an ‘other’ option where the survey

participants specified their own responses if they were not included on the lists of provided

responses. The final question was open-ended to afford the participating researchers an

opportunity to discuss in more depth any related issues they had in mind. The drawback with

this, though, is that sometimes respondents provide irrelevant answers.

There was no pre-test of the survey instrument due to limitation of time. Since DAF has been

extensively and successfully used elsewhere, this could be an assurance that it was a proven

survey instrument, though it was being applied in a different setting. Pretesting questionnaires

is important because it helps to keep errors to a minimum (Wagenaar & Babbie, 2004, p.

156).

In many data asset surveys that use DAF, a sample of the researchers who participate in the

survey is further interviewed face-to-face for an in-depth understanding of the data assets

Page 31: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

22

being held and more qualitative information on practices, needs and perceptions. This has not

been done in this study because time was limited and it was not practical to do so.

3.1.1.2 Identifying and classifying assets

The data collected in the survey was used to identify and classify the data assets being

generated and held by researchers in Malawi. However, this step is more suited to physical

identification than a survey. The questionnaire did not only focus on researchers who held

active data, but also those who once held data. This was done to gather as much data as

possible, knowing that research projects come and go, so researchers will hold data at some

point in their careers.

3.1.1.3 Assessing management of data assets

The data analysis phase of this study has assessed the management and sharing practices of

the research data assets and documented these in the results chapter.

3.1.1.4 Reporting and recommendations

Results of this study have been analysed and reported in the discussion chapter.

Recommendations have been made and also included in the conclusion chapter.

3.2 Survey Population

The population under study comprised researchers from the University of Malawi, Lilongwe

University of Agriculture and Natural Resources (LUANAR) and Mzuzu University. These

institutions were chosen because they have been involved in research for a long time and

represent a diversity of disciplines which served to identify any domain specific practices and

perceptions. These disciplines include agricultural, social and mathematical sciences.

Page 32: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

23

3.3 Data analysis

Analysis of data has taken both quantitative and qualitative approaches. Quantitative analysis

has comprised calculation of percentages, proportions and means. Some data have also been

presented in tables and different types of charts. The ‘other’ and ‘any comments’ fields of the

questionnaire have been used to obtain any relevant qualitative data which has helped to

gauge perceptions of the responding researchers. The results of the data have been compared

with what has been reported in the literature, especially those which are based on the DAF

methodology as reported in the literature review.

The free text answers which respondents gave in the final question have been thematically

grouped and reported.

Page 33: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

24

CHAPTER IV: RESULTS

4.0 Introduction

This chapter presents the findings of the study.

4.1 Demographics of respondents

A total of 34 participants responded to the questionnaire. Out of these, 17 were Principal

Investigators or Project Managers representing 35% of the respondents. Independent

researchers and those who were members of research teams comprised 21 % of each

category. 2 of the respondents were research assistants and only 1 was a research

support/non-academic staff member while 8 were research students working towards their

doctorate degrees representing 17% of all the respondents. Figure 1 shows the distribution of

the respondents by research role.

These researchers come from a wide array of academic disciplines such as social sciences,

sciences and Engineering.

35%

21%

21%

4%

2%

17%

Figure 1: Distribution of respondents by research role

Principal Investigator

Member of Research Team

Independent Researcher

Research Assistant

Non-academic Staff

Research Student

Page 34: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

25

In terms of institutional affiliation, the majority (29) of the respondents came from the

University of Malawi representing 85%, 4 were from the Lilongwe University of Agriculture

and Natural Resources (Luanar) and 1 respondent was from Mzuzu University. Table 1

provides a summary.

Table 1: Distribution of respondents by institutional affiliation

Name of institution Number of respondents Percentage

University of Malawi 29 85%

LUANAR 4 12%

Mzuzu University 1 3%

4.2 Details of research data

In terms of research data holdings, 25 respondents, representing 74% indicated that they held

data while 9 of them (26%) said that they had at one point held data as shown in Figure 2

below.

74%

26%

Figure 2: Do you currently hold or have you ever held

any research data?

Yes, I currently holdresearch data

Yes, I have held researchdata in the past

Page 35: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

26

4.2.1 Data Categories

Categories of the electronic data created in respondents’ research fields are summarised in the

chart in Figure 3. It shows that a wide array of research data are collected by researchers.

Close to half of it comprises the Survey/Interview/Focus Group category. Observational data

make up 19% of the reported categories with experimental data making up 15%. Simulated

data make up 8 %, while derived and reference data categories make up 10% each with

‘other’ categories at 2% of the reported data classes.

Figure 3: Categories of the electronic data created in respondents’ research fields

4.2.2 Data storage media

A wide spectrum of data storage media proliferate. As Table 2 indicates, hard disk drives of

laptops/netbooks are used more (at 21%) than all others by researchers to store their research

data followed by USB/Flash drives (14%), campus computer hard disk drives (13%) and

external hard drives (11%). Interestingly, paper also makes up one of the notable data storage

media (8%), more than Email client/servers, CD/DVDs, Hard disk drive of off campus

computers and web-based service such as Google Docs and Dropbox. The survey has

19%

46%

15%

8%

5%

5%

2%

Observational

Survey/Interview/Focus Group

Experimental

Simulated

Derived

Reference

Other

Page 36: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

27

revealed that shared drives/servers (e.g. University servers) are only form 2% of the research

data storage media.

However, this question was flawed in that it gave respondents an option of choosing multiple

respondents yet the aim was to find the principal media. This needed to be a multiple choice

question with only one response option.

Table 2: Responses to Question 7 "What are the principal media on which your

research data are stored?"

Principal data storage medium Responses Percentage

Hard disk drive of laptop/netbook 32 21%

USB/Flash drive 21 14%

Hard disk drive of computer on campus 19 13%

External hard drive 17 11%

On paper 12 8%

Email client/server 10 7%

CD/DVD 9 6%

Hard disk drive of computer off campus 8 5%

Web-based service (e.g. Google Docs, Flickr, Box.net,

Dropbox, Pando etc. 8 5%

Shared drive/server (e.g. University server) 3 2%

Third party (including commercial data storage) 2 1%

Cassette Tape (Audio) 2 1%

Photograph 2 1%

Slides 2 1%

Other 2 1%

Hard disk drive of instrument/sensor which generates data 1 1%

VHS/Video Cassette 0 0%

Microfiche 0 0%

Page 37: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

28

4.2.3 Formats

The results for the formats/software that researchers are using for their data are in Appendix

C1.

4.2.4 Research data volumes

Asked to estimate the volumes of electronic data they held as Figure 4 shows, the 1-50 GB

range had most responses at just below half of the respondents followed by the 100-500 GB

range which was held by a fifth of the respondents. Identical proportions of respondents

(12%) estimated their research data to be in the region of less than a gigabyte and 50-100 GB

each. Those who held 500 GB - 1TB of data accounted for about a tenth of the respondents

while 1 respondent (3%) had the highest estimation at 50 – 100 TB and another one was not

sure how much he or she held.

Page 38: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

29

Figure 4: A graph showing responses to question 9: estimating of how much

electronic research data currently held/maintained by respondents.

4.2.5 Use of data management plans

The survey wanted to find out whether researchers had data management plans (DMPs) for

their research data such as data preservation policy, record management policy and data

disposal strategy. As Figure 5 shows, the majority of the participants (just over two thirds)

acknowledged that they did not have DMPs, about one third indicated that they had one. One

respondent (3%), a research student, did not know whether or not he or she had a DMP.

12%

44%

12%

18%

9%

3%

3%

< 1 GB

1 - 50 GB

50 - 100 GB

100 - 500 GB

500 GB - 1 TB

1 - 100 TB's

Don't know

Page 39: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

30

Figure 5: Summary of responses to Question 10: “Do you currently have a data

management plan for your research data?”

Grouping the responses by discipline as shown in Table 3, the data shows that the highest

proportion of respondents who have DMPs are in Agricultural Sciences and Health related

disciplines, each at 50%, followed by Social Sciences at 40%, humanities (33%) and Sciences

(28.6%), while the rest do not have DMPs.

Table 3: Use of DMPs by discipline

Discipline Yes % No % Don't know % Total

Agricultural Sciences 2 50.0% 2 50.0% 4

Engineering & Architecture 2 66.7% 1 33% 3

Humanities 1 33.3% 2 66.7% 3

IT 2 100.0% 2

Law 1 100.0% 1

Business & Management

Science 3 100.0% 3

Medicine & Health 3 50.0% 3 50.0% 6

Sciences 2 28.6% 5 71.4% 7

Social Sciences 2 40.0% 3 60.0% 5

TOTAL 10 29.4% 23 67.6% 1 3% 34

Yes29%

No68%

Don't know3%

Page 40: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

31

Respondents indicated various motivations for developing their data management strategy.

The one that featured highly is “Research requirement to access/analyse/annotate others'

data” which made up 50% of the drivers followed by “Volume of data associated with

project” at 14%. Unlike the trend elsewhere, funder mandates feature lowly in influencing

development of DMPs in Malawi at only 7%. This is summarised in Table 4.

A range of reasons for not having DMPs is given by the respondents and presented in Figure

6. Absence of university data management policy and time and effort required were given as

the major ones, both accounting for nearly half of the reasons given. Lack of training or

expertise within research group and lack of local support or guidance together made up

approximately a fifth of the reasons. Nearly a tenth of the reasons given is that DMPs are not

a requirement by project funders followed by the reason that they are not required or

appropriate to field of research or research group at 6%. These were given by respondents

who come from the humanities, social science and science backgrounds.

Table 4: Summary of responses to question 10a on main drivers for developing data

management strategies

Driver for developing DMP Response Percentage

Research requirement to access/analyse/annotate others' data 7 50%

Requirement of project funder 2 14%

Size of project team (i.e. multiple data creators) 1 7%

Volume of data associated with project 3 21%

Complexity of data associated with project (e.g. multiple formats) 1 7%

Absence of university data management policy 0 0%

Other 0 0%

Page 41: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

32

Figure 6: Chart showing responses to question 8b on reasons for not developing

DMPs.

4.3 Responsibility for data management

As shown in Figure 7, more than two-thirds of the responses show that researchers manage

the data themselves. Departmental IT Officer and Central ICT account for only 16% of the

responses and just under 15% for Research Project Manager, Research Assistant, Research

Technician, Other designated person in Research Group, Local Data Centre, International

data centre / data archive combined.

3%

6%

9%

11%

11%

14%

23%

23%

Other

Not required / appropriate to field of research or researchgroup

Not required by project funder

Lack of training / expertise within research group

Lack of local support / guidance (e.g. Central Library, ICT)

Don't know

Time and effort required

Absence of university data management policy

Page 42: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

33

Figure 7: Responses to question 11 “Who, if anyone, is responsible for managing your

electronic research data?”

4.4 Experiences with loss of research data

The study found that more than half of the respondents have lost research data which was not

backed up as shown in Figure 8.

2%

2%

2%

2%

2%

4%

7%

7%

9%

63%

Research Project Manager

Research Assistant

Research Technician

Local Data Centre

International data centre / data archive

Other designated person in Research Group

PhD Student

Central ICT

Departmental IT Officer

Myself

Page 43: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

34

Figure 8: Question 12: Have you ever lost research data which was not backed up?

The participants who acknowledged to have lost data indicated several ways in which this

happened. Several cited multiple ways. As presented in Figure 9. Hardware failure accounted

for half of the ways in which unbacked data was lost followed by software failure at 36%.

Human error or loss and ‘other’ ways of data loss accounted for 7% each.

44%

56%

No

Yes

Page 44: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

35

Figure 9: Ways in which data loss occurred

4.5 Issues with storage

The vast majority of participants indicated that they had experienced problems storing their

research data due to the size of files. This is shown in Figure 10. Asked to give details, these

are the statements that they provided: “Inadequate hardware storage facilities”, “difficulties in

opening big files”, “some file systems don't allow big files” and “external hard drives being

full”.

Through hardware failure50%

Through software failure36%

Through human error or loss

7%

Other7%

Page 45: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

36

Figure 10: Question 13: Have you ever experienced any problems storing your research

data due to the size of the files?

4.6 Research Data Backup

The data, as displayed in the chart in Figure 11, shows a worrying trend in research data

backup with 35% of respondents indicating that they back up their data on an ad hoc basis,

while only 12% do it on a daily basis.

22%

78%

Yes

No

Page 46: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

37

Figure 11: Summary of answers to question 14 “On average, how frequently is your

data backed up?”

4.6.1 Backup media

Media on which the data are backed up are summarised in figure 15 and show that three of

the popular ones are hard disk drives of laptops (20%), external hard drives (19%) and hard

disk drive of computer on campus (14%). Figure 12 summarises the responses on data

backup media. It is worth noting that there were only 3 responses on use of shared drives

such as those of university servers.

12%

15%

26%

3%

35%

6%

3%

Daily

Weekly

Monthly

Annually

Ad hoc

Never

Dont' know

Page 47: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

38

Figure 12: Question 4b. Where are they backed up?

4.7 Researchers’ perceptions on sharing research data

As shown in Figure 13, almost 80% of the respondents indicated that they would support the

idea of their university's repository storing any of their research data, either for their

exclusive use or for wider access, if such a service was offered.

1%

1%

1%

1%

2%

3%

3%

4%

4%

7%

8%

11%

14%

19%

20%

Hard disk drive of instrument/sensor which generates…

Third party (including commercial data storage)

Slides

On paper

Photograph

Shared drive/server (e.g. University server)

Don't know

Hard disk drive of computer off campus

CD/DVD

Email client/server

Web-based service

USB/Flash drive

Hard disk drive of computer on campus

External hard drive

Hard disk drive of laptop/netbook

Page 48: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

39

Figure 13: Responses to question 15 “If the service was offered, would you want your

university's repository to store any of your research data, either for your exclusive

use or for wider access?”

However, asked how long they would want the repository to retain any of their data, less than

half would want the repository to retain their data perpetually. Most of them (54%) are in

favour of their data being kept only until the end of the project. Table 5 summarises the

responses

Table 5: Question 16: If yes, how long would you want the repository to retain any of

your research data, including data only accessible by you?

None of

my data

Some of

my data

Much of

my data

All of

my data

Not at all 33% 8% 25% 33%

Until the end of the project 8% 38% 54%

For a finite period after end of project 35% 24% 41%

Until I leave the University 13% 20% 13% 53%

In perpetuity 11% 22% 22% 44%

Yes79%

No21%

Page 49: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

40

4.8 Research data ownership

Table 6 summarises respondents’ answers to the question of research data ownership for

their projects. The percentage of those who own all the data is the same as that of those who

own some of the data (44%), while just under 10% said they own none of the data. One

respondent did not know who owned the data.

In most of the projects, respondents share data ownership with other academics or

researchers, followed by funding bodies and then journal publishers. Ten percent of the

projects do not share ownership with any one. This is presented in Figure 14.

Table 6: Question 17. Who owns the research data you hold?

Response No. of respondents Percentage

I own all of the data I hold 15 44%

I own some of the data I hold 15 44%

I own none of the data I hold 3 9%

Don't know 1 3%

Page 50: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

41

Figure 14: Question 18. Do you share ownership of any of your research data with

others?

4.9 Research data sharing

As shown in Figure 15, e-mail is the method of sharing research data used most often (35%)

followed by portable storage media (23%) and web-based service (20%). It is interesting to

note that paper is used more often (9%) than shared university drives (5%). Less than 5% say

that they never share research data with anyone.

10%

21%

25%

44%

No

Yes, with journals/publishers

Yes, with funding bodies

Yes, with other academics/researchers

Page 51: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

42

Figure 15: Question 19. How do you currently share research data with colleagues?

Different problems are encountered by the researchers when sharing data. Only 29% say they

have not faced problems when sharing data with their colleagues.

Finding suitable shared storage space has been a problem to some. Perhaps this confirms that

there are storage issues as also indicated by the finding above that shared drive/server is used

by very few people. Lack of file naming conventions made it difficult to identify files. Other

issues include legal issues surrounding international transfer of data and problems

establishing ownership of data as well lack of time to keep all colleagues constantly up to

date. One respondent also cited ‘network problems’ in addition to the given list of answer

options. These are presented in Figure 16.

2%

3%

3%

5%

9%

20%

23%

35%

Shared computer

I never share data with colleagues

Other

Shared drive/server (e.g. University server)

On paper

Web-based service (e.g. Google Docs, Flickr, Box.net,Dropbox, Pando etc.)

Using portable storage (e.g. CDs, DVDs, external harddrive, memory sticks etc.)

E-mail

Page 52: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

43

Figure 16: Question 20. What problems have you encountered when sharing data

with colleagues?

Question 21 wanted to find out who researchers wanted to be allowed access to their research

data. There was a mistake in that a wrong entry “My colleagues” was added to the list of

response options which has resulted in a wrong column (last column of Table 7).

2%

8%

8%

10%

12%

14%

16%

29%

Other

Lack of version control caused confusion

Problems establishing ownership of data

Legal issues arising from international transfer of data

Time consuming to keep all colleagues constantly up todate

Lack of file naming conventions made it difficult to identifyfiles

Finding suitable shared storage space

I have not encountered problems

Page 53: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

44

Table 7: Question 21. Apart from yourself, who would you want to be allowed access to

your research data?

None of

my data

Some of

my data

Much of

my data

All of

my data

My

colleagues

My colleagues 7% 33% 22% 26% 11%

My school 5% 35% 35% 25% 0%

The whole university 4% 43% 30% 22% 0%

Specified academic communities

beyond the university 4% 54% 12% 19% 12%

Anyone (including general

public) 8% 54% 17% 17% 4%

The wrong response was removed and the data re-tabulated resulting in Table 8, which

shows that most of the respondents are in favour of sharing some of their data with all the

stakeholders listed.

Table 8: Re-tabulation of Table YY data

None of

my data

Some of

my data

Much of

my data

All of my

data

My colleagues 8% 38% 25% 29%

My school 5% 35% 35% 25%

The whole university 4% 43% 30% 22%

Specified academic

communities beyond the

university 4% 61% 13% 22%

Anyone (including general

public) 9% 57% 17% 17%

Page 54: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

45

Figure 17: Question 22. What factors would prevent your research data from being

made open access to the general public?

4.10 Experience with research council mandating data sharing

The majority of the respondents, over three-quarters, have never applied for funding from a

body that required some degree of open access to be provided for research data. Only 18%

have. Figure 18 summarises the responses.

6%

6%

6%

7%

10%

13%

16%

17%

20%

None

I do not believe the public would have any use for some ofmy data

Data have commercial value

Data contain personal information/have not beenanonymised

Funder restrictions

I do not have the ownership rights to share all of my data

Protect own ideas or intellectual property

Ethics requirements of university/funder

Data are not ready to be released/concern unpublishedwork

Page 55: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

46

Figure 18: Question 23. Have you ever applied for funding from a body that required

some degree of open access to be provided for your research data?

The funders that were mentioned by each of the respondents who had applied for funding

from bodies mandating sharing are NUFU, Wellcome Trust, IDRC, NIH, Water Research

Commission and DFID.

Q23b: although only 6 participants indicated that they had at some point applied for research

funding from a body that mandated sharing of research data, it is surprising that a total of 17

respondents answered the question that followed which was intended for those who had made

applications to such a funder. This should have been a multiple choice question demanding

only one response.

Yes18%

No76%

Don't know6%

Page 56: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

47

Table 9 indicates that the majority of the respondents did not have problems meeting the

funders’ requirements. One-third of those who said they had applied for funding from such

funding bodies said that they had experienced problems in meeting the requirements but have

always been able to meet the requirements

Table 9: Question 23b. Have you ever experienced difficulties in meeting

these requirements?

No 12 71%

Yes, but I have always been able to meet the requirements 3 18%

Yes, as a result I was unable to obtain funding through this body 0 0%

Yes, and I need training and guidance 2 12%

4.11 Free text responses given for question 24 are in Appendix C8.

Page 57: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

48

CHAPTER V: DISCUSSION

5.0 Introduction

The data for this study suggests that many researchers in Malawi, ranging from principal

investigators to research students and from various disciplines, either currently hold research

data or have in the past done so and it can be expected that these will again in the future hold

such data. It is important therefore to understand the characteristics of the data and how they

handle and manage such data in order to support them effectively.

5.1 Attributes of the research data

5.1.1 Data categories

A wide array of research data are collected by researchers in academic institutions in Malawi

the bulk of which is made up of surveys and interviews. Observational and experimental data

are also collected often. These findings are similar to those of several DAF studies such as

the University of Northampton Study which found that observational data tend to compliment

results from surveys or experiments, thereby playing a supporting and complimentary role.

All the disciplines represented in the current study collect survey data more than the other

research data types. However, only Science & Technology and Engineering & Architecture

disciplines collect simulated, derived and reference data. This supports the notion that the

type of data collected is determined by the discipline to which the researchers belong.

A look at the data indicates that Microsoft Office suite of applications such as Word and

Excel is widely used. The proliferation of Microsoft products is beneficial for researchers

because it guarantees ease of use, analysis and sharing of data among researchers owing to

the similarities in formats. If many researchers are using similar software products for their

data, it is easier to support them than when they employ a wide array of applications.

Page 58: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

49

Digital audio and video files have been mentioned by the same number of 4 respondents. It is

typical that interview data would be collected using voice or video recorders, stored as digital

audio or video files, transcribed and stored as MS Word document files. Excel or SPSS files

are typically used to keep data collected through questionnaires and Word is used to write

research reports.

Use of audio tapes has been mentioned by a few participants. It could be that more

researchers are using them. It is of concern that some are still using such outdated technology

because there is a risk that the data could be lost due to degradation of or damage to the tapes.

Further to that, the quality of sound on tapes dwindles with time. There is need for such data

to be converted from their present analogue to digital format for easy storage and to preserve

their sonic quality.

5.1.2 Databases

The overwhelming majority of researchers in Malawi store data in databases where the data

shows that SPSS is used more than other database software. One challenge that researchers

are likely to face, as observed in the Northampton study, is keeping up to date with SPSS

version upgrades as this program is updated annually. Their old databases may not open

using updated versions of the software. The solution would be software institutional licences

on campus servers so that as databases are stored on shared drives on the servers, they would

also benefit from the annual upgrades through their institutional subscriptions.

5.1.3 Image data

Use of image data is common by researchers in Malawi’s HEIs and the image format of

choice is ‘.jpg/.jpeg’. Many digital cameras use this as the default standard and a number of

applications including web browsers are able to read this format. This implies that researchers

Page 59: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

50

are able to share their image data easily and that they do not have to be restricted by highly

specialised and expensive programs to use them. The other advantage to researchers is that

‘.jpeg’ files do not take up huge amounts of hard disk space because of their compressed

format.

5.1.4 Audio data

Audio research data is also commonly used in Malawi, the majority of which is stored using

‘.mp3’ as the primary format. Files in this format are small in size, implying that they do not

take up much space on the computer and they are also more portable than the other formats.

The drawback though, is that they have compatibility issues with some CD players.

5.1.5 Video data

Comparably, there are less users of video research data than there are for audio. Video takes

up more space than any of the file types. Some research is more suited for video than other

file formats. One reason researchers are not using video as much as other formats could be

unavailability of adequate storage space both on their campuses as well as personal

computers or laptops. Nearly half of the video data is stored using the ‘.mpeg’ format which

is a compressed file format with the advantage of smaller file sizes than other video formats.

This is helpful because it helps mitigate the noticeable inadequacies in storage space or

infrastructure.

5.1.6 Data volumes

The majority of researchers are keeping data volumes that laptop hard disks are capable of

storing. Where institutions do not provide storage infrastructure, as is the case in most

campuses presently, researchers would tend to store their research data on their laptops or

Page 60: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

51

external hard disks. This is risky to the data because laptops and external hard disks can be

stolen or do crash. In addition, if a researcher leaves the institution, they would go away with

the data that they obtained in the name of their previous institution.

5.2 Management of research data

5.2.1 Storage media used

Researchers in Malawi are using various types of storage media for their research data. The

ones that abound are those that are managed by the researchers themselves such as laptop

hard disk drives, USB/Flash drives and external hard drives. Laptops are only convenient for

storing data temporarily but they should not be used to store master copies of data. Use of

shared drives such as those on institutional servers is almost non-existent. This is a worrying

state of affairs because the data is at great risk. Laptops, external hard disk drives or

USB/Flash disks can be lost or stolen easily and most of the times they stop working

unexpectedly leading to loss of data.

Interestingly, some researchers depend on e-mail to store their data. Perhaps they email the

data as attachments to themselves. Email services have file size limits that they can transmit

per email and most of them such as Gmail, a common email service platform in many HEIs

these days, allow up to 10MB. With the finding that data for researchers in the current study

far exceeds this limit, it is easy to see that email servers are not recommended for storing

research data and further to that, they are not designed for that purpose.

In most cases, the researchers manage their research data on their own using their own

facilities such as PCs, laptops and portable storage media. On the whole, there is no clear

dedicated responsibility for management of research data. It is of utmost importance that

research data management be given the priority that it deserves by recruiting specialised

Page 61: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

52

personnel to handle it and investing in the necessary storage infrastructure. The recruitment

could be done either by re-skilling the available staff in departments such as library or IT or

employing new people altogether. This is what universities wanting to strengthen research

support services in developed countries are doing. Managing research data well guarantees its

integrity over the long term.

5.2.2 Research Data Backup

Backing up research data is an extremely important component of research data management

because it ensures that the data are available long after the projects end. It is also a way of

safeguarding the financial and time investment that was made in obtaining them. It is of great

concern that in Malawi’s academic institutions, there is no comprehensive and systematic

approach to research data backup. Most of the data is backed up on an ad hoc basis. This is

risky to the data because coupled with the finding that most of the data is stored and backed

up using personal devices, the data can inevitably and easily be lost.

The custom of research data backup parallels that of storage where data is mostly backed up

using the researchers’ personal devices such as hard disk drives of laptops, external hard

drives and hard disk drives of computers in their offices on campus. This perhaps explains

why the majority of them backup their data themselves. Similarly just as in storage, backing

up data to institutional server hard disk drives is almost non-existent. The reasons could be

that either such servers are not available or just as other DAF studies found, there is no

awareness of the availability of such infrastructure to be used for storage and back up of

research data.

The data backup practices seem to be lacking in rigour and frequency, which, coupled with

the finding that data is backed up as and when the researchers feel, is a risk to the research

Page 62: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

53

enterprise and therefore, wasteful of the funds that financed the research, risky to data reuse

among a host of other problems.

5.2.3 Research data sharing

As research in modern times has become increasingly global and characterised by enormous

volumes of data and collaboration among researchers from a multiplicity of disciplines,

sharing has become crucial. The overwhelming majority of researchers in the current study

support the notion of their university's repository storing their research data, either for their

exclusive use or for wider access, if such a service was offered. Most researchers in Malawi

are largely in favour of sharing their research data. However, when asked how long they

would want a hypothetical repository to hold their research data, only about half of them, on

average, gave their responses and these differed per option.

Surprisingly, contrary to the overwhelming support for data sharing, less than half of those

who responded want all of their data to be retained in perpetuity. Just over half want all of

their data to be held in the hypothetical repository until the end of the project and until they

leave their university. Those who would like some and much of their data to be preserved at

any length of time, on average, exceed just about one fifth of those who responded, meaning

that most researchers are not really in support of sharing their own research data. This finding

seems to agree with what the literature reports, for example Tenopir et al. (2011), who

reported that sharing practices of researchers were minimal.

Most of the participants do not own all of the data that they hold. This could explain the

nonresponse in that they may not feel free to share that which they do not entirely own

themselves. In most cases, ownership is shared between the researchers and other academics

or researchers, journal publishers and funding bodies.

Page 63: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

54

Researchers in Malawi share their data in various ways. A number of them use e-mail. These

should be datasets that are small in size transmittable by email. Portable storage media such

as CDs, DVDs, external hard drives and memory sticks and web-based services such as

Google Doc, DropBox and Flickr are also being used to share data. A small proportion is

sharing using paper. In keeping with what has been reported earlier in this document on

storage and backup practices, use of institutional shared drives to share research data is not

common in Malawi’s academic institutions. It is pleasing to note that those who never share

data with colleagues are in a tiny minority.

5.2.4 Sharing by discipline

When the data are disaggregated by discipline, the picture of sharing perceptions is unclear

because the sizes of the data become too small to make any conclusive judgement. This has

been compounded by the lack of responses for some of the options. However, one

observation that raises curiosity is that all participants from the medical sciences selected

“All my data” for all the timeframes given implying that those who responded have no

problems sharing any of their data. How does this compare with the literature regarding

researchers from this discipline? On the contrary, no clear pattern emerges from the data

regarding social sciences, sciences and the other disciplines regarding their perceptions

towards sharing. Therefore, the extent to which researchers from the various disciplines want

to share their data remains largely unclear.

However, in the absence of respondents from other institutions and a low response rate, this

interpretation should be treated with caution.

5.2.5 Hindrances to sharing

Page 64: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

55

There are various challenges that prevent researchers in Malawi from effectively sharing their

research data. Researchers do not find suitable shared storage space, a now recurring issue, to

enable them share. The data also suggests that lack of file naming conventions makes it

difficult to identify files. This is a problem that emanates from the absence of research data

management policy. As found by Jones & Ross (2009), the prevalence of “idiosyncratic

working practices” leads to differences in naming conventions of data files and is one of the

hindrances to data sharing.

Legal issues arising from international transfer of data and problems establishing ownership

of data as well as demands on time consuming to keep all colleagues constantly up to date

have also been named as some of the challenges being faced by these researchers. One

responded also cited ‘network problems’ in addition to those that were on the list of options.

Most of the respondents are in favour of sharing only some of their data with all the

stakeholders listed such as their colleagues, school, whole university and other external

academic communities. They are less willing to allow access to all of their data and they are

not in favour of their data being made open access to the general public for a number of

reasons. Some feel that the data are not ready to be released because the works have not been

published, others are concerned with ethics requirements of their university or funder and

some would like to protect their ideas or intellectual property. As noted earlier, there are

researchers who cannot share data because they do not have the ownership rights to share it.

Small numbers of researchers mention funder restrictions, data not being anonymised,

commercial interests as well as a belief that the public would not have any use for some of

their data as the reasons for not wanting to share.

5.2.6 Long-term preservation of research data

Page 65: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

56

Preservation of research data ensures its availability over the long-term. The absence of

research data policies, high capacity storage infrastructure and dedicated data management

staff in Malawi’s HEIs makes it almost impossible to preserve electronic research data. This

goes along with the practices and culture of storage and backup that are prevailing at the

moment.

5.2.7 Issues with day to day management of data and support needs

Many of the sub-sections above have been alluded to a number of issues that researchers in

Malawi often face regarding the management of their research data. These hinge on storage,

backup and sharing of data which can be traced to the unavailability of proper storage and

preservation infrastructure.

Linked to the same cause, many researchers lose their research data which has not been

backed up. Data collection, entry, analysis and reporting are expensive activities in monetary

terms and the time and energy it takes to achieve them. Some data is collected over long

periods, often involving many enumerators and therefore, a lot of financial resources. Some

data may only be collected at once as it may involve a particular snapshot in time such as

climate change data. Losing this data therefore, means a waste of time and money. It is also a

blow to reuse and repurposing of the data.

Technical problems such as hardware and software failure together form a huge part of the

reasons for data loss. This is a serious issue because it happens in an environment where the

majority of researchers are managing the data on their own. This means that many

researchers are helpless with the management and safeguarding of their data.

Page 66: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

57

Researchers also lose data through human error such as theft or loss of their devices. The

picture is grim when one considers that these are the primary devices on which most of the

researchers are using to store and backup their data.

Asked if they have ever experienced any problems storing research data due to size files, the

majority indicated that they have not. This is a contradiction especially when one considers

the lack of storage infrastructure that has been identified. It could be that researchers feel self-

sufficient in terms of storage of their data because their laptops and external hard discs, which

are their primary storage devices, have high storage capacities although they are prone to

loss, theft or even wear and tear.

The few researchers who concede to having encountered problems with storing huge files of

their research data give various reasons. These are “inadequate hardware storage facilities”

which is a recurring issue, “difficulties in opening big files”, “some file systems don't allow

big files” and “external hard drives being full”.

5.2.8 Experience with Data Management Plans

Research funders’ policies that require grant applicants to include data management plans in

their application for funding have contributed to increased uptake of research data

management activities in many research and academic institutions in developed countries.

The current study wanted to know if researchers in Malawi have had any experiences with

such funding bodies.

The majority of researchers in Malawi have never applied for research funding from any

funding institution that requires some degree of open access to be provided for research data.

Only a small proportion has.

Page 67: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

58

Most of the researchers who have dealt with these funders report that they have never

experienced any problems in meeting the data open access requirements but a few have

struggled. Although no one has failed to obtain the funding applied for despite experiencing

difficulties, some have expressed the need for support in preparing DMPs.

Some of the funders which researchers in Malawi have applied funding to include NUFU,

Wellcome Trust, IDRC, NIH, Water Research Commission and DFID. Some of these such as

Wellcome Trust, NIH and DFID do require DMPs from grant applicants.

5.2.9 Researchers’ specific concerns

Researchers in Malawi express various concerns over the current management of their

research data or services they would like to see offered by their universities to guarantee

future access to the data. Their responses have been categorised into different themes.

Theme 1: Policy and Storage issues

“Need system of data management and secure server in the department”

“At the moment my storage of research data at my UNI is on a personal basis. I don't know if

there's a data management policy, I will have to check but I think it will nice to have one”

Theme 2: Concerns of data theft

“In most cases there is element of data theft, mainly between IT personnel and the data

'hunters'. Other don’t mind other people's effort and energy engaged in data collection

especially in its raw form. Once published then it can be made public”.

Theme 3: Investment / infrastructure / sharing /access / storage

Page 68: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

59

“My university needs to invest more in ICT access to make it possible to start comfortably

sharing data”

“It would be useful if research data mainly Theses were posted online through institutionally

controlled access for easy access by those interested both nationally and internationally”.

“I would love to have a university central server where I ca deposit my data and be able to

retrieve my data when I am within or outside campus including outside the country”.

Theme 4: Connectivity issues

“The most serious problem is that internet services are poor thereby affecting public access to

some data that we would want to share”.

“Yes, there is a serious challenge with internet connectivity at Polytechnic. Secondly our

publications do not appear in full on our website”.

Theme 5: Training / awareness issues

“Lack of knowledge about data management and there seems to be no-one who minds to

offer some enlightenment on the same”.

Theme 6: Perceived administrative issues

“Management taking too long to put things in place”.

Page 69: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

60

CHAPTER VI: CONCLUSION

6.0 Introduction

This study had set out to understand the present research data management practices on the

Malawian higher education scene. This chapter summarises the findings of the study, outlines

its contribution to the body of knowledge in this area and finally makes practical

recommendations to improve how research data management in Malawi.

6.1 Summary of findings of the study

Many researchers from various academic disciplines in Malawi collect and hold a wide array

of research data from time to time. These range from surveys and interviews to observational

and experimental data. Some of the types of data collected are more discipline-specific than

others, for example simulated data is more prevalent in the Technology and Engineering

disciplines than others.

Digital and audio data is also collected. Digital audio and video data are also generated.

These need high capacity storage because they take up a lot of storage space than other data

types.

Microsoft Office applications are in wide use which makes it easier for researchers to use for

analysis and sharing of data due to similarities in formats. It is also easier to support them

than when there is a proliferation of various types of applications.

As in other studies, SPSS is the most common software for storing and manipulating

databases in Malawi.

Researchers also collect image data which are stored mostly using the compressed ‘.mp3’

format. Some video data is also collected where ‘.mpeg’ is the format of choice for using

them.

Page 70: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

61

Most of the data generated falls within the less than 1GB to 1TB range where laptop hard

disk drives, USB/Flash drives and external hard drives form the bulk of the media used to

store them. These are the devices that are also mostly used to back up the data. This practice

of data storage and backup poses a great risk to the data because these storage devices are

prone to loss, theft and abrupt irretrievability problems. Worryingly, the most effective way

of storage and backup of research data which is using shared drives on institutional servers is

almost non-existent.

Stewardship of research data is primarily done by the researchers themselves where the

critical function of data backup is largely done on an ad hoc basis as opposed to taking a

more rigorous approach. Clearly, researchers are too busy to back up their data more

systematically and frequently.

Although most of the researchers say that they have not experienced problems storing huge

files due to size, perhaps because they have high capacity laptops and external hard drives,

those who have experienced such challenges mention that there is insufficient hardware

storage infrastructure.

Researchers in Malawi overwhelmingly support the idea of research data sharing. However,

most of them are noncommittal when asked about how long they would want their data to be

held in an open access repository and very few would like their data to be held in perpetuity

in such a repository. This could be the case because most of them do not own all of the data

they hold.

A majority of researchers are not willing to share all of their data with research stakeholders.

Some of them want the data to be published before it can be released and others are

concerned with ethical requirements and yet others are protecting their intellectual property.

Page 71: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

62

There is also a feeling that the general public would not have any use with the data and

therefore, there is no need to share them.

A variety of methods in which researchers share data abound. These include email, optical

discs, external hard drives and memory sticks. Web-based services are also being used.

Sharing is hampered by the unavailability of suitable storage space, a lack of file naming

conventions and internet connectivity limitations.

Most researchers have never applied for research funding from bodies that mandate sharing

of research data. Of those who have, most report having no problems meeting the funder’s

requirements although a few have had challenges and none has lost funding as a result. In

agreement with other studies, some researchers in this study are also asking for training and

support with data management plans.

Lack of experience with the funders in question could explain why the overwhelming

majority of researchers in HEIs in Malawi do not have data management plans for their

research data. Examples of these are data preservation policy, record management policy and

data disposal strategy. However, even most of those who have DMPs do not agree that funder

mandates influenced them to do so.

Funders who are well known for mandating data sharing who researchers in Malawi have

dealt with include The Wellcome Trust, NIH and DFID.

A majority of researchers have lost their research data that was not backed up mostly through

technical problems and loss or theft of their laptops and storage devices.

Their expressed concerns regarding management of their research data hinge on policy,

storage, data theft, connectivity, training and administrative issues.

Page 72: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

63

6.2 Contribution

This study has much to contribute to the body of knowledge, research support services in

higher education and to formulation of policy on management of research data.

Studies on how researchers manage their research data have been conducted mostly in

Europe, Australia and North America. One known study in Africa was done in South Africa

in 2010. The present study adds to the existing body of knowledge as it provides a picture of

Malawi’s research data management landscape. Future researchers on the topic of RDM

would have to consult this work.

The study identifies areas that researchers need support and guidance in. University

management and support services could use this information to design programs and recruit

personnel to provide this support.

Policy makers in the public and academic sectors are also some of those who could benefit

from the findings of this study which. In combination with other studies and policies, they

could use it to formulate RDM policies that address some of the issues raised in this study as

it has been found that the absence of policy on management of research data seems to

contribute to the idiosyncratic nature that is obvious in the way that the data is being

managed.

6.3 Limitations of the study

Many researchers from the College of Medicine did not respond to the questionnaire because

they wanted to see ethical clearance from Malawi in addition to the one granted by the

University of Sheffield. A lot of research involving funders that require open access to

research data has been going on at this college, therefore, their responses could have given

insight on the extent to which these funders’ policies are shaping management of research

data.

Page 73: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

64

The online survey approach did not help to obtain a deep understanding of how researchers

are caring for their data. It was also difficult to really gain a real picture of the data volumes

and types of data these researchers are producing. It was apparent that some respondents were

either tired of responding to certain questions or were indifferent to them.

In hindsight, the questionnaire focussed on too many areas such as types of research data

collected, their different formats, the types of software used in manipulating or storing them,

storage and data loss issues, the data management aspects, experience with DMPs, funders a

sharing.

6.4 Recommendations for further research

The following recommendations are offered for future studies in the area of RDM in Malawi.

To obtain more qualitative data, face to face interviews with researchers and those supporting

them would be useful.

It is recommended that physical audits of computers used for storing research data be carried

out as part of studies to understand how the data is being managed. This would give a better

picture of the types, formats and volumes of data being generated and how these data are

managed.

Good information could be obtained by focussing on fewer themes at a time to ensure that

respondents are engaged with the survey throughout the whole process.

To ensure maximum participation from all potential participants, resolving issues of local

ethical requirements well ahead of time should be considered.

Word count:13,416

Page 74: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

65

CHAPTER VII: REFERENCES

Akers, K. G., & Doty, J. (2013). Disciplinary differences in faculty research data

management practices and perspectives. International Journal of Digital Curation, 8(2), 5–

26. doi:10.2218/ijdc.v8i2.263

Alexogiannopoulos, E., McKenney, S. and Pickton, M. (2010) Research Data Management

Project: a DAF investigation of Research Data Management practices at The University of

Northampton. Northampton: University of Northampton. Available from:

http://nectar.northampton.ac.uk/2736

ANDS. (n.d.). Data Reuse. Retrieved May 26, 2014, from

http://ands.org.au/discovery/reuse.html

ANDS. (n.d.). ANDS Guides and Other Resources. Retrieved June 25, 2014, from

http://ands.org.au/guides/index.html

Arts and Humanities Research Council. (2014). Research funding guide (Version 2.6).

Swindon: Arts and Humanities Research Council. Retrieved 03 July 2014 from

http://www.ahrc.ac.uk/SiteCollectionDocuments/Research-Funding-Guide.pdf

Babbie, E. R. (1979). The practice of social research (2nd Ed.). Belmont, California:

Wadsworth Publishing Company.

Barnes, S. (2001). Bristol Online Surveys (BOS) knowledgebase » Survey design. Retrieved

May 6, 2014, from http://www.survey.bris.ac.uk/support/survey-design

Page 75: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

66

Borgman, C. (2012). The conundrum of sharing research data. Journal of the American

Society for Information Science and Technology, 63(6), 1059–1078. doi:10.1002/asi.22634

Bryman, A. (2012). Social Research Methods (4th ed.). Oxford University Press.

Corrall, S., Kennan, M. A., & Afzal, W. (2013). Bibliometrics and Research Data

Management services: emerging trends in library support for research. Library Trends, 61(3),

636–674. doi:10.1353/lib.2013.0005Corrall, S., Kennan, M. A., & Afzal, W. (2013).

Digital Curation Centre. (2014). RDM in South Africa - UCT Research Data Management

Policy and Strategy Workshop. Retrieved June 4, 2014, from http://www.dcc.ac.uk/news/uct-

strategy-workshop

Digital Curation Centre (2009) Data Asset Framework: Implementation guide. Retrieved 01

May 2014 from: http://www.data-audit.eu/docs/DAF_Implementation_Guide.pdf

DCC. (n.d.). Digital curation training for all. Retrieved June 25, 2014, from

http://www.dcc.ac.uk/training

DCC. (n.d.). International Journal of Digital Curation. Retrieved June 25, 2014, from

http://www.dcc.ac.uk/resources/curation-journals/ijdc

Doorn, P., Dillo, I., & van Horik, R. (2013). Lies, damned lies and research data: can data

sharing prevent data fraud? International Journal of Digital Curation, 8(1), 229–243.

doi:10.2218/ijdc.v8i1.256

Economic and Social Research Council. (2013). ESRC Research Data Policy September

2010 (Revised March 2013). Swindon: Economic and Social Research Council. Retrieved 03

July 2014 from http://www.esrc.ac.uk/_images/Research_Data_Policy_2010_tcm8-4595.pdf

Page 76: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

67

Engineering and Physical Science Research Council (EPSRC). (2014). Principles. Retrieved

June 4, 2014, from

http://www.epsrc.ac.uk/about/standards/researchdata/Pages/principles.aspx

Engineering and Physical Sciences Research Council. (2011). Expectations - EPSRC policy

framework on research data. Expectations - Engineering and Physical Sciences Research

Council. Retrieved 03 July 2014 from

http://www.epsrc.ac.uk/about/standards/researchdata/expectations/

European Science Foundation (2000). Good scientific practice in research and scholarship

(No. 10). European Science Foundation. Retrieved 27 June 2014 from

http://www.esf.org/fileadmin/Public_documents/Publications/ESPB10.pdf

Gibbs, H. (2009). Southampton data survey: our experience and lessons learned. Edinburgh

University. Retrieved from http://www.disc-uk.org/docs/SouthamptonDAF.pdf

Groenewegen, D., & Treloar, A. (2013). Adding value by taking a national and institutional

approach to research data: the ANDS experience. International Journal of Digital Curation,

8(2), 89–98. doi:10.2218/ijdc.v8i2.274

Halbert, M. (2013). The problematic future of research data management: challenges,

opportunities and emerging patterns identified by the DataRes Project. International Journal

of Digital Curation, 8(2), 111–122. doi:10.2218/ijdc.v8i2.276

Human Genome Organisation (HUGO), Ethics Committee. (2002). Statement on human

genomic databases, December 2002. Human Genome Organisation (HUGO). Retrieved 09

July 2014 from http://www.hugo-international.org/img/genomic_2002.pdf

Page 77: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

68

Jerrome, N., & Breeze, J. (2009). Imperial College Data Audit Framework Implementation:

Final Report. Programme/Project deposit. Retrieved June 9, 2014, from

http://repository.jisc.ac.uk/307/

Jones, K. (2011). Assessing institutional data storage and management using the Data Asset

Framework (DAF) methodology at the University of Bath. Reports/Papers. Retrieved June 2,

2014, fromhttp://opus.bath.ac.uk/24960/

Jones, S (2014). The range and components of RDM infrastructure and services. Pryor, G.,

Jones, S., & Whyte, A. (Eds.), Delivering Research Data Management services: fundamentals

of good practice. (p. 98). London: Facet Publishing.

Jones, S., Ball, A., & Ekmekcioglu, Ç. (2008). The Data Audit Framework: a first step in the

data management challenge.International Journal of Digital Curation, 3(2), 112–120.

doi:10.2218/ijdc.v3i2.62

Jones, S., & Ross, S. (2009). Data Audit Framework Development (DAFD) Project final

report. Glasgow. Retrieved from http://www.data-audit.eu/docs/DAFDfinalreport.pdf

Jones, S., Pryor, G., & Whyte, A. (2013). ‘How to Develop Research Data Management

Services - a guide for HEIs’. DCC How-to Guides. Edinburgh: Digital Curation Centre.

Available online: http://www.dcc.ac.uk/resources/how-guides

Jones, S., Ross, S., & Ruusalepp, R. (2008). The Data Audit Framework: a toolkit to identify

research assets and improve data management in research led institutions (pp. 213–219).

Presented at the 5th International iPRES Conference: Joined Up and Working: Tools and

Page 78: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

69

Methods for Digital Preservation, London, England. Retrieved from

http://www.bl.uk/ipres2008/ipres2008-proceedings.pdf

Jones, S., Ball, A., & Ekmekcioglu, Ç. (2008). The Data Audit Framework: a first step in the

data management challenge. International Journal of Digital Curation, 3(2), 112–120.

doi:10.2218/ijdc.v3i2.62

Jubb, M. (2007). UK Research Funders’ Policies for the Management of Information

Outputs. International Journal of Digital Curation, 2(1), 29–48. doi:10.2218/ijdc.v2i1.12

Levelt Committee, Noort Committee, & Drenth Committee. (2014). Flawed science: the

fraudulent research practices of social psychologist Diederik Stapel (Stapel Investigation).

Tilburg University/University of Groningen/University of Amsterdam. Retrieved 07 July

2014 from https://www.commissielevelt.nl/wp-

content/uploads_per_blog/commissielevelt/2013/01/finalreportLevelt1.pdf

Lyon, L., Rusbridge, C., Neilson, C., & Whyte, A. (2010). Disciplinary Approaches to

Sharing, Curation, Reuse and Preservation: DCC SCARP Final Report to JISC. Edinburgh:

Digital Curation Centre. Retrieved from

http://www.dcc.ac.uk/sites/default/files/documents/scarp/SCARP-FinalReport-Final-

SENT.pdf

Martinez-Uribe, L. (2009). Using the Data Audit Framework: an Oxford case study. Oxford:

University of Oxford. Retrieved fromhttp://www.disc-uk.org/docs/DAF-Oxford.pdf

Medical Research Council. (2011). MRC policy and guidance on sharing of research data

from population and patient studies (No. v01-00). Medical Research Council. Retrieved 03

Page 79: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

70

July 2014 from http://www.mrc.ac.uk/news-events/publications/mrc-policy-and-guidance-on-

sharing-of-research-data-from-population-and-patient-studies/

Michener, W. K., Allard, S., Budden, A., Cook, R. B., Douglass, K., Frame, M., … Vieglais,

D. A. (2012). Participatory design of DataONE—Enabling cyberinfrastructure for the

biological and environmental sciences. Ecological Informatics, 11, 5–15.

doi:10.1016/j.ecoinf.2011.08.007

National Science Foundation. (2010). Data Management & Sharing Frequently Asked

Questions (FAQs). Retrieved July 9, 2014, from

http://www.nsf.gov/bfa/dias/policy/dmpfaqs.jsp

National Science Foundation. (2010). Scientists seeking NSF funding will soon be required to

submit data management plans: government-wide emphasis on community access to data

supports substantive push toward more open sharing of research data (Press Release 10-

077). Arlington, Virginia: National Science Foundation. Retrieved 04 July 2014 from

http://www.nsf.gov/news/news_summ.jsp?cntn_id=116928

National Commission for Sceince and Technology. (2011). The framework of guidelines for

research in the social sciences and humanities in Malawi; Issued with legislative anchorage to

the Science and Technology Act No.16 of 2003. NCST. Retrieved 25 June, 2014 from

http://www.ncst.mw/wp-content/uploads/2014/03/NATIONAL-FRAMEWORK-OF-

GUIDELINES-IN-SSH.pdf

National Science Foundation. (2010). Award and Administration Guide. Retrieved May 26,

2014, from http://www.nsf.gov/pubs/policydocs/pappguide/nsf11001/aag_6.jsp#VID4

Page 80: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

71

Organisation for Economic Co-operation and Development, The. (2007). OECD principles

and guidelines for access to Research Data from Public Funding. Retrieved from

http://www.oecd.org/sti/sci-tech/38500813.pdf

Parsons, T. (2013). Creating a research data management service. International Journal of

Digital Curation, 8(2), 146–156. doi:10.2218/ijdc.v8i2.279

Pienaar, H. (2010). Survey of research data management practices at the University of

Pretoria, South Africa: October 2009 – March 2010. Retrieved from

http://repository.up.ac.za/handle/2263/15154

Pryor, G., Jones, S., & Whyte, A. (Eds.). (2014). Delivering Research Data Management

services: fundamentals of good practice. London: Facet Publishing.

Pryor, G. (Ed.). (2012). Managing research data. London: Facet Publishing.

Pryor, G., & Donnelly, M. (2009). Skilling up to do data: whose role, whose responsibility,

whose career? International Journal of Digital Curation, 4(2), 158–170.

doi:10.2218/ijdc.v4i2.105

RCUK (2011). RCUK Common Principles on Data Policy - Research Councils UK.

Retrieved May 26, 2014, fromhttp://www.rcuk.ac.uk/research/datapolicy/

Rice, R., & Haywood, J. (2011). Research Data Management initiatives at University of

Edinburgh. International Journal of Digital Curation, 6(2), 232–244.

doi:10.2218/ijdc.v6i2.199

Page 81: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

72

Royal Society, The. (2012). Science as an open enterprise. London: The Royal Society.

Retrieved 20 May 2014 from https://royalsociety.org/~/media/policy/projects/sape/2012-06-

20-saoe-summary.pdf

Tenopir, C., Birch, B., & Allard, S. (2012). Academic libraries and research data services:

Current practices and plans for the future; an ACRL white paper. Chicago: Association of

College and Research Libraries, a division of the American Library Association.

Tenopir, C., Allard, S., Douglass, K., Aydinoglu, A., Wu, L., Read, E., … Frame, M. (2011).

Data sharing by scientists: practices and perceptions. PLoS ONE, 6(6).

doi:10.1371/journal.pone.0021101

Vines, T. H., Albert, A. Y. K., Andrew, R. L., Débarre, F., Bock, D. G., Franklin, M. T., …

Rennison, D. J. (2014). The Availability of Research Data Declines Rapidly with Article

Age. Current Biology, 24(1), 94–97. doi:10.1016/j.cub.2013.11.014

Wagenaar, T. C., & Babbie, E. R. (2004). Guided activities for the practice of social

research. (10th ed.). Belmont, CA: Thomson/Wadsworth.

Ward, C., Freiman, L., Jones, S., Molloy, L., & Snow, K. (2011). Making sense: talking data

management with researchers. International Journal of Digital Curation, 6(2), 265–273.

doi:10.2218/ijdc.v6i2.202

Westra, B. (2013). Data Services for the Sciences: A Needs Assessment.Ariande, (64).

Retrieved fromhttp://www.ariadne.ac.uk/print/issue64/westra

Page 82: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

73

Wicherts, J. M., Borsboom, D., Kats, J., & Molenaar, D. (2006). The poor availability of

psychological research data for reanalysis. The American Psychologist, 61(7), 726–728.

doi:10.1037/0003-066X.61.7.726

Wilson, J. A. J., & Jeffreys, P. (2013). Towards a unified university infrastructure: the data

management roll-out at the University of Oxford.International Journal of Digital Curation,

8(2), 235–246. doi:10.2218/ijdc.v8i2.287

Page 83: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

74

Appendices

Students Staff This proposal submitted by: This proposal is for:

Undergraduate Specific research project

X Postgraduate (Taught) – PGT Generic research project

Postgraduate (Research) – PGR This project is funded by:

Project Title: Research Data Management Practices of Researchers in Malawi:

The Case of Selected Academic Institutions

Start Date: June, 2014 End Date: 01 September, 2014

Principal Investigator (PI):

(student for supervised UG/PGT/PGR research)

Thomas Bello

Email: [email protected]

Supervisor:

(if PI is a student)

Dr Andrew Cox

Email: [email protected]

Indicate if the research: (put an X in front of all that apply)

Involves adults with mental incapacity or mental illness, or those unable to make a personal decision

Involves prisoners or others in custodial care (e.g. young offenders)

Involves children or young people aged under 18 years of age

Involves highly sensitive topics such as ‘race’ or ethnicity; political opinion; religious, spiritual or other beliefs; physical or mental health conditions; sexuality; abuse (child, adult); nudity and the body; criminal activities; political asylum; conflict situations; and personal violence.

Please indicate by inserting an “X” in the left hand box that you are conversant with the University’s policy on the

handling of human participants and their data.

X

We confirm that we have read the current version of the University of Sheffield Ethics Policy Governing

Research Involving Human Participants, Personal Data and Human Tissue, as shown on the University’s

research ethics website at: www.sheffield.ac.uk/ris/other/gov-ethics/ethicspolicy

Page 84: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

75

Part B. Summary of the Research

B1. Briefly summarise the project’s aims and objectives: (This must be in language comprehensible to a layperson and should take no more than one-half page. Provide enough information so that the reviewer can understand the intent of the research)

Summary:

Aim of study

The aim of this study is to assess the status of current Research Data Management (RDM) practices of

researchers in Malawi.

Specific objectives

The objectives of this study are:

To understand the characteristics, types, and volumes of the research data being generated by

researchers in Malawi

To assess the methods that the researchers use to store and backup their research data

To understand the researchers’ perceptions on sharing their data assets

To assess the issues they face in the day-to-day management of this data

To understand the support needs they have in order to effectively manage their data

throughout the research life-cycle

To identify the current practices in data preservation beyond the life of the project

B2. Methodology: Provide a broad overview of the methodology in no more than one-half page.

Overview of Methods:

For this study, a web-based self-completion questionnaire will be used to collect data. This will be

based on a modification of Digital Curation Centre’s Data Asset Framework (DAF) methodology10. It is a

questionnaire that asks respondents about the types, formats and maintenance of their research data

throughout the life of the project and the methods for the data’s preservation beyond the life of the

project.

Analysis of the data will be both quantitative and qualitative. Quantitative analysis will involve

calculate percentages, proportions and means. Some information will also be presented in tables and

different types of charts. The ‘other’ and ‘any comments’ fields of the questionnaire will be used to

obtain qualitative data which will help to gauge the perceptions of the respondents

10 http://www.data-audit.eu/docs/DAF_Implementation_Guide.pdf

Page 85: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

76

If more than one method, e.g., survey, interview, etc. is used, please respond to the questions in Section C for each method. That is, if you are using both a survey and interviews, duplicate the page and answer the questions for each method; you need not duplicate the information, and may simply indicate, “see previous section.”

C1. Briefly describe how each method will be applied

Method (e.g., survey, interview, observation, experiment):

Description – how will you apply the method? The questionnaire will be designed using Google Drive Forms

and disseminated via email.

About your Participants

C2. Who will be potential participants?

The study population will comprise researchers from College of Medicine and Centre for Social Research both

under the University of Malawi and Lilongwe University of Agriculture and Natural Resources (LUANAR).

C3. How will the potential participants be identified and recruited?

A link to the online questionnaire will be e-mailed to key contacts in those institutions who have agreed to respond

to and disseminate the questionnaire to rightful respondents (researchers) once it is ready.

C4. What is the potential for physical and/or psychological harm / distress to participants?

There is no perceived harm to the participants of this study.

C5. Will informed consent be obtained from the participants?

X Yes

No

If Yes, please explain how informed consent will be obtained?

Page 86: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

77 The first page of the questionnaire will contain brief details of the study and what data will be collected and

how confidentially the data will be treated once collected. It will assure them of their freedom to stop

responding at any point. They will be asked to give their consent to responding to the questionnaire, which will

do by clicking a button to proceed to the pages with the questions.

If No, please explain why you need to do this, and how the participants will be de-briefed?

C6. Will financial / in kind payments (other than reasonable expenses and compensation for time) be offered

to participants? (Indicate how much and on what basis this has been decided)

No

About the Data

C7. What data will be collected? (Tick all that apply)

Print Digital

Participant observation

Audio recording

Video recording

Computer logs

Questionnaires/Surveys X

Other:

Other:

C8. What measures will be put in place to ensure confidentiality of personal data, where appropriate?

No personal information will be obtained.

C9. How/Where will the data be stored?

The data will be stored on the iSchool’s Research Data Server where a 10gig share has been allocated.

C10. Will the data be stored for future re-use? If so, please explain

No.

About the Procedure

Page 87: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

78 C11. Does your research raise any issues of personal safety for you or other researchers involved in the project

(especially if taking place outside working hours or off University premises)? If so, please explain how it will

be managed.

No.

Page 88: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

79

The University of Sheffield. Information School Research Ethics Review Declaration

Title of Research Project: [Research Data Management Practices of Researchers in Malawi: The Case of

Selected Academic Institutions]

We confirm our responsibility to deliver the research project in accordance with the University of

Sheffield’s policies and procedures, which include the University’s ‘Financial Regulations’, ‘Good

Research Practice Standards’ and the ‘Ethics Policy Governing Research Involving Human Participants,

Personal Data and Human Tissue’ (Ethics Policy) and, where externally funded, with the terms and

conditions of the research funder.

In submitting this research ethics application form I am also confirming that:

The form is accurate to the best of our knowledge and belief.

The project will abide by the University’s Ethics Policy.

There is no potential material interest that may, or may appear to, impair the independence

and objectivity of researchers conducting this project.

Subject to the research being approved, we undertake to adhere to the project protocol

without unagreed deviation and to comply with any conditions set out in the letter from the

University ethics reviewers notifying me of this.

We undertake to inform the ethics reviewers of significant changes to the protocol (by

contacting our academic department’s Ethics Coordinator in the first instance).

we are aware of our responsibility to be up to date and comply with the requirements of the

law and relevant guidelines relating to security and confidentiality of personal data, including

the need to register when necessary with the appropriate Data Protection Officer (within the

University the Data Protection Officer is based in CiCS).

We understand that the project, including research records and data, may be subject to

inspection for audit purposes, if required in future.

We understand that personal data about us as researchers in this form will be held by those

involved in the ethics review procedure (e.g. the Ethics Administrator and/or ethics

reviewers) and that this will be managed according to Data Protection Act principles.

If this is an application for a ‘generic’ project all the individual projects that fit under the

generic project are compatible with this application.

We understand that this project cannot be submitted for ethics approval in more than one

department, and that if I wish to appeal against the decision made, this must be done through

the original department.

Name of the Student (if applicable):

Thomas Bello

Name of Principal Investigator (or the Supervisor):

Page 89: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

80

Dr. Andrew Cox

Date: 10 June, 2014

Page 90: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

81

Appendix A2: Ethics Information Consent Form

The University of Sheffield. Information School

Research Data Management Practices of Researchers in Malawi: The Case of Selected Academic Institutions

Researchers

Thomas Bello [email protected]

Purpose of the research

Clearly state the objective of the research in two to three sentences

This study seeks to understand the characteristics, types, and volumes of research data being

generated by researchers in Malawi and assess the methods that the researchers use to maintain

their active and preserve legacy data. It also aims to understand the researchers’ perceptions

towards data sharing and the issues they face in the day-to-day management of research data.

Who will be participating?

Indicate who will be participating.

For example “We are inviting adults over 18 who have used Facebook in the past two days.”

We are inviting adults over 18 who are researchers in academic institutions in Malawi.

What will you be asked to do?

Indicate what you will ask them to do.

For example, “we will ask you to complete a brief demographics questionnaire so that we have a

profile of our participant group. Then we will conduct a 15 minute interview about when and how you

use Facebook.”

We will ask you to complete a brief demographics section so that we have a profile of our participant

group. The questions that follow after that are about how you manage your research data. The

questionnaire will take no more than 15 minutes to complete.

What are the potential risks of participating?

This will often be “The risks of participating are the same as those experienced in everyday life.” For

some research, you may need to indicate the risk of anonymity being violated, etc.

The risks of participating are the same as those experienced in everyday life

What data will we collect?

Page 91: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

82

Be very explicit. Indicate if interviews are audio recorded, if visual observation is used, if participants

are being monitored. In short, stay very clearly what is being collected. For example, “We are audio

recording the interviews, and recording all of your actions when you use the computer in a computer

file.”

We will collect data on your research data management practices by asking you to complete an

online Google Forms questionnaire. Once you submit the form, your answers will be anonymously

collected. No personal data will be collected.

What will we do with the data?

Be very explicit. Only state that it is in a locked cabinet if that is indeed correct. If you propose to re-

use the data in future, again be very explicit. If the data is to be destroyed then say so.

For example, “We will be analyzing the data for inclusion in my masters dissertation. After that point,

the data will be destroyed.”

We will be analyzing the data for inclusion in my masters dissertation. After that point, the data will

be destroyed.

Will my participation be confidential?

Explain how confidentiality will be handled. In some casse, e.g., focus groups or any form of group

activity, anonymity cannot be guaranteed. For example, “We are anonymising the data and coding

the computer files with a random number. No identifying information will be retained.” Or

“Participation is in a focus group with six other people. Our data will be anonymised, but we cannot

guarantee that members of the group will not discuss their participation, although we have requested

that they not do so.”

Your participation in this study will be confidential. There will be no way of knowing who has

responded to the questionnaire because the Google Forms questionnaire confidentiality features will

be enabled so that it does not collect any names or email addresses of respondents.

What will happen to the results of the research project?

State what the plans are and how the participant can receive results. For example, “The results of this

study will be included in my master’s dissertation which will be publicly available. Please contact the

School in six months.” Or “The results of this research will be reported in journal papers; a summary of

the results will be posted to [name a website] or by contacting the primary investigator.

The results of this study will be included in my master’s dissertation which will be publicly available.

Please contact the School in six months.

I confirm that I have read and understand the description of the research project, and that I have had

an opportunity to ask questions about the project.

Page 92: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

83

I understand that my participation is voluntary and that I am free to withdraw at any time without

any negative consequences.

I understand that I may decline to answer any particular question or questions, or to do any of the

activities. If I stop participating at all time, all of my data will be purged.

I understand that my responses will be kept strictly confidential, that my name or identity will not be

linked to any research materials, and that I will not be identified or identifiable in any report or

reports that result from the research.

I give permission for the research team members to have access to my anonymised responses.

I give permission for the research team to re-use my data for future research as specified above.

I agree to take part in the research project as described above.

By clicking "Continue", you agree to participate in the study.

Note: If you have any difficulties with, or wish to voice concern about, any aspect of your participation in this study, please contact Dr. Angela Lin, Research Ethics Coordinator, Information School, The University of Sheffield ([email protected]), or to the University Registrar and Secretary.

Page 93: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

84

Appendix A3: Ethics Approval

Page 94: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

85

Appendix B: Copy of questionnaire

Research Data Management Practices of Researchers in

Malawi

About You Please tell us a little about yourself:

1. What best describes your main research role?

o Principal Investigator/Project Manager

o Member of Research Team/Group

o Independent Researcher

o Research Assistant

o Research Support/Non-academic Staff

o Research Student (PhD or MPhil)

o Other:

2. What is your research group or research active area?

3. Which institution do you work at?

« Back

Continue »

33% completed

Details of your research data For the purpose of this section you should consider the term 'electronic research data' to include all data associated with your projects - this may include numerical data produced by computational experiments, output from experimental equipment, images or audio created from experimental data or data gathered as part of the project or even data collected from surveys relating to the project. 'Research data' do NOT include publications, articles, lectures or presentations. Data that you 'hold' describes any research data that you store anywhere. For example: on a computer, on CDs or on paper.

4. Do you currently hold or have you ever held any research data?

o Yes, I currently hold research data

o Yes, I have held research data in the past

o No

5. Which of the following categories best describe the electronic data created in your field of research? (Please choose all that apply)

o Observational (e.g. video or audio recordings of performances or other primary sources;

photographs of artistic works, historical documents etc. (researcher has a passive role))

Page 95: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

86

o Survey/Interview/Focus Group (e.g. quantitative or qualitative responses to survey or

interview questions; oral history accounts (researcher has an active role))

o Experimental (e.g. spectrometry results)

o Simulated (e.g data from a engineering model)

o Derived (e.g data from interrelating survey data)

o Reference (e.g data cataloguing/describing other datasets)

o Other:

6. What types of research data do you hold (e.g. laboratory notes, image collections, transcripts etc.)? (Please select all that apply)

o Data automatically generated from or by computer programs

o Data collected from sensors or instruments (including questionnaires)

o Laboratory notes

o Scans or x-rays

o Slides

o Patient records

o Physical specimens

o Image/photo collections

o Websites

o MS Word files

o Spreadsheets (e.g. Excel)

o SPSS files

o Digital audio files

o Digital video files

o Video tapes

o Audio tapes

o Fieldwork data

o Text corpus

o Documents or reports

o Transcripts

o Other:

7. What are the principal media on which your research data are stored (not including backups)? (Please select all that apply)

o Hard disk drive of computer on campus

Page 96: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

87

o Hard disk drive of computer off campus

o Hard disk drive of laptop/netbook

o Hard disk drive of instrument/sensor which generates data

o External hard drive

o Shared drive/server (e.g. University server)

o Third party (including commercial data storage)

o Web-based service (e.g. Google Docs, Flickr, Box.net, Dropbox, Pando etc. (please

specify under 'Other')

o CD/DVD

o USB/Flash drive

o Email client/server

o VHS/Video Cassette

o Cassette Tape (Audio)

o Photograph

o Slides

o Microfiche

o On paper

o Other:

8. What formats/software do you use for your electronic research data? (Please select all that apply)

o Documents

o Spreadsheets

o Databases

o Images

o Audio

o Video

o Websites

o Emails (not including other formats attached to emails)

o Unique program/simulation written specifically for project

o Other:

8a. If you store data in databases, please select the primary program you use:

Page 97: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

88

MS

Access

.mdb

OpenOffice

.odb SPSS Oracle MySQL NVivo Other

Program 8b. If you store data as images, please select the primary format you use:

.jpg/.jpeg .gif .tiff .bmp Adobe

.pdf

Adobe

.ai .svg Other

Format 8c. If you store data as audio, please select the primary format you use:

.mp3 .wav .wma

Olympus

dictaphones

.dss

Other

Format 8d. If you store data as video, please select the primary format you use:

.avi .mpeg .wmv Flash .swf Quicktime

.mov Other

Format 8e. If you have selected 'Other' for any of the questions 8a-8f, please give details of the software or formats you use:

9. Please estimate how much electronic research data you currently hold/maintain.

o < 1 GB

o 1 - 50 GB

o 50 - 100 GB

o 100- 500 GB

o 500 GB - 1 TB

o 1 - 50 TB's

o 50 - 100 TB's

o > 100 TB's

o Don't know

« Back

Continue »

50% completed

Page 98: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

89

Research data storage

10. Do you currently have a data management plan for your research data (for example, data preservation policy, record management policy, data disposal strategy)?

Yes No Don't know

10a. If yes, what was the main driver for developing your strategy?

o Research requirement to access/analyse/annotate others' data

o Requirement of project funder

o Size of project team (i.e. multiple data creators)

o Volume of data associated with project

o Complexity of data associated with project (e.g. multiple formats)

o Absence of university data management policy

o Other:

10b. If no, please confirm why.

o Not required / appropriate to field of research or research group

o Not required by project funder

o Time and effort required

o Lack of training / expertise within research group

o Lack of local support / guidance (e.g. Central Library, ICT)

o Absence of university data management policy

o Don't know

o Other:

11. Who, if anyone, is responsible for managing your electronic research data? (Please select all that apply)

o Myself (select other options only if they are not you)

o Research Project Manager

o Research Assistant

o Research Technician

o PhD Student

o Other designated person in Research Group

o Departmental IT Officer

o Central ICT

o Local Data Centre

o National data centre / data archive

Page 99: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

90

o International data centre / data archive

o Don't know

o No one

o Other:

11a. If you use any external data centre or archive, please give details:

12. Have you ever lost research data which was not backed up? (Please select all that apply)

o No

o Yes, through hardware failure

o Yes, through software failure

o Yes, through human error or loss

o Other:

13. Have you ever experienced any problems storing your research data due to the size of the files?

Yes No

13a. If yes, please give details:

« Back

Continue »

66% completed

Research Data Backup

14. On average, how frequently is your data backed up?

o Daily

o Weekly

o Monthly

o Annually

o Ad hoc

o Never

o Dont' know

14a. What data tends to be backed up?

o Everything

o Data critical to project

o Data required for publication

Page 100: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

91

o Don't know

14b. Where are they backed up? (Please select all that apply)

o Hard disk drive of computer on campus

o Hard disk drive of computer off campus

o Hard disk drive of laptop/netbook

o Hard disk drive of instrument/sensor which generates data

o External hard drive

o Shared drive/server (e.g. University server)

o Third party (including commercial data storage)

o Web-based service (e.g. Google Docs, Flickr, Box.net, Dropbox, Pando etc. (please

specify under 'Other')

o CD/DVD

o USB/Flash drive

o Email client/server

o Floppy Disk

o VHS/Video Cassette

o Cassette Tape (Audio)

o Photograph

o Slides

o Microfiche

o On paper

o Don't know

o Other:

15. If the service was offered, would you want your university's repository to store any of your research data, either for your exclusive use or for wider access? The hypothetical repository would offer to store whatever research data researchers volunteer (and possess the appropriate rights to volunteer) with a retention period of their choosing. The files would be stored securely with accessibility limited by default to only the researcher in question. The researcher would have the option of widening access anywhere from specific other users to full public open access. The repository would, therefore, provide separate, voluntary facilities for: long-term storage, backups, sharing of data for collaboration purposes with colleagues, and open access. The repository would offer facilities aimed at meeting stricter requirements now made by many funding bodies.

o Yes

o No

16. If yes, how long would you want the repository to retain any of your research data, including data only accessible by you?

Page 101: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

92

None of my

data

Some of

my data

Much of my

data All of my data

Not at all Until the end of

the project

For a finite period

after end of project

Until I leave the

University

In perpetuity

« Back

Continue »

83% completed

Research data sharing

17. Who owns the research data you hold?

o I own all of the data I hold

o I own some of the data I hold

o I own none of the data I hold

o Don't know

18. Do you share ownership of any of your research data with others? (Please select all that apply)

o No

o Yes, with other academics/researchers

o Yes, with journals/publishers

o Yes, with funding bodies

o Other:

19. How do you currently share research data with colleagues? (Please select all that apply)

o I never share data with colleagues

o E-mail

o Shared computer

o Shared drive/server (e.g. University server)

o Using portable storage (e.g. CDs, DVDs, external hard drive, memory sticks etc.)

o Web-based service (e.g. Google Docs, Flickr, Box.net, Dropbox, Pando etc. (Please

specify under 'Other'))

o On paper

o Other:

20. What problems have you encountered when sharing data with colleagues?

Page 102: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

93

(Please select all that apply)

o Finding suitable shared storage space

o Lack of file naming conventions made it difficult to identify files

o Lack of version control caused confusion

o Legal issues arising from international transfer of data

o Problems establishing ownership of data

o Time consuming to keep all colleagues constantly up to date

o I have not encountered problems

o Other:

21. Apart from yourself, who would you want to be allowed access to your research data?

None of my

data

Some of my

data

Much of my

data

All of my

data

My

colleagues

My

colleagues

My school The whole

university

Specified

academic

communities

beyond the

university

Anyone

(including

general

public)

22. What factors would prevent your research data from being made open access to the general public? (Please select all that apply)

o None

o I do not believe the public would have any use for some of my data

o I do not have the ownership rights to share all of my data

o Data have commercial value

o Funder restrictions

o Data are not ready to be released/concern unpublished work

o Protect own ideas or intellectual property

o Data contain personal information/have not been anonymised

o Ethics requirements of university/funder

Page 103: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

94

o Other:

23. Have you ever applied for funding from a body that required some degree of open access to be provided for your research data?

Yes No Don't know

23a. If yes, please state funder and give details:

23b. Have you ever experienced difficulties in meeting these requirements?

o No

o Yes, but I have always been able to meet the requirements

o Yes, as a result I was unable to obtain funding through this body

o Yes, and I need training and guidance

Conclusion

24. Do you have any specific concerns over the current management of your research data or services you would like to see offered by your university to guarantee access to this data in the future?

End of questionnaire Thank you for taking the time to complete this survey. Your contribution is very much appreciated.

« Back

Submit

Never submit passwords through Google Forms. 100%: You made it.

Page 104: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

95

Appendix C – Additional Survey Results

Figure C1: Research data categories by discipline

Appendix C1: Formats/software for research data

In terms of formats/software researchers are using for their data, the results show that

spreadsheets and documents are used equally, together accounting for 50% of all the formats

used followed by databases at 15%. Images, emails, audio, websites and video are being used

in low proportions. Figure C2 provides a summary of these formats/software.

0 1 2 3 4 5 6 7 8 9

Agricultural Sciences

Engineering & Architecture

Humanities

Science &Technology

Law

Business & Management Science

Medicine & Health

Social Sciences

Reference Derived Simulated Experimental Survey/Interview/Focus Group Observational

Page 105: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

96

Figure C2: Responses to question 8 “What formats/software do you use for your

electronic research data?”

Appendix C2: Common database software

29 out of the 34 respondents indicated that they store data in databases. Figure C3 shows that

more than half of these use SPSS and about one quarter use MS Access. NVivo, MySQL and

OpenOffice are used by less than 10% of the respondents each while 10% indicated that they

use ‘other’ database software such as STATA and Microsoft Excel.

Figure C3: Responses to Question 8a “If you store data in databases, please select

the primary program you use:”

2%

2%

4%

4%

5%

8%

9%

15%

25%

25%

Video

Other

Websites

Unique program/simulation written specifically for project

Audio

Emails (not including other formats attached to emails)

Images

Databases

Documents

Spreadsheets

Page 106: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

97

Appendix C3: Image formats

Use of image data seems to be popular with 22 participants (65%) indicating that that they

store data as images. Figure C4 shows that nearly three quarters of these indicated that the

format they use is ‘.jpg/.jpeg’ followed by 14% who indicated ‘Adobe .pdf’ as the format

they use with ‘.tiff’ being used by only 5% of the respondents and 9% saying that they use

‘other’ formats such as ‘post script (ps); and encapsulated post script (eps)’, ‘*.shp; geotiff;

*.ai depending on image types’.

4%3%

7%

10%

24%

52%

OpenOffice .odb

MySQL

NVivo

Other

MS Access .mdb

SPSS

73%

4%

14%

9%

Figure C4: Primary format of images

.jpg/.jpeg

.tiff

Adobe .pdf

Other

Page 107: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

98

Appendix C4: Audio formats

Of those who indicated that they store image data, Figure C5 shows that the majority (67%)

use ‘.mp3’ as the primary format and close to a fifth use ‘.wma’ while less than 10% use

‘.wav’. Close to a tenth of them indicated that they use other audio formats.

Appendix C5: Video formats

Of all the respondents, 13 representing 38% reported that they store video data. Figure C6

shows that ‘.mpeg’ is used by approximately half of them. The ‘.avi’, ‘.wmv’ and ‘Flash

.swf’ formats are used by a similar proportion of 15% of the respondents each while less than

a tenth of the respondents primarily use other video formats such as ‘.MP4’.

67%

17%

11%

5%

Figure C5: Primary format of audio

.mp3

.wma

Other

.wav

46%

16%

15%

15%

8%

Figure C6: Primary format of video

.mpeg

.avi

.wmv

Flash .swf

Other

Page 108: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

99

Appendix C6: Other applications

Some of the software that participants are using in their different fields is summarised in

Table C1 below.

Table C1: Other file formats/software being used by respondents and their areas of

application

Format/Software Area of application

post script (ps); and encapsulated post script (eps) Mathematical Sciences

SSH Genomics

Computer aided design software Architecture

*.shp; geotiff; *.ai Geosciences

STATA Population studies

Appendix C7: Data that is backed up

Data critical to research projects tends to be backed up more (43%) than the rest followed by

every type of data (38%) and then data required for publication at 14%. Figure C7

summaries these findings.

Figure C7: 14a. What data tends to be backed up?

Everything38%

Data critical to project

43%

Data required for publication

14%

Don't know5%

Page 109: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

100

Page 110: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

101

Appendix C8: Researchers’ specific concerns

Researchers in Malawi express various concerns over the current management of their

research data or services they would like to see offered by their universities to guarantee

future access to the data. Their responses have been categorised into different themes.

Theme 1: Policy and Storage issues

“Need system of data management and secure server in the department”

“At the moment my storage of research data at my UNI is on a personal basis. I don't know if

there's a data management policy, I will have to check but I think it will nice to have one”

Theme 2: Concerns of data theft

“In most cases there is element of data theft, mainly between IT personnel and the data

'hunters'. Other don’t mind other people's effort and energy engaged in data collection

especially in its raw form. Once published then it can be made public”.

Theme 3: Investment / infrastructure / sharing /access / storage

“My university needs to invest more in ICT access to make it possible to start comfortably

sharing data”

“It would be useful if research data mainly Theses were posted online through institutionally

controlled access for easy access by those interested both nationally and internationally”.

“I would love to have a university central server where I ca deposit my data and be able to

retrieve my data when I am within or outside campus including outside the country”.

Theme 4: Connectivity issues

“The most serious problem is that internet services are poor thereby affecting public access to

some data that we would want to share”.

“Yes, there is a serious challenge with internet connectivity at Polytechnic. Secondly our

publications do not appear in full on our website”.

Theme 5: Training / awareness issues

“Lack of knowledge about data management and there seems to be no-one who minds to

offer some enlightenment on the same”.

Theme 6: Perceived administrative issues

“Management taking too long to put things in place”.

Page 111: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

102

Appendix D – Letter of introduction

Page 112: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

103

Page 113: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

104

Access to Dissertation

A Dissertation submitted to the University may be held by the Department (or School) within which

the Dissertation was undertaken and made available for borrowing or consultation in accordance

with University Regulations.

Requests for the loan of dissertations may be received from libraries in the UK and overseas. The

Department may also receive requests from other organisations, as well as individuals. The

conservation of the original dissertation is better assured if the Department and/or Library can fulfill

such requests by sending a copy. The Department may also make your dissertation available via its

web pages.

In certain cases where confidentiality of information is concerned, if either the author or the

supervisor so requests, the Department will withhold the dissertation from loan or consultation for

the period specified below. Where no such restriction is in force, the Department may also deposit

the Dissertation in the University of Sheffield Library.

To be completed by the Author – Select (a) or (b) by placing a tick in the appropriate box

If you are willing to give permission for the Information School to make your dissertation available in

these ways, please complete the following:

X (a) Subject to the General Regulation on Intellectual Property, I, the author, agree to this dissertation being

made immediately available through the Department and/or University Library for consultation, and for

the Department and/or Library to reproduce this dissertation in whole or part in order to supply single

copies for the purpose of research or private study

(b) Subject to the General Regulation on Intellectual Property, I, the author, request that this dissertation be

withheld from loan, consultation or reproduction for a period of [ ] years from the date of its

submission. Subsequent to this period, I agree to this dissertation being made available through the

Department and/or University Library for consultation, and for the Department and/or Library to

reproduce this dissertation in whole or part in order to supply single copies for the purpose of research

or private study

Name: Thomas Mphatso Bello

Department: Information School

Signed: Thomas Mphatso Bello Date 27 August 2014

To be completed by the Supervisor – Select (a) or (b) by placing a tick in the appropriate box

Page 114: September 2014 - dagda.shef.ac.ukdagda.shef.ac.uk/dispub/dissertations/2013-14/External/Bello_130143846.pdfThe Australian National University1 to support researchers. Some of the training

105

(a) I, the supervisor, agree to this dissertation being made immediately available through the Department

and/or University Library for loan or consultation, subject to any special restrictions (*) agreed with

external organisations as part of a collaborative project.

*Special

restrictions

(b) I, the supervisor, request that this dissertation be withheld from loan, consultation or reproduction for a

period of [ ] years from the date of its submission. Subsequent to this period, I, agree to this

dissertation being made available through the Department and/or University Library for loan or

consultation, subject to any special restrictions (*) agreed with external organisations as part of a

collaborative project

Name

Department

Signed Date

THIS SHEET MUST BE SUBMITTED WITH DISSERTATIONS BY DEPARTMENTAL REQUIREMENTS.