using the trove api - help centrehelp.nla.gov.au/sites/default/files/trove api survey...

24
TROVE APPLICATION PROGRAMMING INTERFACE (API) SURVEY 2017 USING THE TROVE API A summary of survey responses Cathie Oats Director, Trove 14 March 2017

Upload: doanlien

Post on 18-Aug-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

TROVE APPLICATION PROGRAMMING INTERFACE (API) SURVEY 2017

USING THE TROVE API

A summary of survey responses

Cathie Oats

Director, Trove

14 March 2017

Contents

Executive Summary 3

Methodology 3

Survey response rates 4

General questions about Trove 4

Satisfaction with Trove 4Improvements to Trove 5Sectors 6

Current use of the Trove API 7

API Performance 8Completeness of metadata 8

Options for improvements to the Trove API 8

Follow-up actions from Trove API survey 9

Appendix A: Use of the Trove API – some examples of websites and apps 10

Appendix B: Trove API Survey questions 2017 13

Using the TROVE API 2

Executive SummaryThe Trove API was first made available in April 2012. Since that time it has been used by a variety of users to display results from Trove on other websites, harvest Trove records for inclusion in other databases, retrieve tags or comments added to records contributed by organisations, conduct offline analysis, and create new tools and visualisations.

The National Library of Australia conducted a survey on the use of the Trove API from February 14 – 24, 2017. The purpose of the survey was to gather information from individuals and representatives of organisations on why they use the API, their experiences using the API, and suggestions for improvements. 198 complete responses were received; with 79% respondents identified as male, 17% female, and 4% other.

The survey results show there is dedicated group of technically proficient Trove users who are using the API regularly for the purposes it was intended. A second set of infrequent users are experimenting with the API as part of their studies. Examples of how individuals and organisations are using the API are included in this report.

In relation to improvements to the API, there were concerns about its stability and also requests to expand the range of metadata included. There is significant interest from the technically proficient users in being involved with the open development of the API.

‘There was one thing in particular I wanted to comment on which wasn't in the survey; it was about the process of development and maintenance of the API. What I would particularly like to see is this process being made open.’

‘I also think the source code for the system implementing the API should be open source so that external developers can review it and help to identify and correct errors.’

There are a number of specific Trove and Trove API development projects which will benefit from the information gathered in this survey. These projects will be rolled out in phases to suit the differing needs of the technically proficient users, students and organisations using the API to populate datasets and websites.

Specific actions in relation to improvements to the Trove API will be explored and prioritised following interviews with interested API users and a period of internal consultation.

MethodologyA total of 1,412 emails were sent to individuals who had registered for an API key for either commercial or non-commercial purposes.

The key focus areas of the survey were:

General questions about Trove including standard questions developed for use in all Library surveys; Current use of the Trove APIs; and Options for improvements to the Trove API.

Survey questions were developed in consultation with internal and external stakeholders, including members of the Trove Community of Practice (a NSLA CoP).

Potential survey participants were sent the survey link via email. A survey link was also placed on the API overview page of the Trove Help Centre. Two reminder emails were sent throughout the survey period.

Using the TROVE API 3

The Trove team created and administered the survey using Qualtrics software.

From the respondents, a list of volunteers who indicated they would be interested in participating further in interviews about Trove and/or their API use was compiled.

Survey response rates

198 valid responses were received 62 emails bounced

The response rate, based on the total number of invitations sent (n=1,412) was 14% or 15% based on the known valid sample size (n=1,340).

The survey was structured to allow respondents to progress through the survey without answering all questions. Most questions were not mandatory, and negative feedback was received in response to mandatory questions.

A high number of free text comments provided examples of how individuals and organisations have engaged with, built on and shared digital content based on their use of the Trove API.

General questions about TroveOverall, survey respondents were very positive about Trove, its services and its importance to Australian research. How this sentiment aligns with satisfaction with the Trove API is examined in more detail later in this report.

Satisfaction with TroveSatisfaction with Trove was very high, with the majority of responses indicating an above average or excellent satisfaction rating. Only 3.5% of respondents rated their satisfaction with Trove as poor or below average.

Figure 1: Satisfaction with Trove

The following quotes illustrate some of the reasons for this high level of satisfaction with Trove:

‘Directly relevant to the history curriculum (we are a K-12 School) and facilitating discovery and access to these resources. Authoritative source of truth for call numbers and other catalogue details, source of inspiration for subject headings and source of missing details and contextual information that helps us

Using the TROVE API 4

provide better catalogue records. Lets us know whether a title we hold should be retained or discarded, we try to retain items that Trove demonstrates may be rare or unique (i.e. no longer in circulation in Australia) which helps to protect national history.’

‘Trove has enabled my workplace to significantly improve the service we deliver to Aboriginal and Torres Strait Islander clients.’

Respondents were also satisfied with the information they receive from Trove, with a majority providing an above average rating. Comments by respondents indicated they would like to receive more content from the API, which currently does not include all the fields in Trove, and this may have affected the rating.

Figure 2: Satisfaction with information

‘PhD research has been made SO much more convenient, either through digitised newspaper collection, or in locating where various books are held, rather than having to search each library catalogue separately.’

‘TROVE displays our past corporate magazine and student newspaper thus making it possible for anyone, alumni included, from anywhere in the world access to the University's history...’

‘My interest is in early trials in NSW. Because official police court trial reports no longer survive for this time, Trove is the only location for any information about what happened there. There is no other source. (See my website http://nis.wikidot.com) The ability to follow someone around Australia or investigate non-city happenings so easily is very useful for me as I can work from home and don't need a trip to Sydney or Canberra to search newspapers.’

Improvements to Trove

Several themes emerged from the 74 free text comments about Trove. These including adding more digitised content, particularly digitised newspapers, changes to the Trove interface ranging from small functionality changes to overhauling the entire concept of zones, adding more content from cultural institutions, improving OCR quality, more engagement with users before making functionality improvements, and improved identification of open access and available online resources.

‘remove the concept of zones and make it as a normal facet. include more facets’

‘Keep digitising newspapers and more recent material. post 1950.’

‘Newspaper advertising 'articles' are composites of many advertisements. It would be good if each advertisement could be a separate sub-document. More newspapers scanned and made available.’

Using the TROVE API 5

‘Better identification of open access resources -- search by licence etc. Better integration of people zone and other resources. Ability for users to define relationships between resources, and between people and resources.’

‘It would be good to see a simpler, more streamlined interface. I'd love more datasets from cultural institutions added. It would be excellent to get Trove and DigitalNZ talking to each other at the API level too.’

‘Improve the quality of the OCR so that more content can be found! And add more content - Australian cultural organisations should spend more of their budgets on digitisation and making content fully searchable.’

‘Better quality control on Trove version releases. Improved engagement with select public users for functionality of future releases.’

SectorsOver one-third (37.01%) of respondents identified themselves as family historians, professional historians/researchers, and people conducting personal research. Respondents from this group used the API to populate websites with content, download the full text of digitised newspapers, and transfer metadata to a system they were comfortable working in for further analyses.

‘I refer clients to Trove all the time for research purposes, especially family history research. The newspaper zone is particularly rich pickings for researchers. I've used excerpts from digitised newspapers in Trove to make a video about a historical event for my workplace. Other staff have used it in similar ways.’

Figure 3: Screenshot from the VicFix campaign (https://www.slv.vic.gov.au/contribute-create/completed-vicfix-campaigns), which focuses

on text correction of specific events in Victorian newspapers in Trove.

26% of respondents were from universities. The Trove API had been used in several university courses to teach students about APIs and introduce them to working with large data sets. Extracting large data sets from Trove, especially full-text digitised newspaper articles was a common theme. The data was then used to build experimental interfaces, identify and analyse trends in digitised newspaper content, examine existing collections, and enrich other datasets, as illustrated by the following quote:

Using the TROVE API 6

‘I am a historian working with criminal justice data in a longitudinal database covering all states and over more than 150 years. Extraction of data from digitised newspapers is essential to the enrichment of the data we use. And we have integrated a Trove API into our database management system to enable semi-automated discovery of newspaper reports related to our data (now at more than 250,000 cases). The scope of the project in time, breadth and depth would be impossible without Trove.’

24.41% of respondents identified as an employee at a library or cultural institution, or a vendor building software for this sector, with nine reporting they were Trove content partners. This group of users tended to use the Trove API to enhance their collection including cataloguing items, identifying gaps in their collections and holdings, and also promoting digitised newspapers from their local area.

‘I have been able to catalogue our library - and grab geographic distribution information about where our books are available elsewhere. This has been integral to understanding which items in our collection are more or less important.’

Another theme to emerge with the use of the API is play. There were several examples of users creating Twitter bots that tweet recipes, knitting patterns, or historical newspaper articles relating to current affairs. One participant in a GovHack event created an application that colourised black and white images.

Figure 4: Screenshot from the Colourful Past website (http://colourfulpast.org/), which colourises black and white images found through

the Trove API.

A further selection of products which use the Trove API is shown in Attachment A. The breadth of these products displays the diverse use of the API, as well as the diversity of audiences using the service.

Current use of the Trove APISearching across the records in Trove is the most common use of the Trove API. This raises questions about whether current functionality for whole-of-Trove searching meets the needs of technically proficient users.

Participants were asked to list any issues they encountered while using the Trove API. The two common themes which emerged were:

Using the TROVE API 7

The performance of the API; and

The completeness of metadata.

API Performance Performance issues ranged from the service being unavailable, requests timing out, a high number of errors when searching digitised newspapers, and rate limiting. The following quotes provide an insight into respondent concerns:

‘Trove's current APIs are a good start, but could be improved in terms of reliability, performance, and rate limiting. We ended up taking the Trove API out of the public version of one project because it was too slow to return results compared to the other APIs we used (like the NZ equivalent, DigitalNZ). That said I also realise that API improvements take a lot of effort!’

‘Towards the end of the semester when all students were making calls from the finished apps, Trove servers just stopped answering for hours. At times the same call would have a very different answer, even when they were done just minutes apart.’‘The rate limit that is currently in place would make my app non-responsive if a small number of users were using the app concurrently. That is a non-starter for me.’

Completeness of metadataThe second theme that emerged from responses is the completeness and quality of metadata returned via the API. Respondents expressed a desire for all metadata that is available via the Trove interface to be made available via the API. The following quotes illustrate users’ needs for the availability of all metadata:

‘Inconsistent/missing metadata (like extra notes and rights info) that was present on the trove front end for some records, but not available in the metadata from the Trove API for that same record.’

‘Not all data is exposed - more is exposed thru the human interface, so we used that instead. Functionality beats stability, as without that info, we can’t use the data.’

27 respondents indicated that they have screen scaped data from Trove because they could not access it in any convenient way.

We asked participants to indicate how they would like to see the Trove API help centre improved; the following suggestions were received:

‘I recall it being one very long page...should be broken up into easier to digest chunks.’

‘It's OK in my personal opinion. Basic formatting and sectioning could be cleaner and more distinct.’

‘maybe some short video/screen casts - suitable for beginners.’

‘Some functions of the API are not well documented in the Help centre, and there is no documentation of the People API in the help centre.’

Respondents were very positive about the option of introducing an API knowledge base and discussion forum where users could discuss solutions to problems.

Options for improvements to the Trove APIParticipants were asked to comment on how they would like to see the Trove API improved. 46 responses were recorded, reflecting themes of needing to improve the performance of the API, data quality and format, and the availability of all metadata. Other useful suggestions included:

Sharing the API code base and allowing developers to contribute to API service development; Increasing the data filtering options available via the API;

Using the TROVE API 8

Investigating the possibility of allowing users to write data back to Trove (e.g. annotations); Ensuring the API is available via https; Providing an XML namespace and schema; Providing CSV output; Ability to get all results from all zones in a single set of data; and Better pagination of search results.

Follow-up actions from Trove API surveyIt is intended that further interviews will be conducted with Trove API users, from the list of volunteers compiled during the survey. These interviews will focus on gaining greater understanding of issues around performance and stability of the Trove API, metadata quality, and any potential solutions. Further opening up API development will also be considered, including using an open-source API framework and creating a forum and knowledge base for users. The perceived benefits of any proposed enhancements, including improved experience for select API users and potentially increasing user numbers overall, will be considered against the resources required to implement them.

Using the TROVE API 9

Appendix A: Use of the Trove API – some examples of websites and apps1. Culture Collage website - http://zenlan.com/collage/ - a digital heritage image search that

creates a collage of images, using images from libraries and organisations worldwide including Trove through the Trove API.

2. Recipe Trove app - https://hughrun.github.io/recipe_trove/ - a Twitter bot using the Trove API. Tweet an ingredient or dish at @recipe_trove and it will reply with an appropriate recipe in the form of a JPEG image, and a link to the original article in Trove.

3. Trove Knitting Patterns app - http://shrouded-ocean-2009.herokuapp.com/patterns - uses the Trove API to search Trove and list knitting patterns, displays the metadata and provides an option to tweet the pattern.

4. AustLit widget - http://www.austlit.edu.au/austlit/page/C779027 - developed to query Trove API digitised newspapers for additional information on AustLit records.

Using the TROVE API 11

5. Postcard Tree website - http://www.postcardtree.com – uses the Trove API to assist in family history research. It searches millions of handwritten postcards and help find messages originally sent to people you may be researching.

6. Trove NLA Books app - https://play.google.com/store/apps/details?id=com.etapps.trovenla&hl=en – searches for books through Trove and checks where they're available to borrow in every library listed across Australia.

Using the TROVE API 12

Appendix B: Trove API Survey questions 2017

Q1. Welcome to the Trove API survey 2017. Trove is seeking your feedback on our APIs, as well as general feedback on our services. The survey is divided into 3 parts:Part 1 – General questions about Trove Part 2 – Current use of the Trove APIsPart 3 – Options for improvements to the APIAll responses to the survey are confidential, and individual responses will not be identified in the reporting of findings. It is expected the survey will take approximately 3-5 minutes to complete. If you have any questions about the survey, please contact us.

Q2. Part 1 - General questions about Trove

Q3 How would you rate your satisfaction with Trove? Poor (1) Below average (2) Average (3) Above average (4) Excellent (5)

Q4 How would you rate Trove on its ability to find the information you required? Poor (1) Below average (2) Average (3) Above average (4) Excellent (5)

Using the TROVE API 13

Q5 To what extent do you agree or disagree with the following statements about Trove? (1 to 9 rating scale where 1 least important and 9 is most important)

1 (1) 2 (2) 3 (3) 4 (4) 5 (5) 6 (6) 7 (7) 8 (8) 9 (9)

Trove is of critical

importance to Australia (1)

Trove adds significant

value to my organisation

(2)

Trove adds significant

value to my research (3)

The richness and diversity of

Trove is excellent (4)

I have a high level of trust in

Trove (5)

I value being able to use the Trove API to augment my

website/service (6)

Q6 What improvements would you like to see Trove make to the service it provides?

Q7 Please provide details below of any examples where Trove has significantly added value to you/your organisation?

Q8 Part 2 - Current use of the Trove APIs

Q9 There are two Trove API services, please indicate what data you have used.(If you have used both the Trove API and the Trove Party (People Australia) API you will be asked to give separate responses to question which are repeated in the survey.) Trove API (1) Trove Party(People Australia) API (2)

Using the TROVE API 14

Q10 Please select your current use: commercial user (1) non-commercial user (2)

Display This Question:If There are two Trove API services, please indicate what data you have used. Trove API Is

SelectedQ11 Have you used the Trove API as A tertiary student undertaking a research project or class module (1) An academic or research assistant undertaking a research project (2) An employee at a library or cultural institution, or a vendor building software for this sector (3) An academic (4) An IT developer (5) A participant in a HackFest (6) Other (Free text response) (7) ____________________

Display This Question:If Have you used the Trove API as An employee at a library or cultural institution, or a vendor

building software for this sector Is SelectedQ12 Are you a content partner?    Trove contains links to collections in other organisations. These organisations are our content partners. Yes (1) No (2)

Display This Question:If Are you a content partner Yes Is Selected

Q13 Please provide examples of  how being in Trove increases the exposure of your collections to others.

Display This Question:If Are you a content partner Yes Is Selected

Q14 Do you think that Trove helps you see your collection in context with other Australian collections

Display This Question:If There are two Trove API services, please indicate what data you have used. Trove Party(People

Australia) API Is SelectedQ15 Have you used the Trove Party (People Australia) API as A tertiary student undertaking a research project or class module (1) An academic or research assistant undertaking a research project (2) An employee at a library or cultural institution, or a vendor building software for this sector (3) An academic (4) An IT developer (5) A participant in a HackFest (6) Other (Free text response) (7) ____________________

Using the TROVE API 15

Display This Question:If Have you used the Trove Party (People Australia) API as An employee at a library or cultural

institution, or a vendor building software for this sector Is SelectedQ16 Are you a content partner?  Trove contains links to collections in other organisations. These organisations are our content partners. Yes (1) No (2)Condition: No Is Selected. Skip To: Have often have you used the Trove Pa....

Display This Question:If Have you used the Trove Party (People Australia) API as An employee at a library or cultural

institution, or a vendor building software for this sector Is SelectedQ17 Please provide examples of  how being in Trove increases the exposure of your collections to others.

Display This Question:If Have you used the Trove Party (People Australia) API as An employee at a library or cultural

institution, or a vendor building software for this sector Is SelectedQ18 Do you think that Trove helps you see your collection in context with other Australian collections.

Display This Question:If There are two Trove API services, please indicate what data you have used. Trove API Is

SelectedQ19 How often have you used the Trove API Once (1) A couple of times, or more (2) Monthly (3) Multiple calls every day (4)

Display This Question:If There are two Trove API services, please indicate what data you have used. Trove Party(People

Australia) API Is SelectedQ20 How often have you used the Trove Party (People Australia) API? Once (1) A couple of times, or more (2) Monthly (3) Multiple calls every day (4)

Using the TROVE API 16

Display This Question:If There are two Trove API services, please indicate what data you have used. Trove API Is

SelectedDONE Q21 Please indicate if you have used the Trove API to:

All the time (1) Often (2) Sometimes (3) Rarely (4) Never (5)

Search across the records in

Trove. (1)

Get information

about a single item in Trove.

(2)

Look up other associated

data in Trove. (3)

Display This Question:If There are two Trove API services, please indicate what data you have used. Trove Party(People

Australia) API Is SelectedQ22 Please indicate if you have used the Trove Party (People Australia) API to:

All the time (1) Often (2) Sometimes (3) Rarely (4) Never (5)

Search across the records in

Trove. (1)

Get information

about a single item in Trove.

(2)

Look up other associated

data in Trove. (3)

Display This Question:If There are two Trove API services, please indicate what data you have used. Trove API Is

SelectedQ23 Please share your reason for using the Trove API? For example, have you built an app or used it to contribute to a website? Please provide a name or link.

Using the TROVE API 17

Display This Question:If There are two Trove API services, please indicate what data you have used. Trove Party(People

Australia) API Is SelectedQ24 Please share your reason for using the Trove Party (People Australia) API? For example, have you built an app or used it to contribute to a website? Please provide a name or link.

Q25 How often does your application(s)All the time (1) Often (2) Sometimes (3) Rarely (4) Never (5)

Call and display Trove API results on

the fly (1)

Harvest / download

Trove data to enhance

records in a local system or improve linking

(2)

Harvest / download

Trove data to a local system, to undertake

deeper analysis (3)

Display This Question:If There are two Trove API services, please indicate what data you have used. Trove API Is

SelectedQ26 Have you experienced any issues, for example, reliability or performance issues, with the Trove API

Display This Question:If There are two Trove API services, please indicate what data you have used. Trove Party(People

Australia) API Is SelectedQ27 Have you experienced any issues, for example, reliability or performance issues, with the Trove Party (People Australia) API

Q28 Have you used the Trove API Help centre? Yes (1) No (2)Condition: No Is Selected. Skip To: Have you screen scraped Trove data, b....

Using the TROVE API 18

Display This Question:If Have you used the Trove API Help centre? Yes Is Selected

Q29 Which sections did you use? Please rate their usefulness.Not used (1) Not at all useful (2) Moderately useful

(3)Very useful (4)

API overview (1)

API technical guide (2)

API terms of use (3)

Application gallery (4)

Examples (5)

Experiments (6)

GovHack tips & tricks (7)

World War 1 API examples (8)

Display This Question:If Have you used the Trove API Help centre? Yes Is Selected

Q30 Would you like any changes made to the Trove API Help centre?

Q31 Have you screen scraped Trove data, because you couldn't access it in any convenient way? Yes (1) No (2)Condition: No Is Selected. Skip To: End of Block.

Display This Question:If Have you screen scraped Trove data, because you couldn't access it in any convenient way?

Yes Is SelectedQ32 If you have screen scraped data from Trove, please share what you did with it?

Q33 If you have published a mobile app using the API how long would it take you to update that app once an updated API is released? 1-3 months (1) 4-6 months (2) Longer: please specify (3) ____________________

Q34 Part 3 - Options for improvements to the API   While we will do our best to implement your suggestions on how we can improve the API, it is unlikely that we will be able to respond to all your suggestions. For example it is unlikely we will be able to increase call rates to support all commercial uses.

Using the TROVE API 19

Q35 What alternative access to the Trove corpus would you use other than an API? Bulk data download (1) Downloads of specific data sets (eg. People) (2) Other (3) ____________________

Display This Question:If There are two Trove API services, please indicate what data you have used. Trove API Is

SelectedQ36 How would you like to see the Trove API improved?

Display This Question:If There are two Trove API services, please indicate what data you have used. Trove Party(People

Australia) API Is SelectedQ37 How would you like to see the Trove Party (People Australia) API improved?

Q38 Would you like a discussion area for technical help from the Trove from other developers? Yes (1) No (2)

Q39 What sort of technical support do you expect from Trove for problems and queries? a knowledge base (1) discussion forum (2) help line (3)

Q40 Can you identify any outstanding examples of APIs (your opinion) within the Galleries, Libraries, Archives and Museum sector? Why do you think they are outstanding?

Q41 Can you identify any APIs that are not from the Galleries, Libraries, Archives and Museum sector that you think are exceptional? What makes them exceptional?

Q42 What is your postcode?

Q43 Type in the suburb if you are unsure of your postcode.

Q44 If you live overseas, what country do you reside in?

Q45 What is your gender? Male (1) Female (2) Other (3) ____________________

Using the TROVE API 20

Q46 How old are you? Under 18 years (1) 18 - 24 years (2) 25 - 34 years (3) 35 - 44 years (4) 45 - 54 years (5) 55 - 64 years (6) 65 - 74 years (7) 75 years and over (8)

Q47 Are you of Aboriginal or Torres Strait Islander origin? Yes (1) No (2)

Q48 Do you speak a language other than English at home? Yes (1) No (2)

Display This Question:If Do you speak a language other than English at home? Yes Is Selected

Q49 Which language do you speak at home?

Q50 Follow up Trove and the National Library may wish to follow up on some comments made during the survey. If you would like to be involved in any follow up or further research, please provide your details below.

Q51 Name:

Q52 Email:

Q53 Telephone:

Using the TROVE API 21