Innovation in Assembly: National Library of Australia: Trove

Innovation in Assembly is one of the eight core design patterns (O’Reilly, 2005), and it refers to online services allowing the user to remix and create from a platform provided by a business. This creation involves utilising existing data and capabilities of an established system and giving the user the tools to use that platform however they see fit.

The National Library of Australia’s (NLA) Trove search engine is a platform developed as a collaboration between over one thousand major libraries (Holley, 2010). As of the 20th of March, 2014, Trove has more than 390 million searchable items.

NLA: Trove and Innovation in Assembly

The NLA has developed an API for use with Trove, which allows developers to search, pull entire items (including metadata), copy, view history and create visualisations for any items across the vast database indexed within the Trove platform. The data that is retrieved from calls to the API can be freely manipulated by the developer as they wish, including for embedding on another website. The granularity of the returned data allows capable users to execute almost any action they wish the information once they have received it. The API key, required by the developer to request from NLA servers is freely available, lowering the barrier for entry to almost zero and code documentation is available for all possible API calls.

NLA:Trove & Best Practices of Innovation in Assembly

Design for Remixability & Granular Addressability of Content

Designing for Remixability refers to the design of the platform that allows the user to deal with the smallest possible units, rather than entire datasets or meta-data, etc. This also requires the data to be addressable on an individual level.

Trove gives users absolute granular control over the data they had received from a search request. Any article, within any sized pool of results, is individually addressed and can be accessed and manipulated in any way the developer deems fit.

Apply API Best Practices & Use Existing Standards

When a business develops and maintains an API, there are a number of practices that help to achieve success and ensure developers can get the most out of the robust system they’ve been provided with.

Writing, maintaining and releasing documentation alongside programming interfaces is industry standard practice. If an API is released without supporting documentation or sample code, developers will not be able to understand the interface and it will go unused. Trove has provided full documentation and sample code for its API.

If developers are allowed to request from the platform limitlessly, especially if the service becomes popular, the system will be under heavy load from unfettered requests. It is considered good practice to limit the number of requests each client can send over a given amount of time. Trove limit requests by enforcing an API key system in which the number of requests from the same API key is limited.

Web platforms must utilise standardised file outputs for the data they send to the user. Without standardisation to the rest of the industry, the output is useless and will not be popular with the community. Trove outputs industry standard file types for web services (JSON and XML).

Summary

The NLA’s Trove service is an example of Innovation through Assembly as it leverages its already-existing large service platform and giving the public an API through which they may dynamically interact with the data however they wish. Trove’s output is highly remixable, as it is uniquely addressed and permits granular access to individual articles. Trove successfully implements numerous API best practices including thorough documentation, examples and a way to limit server load by individual key-holders.

About these ads

9 thoughts on “Innovation in Assembly: National Library of Australia: Trove

  1. Michael,

    I think ‘Innovation in assembly’ is a great thing, being able to re-use/recycle pre-existing data and services so as to provide another product and conduct more business obviously appeals to companies. The example O’Reilly gives himself is a good example of this.

    Do you think that Facebook has made use of innovation in assembly by allowing businesses to use their platform for things like registering and logging in users/commenting on articles and blogs?

    Cheers

  2. Hi Michael,

    Have you considered the copyright issues of reusing such information? I question, especially with author rights and copyright restrictions in mind, to what extent are certain data from books are allowed to be replicated to the public through a third party?

    I realise that Trove would already have to consider such things but what if they provided full text editions online and allowed these to be accessed by API clients by developers? What rights would they have to the remixability of content they are granted access to?

    It opens the gate to possible copyright violation and reproduction of content without permission (Australian Copyright Council, 2013) – something that would have to be addressed extensively to achieve a uniting decision to the access of content that has potential to reach large audiences through online channels.

    On the other hand, there are many possibilities that exist with this vast amount of data readily available to developer. There are almost boundless settings that this data (both textual and visual) that could be used to create innovative applications. For instance, search of a physical copy of an item can be layered upon Google Maps to pinpoint the locations the items are available at – and not just locally but nationally and (possibly) internationally. Given that this is a simple example, there are other sites that could highly benefit from the collective data(especially book based social platforms such as Goodreads.com) by sourcing information from a credible source in order to ensure valid data on their site that their users use.

    The Trove API would also have great educational application to enhance student researching ability through a mobile or web based search engine that is easier to use and remixed to suit their educational and resource needs.

    References:
    Australian Copyright Council. (2013). Infringement: What Can I Do? Retrieved March 21, 2014 from http://www.copyright.org.au/find-an-answer/

  3. Hi Max,

    Wow, now Trove truly does live up to its name! It’s a very well refined and organised database search, all the features of regular search engines plus all the flexibility of a database with none of the drawbacks in an easy to use, and navigate engine.

    Are they any new databases or libraries etc. that Trove are going to be pooling more information from any time soon, as it stands they have quite a good showing, but there’s always more information out there.

    With the addition of their own API as well, only makes the engine even more powerful, and more importantly, a service that will actually get used and won’t just die off because of over saturation.
    Good research, and well written, very informative. Soon I hope you will too visit my blog once it has few posts up, it will be located @ http://kahlpiotrowski.wordpress.com/

    Thanks,

    Kahl

  4. I’ve always really loved what Trove is doing with data. They’re very similar to archive.org, but I think they focus more on older content like newspapers and photographs.
    Nevertheless, I think that any work that’s done with archiving old data is important and highly useful.
    I saw a project not too long ago where people had built a web-app that allowed you to pull an old newspaper from Trove, and run a text detection algorithm. Once completed, you could correct any incorrect words, and such such improve the detection algorithm. I think they hope to reach the point where they can digitise old newspapers without any human input. Very cool!
    It suggests that, at least in my opinion, that there won’t really be a need for Trove to have the kind of API it has now in the future. I think it will evolve to suit different needs over time, depending on what direction web 2.0 takes.

  5. Thanks for a great post on Trove.

    As I’m sure you know, Web 2.0 is largely about reaching a wider audience and normally when we think of Web 2.0 it’s hard not to gravitate towards the likes of Facebook and Google. The amount of social data they capture is vast but sometimes I wonder how all this data is adding value or making the ‘world a better place’ so to speak. However, Web 2.0 goes much further than this and Trove is a great example.

    Here is very useful information from valuable and formally trusted sources such as libraries.

    Reading your article on Trove made me think of global advancements towards education. New technologies and data sharing have eliminated distance for the international finance industry that trades not only around the world, but around the clock; for employees who collaborate on projects across time zones; for “footloose” businesses that operate from rural communities; and for students and researches who can search libraries and databases beyond their borders. (Hudson HE. 1997, 453)

    Another example of a similar Web 2.0 innovation in assembly and access to data through API’s is the OCLC (Online Computer Library Center). As more of these educational online databases such as Trove and the OCLC collaborate in delivering quality resources through these Web 2.0 mechanisms the closer we get to enabling poorer areas and developing nations a better chance at a good education and therefore a critical win for everyone.

    The OCLC through ‘WorldCat’ – the online Union Catalog, is also continuously expanding it’s operations and initiating Web 2.0 projects including Curiouser, Data Mining and WikiD, a project which allows readers to add commentary and structured-field information associated with any ‘WorldCat’ record. A good example of this is WebJunction. This was awarded a grant by the Bill and Melinda Gates Foundation to build an online community for libraries and other organizations to promote access to technical advancements and information for creative professionals in the digital library arena.

    ‘Innovation in Assembly’ across these institutes is prevalent and as access to the data through various API’s becomes seamlessly interchangeable and open standards keep evolving my hope is that Web 2.0 will be a cornerstone to significant developments in industries such as Health, Education and Security.

    References:
    Hudson HE. Global Connections: International Telecommunications Infrastructure and Policy. New York: Van Nostrand Reinhold; 1997.

  6. I remember I have used Trove for a project in the other subject, it were developing a web application using Trove API. It is interesting that they have very rich database of old news paper and articles, and it actually brought the interest from historians. Probably that’s why the project idea was taken in the subject I have taken. Since the API became quiet popular, many organisation such as Government and Private companies they providing very useful information, and they could generate more information. As an another example, Tranlsink is providing a public transport API and let developers to develop and utilise the information they’re providing.

    Thanks for the article!. :)

  7. I actually had no idea that Translink had released their own API until you mentioned it here. This is a really good idea and a great way to facilitate the usage of their network, since when someone develops an easier or better way to find information about travel times, commuters are more likely to actually utilise the service. It adds value for consumers and benefits Translink directly.

    Thanks for the info

  8. @kahl

    The NLA and Trove are, as far as I know, constantly scanning in more pieces to the database. While researching for this post I actually came across a list of contributors to the database. Scrolling through this list you can see that QUT has contributed almost 440,000 pieces to the project.

    Also, on the front page of the Trove project, under Contribute you can see the thousands of contributions and corrections occuring in just a single day.

  9. @alassus

    This is a really good question, and I won’t pretend to know anything about Copyright Law because I definitely don’t. However Trove does have a (very) small page addressing Copyright, which directs the user to the NLA’s extensive writeup on Copyright in their library.

    The government are responsible for The National Library of Australia and by extension Trove. I can only assume that this was addressed by them well before commencing the huge undertaking of the Trove project.

    This was an interesting question and I wish I could’ve given you an actual answer, sorry.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s