How to Use and Read ONIX Book Files: Getting the most out of your metadata

true

What is ONIX for BooksA no-nonsense handbook to navigating and getting the most out of your metadata distribution system


What is ONIX for Books?

First thing’s first: what exactly is ONIX? Is it the british spelling of a black stone? Is it a pokemon? No, and yes, but, not here. Developed by a company called EDItEUR in 2001, ONIX for Books is the global standard format for creating, transmitting, and, and communicating book product and bibliographic information electronically.   

“ONIX is an XML-based standard for rich book metadata, providing a consistent way for publishers, retailers and their supply chain partners to communicate rich information about their products.”

If you’ve ever seen html code or seen a movie with hackers, you’ve seen what an ONIX file looks like. The kind folks at EDItEUR created a standard of fields using codes so that no matter what language you speak, you can accurately communicate the information about your book that retailers need to sell it. It’s like digital Esperanto.

The files– which ONIX calls “messages”–are sent from publishers to distributors/retailers through a variety of systems. Since the format is a guideline and not a product, they can be sent as simply as via email attachment or as sophisticated as through third-party tools and File Transfer Protocol (FTP) providers.

The individual files can be viewed either in an internet browser (we recommend Chrome), in a simple text editing software (Notepad for Microsoft, TextEdit for Mac). Since ONIX messages are built to speak between computers, they can be difficult for humans to read. Third party software that breaks down each code into easy question and answer fields are available. Some of the most popular include ONIXEdit, Book Connect, OnixSuite, Title Manager, BookSonix and BiblioLive.

How does ONIX work?

Once completed, the ONIX messages are transmitted to the specific retailers, who use the information to populate fields for the product display as well as within their own cataloging and search functions.

Why does complete ONIX metadata matter?

In one word: sales. Books with complete metadata sell more copies, across both digital and nondigital platforms.

In 2016, the Nielsen Company, Bowker, and Baker and Taylor published the findings of their US survey of book sales and metadata, entitled “Nielsen Book US Study: The importance of Metadata for Discoverability and Sales,” which reinforced the findings of their 2012 Nielsen Book UK study, which found a strong link between books with complete and relevant metadata and increased sales–including for offline retailers.

Their results promote the well-founded idea that discoverability — “the ease with which a particular product can be found”–hinges on complete metadata, and makes the important distinction that it’s not just about the direct consumer discoverability, but for gatekeepers in the book industry supply chain, most notably librarians and booksellers, too:

“Providing accurate data on properties such as publication date, price, supplier and physical attributes aids booksellers in planning their stock management, from scheduling future orders, to planning shelf space or storage allocations, to ensuring shipments are made on the most economical terms (through referencing physical attribute data).

“Maintaining an efficient supply chain ensures that booksellers can focus on selling books – and maximizing sales for publishers and themselves. Where this valuable supply chain data isn’t available to the bookseller, at best they will need to carry out additional work (leading to decreased efficiency) and at worst they may not order the product due to an inability to plan for it effectively.”

While that all may sound daunting, the 2012 study examined just ten attributes out of 3,000+ ONIX code entries available for completion. In the 2016 study, Nielsen examined the metadata of the top 100,000 bestselling titles from July 2015 to July 2016. Beginning with eight basic fields that they identified as the basic level of completeness– ISBN, Title, Format/Binding, Publication Date, BISAC Subject Code, Retail Price, Sales Rights, Cover image, and Contributor–and found that conforming titles saw average sales that were 75% higher than titles that did not.

They also looked at two critical groupings of data that revealed more positive sales correlations:

Books with complete descriptive data (title description, author biography and review) saw 72% higher sales than those without
Books with keywords saw 34% higher sales than those without.

Quick guide: Reading the XML

Get into the nitty-gritty of the ONIX messages!

I. Navigating ONIX’s Most Important Fields

With the exception of reviews and high-quality keywords, all of this vital information is most likely at hand. You have it, you just need to implement it.

If you have access to a third-party system this may be easy enough, but since each is built to speak the ONIX language, it is important to know:

What the fields are
Where they live in the message
Details you can further include to strengthen your book sales.

Creator of the ONIX system have an exhaustive code list that they update routinely. Most publishers still use ONIX 2.0 series as their system, but the data collective has unleashed the power of ONIX 3.0: a much more thorough listing that takes digital formats into account.

An ONIX message’s details are divided among 230 sections–including sections as specific as “price constraints,” “chinese school grade code,” and “supply date info” but the most vital are at at the beginning.

While the messages can look confusing, they do follow a solid logic, building on information as it is presented. Just like written language, there are phrases that are opened and closed to derive meaning and relationships from each statement.

We’ll pull apart each section and explain it and then bring it back together below.

Each detail is listed as its own line set off by tabs that note its order preceded by an opening information tag in brackets and closed by the same bracket with a “/” in front, like this. (Note: an an ellipse as seen below between the Product Identifier tags brackets in a clause means that you can expand the section down to see the information contained.)

<Product>
     <Product Identifier Type>...</Product Identifier Type> *

     <Title>
          <Title Type> # </Title Type>
          <TitlePrefix>The</TitlePrefix>
          <TitleWithoutPrefix>TEXT</TitleWithoutPrefix>
     </Title>

<Contributor>

II. Formatting

In ONIX messages you can use standard html syntax to begin and end statements and to denote formatting differences. Here are the basics:

<d104 textformat="02"> -->add this after the major Section Heading to note that the following markup obeys HTML rules. End it just by closing the statement with </d104>

<Heading> Field Entry </Heading>

*Surrounding less-than and greater-than brackets offset and denote organizational headings beginning
*Data pertaining to the heading goes between the <Heading> syntax
*A bracket followed by a backslash denotes the end of a phrase.

Within the Field Entry, you can note formatting for e-commerce sites (otherwise it will populate as standard text and in a large block if there is a lot of content):

<b> Bold text </b>
<strong> Important text </strong>
<i> Italic text </i> 
<em> Emphasized text </em> 
<sub> Subscript text </sub> 
<sup> Superscript text</sup> 
<u> Underline </u>
<p> - paragraph break
<br> - line break
<li> - list item

III. Descriptive Data Identifiers

1. Keywords

Entries to this part of the message are specifically used for indexing and search purposes. While not normally intended for display, best practice is to integrate those that make sense and seem to preform well into your other descriptive fields.

Where?

Under the <subject> heading.

How?

Code Number: 20

<subject>
     <b067>20</b067>
     <b070> keywords; separated by; semi-colons; can be long tail; or short; tail; terms; BUT; avoid; title,  subject, and series; terms; that;
      are; duplicative; because most retailers;  only allow; a certain number of characters; 
</b070>*
*StoryFit Metadata Keywords are optimized to meet retailer requirements.

2. BISAC Subject Code

Found nested under  <Product>, the <subject> product identifiers include 112 options for subject descriptions–everything from Dewey Decimal and Library of Congress organization categories to various European country standards, location by postal code, and Key Character Names found after the clause <b067>  The code for BISAC category code is “10.”

<Product>
     <subject> 
     <b067>10</b067>
     <b069>BISAC Subject Code(FIC031000, for example)</b069>

For a full list of BISAC categories, visit the Book Industry Study Group’s listing of complete BISAC headings for fiction and nonfiction subject.  Note: ONIX also supplies the <b067>22</b067>  field for BISAC merchandising Theme, which would follow the same syntax as the Subject code, but with the <b069> field following the  <b067>22</b067> entry.

3. ISBN

Where?

Under <ProductIdentifier>

What?

Noted as <ProductIDType> code number </ProductIDTyper>:

ISBN-13

 <ProductIDType>15</ProductIDType>

ISBN-10

 <ProductIDType>02</ProductIDType>

ISBN-A

 <ProductIDType>26</ProductIDType>
How?

Following the Product Identifier code as <IDValue> 9781000000000 </IDValue>

So the full ISBN entry would look like this:

<Product>
     <ProductIdentifier>
          <ProductIDType>15</ProductIDType>
<IDValue>9781000000000</IDValue>
</ProductIdentifier>

4. Title

Where?
Under <Title>
What?

Lots of options for the Title:

<TitleType> code number </TitleType>

Most used code number will be “01,” which signals a distinctive title in a book and the cover title for a serial. Other options include:

00 - undefined
02 - ISSN key title of serial
03 - Title in original language
04 - Title acronym or initialism
05 - Abbreviated Title
06 - Title in other language
07 - Thematic title of journal issue
08 - Former title
10 - Distributor’s Title
11 - Alternative title on cover
12 - Alternative title on back
13 - Expanded title
14 - Alternative title
<TitleText>Full Title</Title Text>
<TitlePrefix>A, An, The, etc.</TitlePrefix>
<TitleWithoutPrefix>Title, but without the prefix</TitleWithoutPrefix>
How?

The full title entry looks like this:

<Title>
     <TitleType>01</TitleType>
     <TitlePrefix>The</TitlePrefix>
     <TitleWithoutPrefix>Book Title Example</TitleWithoutPrefix>
</Title>

5. Format/Binding

Where?

Under <ProductIdentifier>

What?

Noted as <ProductForm>code</ProductForm>:

There are 135  format options with details from binding and paper type to operating system and file type . The most used codes are:

BA - Book
BB - Hardback
BC - Paperback / softback
BH - Board Book
AA - Audio
AJ - Downloadable Audio file
EA - Digital (delivery method unspecified)
EB - Digital Download

Can I add specific format details about the Product?

There are 256 options. The most commonly used codes are:

B101 - Mass market (rack) paperback
B102 - Trade paperback (US)
B103 - Digest format paperback
B104 - A-format paperback
B105 - B-format paperback
B106 - Trade paperback (UK)
B107 - Tall rack paperback (US)
B315 - Trade binding
A103 - MP3 format
A104 - WAV format
B401 - Cloth over boards
B221 - Picture book
E101 - EPUB
E116 - Amazon Kindle
E121 - eReader
E126 - Microsoft Reader
E133 - Google Edition
E134 - Book ‘app’ for iOSE135 - Book ‘app’ for Android
E136 - Book ‘app’ for other operating system
E141 - iBook
B501 - With dust jacket
B502 - With printed dust jacket
How?

The full title entry looks like this:

<Product>
     <ProductIdentifier>
     <ProductForm>BB</ProductForm>
     <ProductFormDetail>B501</ProductFormDetail>
</ProductIdentifier>

6. Publication Date

The date itself is straight forward and found only under Product as:

<Product>
     <PublicationDate>YearMonthDay</PublicationDate>
*Note: make sure to denote in the ONIX message what your standard date format is. This is found under <DateFormat> (CodeList Number 55)
How do I add specific details about the Publication Date?

There are plenty of juicy details to add to an ONIX message about a pending publication surrounding its release.

Publishing Status

<Product>
     <PublishingStatus>code</PublishingStatus> 

The options are:

00 - Unspecified
01  - Cancelled
02  - Forthcoming
03  - Postponed indefinitely
04  - Active
05  - No longer our product
06  - Out of stock indefinitely
07  - Out of print
08  - Inactive
09  - Unknown
10  - Remaindered
11  - Withdrawn from sale
12  - Recalled
13  - Active, but not sold separately
14  - Recalled
15  - Recalled
16  - Temporarily withdrawn from sale
17  - Permanently withdrawn from sale

Availability
<Product>
     <SupplyDetail>
     <ProductAvailability>Code</ProductAvailability> 

The options are:

01 - Unspecified Cancelled
09 - Not yet available, postponed indefinitely
10 - Not yet available
11 - Awaiting stock
12 - Not yet available, will be POD
20 - Available
21 - In stock
22 - To order
22 - POD
30 - Temporarily unavailable
31 - Out of stock
32 - Reprinting
33 - Awaiting reissue
34 - Temporarily withdrawn from sale
40 - Not available (reason unspecified)
41 - Not available, replaced by new product
42 - Not available, other format available
43 - No longer supplied by us
44 - Apply direct
45 - Not sold separately
46 - Withdrawn from sale
47 - Remaindered
48 - Not available, replaced by POD
49 - Recalled
50 - Not sold as set
51 - Not available, publisher indicates OP
52 - Not available, publisher no longer sells product in this market
97 - No recent update received
98 - No longer receiving updates
99 - Contact supplier

7. Retail Price

With the price it is important to note several things:

a. Currency

It’s important to note the currency to each ISBN. You can do so in the header of your message as

<DefaultCurrencyCode>CODE</DefaultCurrencyCode> 

or under the <price> field as
<CurrencyCode>CODE<CurrencyCode>.
b. Price type

There are 26 possibilities here. These are the most common:

03 - Fixed retail price excluding tax
04 - Fixed retail price including tax
05 - Supplier’s net price excluding tax
07 - Supplier’s net price including tax
41 - Publishers retail price excluding tax
42 - Publishers retail price including tax
c. Price type qualifiers

There are 16 possibilities here. These are the most common:

01 - Member/subscriber price
02 - Export Price
03 - Reduced price applicable when the item is purchased as part of a set (or series, or collection)
05 - Consumer Price
06 - Corporate / Library / Education price
07 - Reservation order price
08 - Promotional offer price
10 - Library Price
11 - Education Price
12 - Corporate price
13 - Subscription service price
14 - School library price
15 - Academic library price
16 - Public library price
d. Sales Rights

Nested under <Product>. Eight options:

00 - Sales rights unknown or unstated for any reason
01 -For sale with exclusive rights in the specified countries or territories
02 -For sale with non-exclusive rights in the specified countries or territories
03 -Not for sale in the specified countries or territories (reason unspecified)
04 -Not for sale in the specified countries (but publisher holds exclusive rights in those countries or territories
05 - Not for sale in the specified countries (publisher holds non-exclusive rights in those countries or territories)
06 -Not for sale in the specified countries (because publisher does not hold rights in those countries or territories)
07 -For sale with exclusive rights in the specified countries or territories (sales restriction applies)
08 -For sale with non-exclusive rights in the specified countries or territories (sales restriction applies)
e. Territory:
<RightsTerritory>TERRITORIES</RightsTerritory>

8. Contributor

Nested under <Product>. Using the following codes you can note everything from the contributor to co-author, editors, pseudonyms, and more (there are a total of 108 designated authorial tags).

<Product>
    <contributor>
    <b035>#</b035>
    <b039>First Name</b039>
    <b040>Last Name</b040>
</contributor>

Between <b035> # </b035> will be:

A01 - By (author) 
A02 - With 
A08 - By (photographer) 
A09 - Created by 
A12 - Illustrated by 
A13 - Photographs by 
A14 - Text by 
A15 - Preface by 
A16 - Prologue by 
A19 - Afterword by 
A22 - Epilogue by 
A23 - Foreword by 
A24 - Introduction by 
A26 - Memoir by 
A29 - Introduction and notes by 
A32 - Contributions by 
A36 - Cover design or artwork by 
A38 - Original author 
A39 - Maps by 
A43 - Interviewer 
B01 - Edited by 
B02 - Revised by 
B03 - Retold by 
B04 - Abridged by 
B05 - Adapted by 
B06 - Translated by 
B07 - As told by 
B10 - Edited and translated by 
C01 - Compiled by 
E07 - Read by

9. Title Description

Found in the <TextType> or <OtherText> (List Number 153) , there are 24 fields that can be populated and should be if you have the information.

<othertext>
   <d102>Code Number</d102>
    <d104 textformat="02">Description</d104>
</othertext>

Here are the most common/most important for increased discoverability and sales:
01 - Sender-defined text                                   Text which (a) is not for general distribution and (b) cannot be coded elsewhere.
02 - Short description/annotation                          Limited to a maximum of 350 characters
03 - Description Length unrestricted
04 - Table of contents 
05 - Flap / cover copy 
06 - Review quote 
07 - Review quote: previous edition 
08 - Review quote: previous work 
09 - Endorsement 
10 - Promotional headline 
11 - Feature                                               Describing an attention-grabbing feature of a product for promotional purposes.
12 - Biographical note                                     A note referring to all contributors to a product – NOT linked to a single contributor
13 - Publisher’s notice                                    Publisher statement of contractual obligations (disclaimer, sponsor statement, or legal notice, etc.) 
16 - Short description/annotation for collection           (of which the product is a part.) Limited to a maximum of 350 characters
17 - Description for collection                            (of which the product is a part.) Length unrestricted

10. Review

In addition to the information on relayed on reviews in the title data, you can include full reviews and award citations:

<Product>
 <CitedContentType> OR <PrizeorAwardorAchievement>
 <CitationType>Code Name or Value<CitationType>
information
</CitedContentType> OR </PrizeorAwardorAchievement>
Cited Content Type
01    Review
02    Bestseller list
03    Media mention
04   ‘One locality, one book’ program
05    Curated list

Prize/Award/Achievement
01     Winner
02     Runner-up
03     Commended
04     Short-listed
05     Long-listed
06     Joint winner
07     Nominated


More Resources:

“A non-technical, beginners’ guide to ONIX for Books” by BookMachine.

“Three Ways To Do More With ONIX” by Consonance

Filed Under: ONIX, Editeur, Publishing, keywords, metadata

SUBSCRIBE TO EMAIL UPDATES