VitalSource has two options for engagement data delivery: near real-time and periodic delivery. Delivery frequency doesn’t limit the scope of data you’ll receive in your data feed, only the frequency at which you receive your data.

Scheduling Options

Periodic feeds post incremental data files to a Google Cloud storage bucket on a set schedule. Only one feed is delivered to each bucket, so data for a daily data feed and a weekly data feed is never mixed.

Each file is uniquely named and contains gzipped, incremental engagement data or dimensional data.

Daily

Your engagement data is extracted once every day.

Daily delivery is sufficient for most efforts that don't require insights throughout the day. For more frequent delivery of engagement data, consider near real-time delivery.
File size is more manageable since activities are extracted every day.

Weekly

Your engagement data is extracted once every week.

Supports research efforts or periodic trend analysis
File size may be larger, possibly up to 10M events for large organizations

File Retention

Files will be available for download for 30 days following their date of posting.

File Naming Convention

Files are named using the convention detailed below. The timestamps included in files reflect the time of the last activity or update within the file in UTC.

{source}_{MMDDYYYY_HHMMSS}_{split}.{format}.gz

For example, an activity extract through November 30th, 2018 at 4:30:52 PM UTC in the Caliper JSON format would be: activities_11302017_163052_000.caliper.json.gz.

Large files are automatically split into multiple files, denoted by different values in the split component of the file name.

Incremental Delivery and Handling

Periodic data feeds are incremental. This means that you will only receive new data every day or week, based on your feed's frequency. For the activities file, this means that only events that were received since the last extract will be sent. Offline activities are synchronized with the system when users come back online, so activity feeds may contain newly received historical data. Events always include the timestamp from when the learning activity occurs, not when they are received.

Dimensional files include records that have either been updated since the last feed or records that are referenced in the activities file currently being delivered. For example, the users feed will include users with updated metadata and users that have engaged with content since the last feed run.

You may wish to take advantage of your database's merge operation. This operation will perform two functions. First, if the record is new, it will be inserted. Second, if the record exists, it will be updated with the new data. All columns are included for each record, so in the case of an update, it is safe to update the contents of all columns in your database. Take note of the fields in the table definitions below that are listed as keys. These fields will be required to identify unique records.

Delivery Times

The time of file delivery can be customized on a per-feed basis, although variation in delivery times is expected due to data availability. Variation between delivery times should be relatively small from day to day.

Extract Error Handling

While we strive to deliver daily and weekly feeds exactly once a day or once a week, it is possible to receive more than one extract in a given period due to processing issues. If an issue occurs, the extract process will restart, producing additional files for a given period. These new files will not overlap in data any previously delivered files.

Empty Results

Empty files are not posted to the shared location. When no data is present for a given extract, empty files will not be produced and posted.

Available Formats

Periodic data feeds are available in Caliper, JSON and CSV format.

VitalSource CSV Format

The VitalSource CSV format is a custom format for transmitting engagement data from the VitalSource platform. Including the same essential elements as Caliper, the VitalSource CSV format is a streamlined payload.

VitalSource JSON Format

The VitalSource JSON format is a custom format for transmitting engagement data from the VitalSource platform. Including the same essential elements as Caliper, the VitalSource JSON format is a streamlined payload.

IMS Caliper 1.1 Format

As a contributor to the IMS Caliper standard, VitalSource is pleased to support the formatting of engagement data using Caliper 1.1. Caliper differentiates itself as a standard by prescribing both a data model and semantic model for communicating learning activities.

Industry Standard: Created by IMS, Caliper is the industry standard for transmitting learning interactions.
Powerful Foundation: Based on JSON-LD, Caliper is both human readable and easily understood by machines.
Semantic Interoperability: Caliper prescribes a vocabulary for communicating learning interactions, ensuring consistent interpretation across systems.
System Interoperability: Caliper defines the structure of each data point, ensuring consistent integration and consumption across systems.

Caliper Context Extension

Caliper JSON-LD documents define a context, denoted by the @context keyword, a property employed to map document terms to IRIs to one or more published vocabularies. Inclusion of a JSON-LD context provides an economical way for Caliper to communicate document semantics to services interested in consuming Caliper event data.

On certain events, VitalSource has extended the IMS Caliper specification by providing an additional context definition. This extension of the context allows for the inclusion of additional event types that are not currently described in the Caliper standard. The Caliper standard allows for this extension and may ultimately adopt the VitalSource-described events in a future release of the Caliper standard. Extended events are clearly identifiable by inspecting the context attribute. These events will include both the Caliper context and the VitalSource context.

Engagement Events

Each of the learning events listed below are captured as users interact with content on the VitalSource platform. Each event will produce a single record in the activities feed.

View - View events are recorded when an end user views a page within an asset. For PDFs, view events are recorded for each page. View events are triggered on epub content when the user interacts with a section of content identified by page labels. While configurable on custom instances of Bookshelf, by default, view events are only triggered with the user remains on a given page for at least 3 seconds.
Print - Print events are recorded when the user prints a page of content.
Search - Search events are recorded when users search for a term within the content.
Note - Note events are recorded when users enter a note on a highlight annotation.
Highlight - Highlight events are recorded when users highlight passages within the content. Epub highlights are communicated as a CFI range. PDF highlights are represented by a series of coordinates.
Bookmark - Bookmark events are recorded when users create a bookmark within the content.
Download - Download events are recorded when the user downloads a copy of the asset for offline access.
Launch - Launch events are recorded when the user launches into the VitalSource platform via LTI.

Feed Details

Files Included in Each Feed

File	File Stem	Description
Activities	activities	A file containing activity that was received since the last feed interval.
Books	books	A file containing metadata around each book that has activity or was updated since the last feed interval.
Users	users	A file containing metadata around each user that has activity or was updated since the last feed interval.
Companies	companies	A file containing metadata around each company that has activity or was updated since the last feed interval. In most cases, you will only have one company record.

Books File

Field	Format	Description
asset_id	String (Primary key)	A unique identifier for the asset.
vbid	String	VitalBook Identifier, a unique identifier for the asset.
format	String	Format of the book; dashML, pbk, pdf, ePub are valid returns.
title	String	Asset title
e_isbn	String	eISBN of the asset
print_isbn	String	Print ISBN of asset
description	String	The description of the asset
author	String	Name(s) of the author(s)
edition	Integer	Edition of the book
publisher	String	Name of the publisher
publisher_id	Integer	VitalSource identifier for the publisher
imprint	String	Name of the imprint
imprint_id	Integer	VitalSource identifier for the imprint
created_at	ISO-8601 date/time	Date and time the asset was created
updated_at	ISO-8601 date/time	Date and time the asset was last updated
kind	String	The type of asset; i.e. file, book, chapter
parent_asset_id	String	The identifier for the parent asset. This applies to certain asset kinds like chapters.
parent_vbid	String	The vbid for the parent asset. This applies to certain asset kinds like chapters.

Companies File

Field	Format	Description
company_id	String (Primary key)	A unique identifier for the company
name	String	Name of the company
parent_id	String	Id of the company's parent organization
parent_name	String	Name of the company's parent organization
created_at	ISO-8601 date/time	Date and time the company was created
updated_at	ISO-8601 date/time	Date and time the company was updated

Users File

Field	Format	Description
user_id	String (Composite key)	A unique identifier for the user
previous_user_id	String	The user's previous identifier
first_name	String	User's first name
last_name	String	User's last name
company_id	String (Composite key)	The company who owns the reference ID for the user. This element is only returned when a reference ID is available. Foreign key: companies.company_id
reference	String	Reference ID for the user; this element is only returned when a reference ID is available
created_at	ISO-8601 date/time	Date and time the user's account was created
updated_at	ISO-8601 date/time	Date and time the user's account was last updated
access_token	String	The user's access token; this can be used to make additional calls to dereference annotations

Activities File

course_idString

The VitalSource course identifier. This is only available for LTI launch events.

Field	Format	Description
event_id	String (Primary key)	A unique identifier for the event.
event_time	ISO-8601 date/time	The time at which the event occurred in UTC.
event_type	String	The type of event.
user_id	String	A dereferenceable, unique identifier for the user. Foreign key: users.user_id
asset_id	String	A dereferenceable, unique identifier for the asset. Foreign key: assets.asset_id
session_id	String	A unique identifier for the user's session. A new identifier is created each time the user opens Bookshelf.
user_agent	String	The user agent string from the user's device. This can be used to discern the device and platform associated with the activity.
page	String	The page associated with the event. This is poulated for non-print events.
pages	String	Comma delimited list of pages that were printed. Only populated for print events.
search_term	String	The term that was searched.
launch_parameters	String	LTI parameters that were passed to VitalSource as part of the launch.
distributor_id	String	A dereferencable, unique identifier for the distributor responsible for provisioning content to the user. Foreign key: companies.company_id
course_id	Integer	The LMS ID

Periodic Delivery