VitalSource has two options for engagement data delivery: near real-time and periodic delivery. Delivery frequency doesn’t limit the scope of data you’ll receive in your data feed, only the frequency at which you receive your data.
Periodic feeds post incremental data files to a Google Cloud storage bucket on a set schedule. Only one feed is delivered to each bucket, so data for a daily data feed and a weekly data feed is never mixed.
Each file is uniquely named and contains gzipped, incremental engagement data or dimensional data.
Your engagement data is extracted once every day.
- Daily delivery is sufficient for most efforts that don't require insights throughout the day. For more frequent delivery of engagement data, consider near real-time delivery.
- File size is more manageable since activities are extracted every day.
Your engagement data is extracted once every week.
- Supports research efforts or periodic trend analysis
- File size may be larger, possibly up to 10M events for large organizations
Files will be available for download for 30 days following their date of posting.
File Naming Convention
Files are named using the convention detailed below. The timestamps included in files reflect the time of the last activity or update within the file in UTC.
For example, an activity extract through November 30th, 2018 at 4:30:52 PM UTC in the Caliper JSON format would be: activities_11302017_163052_000.caliper.json.gz.
Large files are automatically split into multiple files, denoted by different values in the split component of the file name.
Incremental Delivery and Handling
Periodic data feeds are incremental. This means that you will only receive new data every day or week, based on your feed's frequency. For the activities file, this means that only events that were received since the last extract will be sent. Offline activities are synchronized with the system when users come back online, so activity feeds may contain newly received historical data. Events always include the timestamp from when the learning activity occurs, not when they are received.
Dimensional files include records that have either been updated since the last feed or records that are referenced in the activities file currently being delivered. For example, the users feed will include users with updated metadata and users that have engaged with content since the last feed run.
You may wish to take advantage of your database's merge operation. This operation will perform two functions. First, if the record is new, it will be inserted. Second, if the record exists, it will be updated with the new data. All columns are included for each record, so in the case of an update, it is safe to update the contents of all columns in your database. Take note of the fields in the table definitions below that are listed as keys. These fields will be required to identify unique records.
The time of file delivery can be customized on a per-feed basis, although variation in delivery times is expected due to data availability. Variation between delivery times should be relatively small from day to day.
Extract Error Handling
While we strive to deliver daily and weekly feeds exactly once a day or once a week, it is possible to receive more than one extract in a given period due to processing issues. If an issue occurs, the extract process will restart, producing additional files for a given period. These new files will not overlap in data any previously delivered files.
Empty files are not posted to the shared location. When no data is present for a given extract, empty files will not be produced and posted.
Periodic data feeds are available in Caliper, JSON and CSV format.
VitalSource CSV Format
The VitalSource CSV format is a custom format for transmitting engagement data from the VitalSource platform. Including the same essential elements as Caliper, the VitalSource CSV format is a streamlined payload.
VitalSource JSON Format
The VitalSource JSON format is a custom format for transmitting engagement data from the VitalSource platform. Including the same essential elements as Caliper, the VitalSource JSON format is a streamlined payload.
IMS Caliper 1.1 Format
As a contributor to the IMS Caliper standard, VitalSource is pleased to support the formatting of engagement data using Caliper 1.1. Caliper differentiates itself as a standard by prescribing both a data model and semantic model for communicating learning activities.
- Industry Standard: Created by IMS, Caliper is the industry standard for transmitting learning interactions.
- Powerful Foundation: Based on JSON-LD, Caliper is both human readable and easily understood by machines.
- Semantic Interoperability: Caliper prescribes a vocabulary for communicating learning interactions, ensuring consistent interpretation across systems.
- System Interoperability: Caliper defines the structure of each data point, ensuring consistent integration and consumption across systems.
Caliper Context Extension
Caliper JSON-LD documents define a context, denoted by the @context keyword, a property employed to map document terms to IRIs to one or more published vocabularies. Inclusion of a JSON-LD context provides an economical way for Caliper to communicate document semantics to services interested in consuming Caliper event data.
On certain events, VitalSource has extended the IMS Caliper specification by providing an additional context definition. This extension of the context allows for the inclusion of additional event types that are not currently described in the Caliper standard. The Caliper standard allows for this extension and may ultimately adopt the VitalSource-described events in a future release of the Caliper standard. Extended events are clearly identifiable by inspecting the context attribute. These events will include both the Caliper context and the VitalSource context.
Each of the learning events listed below are captured as users interact with content on the VitalSource platform. Each event will produce a single record in the activities feed.
- View - View events are recorded when an end user views a page within an asset. For PDFs, view events are recorded for each page. View events are triggered on epub content when the user interacts with a section of content identified by page labels. While configurable on custom instances of Bookshelf, by default, view events are only triggered with the user remains on a given page for at least 3 seconds.
- Print - Print events are recorded when the user prints a page of content.
- Search - Search events are recorded when users search for a term within the content.
- Note - Note events are recorded when users enter a note on a highlight annotation.
- Highlight - Highlight events are recorded when users highlight passages within the content. Epub highlights are communicated as a CFI range. PDF highlights are represented by a series of coordinates.
- Bookmark - Bookmark events are recorded when users create a bookmark within the content.
- Download - Download events are recorded when the user downloads a copy of the asset for offline access.
- Launch - Launch events are recorded when the user launches into the VitalSource platform via LTI.
Files Included in Each Feed
|A file containing activity that was received since the last feed interval.
|A file containing metadata around each book that has activity or was updated since the last feed interval.
|A file containing metadata around each user that has activity or was updated since the last feed interval.
|A file containing metadata around each company that has activity or was updated since the last feed interval. In most cases, you will only have one company record.
|A unique identifier for the asset.
|VitalBook Identifier, a unique identifier for the asset.
|Format of the book; dashML, pbk, pdf, ePub are valid returns.
|eISBN of the asset
|Print ISBN of asset
|The description of the asset
|Name(s) of the author(s)
|Edition of the book
|Name of the publisher
|VitalSource identifier for the publisher
|Name of the imprint
|VitalSource identifier for the imprint
|Date and time the asset was created
|Date and time the asset was last updated
|The type of asset; i.e. file, book, chapter
|The identifier for the parent asset. This applies to certain asset kinds like chapters.
|The vbid for the parent asset. This applies to certain asset kinds like chapters.
|A unique identifier for the company
|Name of the company
|Id of the company's parent organization
|Name of the company's parent organization
|Date and time the company was created
|Date and time the company was updated
|A unique identifier for the user
|The user's previous identifier
|User's first name
|User's last name
The company who owns the reference ID for the user. This element is only returned when a reference ID is available.
Foreign key: companies.company_id
|Reference ID for the user; this element is only returned when a reference ID is available
|Date and time the user's account was created
|Date and time the user's account was last updated
|The user's access token; this can be used to make additional calls to dereference annotations
The VitalSource course identifier. This is only available for LTI launch events.
|A unique identifier for the event.
|The time at which the event occurred in UTC.
|The type of event.
A dereferenceable, unique identifier for the user.
Foreign key: users.user_id
A dereferenceable, unique identifier for the asset.
Foreign key: assets.asset_id
|A unique identifier for the user's session. A new identifier is created each time the user opens Bookshelf.
|The user agent string from the user's device. This can be used to discern the device and platform associated with the activity.
|The page associated with the event. This is poulated for non-print events.
|Comma delimited list of pages that were printed. Only populated for print events.
|The term that was searched.
|LTI parameters that were passed to VitalSource as part of the launch.
A dereferencable, unique identifier for the distributor responsible for provisioning content to the user.
Foreign key: companies.company_id
The LMS ID