PowerMetrics: About JSON and XML data

Many services return data in JSON (JavaScript Object Notation) and XML (eXtensible Markup Language) formats. This article describes the structure of JSON and XML data and how to handle it in PowerMetrics.

In this article, you will find information and examples on the following topics associated with working with JSON and XML data.

About the structure of JSON and XML data

JSON and XML are hierarchical data formats.

Where tabular data is referenced by column and row, JSON and XML data is referenced by its path, which indicates its position in the hierarchy. This path, also called the XPath, describes the levels traversed to reach the elements. See the following example, where:

  • player/first_name is the XPath of the first_name element
  • player/team/wha is the XPath of the wha element

JSON data is made up of these elements:

  • objects: one or more field name and value pairs, indicated by curly braces.
  • arrays: lists of elements of the same type, indicated by square brackets.

Selecting JSON or XML data

In the modeller, entering @ into the formula bar selects the data source. (See below.)

Note: The ; only displays when the XPath is being edited.

Clicking an element in the data source selects that element to include in the model. In the following image, "/player/first_name" is your datasource reference and editable XPath.

Note: Some data sources use '@' as the first character in their field name or as an attribute. It is important not to get them confused with the datasource reference.

Using the Selection Options in JSON or XML data

When you select data from a JSON or XML data source, the selected field’s path, or XPath, is displayed in the formula editor. You can type the path or you can click on the Selection Options link.

In the following image, the XPath of the first_name element is /player/first_name.

By default, the values of all peer elements are selected (equivalent to selecting a column of data in a table). If you need to change this selection, click Selection Options, and select the required option:

  • Select only this <field name> element. Only one specific <field_name> element is selected (equivalent to selecting a specific cell in Excel). The formula editor displays the specific path: @ /player[1]/first_name[1] .
  • Select all peer <field name> elements. All <field_name> elements at this level of the hierarchy are selected (equivalent to selecting a column of data in a table). The formula editor displays the path: @ /player/first_name .
  • Select all <field name> elements in all <parent> elements, adding blanks for missing <field name> elements. All <field_name> elements in all parents are selected, and blanks are inserted for missing <field_name> elements. The formula editor displays the path: @ kf:fill_elements(player,”first_name”) .
  • Select all field names in <parent> element. All field names in this branch of the hierarchy are selected. The formula editor displays the path: @ kf:names(player) . Learn more here.
  • Select all <field_name> elements. All <field_name> elements at any level of the hierarchy are selected. The formula editor displays the path: @ //first_name .
  • Select all <field_name> elements in this <parent> element. All <field_name> elements in this branch of the hierarchy are selected. The formula editor displays the path: @ /player[1]/first_name .
  • Select everything in <parent> elements. Everything (all values) in all <parent> elements is selected. The formula editor displays the path: @ /player/* .

XPath Expressions

An XPath expression is used to select elements based on specified criteria.

Some common functions used in XPath expressions:

  • name
  • last
  • position
  • contains
  • substring
  • count

Example 1: select elements by matching a specific value

To select the last names of players on team Winnipeg:

  1. Select the last_name of a player:  @ /player/last_name  This will return all player/last_name elements.
  2. Type the expression, [team/name='Winnipeg'], after /player to select only the last names of players on team Winnipeg.

The resulting XPath:  @ /player[team/name='Winnipeg']/last_name

This says: for all player elements with descendant team/name equal to Winnipeg, select the last_name element.

Example 2: select elements based on a field name

This example uses the contains function and the name function to select all elements based on a field name.

Type:  @ /player/*[contains(name(),'name')]

This says: for all /player elements, select any values where the field name contains: "name"

This expression returns Anders, Hedberg, Maurice, Richard, Rocket, Wayne, Gretzky, The Great One.

Example 3: select all descendant values

@ /player/*  selects all immediate descendant values in the player element.

Anders
Hedberg
Maurice
Richard
Rocket
Wayne
Gretzky
The Great One

@ /player[2]/*  selects all immediate descendant values (denoted by /*) in the second player element.

Maurice
Richard
Rocket

@ /player[3]/*/*  selects all second level descendant values (denoted by /*/*) in the third player element

Edmonton
1978-1988
1978-1979

XPath Axes

XPath axes (child, descendant, following-sibling, preceding-sibling) indicate the position of an element relative to another element in the hierarchy. For example,

  • first_name, last_name, nickname, teams/name, teams/wha and teams/nhl are descendants of player.
  • first_name, last_name and nickname are children of player.
  • first_name, last_name, nickname and teams are siblings of each other, preceding-sibling or following-sibling depending on the relative positions (XML) or alphabetical order (JSON} of their names:
    • first_name is a preceding-sibling of last_name.
    • last_name is a following-sibling of first_name.

 

To select first_names for all players with a nickname child:

@ /player[nickname]/first_name

 

To select all first_names with a nickname following-sibling (this will return the same values as @ /player[child::nickname]/first_name ) :

@ /player/first_name[following-sibling::nickname]

 

To select all last_names of players who played in the WHA:

@ /player/last_name[following-sibling::team/wha]

Klipfolio Functions for JSON data

Sometimes more complex XPath manipulation is required to access and align JSON data.

kf:element_at

Syntax: kf:element_at( object, index )

Select an element at the position specified by index. This function is used when a field name is unnamed and is instead referenced by number (position).

For example, using the following data source:

Example 1:

@ kf:element_at(rows,1)

Returns:

  • /nl/
  • /studyScreen?sourceId=11263
  • /studyScreen?sourceId=270814
  • /studyScreen?sourceId=604

Example 2:

@ kf:element_at(rows,2)

Returns:

  • 1
  • 1
  • 1
  • 1

kf:fill_elements

Syntax: kf:fill_elements(path,'fieldname')

Note: You can enter the formula manually or generate it automatically by choosing the Selection Option Select all <field_name> elements in all <parent> elements, adding blanks for missing <field_name> elements.

This function is used to “fill in” blanks for an element that is not included in every record. This is useful for data alignment.

For example, using the following data source:

Example 1:

@ /player/first-name Returns: Anders, Maurice, Wayne

@ /player/nickname Returns: Rocket, The Great One

As is, the data incorrectly aligns Anders with the nickname, Rocket, and Maurice with the nickname, The Great One.

 

Use kf:fill_elements to align this data correctly:

@ kf:fill_elements(/player,'nickname')

Returns: <blank>, Rocket, The Great One

 

Example 2:

@ kf:fill_elements(/player/team,'wha')

Returns: 1974-1978, <blank>, 1978-1979

Note: This function does not work for nested arrays.

kf:names

Syntax: kf:names(object)

Note: You can enter the formula manually or generate it automatically by choosing the Selection Option Select all field names in <parent> element.

The kf:names function returns the names of all fields contained in an element. This is useful when the field names identify categories such as demographics or countries.

Example 1:

For example, Facebook can return page fans data by gender and age. To create a list of these categories for a table column for a bar/line chart x-axis, use:

@ kf:names(data/values/value)

Which returns data shown in the following image:

Compare with @ kf:names(/data/values/value/M.25-34) which returns only value M.25-34

More examples:

The following show more examples using the data source featured in the previous function (kf:fill_elements).

@ kf:names(/player[1]/team)  Returns: name, wha

@ kf:names(/player/team)  Returns: names for all items in player: name, wha, name, nhl, name, wha, nhl

@ kf:names(/player[1])  Returns: first_name, last_name, team

Note: Unnamed fields cannot be selected directly; see kf:element_at.

Have more questions? Submit a request