Skip to main content

Constructing Networks

Some steps of the visualization process require understanding a dataset as a network.

These include:

  • calculating metrics such as node-centrality
  • performing topology-based network layouts
  • performing operations like expanding a selection from a node to its neighbours
  • transformations with required understanding that nodes are connected to edges:
    • aggregating edges (the data is aggregated, but the node and link references are not)
    • filtering edges or nodes (for consistency, this requires filtering out any edges connected to filtered-out nodes)
    • projecting edges
    • aggregating nodes into a supernode

We distinguish between separate stages:

  • loading tabular data
  • processing data in ways that treat it as tabular data, without needing to know how it should be interpreted as a network
  • construction of a network
  • processing the network in a way that requires understanding that links connect nodes

Creating networks

  • simple construction
  • complex construction
  • importing existing networks

A network can be created from tabular data defined by a data block, or it can be imported from a network data format defined at this step.

Simple Networks

AttributeTypeDescription
namestringThe name to be given to the network when it is created.
nodesstringThe name of the data array containing the list of nodes (or a selector)
linksstringThe name of the data array containing the list of links (or a selector)
directedbooleanBoolean variable recording whether the edge sin this network are directed.
source_node[string, string]The first element is the field on the nodes containing the node id;the second element is the field on the links containing the id of the source node.
target_node[string, string]The first element is the field on the nodes containing the node id;the second element is the field on the links containing the id of the source node.The first element is the field on the nodes containing the node id;the second element is the field on the links containing the id of the target node.
metrics?SeparateMetricSpec[]List of metrics to calculate for each node.
addReverseLinks?booleanIf true, replace any undirected links by a pair of directed links.
transform?NetworkTransformSpec[]List of network transformations to apply.
ignoreDanglingLinksbooleanIf true, links for which the source or target node cannot be found are ignored.If false, these links are created, but their source or target attribute is undefined.This is likely to cause errors computing layouts, but may be useful when looking for misisng node definitions in a dataset.Default: true

A network can be constructed by two tabular datasets: an array of nodes and an array of links.

This is the same format as accepted by d3-force, or the Force Transform in Vega.


{
"name": "le-mis-network",
"nodes": "nodes",
"links": "links",
"directed": true,
"source_node": [ "id", "source" ],
"target_node": [ "id", "target" ]
}

This specifies that we want to create a new network called le-mis-network, using data about nodes that was saved using the name nodes and data about the links that was saved using the name links. The source node for each link is identified by the field called source; the value in this field is matched against the id field of the nodes. The target node for each link is identified by the field called target; the value in this field is matched against the id field of the nodes.

If addReverseLinks is true, then two edges in the network are created for each entry in the links array (one directed from the source to the target, and one from the target to the source); this can be useful when constructing an adjacency matrix for an undirected network.

The nodes table can be omitted, if you don't have any data associated with the nodes other than their ids.


{
"name": "le-mis-network",
"links": "links",
"directed": true,
"source_node": [ "id", "source" ],
"target_node": [ "id", "target" ]
}

Complex networks

NetPanorama allows complex networks to be constructed by combining multiple tables.

The rows of each table amy be mapped onto new nodes, new edges, or both.

Nodes can be created implicitly by defining an edge. This avoids the need for operations such as "promoting attributes to edges" that are used in some other systems: we can instead state that each entry of a data array is mapped onto an edge, whose target node has an id defined by that attribute.

AttributeTypeDescription
namestringThe name to be given to the network when it is created.
partsComplexNetworkSpecComponent[]Each entry in this list describes the data that should be extracted from a data array.
transform?NetworkTransformSpec[]List of network transformations to apply.

A ComplexNetworkSpecComponent:

AttributeTypeDescription
datastringThe name of the array being processed
yieldsNodes?ExtractedNodeSpec[]Definitions for how nodes should be extracted from this data array.
yieldsLinks?ExtractedLinkSpec[]Definitions for how links should be extracted from this data array.
mapYieldsLinks?MapLinkSpec[]Definitions for how nested links should be extracted from this data array.

An ExtractedNodeSpec:

AttributeTypeDescription
type?stringThe node type of nodes created by this rule.
id_fieldstringThe name of the field which contains the node id.
dataValue[]List of fields from the data array to be copied to the nodes, and their new names.

An ExtractedLinkSpec:

AttributeTypeDescription
source_idValFromFieldThe field in the data array that contains the id of the source node.
source_id_field?stringThe field on the source node which contains the value that will be used as the id (default is id).
source_node_typestringThe type of the source node.
target_idValFromFieldThe field in the data array that contains the id of the target node.
target_id_field?stringThe field on the target node which contains the value that will be used as the id (default is id).
target_node_type?stringThe type of the source node.
data?Value[]List of fields from the data array to be copied to the links, and their new names.
addReverseLinks?booleanIf true, replace any undirected links by a pair of directed links.

Sometimes a data array will yield multiple links per row, because one field contains a list of elements to which connections are made. This can be handled by a MapLinkSpec, which has the same attribute as an ExtractedLinkSpec, but also an extra field attribute which records the name of the field containing this list.

A Value is either:

  • a string
  • an object with required attribute field (containing a field name) and an optional attribute as
  • an object with required attribute value and an optional attribute as

Importing existing networks

note

Currently only Pajek and GraphML formats are supported. In future, support may be added for other popular formats such as GEXF, GF, GML, GraphViz DOT, UCINET DL, and TULIP TLP.

There are several formats which store information about a network explicitly, so that they can be loaded without needing to specify the relationship between data elements and nodes or edges.

Such data can be imported directly by providing the URL of the data file.

Pajek

For documentation of the Pajek NET file format, see the manual and these introductory slides.

The Pajek format has many variations; currently we support only arcs/edges or arcslist/edgeslist form (not matrix format).

    {
"name": "net",
"format": "pajek",
"url": "./data/example.pajek"
}

GraphML

NetPanorama supports the import of GraphML (not to be confused with GML) files. You may find the GraphML Primer a useful resource.

Not all GraphML features are supported. Specifically, we do not support:

  • nested graphs (i.e., graphs in which the nodes are hierarchically ordered)
  • hypergraphs (rather than edges joining a pair of nodes, these include hyperedges joining a set of nodes)
  • ports (these specify where on nodes connecting edges should be drawn)

Computing network metrics