Software Open Access

scienceguyrob/SKANIP_Data_Models: SKA NIP Data Models v1.1

Dr Rob Lyon

CSP to SDP NIP Data Rates & Data Models (version 1.1)

this release contains an ipython notebook and collection of python scripts. These make it easier to model SKA Science Data Processor (SDP) data rates. Diagrams are included that define the conceptual and logical structure of the Non-Imaging Processing (NIP) data models. Also included are activity diagrams for all NIP pipelines. Finally, formulas are presented that provide
accurate estimates of NIP pipeline data rates.

Author: Rob Lyon

Email :

web :


This  release consists of,

  • an ipython notebook that explores and describes SDP data models for NIP pipelines.
  • diagrams containing Entity-Relationship Models (ERMs), describing the data models/pipeline activity in NIP.
  • formulas useful for deriving NIP pipeline data rates.
  • standalone unit tested source code, which validates the correctness of the interactive ipython notebook code.


The notebook describes the conceptual and logical data models for the Non-Imaging Processing (NIP) components of the SKA Science Data Processor (SDP), according to Software Engineering Institute (SEI) standards. The notebook was created for JIRA TSK-1294, which requested SEI compliant versions of the NIP data models. This document represents the 4th iteration (at least) of the data models work, however only the last two iterations, TSK-12 and TSK-73 are recorded in JIRA.

All data models presented in this notebook are described via Entity-Relationship Models (ERMs). These are sometimes referred to as an entity-relationship diagrams (ERDs). There is no standard notation for ERMs, however we have attempted to adhere to defacto standards. Unified Modelling Language (UML) activity diagrams are also used to characterise the processing activity within NIP, which ingests and outputs the data models.

This ipython notebook also models the input data rates and data volumes for NIP. It includes models for,

  • pulsar/transient search mode.
  • pulsar timing mode.
  • dynamic spectra mode. The SKA science working group are discussing changing the name of this mode, to better reflect its function. The current name proposed is “Pulsar search filterbank format Data Stream”. Please be aware of this, as this change may come into effect after this notebook is finished. The name may even be changed to something entirely different.

Data rate estimates are provided via interactive Python 2.7 code. This can be run either within the notebook, or externally via the supplied source code.

Given the use of ERM & UML, basic familiarity with SEI standards is recommended to fully understand the diagrams and terminology used in this notebook.


The code and the contents of this notebook are released under the GNU GENERAL PUBLIC LICENSE, Version 3, 29 June 2007. We kindly request that if you make use of the notebook, you cite the work appropriately.


The notebook builds upon earlier work by Dr. Lina Levin-Preston & Prof. Ben Stappers. Lina wrote the original data models document, did all the data volume/rate calculations, and provided the original notation! Ben worked on this too. This notebook relies greatly on their earlier work. Thank you Andrea Possenti, for providing valuable feedback which has helped produce the latest version.

Change log

Version 1.1 - Incorporates feedback from Andrea Possenti. Altered the description of a program and schedule block in Section 2.

Files (5.2 MB)
Name Size
5.2 MB Download
All versions This version
Views 168168
Downloads 1818
Data volume 94.1 MB94.1 MB
Unique views 163163
Unique downloads 1818


Cite as