Skip to main content
RSS feed Subscribe to feed

 

How to Create a Data Transformation

This tutorial describes how to create a data transformation capable of changing the data type of a column or replacing null values in a column.

Overview

Data transformations transform data at import, before it is loaded into Spotfire. This tutorial describes a transformation that changes a column, either by altering the type type of a column or by replacing the null values.

Background Information
  • Creating a Transformation
    Transformations are applied to data when loading it into Spotfire, shaping the data to the desired form before analyzing it.
Prerequisites
  • Spotfire SDK\Examples\Extensions\SpotfireDeveloper.CleanupTransformationExample

Cleanup Transformation Example

The core functionality of the custom transformation is implemented by three classes:

  • CleanupTransformation: Defines the logic of the transformation.
  • CleanupTransformationDialog: The dialog retrieving settings for the transformation.
  • CleanupTransformationReader: Performs the actual transformation. It is returned by the CleanupTransformation when the Connect method is called.

CleanupTransformation.cs

The CleanupTransformation class is implemented by deriving from CustomDataTransformation. It handles the prompting for settings and creates a DataRowReader for the transformation.

  1. Override ConnectCore.
    This method checks if prompting is required and creates a DataTransformationConnection which prompts the registered transformation dialog and creates a CleanupTransformationReader.
  2. Override GenerateDataHistoryCore.
    This method must be implemented if the transformation requires settings to execute. In the method all applied settings are added as details to a DataHistoryBuilder.
  3. Add public properties.
    The CleanupTransformation class is passed to the dialog and the specified settings are then stored in it.
    The properties of the CleanupTransformation class must be public to enable the dialog to store the settings.
  4. Implement serialization.
    The settings for the transformation must be serialized.

CleanupTransformationDialog.cs

The user must be able to specify parameters for the transformation and therefore a dialog must be created. For the CleanupTransformation the user must be able select a column and specify if the column should change type or if null values ought to be replaced.

Cleanup data transformation dialog
If the column shall change type, the replacement type must be specified.
If null values shall be replaced, the replacement value must be specified.
  1. Update the model with user information.
    When the user presses the OK button, the specified parameters are stored in the model. The stored values are used as parameters when creating the DataRowReader.

CleanupTransformationReader.cs

  1. Implement a class deriving from CustomDataRowReader.
    DataRowReader is the class that performs the transformation of the data.
    The reader will step through the rows of the dataset. For each row the data will be calculated in the MoveNextCore method.
  2. Implement the constructor.
    Store the settings for the transformation and create a list for the cursors used to transform the data.
    Each column is added to a list of columns. For the column to be transformed a custom cursor is assigned. If the data type shall be changed, the column gets a new data type. The transformation cursor and the input cursor is stored, these will be used in the MoveNextCore method.
  3. Implement GetColumnsCore.
    This method returns the columns that the DataRowReader can return.
  4. Implement GetResultPropertiesCore.
    This method returns the result properties for the DataRowReader.
    Properties to describe the performed transformation are added to the result property collection.
  5. Implement MoveNextCore.
    This method moves the cursors to the next row of the dataset. First a check is performed that the input reader can move to the next row. Then all cursor in the list of transformation cursors are processed.
    For each cursor the output value will be computed:
    • If column type is to be changed the input value is converted to the new type.
    • If the null values is to be replaced the invalid input values is replaced by the defined value.
  6. Implement ResetCore.
    This method resets the DataRowReader.