Logo of php-etl
Getting Started
🐘 Standalone 🎵 Symfony 🦢 Sylius
Core Concepts
The Concept Execution Context Item Types Custom Operations Glossary FAQ
⛓️ Operations
Building Blocks
Split Merge Repeat Safe
Extract
File Finder CSV JSON
Transform
External File Processor Filter Data Rule Transformer Split Item HTTP Client Log Callback
Aggregation
Simple Grouping
Load
CSV JSON
🧑‍🍳 Cookbook
Without Context
Grouping / Aggregation Filtering Splitting/Forking Making your chains configurable Complex data to csv / Flatten Data Api to CSV N°1 Api to CSV N°2 Sub chains
With Context
Api to CSV Import external file
Custom Operations

PHP-ETL - Cook Books
Using Sub Chains

Using subchains

There will be cases where the chain description can become quite repetitive, let’s take the following example from Chapter 1 - Splittin/Forking.

In that example we have split our customer.csv files into 2 files, one with the customers subscribed to the newsletter and one with those not subscribed. We do not do any additional process to change the structure of the data.

Let’s now imagine we would like to extract only the firstName and Lastname from the csv file for the subscribed customers. The resulting chain would look like:

  branch-out:
    operation: split
    options:
      branches:
        -
          filter-unsubscribed:
            operation: filter
            options:
              rule: [{get : {field: 'IsSubscribed'}}]
              negate: false
          transform:
            operation: rule-engine-transformer
            options:
              add: false # We want to replace all existing columns with our new columns.
              columns:
                FirstName:
                  rules:
                    - get : {field: 'FirstName'}
                LastName:
                  rules:
                    - get : {field: "LastName"}
          write-new-file:
            operation: csv-write
            options:
              file: "subscribed.csv"
        -
          filter-subscribed:
            operation: filter
            options:
              rule: [{get : {field: 'IsSubscribed'}}]
              negate: true

          write-new-file:
            operation: csv-write
            options:
              file: "unsubscribed.csv"

In order to do the same for both subscribed & unsubscribed customer we would need to duplicate the whole transform operation. That would be quite inefficient. Also this is a very simple case, if we wanted to add grouping and more transforms it makes the amount of duplications even more important.

The subChain can be used in such cases:

We can create such a subchain that will make the necessary transformations.

subChains:
  customTransform:
    chain:
      -
        operation: rule-engine-transformer
        options:
          add: false # We want to replace all existing columns with our new columns.
          columns:
            FirstName:
              rules:
                - get : {field: 'FirstName'}
            LastName:
              rules:
                - get : {field: "LastName"}

We can use this operation anywhere within our chain

          transform:
            operation: subchain
            options:
              name: customTransform

The following rules applies for subchains:

  • Sub chains can have multiple operations as with a normal chain.
  • Operation for subchains are cloned, so a grouping operation will not share memory. Unless option; shared is true.
  • subchains can use subchains, so it’s possible to have multiple levels of subchains.

Complete Code

subChains:
  customTransform:
    chain:
      generic-subchain-transformation:
        operation: rule-engine-transformer
        options:
          add: false # We want to replace all existing columns with our new columns.
          columns:
            FirstName:
              rules:
                - get : {field: 'FirstName'}
            LastName:
              rules:
                - get : {field: "LastName"}

chain:
  read-file:
    operation: csv-read
    options: [] # The default delimeter,&
  branch-out:
    operation: split
    options:
      branches:
        -
          filter-unsubscribed:
            operation: filter
            options:
              rule: [{get : {field: 'IsSubscribed'}}]
              negate: false
          transform:
            operation: subchain
            options:
              name: customTransform
          write-new-file:
            operation: csv-write
            options:
              file: "subscribed.csv"
        -
          filter-subscribed:
            operation: filter
            options:
              rule: [{get : {field: 'IsSubscribed'}}]
              negate: true
          transform:
            operation: subchain
            options:
              name: customTransform
          write-new-file:
            operation: csv-write
            options:
              file: "unsubscribed.csv"

Network

GitHub Repo Issues Good First Issues

Help Preserve This Project

Support for the continued development of php ETL. I maintain this project in my free time.

Support
Free & Open Source (MIT)