Logo of php-etl
Getting Started
🐘 Standalone 🎵 Symfony 🦢 Sylius
Core Concepts
The Concept Execution Context Item Types Custom Operations Glossary FAQ
⛓️ Operations
Building Blocks
Split Merge Repeat Safe
Extract
File Finder CSV JSON
Transform
External File Processor Filter Data Rule Transformer Split Item HTTP Client Log Callback
Aggregation
Simple Grouping
Load
CSV JSON
🧑‍🍳 Cookbook
Without Context
Grouping / Aggregation Filtering Splitting/Forking Making your chains configurable Complex data to csv / Flatten Data Api to CSV N°1 Api to CSV N°2 Sub chains
With Context
Api to CSV Import external file
Custom Operations

PHP-ETL - Cook Books
Grouping / Aggregation

A second example we can work on is to write a json file where customers are grouped based on their subscription state. We will write this in json as its more suited to understand what we are doing.

Let’s start by reading our csv file

  read-file:
    operation: csv-read
    options: [] # The default delimeter

We will use the simple-grouping operation for this. This operation needs to put all the data in memory and should therefore be used with caution.

We have a single grouping-key, we can make more complex grouping operations, by grouping by subscription status and gender for example.

Grouping identifier allows us to remove duplicates, if we had customer emails we could have used that information for example.

group-per-subscription:
  operation: simple-grouping
  options:
    grouping-key: ['IsSubscribed']
    group-identifier: []

We will also use json write operation.

This works like the csv file, but is more suited for complex multi level datas as we have after the grouping.

write-new-file:
  operation: json-write
  options:
    file: "output.json"

Complete yaml

chain:
  read-file:
    operation: csv-read
    options: [] # The default delimeter

  group-per-subscription:
    operation: simple-grouping
    options:
      grouping-key: ['IsSubscribed']
      group-identifier: []

  write-new-file:
    operation: json-write
    options:
      file: "output.json"

Network

GitHub Repo Issues Good First Issues

Help Preserve This Project

Support for the continued development of php ETL. I maintain this project in my free time.

Support
Free & Open Source (MIT)