Logo of php-etl
Getting Started
🐘 Standalone 🎡 Symfony 🦒 Sylius
Core Concepts
The Concept Execution Context Item Types Custom Operations Glossary FAQ
⛓️ Operations
Building Blocks
Split Merge Repeat Safe
Extract
File Finder CSV JSON
Transform
External File Processor Filter Data Rule Transformer Split Item HTTP Client Log Callback
Aggregation
Simple Grouping
Load
CSV JSON
πŸ§‘β€πŸ³ Cookbook
Without Context
Grouping / Aggregation Filtering Splitting/Forking Making your chains configurable Complex data to csv / Flatten Data Api to CSV NΒ°1 Api to CSV NΒ°2 Sub chains
With Context
Api to CSV Import external file
Custom Operations

PHP-ETL - Understand the ETL
Execution Context - Why to have an execution context & what it does

Execution Context - Why to have an execution context & what it does

In most of our examples our chain had access to the whole file system. This means having multiple chains running together, or having a list of files each execution has generated is impossible.

Both the 🎡 Symfony Bundle(and therefore the 🦒 Sylius integration) and the Magento2 Module will use contextual chains. This means the β€œmain” operations have only access to a particular directory created for the execution of the chain.

This directory might be locally available on the server or it might be a remote file system. This can be usefull if php-etl is used on a multi server setup for example to share files between the servers.

Additional operations such as the ExternalFileFinderOperation and ExternalFileProcessor will be use to process files that are either on a remote directory (sftp, bucket s3…) or files that are on the local file system. Because operations such as the CsvLoader will not have access to those files unless they are copied into the contextual directory of the current execution.

So both or ExternalFile & our context can be a remote, they could be the same remote, or 2 different remotes.

Let start by a simple example.

Write the result of an API to a CSV File.

For this we will first create a new ContextFactory using PerExecutionContextFactory. This context factory will create unique contexts for each execution. This means a unique directory to run the etl in; and a unique logger.

This is only needed if you are running the etl in 🐘 standalone. With any integration this should be automatically handled for you.

<?php
$workdir = __DIR__ . "/var/";
$dirManager = new ChainWorkDirManager($workdir);
$loggerFactory = new NullLoggerFactory();
$fileFactory = new LocalFileSystemFactory($dirManager);

return new PerExecutionContextFactory(
        $dirManager,
        $fileFactory,
        $loggerFactory
    );

The execution is identified with objects of type ExecutionInterface set on the processor:

$options = [
    'etl' => [
        'execution' => new PockExecution(new DateTime())
    ]
];

$chainProcessor->process(
    new ArrayIterator([[]]),
    $options
);

Executing this will create a directory in var/ with the output result. Everytime you execute the chain a new directory wil be created.

Network

GitHub Repo Issues Good First Issues

Help Preserve This Project

Support for the continued development of php ETL. I maintain this project in my free time.

Support
Free & Open Source (MIT)