PHP-ETL - Understand the ETL
Execution Context - Why to have an execution context & what it does
Execution Context - Why to have an execution context & what it does
In most of our examples our chain had access to the whole file system. This means having multiple chains running together, or having a list of files each execution has generated is impossible.
Both the π΅ Symfony Bundle(and therefore the π¦’ Sylius integration) and the Magento2 Module will use contextual chains. This means the βmainβ operations have only access to a particular directory created for the execution of the chain.
Additional operations such as the ExternalFileFinderOperation and ExternalFileProcessor will be use to process files that are either on a remote directory (sftp, bucket s3β¦) or files that are on the local file system. Because operations such as the CsvLoader will not have access to those files unless they are copied into the contextual directory of the current execution.
Let start by a simple example.
Write the result of an API to a CSV File.
For this we will first create a new ContextFactory using PerExecutionContextFactory. This context factory will create unique contexts for each execution. This means a unique directory to run the etl in; and a unique logger.
This is only needed if you are running the etl in π standalone. With any integration this should be automatically handled for you.
<?php
$workdir = __DIR__ . "/var/";
$dirManager = new ChainWorkDirManager($workdir);
$loggerFactory = new NullLoggerFactory();
$fileFactory = new LocalFileSystemFactory($dirManager);
return new PerExecutionContextFactory(
$dirManager,
$fileFactory,
$loggerFactory
);
The execution is identified with objects of type ExecutionInterface set on the processor:
$options = [
'etl' => [
'execution' => new PockExecution(new DateTime())
]
];
$chainProcessor->process(
new ArrayIterator([[]]),
$options
);
Executing this will create a directory in var/
with the output result. Everytime you execute the chain a new
directory wil be created.