PHP-ETL - Cook Books
Api to CSV

Simple API to CSV

The php etl also provides a basic http client operation, this operation will allow us to get or push data using rest api’s.

Let’s call a mock api returning a list of users.

Using response_is_json allow us to decode the json returned by the api automatically. option_key will allow us to pass additional options to the query. This can be used to add dynamic headers, or data that needs to be posted. If response_key is set that the response data will be added to the original data object. If not the response will replace the input data.

  get-from-api:
    operation: http
    options:
      url: https://63b687951907f863aaf90ab1.mockapi.io/test
      method: GET
      response_is_json: true
      option_key: ~
      response_key: ~
      options:
        headers: {'Accept': 'application/json'}

This will return a single DataItem with all the users of the api. We will need to split this item in order to process each users individually.

  split-item:
    operation: split-item
    options:
      keys: ['content']
      singleElement: true

Now we can write the users into the csv file, as we have done so in our previous examples.

Complete Code

chain:
  get-from-api:
    operation: http
    options:
      url: https://63b687951907f863aaf90ab1.mockapi.io/test
      method: GET
      response_is_json: true
      option_key: ~
      response_key: ~
      options:
        headers: {'Accept': 'application/json'}

  split-item:
    operation: split-item
    options:
      keys: ['content']
      singleElement: true

  write-new-file:
    operation: csv-write
    options:
      file: "output.csv"

Transform the data before writing it.

Previously we fetched from a mock api all the users, what if we need to call individual api’s for each user id. In order to achieve this we need the url of our api to be “dynamic” as at each execution we need to use another user id.

We can achieve this by using symfony expressions in the url key. To tell the operation that a symfony expression is being used just prefix it with a @. (! is for using values from the input, @ is for using data from the current data. All fields do not support @ as it’s handled by each operation. but all fields support ! as it’s generated before the etl starts processing).

We will also change the option_key, if not our data (id = 1), will be sent into the options of the HttpClient, which will cause an error. Having an invalid key here will allow us not to have any options.

Let us note that this operation runs multiple queries with concurrency. A single Symfony HttpClient is created for this operation. And using the AsyncItems functionality of the ETL, we can run all the http requests in parallel.

  get-from-api:
    operation: http
    options:
      url: '@"https://63b687951907f863aaf90ab1.mockapi.io/test/"~data["id"]'
      method: GET
      response_is_json: true
      option_key: "-placeholder-"
      response_key: ~
      options:
        headers: {'Accept': 'application/json'}

Now we can write the users into the csv file, as we have done so in our previous examples.

Complete Code

chain:
  get-from-api:
    operation: http
    options:
      url: '@"https://63b687951907f863aaf90ab1.mockapi.io/test/"~data["id"]'
      method: GET
      response_is_json: true
      option_key: "-placeholder-"
      response_key: ~
      options:
        headers: {'Accept': 'application/json'}

  content-only:
    operation: rule-engine-transformer
    options:
      add: false # We want to replace all existing columns with our new columns.
      columns:
        createdAt:
          rules:
            - get: {field: ['content','createdAt']}
        name:
          rules:
            - get: {field: ['content','name']}
        avatar:
          rules:
            - get: {field: ['content','avatar']}
        id:
          rules:
            - get: {field: ['content','id']}

  write-new-file:
    operation: csv-write
    options:
      file: "output.csv"