Quick Start Guide: Talend and Docker

Enterprise deployment work is notorious for being hidebound and slow to react to change. With many organizations adopting Docker and container services, it becomes easy to incorporate their Talend deployment life cycle into their existing Docker and container services, creating a more unified deployment platform to be shared across various applications within an organization.

This article is intended as a quick start guide on how to generate Talend Jobs as Docker images using a Docker service that is on a remote host.

Also, to provide better understanding on handling Docker images, a few topics below are discussed by drawing comparisons between sh/bat scripts and Docker images.

Setting up your Docker for remote build

Talend Studio needs to connect to a Docker service to be able to generate a Docker image.

The Docker service can run on a machine where Talend Studio is installed, or it might be running somewhere on a remote host. This step is not needed if Docker is running on the same machine where Talend Studio is installed; this step is needed only if Talend Studio and Docker are running on different hosts.

Docker Remote Build

Building a Docker Image from Talend Studio v7.1 or Greater

In v7.1, Talend introduced the Fabric 8 Maven plugin to generate a Docker image directly from Talend Studio.

Using Talend Studio, we can build a Docker image stored in a local Docker repository. Otherwise, we can build and publish a Docker image to any registry of our choice.

Let us look at both options:

Build the Docker Image from Talend Studio

  1. Right-click on the Job and navigate to the Build Job option:

2. Under build type, select Docker Image:

3. Choose the appropriate context and log4h level.
4. Under Docker Options, select local if Docker and Studio are installed on same host, or select Remote if your Docker service is running on a different host from the one where Talend Studio is installed. In our example, we enabled Docker for a remote build via TCP on port 2375:

tcp://dockerhostIP:2375
Quick-Start-Guide-Talend-and-Docker-remoteBuild2

5. Once this is done, your Docker image is built and stored in the Docker repository, in our example on host 2.

6. Log in to the Docker host, in our example host 2, and execute the command docker images. You should be able to view the image we just built:

Quick-Start-Guide-Talend-and-Docker-dockerImage2

Build and Publish the Docker Image to the Registry from Talend Studio

Talend Studio can be used to build a Docker image, and the image can be published to any registry where the images can be picked up by Kubernetes or any container services. In our example, I have set up an AWS ECR registry.

  1. Right-click on the Job name and navigate to the Publish option.
Quick-Start-Guide-Talend-and-Docker-publish.png
Quick-Start-Guide-Talend-and-Docker-publish.png

2. Select the Export Type Docker Image:

3. Under Docker Options, provide the Docker host and port details as discussed in the previous topics. Give the necessary details of the registry and Docker image name:

Image Name = Repository Name
Image Tag=Jobname_Version
Username = AccessKeyId (AWS)
Password=Secret (AWS)

4. Once this is done, navigate to AWS ECR and you should able to search and find the image:

Quick-Start-Guide-Talend-and-Docker-awsEcr

Running Docker Images vs Shell or Bat scripts

With Talend, we are all accustomed to either .SH or .Bat scripts, so for better understanding of how to run Docker images let’s cover various aspects, like how to pass run time parameters and volume mounting, in detail below.

Passing Run Time Parameters to a Docker Image

To run the Docker image that is in your Docker repository (Talend Build Job as Docker):

  1. List all the Docker Images by running the command docker images:
Quick-Start-Guide-Talend-and-Docker-dockerImage2

2. Now I want to run the image madhav_tmc/tlogrow, Tag latest, which uses a tWarn component to print a message. Part of the message will be from the context variable param.

3. Run the Docker image by passing a value to the context variable param at runtime:

docker run madhav_tmc/tlogrow:latest \--context_param param="Hello TalendDocker"

Below in the log, we can see the value passed to the Docker image at runtime:

Quick-Start-Guide-Talend-and-Docker-valuePassed