Talend Open Studio Cookbook
上QQ阅读APP看书,第一时间看更新

Introduction

This chapter mainly deals with the tMap component which is usually the main processing component at the heart of any Talend transformation job.

The tMap component

The tMap component has extensive transformation capabilities and has thus become the data integration developer's tool of choice. Among the tMap component's capabilities are the ability to:

  • Add and remove columns
  • Apply transformation rules to one or more columns
  • Filter input and output data
  • Join data from multiple sources into one or many outputs
  • Split source data into multiple outputs

Flexibility

The tMap component is multipurpose and very flexible and because of this there is often the temptation to do as much as possible in a single tMap component. This isn't recommended, since this can raise the complexity to a level where the code becomes difficult to understand and to maintain. It is recommended that multiple tMap components be used to manage complex transformations, so that the code is more easily understood.

Single line of code

One of the main limitations of tMap is that the output expressions for transformation are limited to just a single line. This can be overcome using code routines that perform complex logic or utilizing tMap variables and the Java ternary operation can be used to perform conditional logic.

All these techniques will be demonstrated in this chapter.

Batch versus real time

The operation of lookups (for joining) can be manipulated in tMap to enable efficient joining in both batch and real-time mode. The reload at each row option for real-time lookups will be detailed later in the chapter.