InfoSphere DataStage is a powerful data integration tool. It was acquired by IBM in 2005 and has become a part of IBM Information Server Platform. It uses a client/server design where jobs are created and administered via a Windows client against central repository on a server. The IBM InfoSphere DataStage is capable of integrating data on demand across multiple and high volumes of data sources and target applications using a high performance parallel framework. InfoSphere DataStage also facilitates extended metadata management and enterprise connectivity
It has three levels of Parallelism which are:
- Pipeline Parallelism
- Data Parallelism
- Component Parallelism
I have found good tutorial on IBM datastage. Read at: http://etl-tools.info/en/datastage/datastage_tutorial.htm





