DataStage continued to enhance it’s capabilities to manage data quality and data integration solutions. DataStage 8.0 introduced many new features to make development and maintenance of project comfortable. These enhancements include data quality management, connectivity methods, implementation of slowly changing dimension.
IBM Information Server, consist of the following components, WebSphere DataStage and Quality Stage, WebSphere Information Analyzer, Federation Server, and Business Glossary, common administration, logging and reporting. These components are designed to provide much more efficient ways to manage metadata and develop ETL solutions. Components can be deployed based on client need.
With the Hawk release, DataStage has created common administration, logging and reporting and this will improve metadata reporting available, compared to prior releases.
Data Quality is highly critical for data integration projects. With earlier releases such as MetaStage, Quality Stages used to add lot of additional overhead in installation, training and implementation. With new release of QualityStage, integration projects using standardization, matching and survivorship to improve quality will be more accessible and easier to use. Also, developer will be able to design jobs with data transformation stages and data quality stages in the same session. Designer is called DataStage and QualityStage Designer in current release, based on it’s usage.
Managing connectivity information and propagating connectivity information between different environments, has added additional development and maintenance overhead. These new objects help in connecting to remote database connectivity easier. Earlier releases, development team may need to spend considerable time in resolving connectivity issues with the database. DataStage 8 will help the team by providing frictionless connectivity and connectivity objects, ensure reusability and reduces risk of data issues due to wrong connectivity information.
It’s always important to get different options to access data for lookup and accessing over a range is always better option when data range is available for improving performance. Range lookup has been merged into the existing lookup form and are easy to use.
Data Warehouse developers need to develop complex jobs to implement Slowly Changing Dimension. With this stage introduced in DataStage 8, following enhancements can be done easily, surrogate key generation, there is the slowly changing dimension stage and updates passed to in memory lookups. That’s it for me with DBMS generated keys, I’m only doing the keys in the ETL job from now on! DataStage server jobs have the hash file lookup where you can read and write to it at the same time, parallel jobs will have the updateable lookup.
This new feature allows developers to open any job, which is already opened by other developers. This copy of developer will be READ ONLY. This helps the developers in reducing wait time, when job is currently LOCKED by other user. New enhancements also allows you to unlock the job associated with a disconnected session from the web console in an easier way than prior releases.
With this feature an administrator can disconnect sessions and unlock jobs.
This feature reduces the effort spent in synchronizing SQL Select list to the DataStage column list. This will ensure that column mismatches. Adding to this in ODBC Connector, you will be able to complex queries with GUI, which includes adding columns and where clause to the statement.
With this new enhancement, when lot of small parallel jobs gets invocated, this will have less impact on DataStage long running jobs. Connectivity and resource allocation for parallel jobs has improved and load is balanced based on job requirement.
With this new feature, Data Stage has introduced common logging of Data Stage job logs. This helps in searching from Data Stage log. Data Stage has also introduced time based and record based job monitoring.