New Features in Informatica 9.5
1. Data Integrator Analyst Option
2. PowerCenter Data Virtualization Edition.
3. Data Validation Option.
4. Proactive Monitoring for PowerCenter.
5. Metadata Manager update(Advanced).
6. PowerCentre for Hadoop
Informatica 9 empowers line-of-business managers and business analysts to identify bad data and fix it faster. Architecture wise there are no differences between Informatica 8 and 9 but, there are some new features added in powercenter 9.
Informatica 9 includes the Informatica Developer and Informatica Analyst client tools.
1. The Informatica Analyst tool is a browser-based tool for business analysts, stewards and line of business managers. This tool supports data profiling, specifying and validating rules (Scorecards), and monitoring data quality.Data Integration Analyst Option Key Features & Benefits
•Deliver projects faster. The Data Integration Analyst Option empowers business analysts to immediately get the information they need to perform self-service data integration tasks on their own.
•Increase productivity of business analysts and IT teams. The Data Integration Analyst Option enables metadata-driven business-IT collaboration and agile data integration that eliminate manual steps and countless iterations.
•Reduce the chance of human error. The Data Integration Analyst Option automatically generates data integration mappings from analyst-defined specifications.
•Improve data governance. The Data Integration Analyst Option enables the business to own the data while IT controls the process and has complete visibility of the data lifecycle.
2 . Data virtualization is process of integrating data from many disparate sources in real-time or near real-time to support various business requirements. It involves integrating, transforming and delivering data as a service to support applications and processes. It makes use of the federated database technology. Data Virtualization can play the role of AGILE BI development i.e. a) Virtual Data Profiling b) Data Quality c) Advanced Data Caching d) ETL Integrator Tool e)BI interfaces.The Business Analyst or the Inline manager can dig in the DW/mart directly but in an integrated way. The IT and the business team can work on the same Data Integrator logic, thereby changing the traditional way of making a change request which takes lot of turnaround time (from ETL, Staging, DW, Metadata to reporting teams). Once satisfied with the logic (post the collaboration of the business and the IT team) the data acquisition logic can be implemented at the ETL layer & thereby enriching the Data Warehouse.
Hence, DV can provide agility to the organization’s Data Integration and business intelligence processes a). Operationalised reports to be based on centralized logic, b). De –coupling the BI tools from the data acquisition logic i.e. allowing all tools to connect with the metadata layer.
3. Speeding up the information delivery process by engaging the business analyst and decreasing the number of steps required.
Please check out the excellent explanation here. Data Validation OptionDVO is a “black box” testing solution that provides automation, repeatability and auditability to virtually any data testing or reconciliation process.
Data Validation Option Benefits:
•Increased likelihood of project success, lower project risk
•Significant cost savings, faster time to market
•50% source-to-target testing
•80% – 90% regression / upgrade testing
•Ability to test all data, not just a small sample
•Ability to test in heterogeneous environments
•No need to write SQL
•Complete Audit Trail of all your testing activities
•No need to acquire additional server technology:
4. Proactive Monitoring:
- Operations – Improve Uptime
- Identify issues before workflows fail or lead to erroneous reports
- 20+ prebuilt alerting rules and templates
- NEW environmental variables (CPU, memory, tablespace) to track and correlate against schedules and jobs
- Governance – Improve Quality
- Identify development best practices violations before they hit production
- 100s of pre-configured attributes simplify rule customization
- Regulate SQL overrides, enforce commit and cache sizes and dozens of other documented practices
- PowerCenter Metadata Manager– Meta Data reports- Consolidated Metadata Catalog- Personalized data lineage- Business Glossary- 3rd Party BI- Metadata bookmarks
New in 9.5
•New XConnect for SAP BW 7.0, 7.3
•New XConnect for custom metadata
•Easier to develop and manage: Validation, user-friendly logging
•Revamped all XConnects for BI and data modeling tools
•Focused functionality on Operational (Production) validation scenarios
•Enhanced reports and dashboards with drill down (Jasper)
•Testing across heterogeneous table joins
•Extending data support
- Big Transaction Data PowerExchange for Relational, Data Warehousing, Applications
New with 9.1: •PWX for Greenplum •PWX for MS Dynamics CRM •PWX for Oracle CDC – Direct to Log •Support for Exadata: Source/target and repository
New with 9.5: •PWX for SAP: SAP HANA extract/load via ODBC, SAP for BW 7.3 •PWX SQLServer: Datetime2 support, bulk load performance
At the Hadoop World conference in New York, Informatica Corp. announced a new big data-ready version of its PowerCenter data integration (DI) offering, the aptly-named PowerCenter Big Data Edition.Just as PowerCenter itself has evolved into more-than-just-an-ETL tool — it bundles data profiling, data cleansing, and other features — PowerCenter Big Data Edition might be called more-than-just-a-Hadoop-DI tool: it even includes vanilla PowerCenter.According to John Haddad, director of product marketing with Informatica, the new big data-ready version of PowerCenter bundles features such as data profiling, data cleansing, data parsing, and “sessionization” capabilities.
PowerCenter Big Data Edition also includes a license for conventional PowerCenter, says Haddad; this permits customers to run ETL or DI jobs in the context — i.e., in Hadoop or on one or more large SMP boxes — that’s most appropriate to their requirements or workload characteristics.”It includes the license and [the] capability to run traditional PowerCenter and scale it up on multiple CPUs like an SMP box or on a traditional grid infrastructure,” he confirms. “You’re not going to use Hadoop for all of your workloads; if you’re doing a few gigabytes of structured data on a daily basis and you want it to be processed in near-real time, you would deploy that on a traditional grid infrastructure,” Haddad continues. “If the next day, you have 10 terabytes of data and you need extra processing capacity, you can run that in Hadoop.”Accommodating HadoopVendors are accommodating Hadoop in different ways. DI vendors, for example, tend to take either of two approaches. Some vendors have gone “all-in” on Hadoop and MapReduce — the approach leverages the Hadoop implementation of MapReduce to perform the processing associated with ETL workloads.
Open source software (OSS) DI specialist Talend is an example of this approach.Other vendors have employed an embrace-and-extend approach. DI offerings from vendors such as Pervasive Software Inc. and Syncsort Inc., for example, run at the node-level across a Hadoop cluster; they use their own libraries in place of MapReduce, such that a Pervasive or a Syncsort engine actually does the ETL processing in place of MapReduce on an individual Hadoop node.
Informatica’s approach is closer to that of Talend’s — with a key difference. In the context of Hadoop, PowerCenter Big Data Edition — like Talend Open Studio for Big Data — uses MapReduce to do its ETL heavy lifting. However, customers also can run non-Hadoop workloads in conventional PowerCenter. (The Big Data version of Talend Open Studio does not include a license for conventional — i.e., non-Hadoop-powered — Talend ETL. If you buy Open Studio for Big Data, you’re using MapReduce to do your ETL processing.)”Hadoop is not for all types of workloads and we recognize that. In some ways, the Big Data Edition is elastic. Even if you’re doing a big data project, you’re clearly going to want [to involve] some of your more traditional [data] sources, too,” says Haddad, who adds: “Don’t you want one package that can do it all?”Haddad and Informatica aren’t necessarily insisting on an arbitrary distinction.
Some critics allege that although MapReduce-powered ETL is a good fit for certain kinds of workloads, it makes for a comparatively poor general-purpose ETL tool.”[MapReduce] is brute force parallelism. If you can easily segregate data to each node and not have to re-sync it for another operation [by, for example,] broadcasting all the data again — then it’s fast,” said industry veteran Mark Madsen, a principal with information management consultancy Third Nature Inc., in an interview earlier this year.The problem, Madsen drily noted, is that this isn’t always doable.Haddad acknowledges that most of Informatica’s competitors market Hadoop- or Big Data-ready versions of their DI platforms. On the other hand, he insists, PowerCenter Big Data Edition supports both Hadoop MapReduce and conventional ETL.
For this reason, and in view of the shortcomings of MapReduce-powered ETL for certain kinds of workloads, Informatica’s is the more “flexible” approach, Haddad claims.”As companies move more of their workloads to Hadoop, you don’t want them to go back to the stones and knives of hand coding,” he points out, “so we provide the ability to remove hand coding within Hadoop for ETL and things like that. We also make it possible for [customers] to design and build [DI jobs] once and deploy [them] anywhere: on a traditional grid or on Hadoop.”
The Powercenter Administration Console has been renamed the Informatica Administrator. The Informatica Administrator is now a core service in the Informatica Domain that is used to configure and manage all Informatica Services, Security and other domain objects (such as connections) used by the new services.The Informatica Administrator has a new interface. Some of the properties and configuration tasks from the powercenter Administration Console have been moved to different locations in Informatica Administrator. The Informatica Administrator is expanded to include new services and objects.
Cache Update in Lookup Transformation
You can update the lookup cache based on the results of an expression. When an expression is true, you can add to or update the lookup cache. You can update the dynamic lookup cache with the results of an expression.
Database deadlock resilienceIn previous releases, when the Integration Service encountered a database deadlock during a lookup, the session failed. Effective in 9.0, the session will not fail. When a deadlock occurs, the Integration Service attempts to run the last statement in a lookup. You can configure the number of retry attempts and time period between attempts.
Multiple rows return
Lookups can now be configured as an Active transformation to return Multiple Rows. We can configure the Lookup transformation to return all rows that match a lookup condition. A Lookup transformation is an active transformation when it can return more than one row for any given input row. Limit the Session Log . You can limit the size of session logs for real-time sessions. You can limit the size by time or by file size. You can also limit the number of log files for a session.
Auto-commit: We can enable auto-commit for each database connection. Each SQL statement in a query defines a transaction. A commit occurs when the SQL statement completes or the next statement is executed, whichever comes first.
Passive transformation: We can configure the SQL transformation to run in passive mode instead of active mode. When the SQL transformation runs in passive mode, the SQL transformation returns one output row for each input row. Connection managementDatabase connections are centralized in the domain. We can create and view database connections in Informatica Administrator, Informatica Developer, or Informatica Analyst. Create, view, edit, and grant permissions on database connections in Informatica Administrator.
Monitoring: We can monitor profile jobs, scorecard jobs, preview jobs, mapping jobs, and SQL Data Services for each Data Integration Service. View the status of each monitored object on the Monitoring tab of Informatica Administrator.
Deployment: We can deploy, enable, and configure deployment units in the Informatica Administrator. Deploy Deployment units to one or more Data Integration Services. Create deployment units in Informatica Developer.
Model Repository ServiceApplication service that manages the Model repository. The Model repository is a relational database that stores the metadata for projects created in Informatica Analyst and Informatica Designer. The Model repository also stores run-time and configuration information for applications deployed to a Data.
Data Integration Service: Application service that processes requests from Informatica Analyst and Informatica Developer to preview or run data profiles and mappings. It also generates data previews for SQL data services and runs SQL queries against the virtual views in an SQL data service. Create and enable a Data Integration Service on the Domain tab of Informatica Administrator.
XML Parser : The XML Parser transformation can validate an XML document against a schema. The XML Parser transformation routes invalid XML to an error port. When the XML is not valid, the XML Parser transformation routes the XML and the error messages to a separate output group that We can connect to a target.
Enforcement of licensing restrictions: Powercenter will enforce the licensing restrictions based on the number of CPUs and repositories. Also Informatica 9 supports data integration for the cloud as well as on premise. You can integrate the data in cloud applications, as well as run Informatica 9 on cloud infrastructure.
- Informatica- Differences 8.X and 9.X (srinimf.com)
- Informatica Training in Chennai | Informatica Training Institute in Chennai | Informatica Syllabus – 90038 53466 (informaticatraininginchennai.wordpress.com)
- Filter Transformation in Informatica with example (informaticatipstricks.wordpress.com)