Hadoop in The Enterprise:: Maximizing Big Data Benefits With Mapr and Informatica
Hadoop in The Enterprise:: Maximizing Big Data Benefits With Mapr and Informatica
Hadoop in The Enterprise:: Maximizing Big Data Benefits With Mapr and Informatica
TM
Table of Contents
Introduction Hadoop: A Strategic Data Analytics Platform Informatica with MapR A Better Hadoop: Additional Enhancements in MapRs Distribution Summary
Introduction
The volume, velocity and variety of data are all growing relentlessly. The growth is causing organizations to struggle nding the tools, talent and time to get value from data cost-eectively. The need to integrate Big Transaction Data with Big Interaction Data while leveraging Big Data Processing technologies like Hadoop, is particularly challenging. Informatica oers the industrys leading independent data integration platform that uniquely enables organizations to maximize the return on Big Data and drive top business imperatives. Informatica is also integrated with Hadoop, which is purpose-built for processing Big Data eectively and aordably, and MapR Technologies distribution for Hadoop which improves performance, scalability, reliability and ease-of-use. This white paper outlines how the combination of Informaticas Data Integration platform and MapRs distribution for Hadoop oers powerful new capabilities for integrating and processing Big Data more eciently and costeectively than ever before.
Page 2
Page 3
Informatica with MapR, Continued. Informaticas HParser Community Edition (included in the MapR distribution) helps create an easy-to-use integrated data environment (IDE) that enables customers to visually design data parsing transformations for industry-standard (e.g. FIX, SWIFT, ACORD, HL7, EDI, and many more) and popular document formats (e.g. MS Oce, PDF, etc.), as well as complex les (e.g. Logs, Omniture, XML and JSON), which can then be executed in parallel in the Hadoop cluster. The performance advantages of MapR, combined with the eciency of HParser, allow users to perform data parsing and transformations with higher performance and lower hardware costs compared to other options. PowerExchange for Hadoop makes it easier for non-programmers to move transaction and interaction data between a MapR cluster and other databases and data warehouses, without the use of hand-coding. MapRs Direct Access NFS interface also enables users to leverage Informaticas full range of data sources and transformations with the Hadoop environment.
Page 4
Summary
By using Informatica with MapRs distribution for Hadoop, organizations are now able to achieve high-performance data integration, replication and messaging. Together the two companies are pushing the limits of high-performance networks to move many terabytes per hour of transaction, interaction and streaming data into the MapR cluster, as well as to parse and process a broad range of structured and unstructured data natively in Hadoop all without coding. The combination also gives organizations a more aordable way to archive data in applications, data warehouses and/or legacy systems to Hadoop, or to archive data to Hadoops lower-cost storage.
Together Informatica and MapR provide a cost-eective, analytic-ready data storage and processing with enterprise-class high availability and business continuity. To learn more, please visit either company on the Web at www.mapr.com or www.informatica.com, or call 855-NOW-MAPR (855-669-6277).
MapR Technologies is the creator of the industrys fastest, most dependable and easiest to use distribution for Apache Hadoop. MapR Technologies is dedicated to advancing the Hadoop platform and ecosystem to enable more businesses to harness the power of big data analytics for competitive advantage. For more information, please visit www.mapr.com.
2012. MapR. Condential. 05.12