Progress DataDirect Hadoop Apache Hive ODBC Driver
Organizations are realizing they can mine and analyze vast amounts of data and transform new insights into action. But how can they effectively and efficiently connect and collect volumes of data from their data warehouses?
In many environments today, the Apache Hadoop® file system is used in the data warehouse and holds some of the richest information available. Processing Hadoop data into meaningful forms can yield intelligence that complements traditional analytics.
The challenging step is connecting existing SQL-based business intelligence and data analytic's tools to Hadoop data. Without such connectivity, companies' analysts and decision makers are locked out of the insights contained in Hadoop.
The DataDirect driver for Apache Hive is the only fully-compliant ODBC driver supporting multiple Hadoop distributions out-of-the-box. With the latest release of DataDirect Connect, the ODBC driver also supports the latest iteration of Apache Hive (“Hive2”), which includes improved support for concurrency and authentication. Those enhancements have also become available in key Apache Hadoop distributions such as Cloudera CDH 4.1. DataDirect Connect XE for ODBC 7.1 includes full Hive2 support and Cloudera CDH 4.1 certification. The improved concurrency allows the ODBC driver to manage concurrent connections to the Hive2 server, enabling better scalability and high availability for those applications requiring highly efficient access to Hive databases. In addition, improved authentication to the Hive2 server allows for plain text passwords and better overall authorized access and data security.
At a glance, the Progress DataDirect Hadoop Apache Hive ODBC driver delivers:
- High-performance and throughput with support for Hive2 and concurrent connections
- Improved authentication for increased data security
- Cloudera CDH 4.1 certification plus Cloudera Hive2 support
- In addition to Cloudera, support for Apache, MapR, and Amazon EMR Hadoop distributions
- Windows, RedHat, Solaris, SUSE, AIX, and HP-UX platform support
- SELECT, INSERT [OVERWRITE] SELECT, LOAD, and CREATE/DROP Hive grammar support
- Full driver metadata
- Support for parameter arrays, processing the arrays as a series of executions, one execution for each row in the array
- Support for standard SQL functionality, including Create Index, Create Table, Create View, Drop Index, Drop Table, Drop View
- Support for a wide range of data types: Int, TinyInt, SmallInt, BigInt, String, Double, Binary, Boolean, Float, and Timestamp
Supported Apache Hive Versions and Distribution Versions
||Apache Hive Version
|Amazon Elastic MapReduce (Amazon EMR)
Hive 0.9.x (pending)
|Apache Hadoop Hive
|Cloudera’s Distribution Including Apache Hadoop (CDH)
CDH3 update 4
|MapR Distribution for Apache Hadoop