skip to main content
Quick Start: Progress DataDirect for ODBC for Apache Hive Wire Protocol Driver for Windows : Tuning for Performance
  

Try DataDirect Drivers Now

Tuning for Performance

The connection options described in this section directly affect the performance of your driver. To tune for performance, configure your driver according to the recommended settings and your environment.

Array Fetch Size

Purpose:
Determines the number of cells the driver retrieves from a server for a fetch. When executing a fetch, the driver divides the value specified by the number columns in a particular table to determine the number of rows to retrieve.
Performance Impact:
To improve throughput, increase the value of Array Fetch Size. By increasing the value of Array Fetch Size, you increase the number of rows the driver will retrieve from the server for a fetch. In turn, increasing the number of rows that the driver can retrieve reduces the number, and expense, of network round trips. Note that improved throughput does come at the expense of increased demands on memory and slower response time. Furthermore, if the fetch size exceeds the available buffer memory of the server, an out of memory error is returned when attempting to execute a fetch. If you receive this error, decrease the value specified until fetches are successfully executed.
Recommended Settings:
Tune this setting to reflect the typical fetch size of your application. Smaller fetch sizes can improve the initial response time of the query. Larger fetch sizes improve overall fetch times at the cost of additional memory.
GUI Tab:
Advanced

Batch Mechanism

Purpose:
Determines the mechanism that is used to execute batch operations.
Performance Impact:
Unlike the native batch mechanism, the multi-row insert mechanism only returns the total number of update counts for batch inserts. Therefore, setting BatchMechanism to MultiRowInsert offers substantial performance gains when performing batch inserts.
Recommended Settings:
If your application does not require individual update counts for each statement or parameter set in the batch, set to 2 (MultiRowInsert) for improved performance when executing batch inserts.
GUI Tab:
Advanced

Catalog Mode

Purpose:
Determines whether the driver uses native catalog functions to retrieve information returned by the SQLTables, SQLColumns, and SQLStatistics catalog functions.
Performance Impact:
Apache Hive’s native catalog functions return incorrect information in certain scenarios. To address this issue, by default, the driver uses a combination of driver-discovered information and native functions to retrieve more accurate catalog information than native functions alone. While using driver-discovered information improves accuracy, it does so at an expense to performance. If accurate catalog information is not required, you can improve performance by setting Catalog Mode connection option to 1 (Native).
Recommended Settings:
If accurate catalog information is not required, set to 1 (Native) for the best performance.
If your application requires accurate catalog information, set to 0 (Mixed) for the optimal balance of performance and accuracy.
GUI Tab:
Advanced

Default Buffer Size for Long/LOB Columns (in Kb)

Purpose:
The maximum length of data (in KB) the driver can fetch from long columns in a single round trip and the maximum length of data that the driver can send using the SQL_DATA_AT_EXEC parameter.
Performance Impact:
To improve performance when your application fetches images, pictures, or long text or binary data, a buffer size can be set to accommodate the maximum size of the data. The buffer size should only be large enough to accommodate the maximum amount of data retrieved; otherwise, performance is reduced by transferring large amounts of data into an oversized buffer. If your application retrieves more than 1 MB of data, the buffer size should be increased accordingly.
Recommended Settings:
Set DefaultLongDataBuffLen to a size no larger than the maximum size of data retrieved.
GUI Tab:
Advanced

Enable Cookie Authentication

Purpose:
Determines whether the driver attempts to use session cookie based authentication for requests to an HTTP endpoint after the initial authentication to the server.
Performance Impact:
Cookie based authentication improves response time by eliminating the need to re-authenticate with the server for each request.
Recommended Settings:
If your environment is configured to use session cookies, set to 1 (Enabled).
GUI Tab:
General

Encryption Method

Purpose:
Determines the method the driver uses to encrypt data sent between the driver and the database server.
Performance Impact:
Data encryption may adversely affect performance because of the additional overhead (mainly CPU usage) that is required to encrypt and decrypt data.
Recommended Settings:
If data encryption is not required, set to 0 (No Encryption) for improved performance.
GUI Tab:
Security

String Describe Type

Purpose:
Specifies whether all string columns are described as SQL_WVARCHAR. This connection option affects SQL_Columns, SQLDescribeCol, SQLColAttributes, etc. It does not affect SQLGetTypeInfo.
Performance Impact:
To obtain data from String columns with the getClob() method, the StringDescribeType connection property must be set to -10 (SQL_WLONGVARCHAR). (Otherwise, calling getClob() results in an "unsupported data conversion" exception.) When StringDescribeType is set to -10 (SQL_WVARCHAR), the driver not only maps String to Longvarchar but also allocates more space to cache the long data. Because more space is allocated for the long data, your application will incur a performance penalty.
Recommended Settings:
If your application does not need to fetch data as SQL_WLONGVARCHAR, set to -9 (SQL_WVARCHAR) for improved performance.
GUI Tab:
Advanced