skip to main content
Parameters for all supported Data Store types : Apache Hadoop Hive connection parameters : Apache Hadoop Hive connection parameters (on-premise)
 

Try DataDirect Cloud Now

Apache Hadoop Hive connection parameters (on-premise)

The following tables describe parameters available on the tabs of an Apache Hadoop Hive On-Premise Data Source dialog:
*General tab
*Security tab
*OData tab
*Advanced tab

General tab

Click the thumbnail to view the screen. Required fields are marked with an asterisk.
General tab of the Apache Hadoop Hive data source setup dialog (on-premise)General tab of the Apache Hadoop Hive data source setup dialog (on-premise)
Table 7. General tab connection parameters for Apache Hadoop Hive
Field
Description
Data Source Name
A unique name for this Data Source definition.
Note: Names can contain only alphanumeric characters and underscore.
Description
A description of this set of connection parameters.
User Id
The User Id for the Apache Hive account used to establish the connection to the Apache Hive server.
Password
A password for the Apache Hive account that is used to establish the connection to your Apache Hive server. A password is required if user ID/password authentication is enabled on your database. Contact your system administrator to obtain your password.
Note: By default, the password is encrypted.
By default, the characters in the Password field you type are not shown. If you want the password to be displayed in clear text, click the eye Watchful eye password button button. Click the button again to conceal the password.
Server Name
Specifies either the server name(if your network supports named servers) or the IP address of the primary Apache Hive server machine, for example, MyHiveServer or 122.23.15.12.
Port Number
The port number of the Apache Hive server to connect to.
Database
The name of the database that is running on the database server.
Connector ID
The unique identifier of the DataDirect Cloud On-Premise Connector that is to be used to access the on-premise data source. Click the arrow Down arrow for a field) and select the Connector that you want to use. The identifier can be a descriptive name, the name of the machine where the Connector is installed, or the Connector ID for the Connector.
If you have not installed an On-Premises Connector, and no Connectors have been shared with you, this field and drop-down list are empty.
If you own multiple Connectors that have the same name, for example, Production, an identifier is appended to each Connector, for example, Production_dup0 and Production_dup1. If the Connectors in the drop-down list were shared with you, the owner's name is appended, for example, Production(owner1) and Production(owner2).

Security tab

Click the thumbnail to view the screen.
Security tab of the Apache Hadoop Hive data source setup dialog (on-premise)Security tab of the Apache Hadoop Hive data source setup dialog (on-premise)
Table 8. Security tab connection parameters for Apache Hadoop Hive On-Premise
Field
Description
Crypto Protocol Version
Specifies a comma-separated list of the protocol versions that can be used in creating an SSL connection to the Data Source. If the specified protocol is not supported by the database server, the connection fails and the connectivity service returns an error.
Valid Values:
cryptographic_protocol [[, cryptographic_protocol ]...]
where:
cryptographic_protocol
is one of the following cryptographic protocols:
TLSv1 | TLSv1.1 | TLSv1.2
The client must send the highest version that it supports in the client hello.
Note: Good security practices recommend using TLSv1.2 if your data source supports that protocol version, due to known vulnerabilities in the earlier protocols.
Example
Your security environment specifies that you can use TLSv1.1 and TLSv1.2. When you enter the following values, the connectivity service sends TLSv1.2 to the server first:
TLSv1.1,TLSv1.2
Default: TLSv1, TLSv1.1, TLSv1.2
Encryption Method
Determines whether data is encrypted and decrypted when transmitted over the network between the DataDirect Cloud connectivity service and the on-premise database server.
Valid Values:
noEncryption | SSL
If set to noEncryption, data is not encrypted or decrypted.
If set to SSL, data is encrypted using SSL. If the database server does not support SSL, the connection fails and the DataDirect Cloud connectivity service throws an exception.
*Connection hangs can occur when the DataDirect Cloud connectivity service is configured for SSL and the database server does not support SSL. You might want to set a login timeout using the Login Timeout property to avoid problems when connecting to a server that does not support SSL.
*When SSL is enabled, the following properties also apply:
Host Name In Certificate
Validate Server Certificate
Crypto Protocol Version
The default value is noEncryption.
Host Name In Certificate
A host name that is validated against the information stored in an SSL certificate when validation is enabled (Validate Server Certificate=1). This option provides additional security against man-in-the-middle (MITM) attacks by ensuring that the server the connectivity service is connecting to is the server that was requested. This option is only valid when SSL encryption is enabled.
Valid Values:
host_name | #SERVERNAME#
where host_name is a valid host name.
If host_name is specified, the DataDirect Cloud connectivity service compares the specified host name to the DNSName value of the SubjectAlternativeName in the certificate. If a DNSName value does not exist in the SubjectAlternativeName or if the certificate does not have a SubjectAlternativeName, the DataDirect Cloud connectivity service compares the host name with the Common Name (CN) part of the certificate’s Subject name. If the values do not match, the connection fails and the DataDirect Cloud connectivity service throws an exception.
If #SERVERNAME# is specified, the DataDirect Cloud connectivity service compares the server name that is specified in the connection URL or data source of the connection to the DNSName value of the SubjectAlternativeName in the certificate. If a DNSName value does not exist in the SubjectAlternativeName or if the certificate does not have a SubjectAlternativeName, the DataDirect Cloud connectivity service compares the host name to the CN part of the certificate’s Subject name. If the values do not match, the connection fails and the DataDirect Cloud connectivity service throws an exception. If multiple CN parts are present, the DataDirect Cloud connectivity service validates the host name against each CN part. If any one validation succeeds, a connection is established.
Default: Empty string
Validate Server Certificate
Determines whether the DataDirect Cloud connectivity service validates the certificate that is sent by the database server when SSL encryption is enabled (Encryption Method=SSL). When using SSL server authentication, any certificate that is sent by the server must be issued by a trusted Certificate Authority (CA). Allowing the DataDirect Cloud connectivity service to trust any certificate that is returned from the server even if the issuer is not a trusted CA is useful in test environments because it eliminates the need to specify truststore information on each client in the test environment.
Valid Values:
true | false
If the check box is selected (true), the DataDirect Cloud connectivity service validates the certificate that is sent by the database server. Any certificate from the server must be issued by a trusted CA in the truststore file. If the Host Name In Certificate parameter is specified, the DataDirect Cloud connectivity service also validates the certificate using a host name. The Host Name In Certificate parameter is optional and provides additional security against man-in-the-middle (MITM) attacks by ensuring that the server the DataDirect Cloud connectivity service is connecting to is the server that was requested.
If the check box is not selected (true), the DataDirect Cloud connectivity service does not validate the certificate that is sent by the database server. The DataDirect Cloud connectivity service ignores any Java system properties.
Default: false

OData tab

The following table describes the controls on the OData tab. For information on using the Configure Schema editor, see Enabling OData and working with Data Source groups. For information on formulating OData requests, see "Formulating queries" under Querying with OData.
Click the thumbnail to view the screen. Required fields are marked with an asterisk.
OData tabOData tab
Table 9. OData tab connection parameters for Apache Hadoop Hive On-Premise
Field
Description
Access URI
Specifies the base URI for the OData feed to access your DataDirect Cloud data source, for example, https://service.datadirectcloud.com/api/odata. You can copy the URI and paste it into your application's OData configuration.
The URI contains the case-insensitive name of the data source to connect to, and the query that you want to execute. This URI is the OData Service Root URI for the OData feed. The Service Document for the data source is returned by issuing a GET request to the data source's service root.
The OData Service Document returns the names of the entities exposed by the Data Source OData service. To get details such as the properties of the entities exposed, the data types for those properties and the relationships between entities, the Service Metadata Document can be fetched by adding /$metadata to the service root URI.
Schema Map
Enables OData support. If a schema map is not defined, the OData API cannot be used to access the data store using this Data Source definition. Use the Configure Schema editor to select the tables to expose through OData.
See Using the Configure Schema editor for more information.
Data Source Caching
Specifies whether the connection to the backend data source is cached in a session associated with the data source. Caching the back end connection improves performance when multiple OData queries are submitted to the same data source because the connection does not need to be created on every query.
Caching of the back end connection can get in the way when trying to configure a data source for OData. If a change is made to any of the DataDirect Cloud data source connection parameters, those changes will not be seen because the connection was established using the old data source definition, and was cached. The session that caches the backend connection is discarded if there is no activity to the data source for approximately 5 minutes.
When you configure a data source for OData, it is recommended that the OData session caching be disabled. Once you are satisfied with the OData configuration for the data source, enable the parameter to get the performance improvement provided by caching the connection to the backend data source.
Valid Values:
When set to 1, session caching is enabled. This provides better performance for production.
When set to 0, session caching is disabled. Use this value when you are configuring the data source.
Default: 1
Page Size
Determines the number of entities returned on each page for paging controlled on the server side. On the client side, requests can use the $top and $skip parameters to control paging. In most cases, server side paging works well for large data sets. Client side pagination works best with a smaller data sets where it is not as expensive to fetch subsequent pages.
Valid Values: 0 | n
where n is an integer from 1 to 10000.
When set to 0, the server default of 2000 is used.
Default: 0
Refresh Result
Controls what happens when you fetch the first page of a cached result when using Client Side Paging. Skip must be omitted or set to 0. You can use the cached copy of that first page, or you can re-execute the query to get a new result, discarding the previously cached result. Re-executing the query is useful when the data being fetched may change between two requests for the first page. Using the cached result is useful if you are paging back and forth through results that are not expected to change.
Valid Values:
When set to 0, the OData service caches the first page of results.
When set to 1, the OData service re-executes the query.
Default: 1
Inline Count Mode
Specifies how the connectivity service satisfies requests that include the $inlinecount parameter when it is set to allpages. These requests require the connectivity service to include the total number of entities that are defined by the OData query request. The count must be included in the first page in server-driven paging and must be included in every page when using client-driven paging.
The optimal setting depends on the data store and the size of results. The OData service can run a separate query using the count(*) aggregate to get the count, before running the query used to generate the entities. In very large results, this approach can often lead to the first page being returned faster. Alternatively, the OData service can fetch the entire result before returning the first page. This approach works well for small results and for data stores that cannot optimize the count(*) aggregate; however, it may have a longer initial response time for the first page if the result is large.
Valid Values:
When set to 1, the connectivity service runs a separate count(*) aggregate query to get the count of entities before executing the query to return results. In very large results, this approach can often lead to the first page being returned faster.
When set to 2, the connectivity service fetches all entities before returning the first page. For small results, this approach is always faster. However, the initial response time for the first page may be longer if the result is large.
Default: 1
Top Mode
Indicates how requests typically use $top and $skip for client side pagination, allowing the service to better anticipate how to process queries.
Valid Values:
Set to 0 when the application generally uses $top to limit the size of the result and rarely attempts to get additional entities by combining $top and $skip.
Set to 1 when the application uses $top as part of client-driven paging and generally combines $top and $skip to page through the result.
Default: 0
OData Read Only
Controls whether write operations can be performed on the OData service. Write operations generate a 405 Method Not Allowed response if this option is enabled.
Existing OData-enabled data sources are read only (write operations are disabled). To enable write operations for an existing OData enabled data source, clear the OData Read Only option on the OData tab. Then, on the Data Sources tab, regenerate the OData model for the data source by clicking on the OData model icon Synch completed successfully.
Valid Values:
true | false
When the check box is selected (set to true), OData access is restricted to read-only mode.
When the check box is not selected (set to false), write operations can be performed on the OData service.
Default: false
String Max Length
Controls the maximum length reported by the connectivity service for Apache Hive String columns in OData metadata documents. By default, the Apache Hive data store reports a maximum length of 2147483647 for the String data type. The connectivity service excludes columns from the OData model that exceed a maximum length of 32768, so this option changes the max length to be 32768 allowing String columns to be included in the model by default.
If the value specified is larger than 32768, String columns are excluded from the model.
Note: If the value specified is 32768 or smaller, there may be issues with some OData applications as the connectivity service may still return values from String columns that exceed the reported maximum length.
Valid Values:
An integer greater than 0
Default: 32768

Advanced tab

Click the thumbnail to view the screen.
Advanced tab of the Apache Hadoop Hive data source setup dialog (on-premise)Advanced tab of the Apache Hadoop Hive data source setup dialog (on-premise)
Table 10. Advanced tab connection parameters for Apache Hadoop Hive
Field
Description
Initialization String
A semicolon delimited set of commands to be executed after the DataDirect Cloud connectivity service has established and performed all initialization for the connection with Apache Hive. If the execution of a SQL command fails, the connection attempt also fails and DataDirect Cloud returns an error indicating which SQL commands failed.
Default: empty string
Login Timeout
The amount of time, in seconds, that the DataDirect Cloud connectivity service waits for a connection to be established before timing out the connection request.
Valid Values:
0 | x
where x is a positive integer that represents a number of seconds.
If set to 0, the connectivity service does not time out a connection request.
If set to x, the connectivity service waits for the specified number of seconds before returning control to the application and throwing a timeout exception.
Default: 30
Max Pooled Statements
The maximum number of prepared statements to cache for this connection. If the value of this property is set to 20, the DataDirect Cloud connectivity service caches the last 20 prepared statements that are created by the application.
Valid Values:
0 | x
When set to 0, no prepared statements are cached.
When set to x, the OData service caches the specified number of prepared statements.
Default: 0
Query Timeout
The number of seconds for the default query timeout for all statements that are created by a connection.
Valid Values:
-1 | 0 | x
If set to -1, the query timeout functionality is disabled. The DataDirect Cloud connectivity service silently ignores calls to the Statement.setQueryTimeout() method.
If set to 0, the default query timeout is infinite (the query does not time out).
If set to x, the DataDirect Cloud connectivity service uses the value as the default timeout for any statement that is created by the connection. To override the default timeout value set by this connection option, call the Statement.setQueryTimeout() method to set a timeout value for a particular statement.
*If x is specified, this property is ignored for HiveServer1 connections.
Default: 0
Extended Options
Specifies a semi-colon separated list of connection options and their values. Use this configuration option to set the value of undocumented connection options that are provided by Progress DataDirect technical support. You can include any valid connection option in the Extended Options string, for example:
Database=Server1;UndocumentedOption1=value[;UndocumentedOption2=value;]
If the Extended Options string contains option values that are also set in the setup dialog, the values of the options specified in the Extended Options string take precedence.
Valid Values: string
Default: empty string
See the steps for:
Creating a Data Source definition