Quick Start: Progress DataDirect® for ODBC for
Apache Spark SQLTM Wire Protocol Driver on
UNIX and Linux (32-bit and 64-bit)
The following basic information enables you to connect with and test your driver immediately after installation. For installation instructions, see the Progress DataDirect for ODBC Drivers Installation Guide. This Quick Start covers the following topics:
ODBCHOME in the following sections refers to your installation directory path determined at installation.
1 Check your permissions: Log in as a user with full r/w/x permissions recursively on the entire Progress DataDirect for ODBC Drivers installation directory.
2 Determine which shell you are running: From the login shell, execute the echo $SHELL command.
3 Run the DataDirect setup script to set variables: Two scripts, odbc.csh and odbc.sh, are installed in the installation directory. For Korn, Bourne, and equivalent shells, execute odbc.sh. For a C shell, execute odbc.csh. After running the setup script, execute the env command to verify that the ODBCHOME/lib directory has been added to your shared library path.
Set ODBCINI variable: Progress DataDirect for ODBC
Drivers install a default odbc.ini
file, where your data sources reside, in the installation directory. You must
set the ODBCINI environment variable to point to the path of the odbc.ini file. For example:
$ ODBCINI= ODBCHOME/odbc.ini; export ODBCINI
The ivtestlib (32-bit drivers) and ddtestlib (64-bit drivers) test loading tools are provided to verify that the driver can be loaded into memory; they are located in the ODBCHOME/bin directory. For example, to load the 32-bit driver, you would enter:
$ ivtestlib ODBCHOME/lib/ivsparknn.zz
where nn represents the driver level number and zz represents the extension.
If the load is successful, the test loading tool returns a success message along with the version string of the driver. If the driver cannot be loaded, test loading tool returns an error message explaining why.
The following procedure is applicable to all UNIX/Linux systems.
The default odbc.ini file installed in the installation directory is a template in which you create data source definitions. You enter your site-specific database connection information using a text editor. Each data source definition must include the keyword Driver=, which is the full path to the driver.
The following examples show the minimum connection string options that must be set to complete a test connection, where xx represents iv for 32-bit or dd for 64-bit drivers, nn represents the driver level number, and zz represents the extension. The values for the options are samples and are not necessarily the ones you would use.
To configure a connection, you might enter:
Spark SQL Wire Protocol]
Database: The name of the database to which you want to connect by default.
HostName: Either the URL name or the IP address of the server to which you want to connect.
PortNumber: The port number of the server listener. The default is 10000.
MaxStringSize: (This option is required only for Microsoft Access and Tableau users.) The maximum size of columns of the String data type that the driver describes through result set descriptions and catalog functions. Specify the following value that corresponds to your environment:
• For Microsoft Access users, specify a value of 255.
• For Tableau users, specify a value from 255 to 4000 that suits your environment.
The Progress DataDirect for ODBC for Apache Spark SQL Wire Protocol driver installation includes a program named example that can be used to connect to a data source and execute SQL. The application is located in the ODBCHOME/example directory. To run the program, type example and follow the prompts to enter your data source name, user name, and password. If successful, a SQL> prompt appears and you can type in SQL statements such as
SELECT * FROM accounts
If example is unable to connect, the appropriate error message appears.
The drivers have connection options that directly affect performance. To tune these drivers for performance, set the following connection options:
If you know the typical fetch size for your application
Controlled by ArraySize. This option specifies the number of cells the driver retrieves. By increasing the value of Array Size, you increase the number of rows the driver will retrieve from the server for a fetch. Smaller values can improve the initial response time of the query. Larger values improve overall fetch times at the cost of additional memory and slower response time.
Note: If the fetch size exceeds the available buffer memory of the server, an out of memory is returned when attempting to execute the fetch. If you receive this error, decrease the value specified until fetches are successfully executed.
If you know the mazimum size of data when fetching images, pictures, long text, or binary data
Controlled by DefaultLongDataBuffLen. This option specifies the maximum length of data (in KB) the driver can fetch from long columns in a single round trip and the maximum length of data that the driver can send using the SQL_DATA_AT_EXEC parameter. The buffer size should only be large enough to accommodate the maximum amount of data retrieved; otherwise, performance is reduced by transferring large amounts of data into an oversized buffer.
If your application requires encryption of data
Controlled by EncryptionMethod. Data encryption may adversely affect performance because of the additional overhead (mainly CPU usage) required to encrypt and decrypt data.
If your application needs to access a database objects owned only by the current user
Set UseCurrentSchema=1 to improve performance. When this option is enabled, the driver returns only database objects owned by the current user when executing catalog functions. Calls to catalog functions are optimized by grouping queries.
© 2016. Progress Software Corporation. All rights reserved.