skip to main content
Welcome to DataDirect Hybrid Data Pipeline : Deployment scenarios : Load balancer deployment : Load balancer configuration
  

Try Now
Load balancer configuration
The Hybrid Data Pipeline product package does not include a load balancer. However, Hybrid Data Pipeline can be deployed on one or more nodes behind a load balancer to provide high availability and scalability. Hybrid Data Pipeline supports two types of load balancers: network load balancers that support the TCP tunneling protocol and cloud load balancers that support the WebSocket protocol. In turn, the load balancer must be configured to support the Hybrid Data Pipeline environment according to the following criteria.
*The load balancer must be configured for SSL termination to support encrypted communications between clients and the load balancer. The configuration of the load balancer depends in part on the type of SSL certificate supplied. The following guidelines should be used when configuring the load balancer for SSL termination.
Note: When intermediate certificates are required for the trust chain, then the SSL certificate must be supplied in a PEM Base64 encoded X.509 file format. When there are no intermediate certificates, then the SSL certificate can be supplied as a DER encoded binary X.509 file format, instead of the PEM file format.
*When using a self-signed certificate, the certificate must be supplied to the load balancer and specified as the SSL certificate during installation of the Hybrid Data Pipeline server. The installation program uses the specified certificate file to generate trust stores needed in the installation of the ODBC driver, JDBC driver, and On-Premises Connector. These files are written to the redist directory of the key location upon installation. They are needed for the installation of each component.
*When using a certificate issued by a well-known certificate authority (such as a certificate authority trusted by Java), the load balancer needs to be configured with the certificate and any intermediate certificates necessary to establish the chain of trust to the root certificate. The root certificate must be specified as the SSL certificate during installation of the Hybrid Data Pipeline server. The installation program uses the specified certificate file to generate the trust stores needed in the installation of the ODBC driver, JDBC driver, and On-Premises Connector. These files are written to the redist directory of the key location upon installation. If the certificate is issued by a well-known certificate authority, trust stores will not be needed for the installation of the JDBC driver and the On-Premises Connector. However, a trust store will be needed for the installation of the ODBC driver.
*When using a certificate issued by a certificate authority that is not well-known, the load balancer needs to be configured with the certificate and any intermediate certificates necessary to establish the chain of trust to the root certificate. The root certificate must be specified as the SSL certificate during installation of the Hybrid Data Pipeline server. The installation program uses the specified certificate file to generate the trust stores needed in the installation of the ODBC driver, JDBC driver, and On-Premises Connector. These files are written to the redist directory of the key location upon installation. They are needed for the installation of each component.
*The load balancer must support session affinity. The load balancer must either be configured to supply its own cookies or to pass the cookies generated by the Hybrid Data Pipeline service back to the client. The Hybrid Data Pipeline service provides a cookie named C2S-SESSION that can be used by the load balancer. For ODBC and JDBC applications, the ODBC and JDBC drivers automatically use cookies for session affinity. OData applications should be configured to echo cookies for optimal performance.
*The load balancer must pass the hostname in the Host header when a request is made to an individual Hybrid Data Pipeline node. For example, if the hostname used to access the cluster is hdp.mycorp.com and the individual nodes behind the load balancer have the hostnames hdpsvr1.mycorp.com, hdpsvr2.mycorp.com, hdpsvr3.mycorp.com, then the Host header in the request forwarded to the Hybrid Data Pipeline node must be the load balancer hostname hdp.mycorp.com.
*The load balancer must supply the X-Forwarded-Proto header to indicate to the Hybrid Data Pipeline node whether the request was received by the load balancer as an HTTP or HTTPS request.
*The load balancer must supply the X-Forwarded-For header if the client IP address is needed for Hybrid Data Pipeline access logs. If the X-Forwarded-For header is not supplied, the IP address in the access logs will always be the load balancer's IP address.
*The load balancer may be configured to run HTTP health checks against nodes with the Health Check API. Note that the On-Premises backend and default notification servers support only simple health checks, not health-check APIs.
*Additional configuration is required for the following scenarios.
*If you are using the On-Premises Connector to access backend data behind a firewall with a network load balancer such as HAProxy, see Configuring a network load balancer with the On-Premises Connector for additional configuration requirements.
*If you are deploying Hybrid Data Pipeline on a cloud load balancer such as the AWS application load balancer or the Azure application gateway, see Configuring a cloud load balancer for additional configuration details.
* Configuring a network load balancer with the On-Premises Connector
* Configuring a cloud load balancer