Configuring a network load balancer with the On-Premises Connector
When running Hybrid Data Pipeline behind a network load balancer with an On-Premises Connector, the load balancer must be configured to route requests for on-premises data sources to the correct server nodes.
There are two general steps involved in configuring your load balancer to support on-premises data access. First, a custom Access Control List must be created to direct requests for the On-Premises Connector to cluster nodes. Second, a backend notification pool that specifies the on-premises port for each cluster node must be created. The following instructions explain how an HAProxy load balancer can be configured to support Hybrid Data Pipeline access to backend data sources using the On-Premises Connector. These instructions may be adapted for other load balancers, such as NGINX and F5.
The Hybrid Data Pipeline installation program automatically generates an HAProxy configuration file for each installation of the server. These HAProxy configuration files are written to the HAProxy subdirectory in the key location directory specified during installation. These files must be merged to create a single HAProxy configuration file for a load balancer deployment of Hybrid Data Pipeline.
Take the following steps to create an HAProxy configuration file for a load balancer deployment using the On-Premises Connector.
1. Create an Access Control List (ACL) to direct requests for the On-Premises Connector to each Hybrid Data Pipeline server.
Note: Options 1 and 2 below may be used in combination.
Option 1. Use a custom header to direct requests. Each entry should be prefaced with acl.
In this example, the custom header X-DataDirect-OPC-Host is used to direct requests to the server service2.myserver.com through the default On-Premises Port 40501.
acl is_opa_hdr_service2_myserver_com_40501 hdr(X-DataDirect-OPC-Host)
-i opa_service2_myserver_com_40501
use_backend opa_service2_myserver_com_40501 if is_opa_hdr_service2_myserver_com_40501
Option 2. Use URL routing to direct requests. Each entry should be prefaced with acl.
In this example, URL routing is used to direct requests to the server service2.myserver.com through the default On-Premises Port 40501.
acl is_opa_url_service2_myserver_com_40501 path_end
-i /connect/opa_service2_myserver_com_40501
use_backend opa_service2_myserver_com_40501 if is_opa_url_service2_myserver_com_40501
2. Add each Hybrid Data Pipeline server to the backend notification pool section using the server keyword.
In the following example, the server server2.myserver.com has been added to the backend hdp_notification_pool section, and health checks have been enabled at the root with the option httpchk property.
backend hdp_notification_pool
mode http
option http-tunnel
balance roundrobin
option httpchk HEAD /
http-check expect status 200
#HDP Notification Server Definitions
server server1.myserver.com 11.22.111.105:11280 check
server server2.myserver.com 11.22.111.106:11280 check
3. Create a backend pool that specifies the On-Premises Port for each Hybrid Data Pipeline server that supports the On-Premises Connector by adding a backend section to the configuration file.
For example, the following backend section is for a node on the service2.myserver.com server using the default On-Premises Port 40501. Health checks have been enabled at the root with the option httpchk property.
backend opa_service2_myserver_com_40501
mode http
option http-tunnel
option httpchk HEAD /
http-check expect status 200
server service2.myserver.com 11.22.111.106:40501 check
4. Add each Hybrid Data Pipeline server to the default backend pool using the server keyword.
In the following example, server2.myserver.com has been added to the backend hdp_default_backend pool, and health checks have been enabled by specifying the /api/healthcheck endpoint with the option httpchk property.
backend hdp_default_backend
mode http
balance roundrobin
option httpchk HEAD /api/healthcheck
http-check expect status 200
cookie HDP_SESSION insert nocache
#HDP Server Definitions
server service1.myserver.com 11.22.11.105:8080 check cookie service1.myserver.com
server service2.myserver.com 11.22.111.106:8080 check cookie service2.myserver.com
Example
The following example demonstrates an HAProxy configuration file for using the load balancer with two server nodes that have the On-Premises connector enabled, server1.myserver.com and server2.myserver.com. To create this file, the required sections were copied from the generated configuration file for service2.myserver.com into the generated file for service1.myserver.com. Copied sections are indicated with comments.
global
log 127.0.0.1 local0
chroot /var/lib/haproxy
daemon
defaults
log global
mode http
option httplog
option dontlognull
timeout connect 5s
timeout client 15m
timeout server 15m
##############################################################################
# Configuration for OPC with load balancer.
##############################################################################
frontend lb_opc_nodes
bind *:80
#Replace /common/hdpsmoke/shared/redist/ddcloud.pem with the location of the
#loadbalancers SSL certificate
bind *:443 ssl crt /common/hdpsmoke/shared/redist/ddcloud.pem
#In production port 80 should be a permanent redirected to 443 by uncommenting the
#following line
#redirect scheme https code 301 if !{ ssl_fc }
mode http
default_backend hdp_default_backend
#Define rules for HDP Notification Servers
acl is_hdp_notification2 path_end -i /connect/X_DataDirect_Notification_Server
use_backend hdp_notification_pool if is_hdp_notification2
acl is_hdp_notification hdr(X-DataDirect-OPC-Host) -i X_DataDirect_Notification_Server
use_backend hdp_notification_pool if is_hdp_notification
#Rules for on-premises connection to service.myserver.com
acl is_url_opa_service1_myserver_com_40501 path_end
-i /connect/opa_service1_myserver_com_40501
use_backend opa_service1_myserver_com_40501 if is_url_opa_service1_myserver_com_40501
acl is_hdr_opa_service1_myserver_com_40501 hdr(X-DataDirect-OPC-Host)
-i opa_service1_myserver_com_40501
use_backend opa_service1_myserver_com_40501 if is_hdr_opa_service1_myserver_com_40501
#Rules for on-premises connection to service2.myserver.com. These rules were copied
#from the service2.myserer.com configuration file.
acl is_url_opa_service2_myserver_com_40501 path_end
-i /connect/opa_service2_myserver_com_40501
use_backend opa_service2_myserver_com_40501 if is_url_opa_service2_myserver_com_40501
acl is_hdr_opa_service2_myserver_com_40501 hdr(X-DataDirect-OPC-Host)
-i opa_service2_myserver_com_40501
use_backend opa_service2_myserver_com_40501 if is_hdr_opa_service2_myserver_com_40501
backend hdp_notification_pool
mode http
option http-tunnel
balance roundrobin
option httpchk HEAD /
http-check expect status 200
#HDP Notification Server Definitions
server service1.myserver.com 11.22.111.105:11280 check
#The following server argument was copied from the service2.myserver.com
#configuration file
server service2.myserver.com 11.22.111.106:11280 check
backend opa_service1_myserver_com_40501
mode http
option http-tunnel
option httpchk HEAD /
http-check expect status 200
server service1.myserver.com 11.22.111.105:40501 check
#The following section was copied from the service2.myserver.com configuration file.
backend opa_service2_myserver_com_40501
mode http
option http-tunnel
option httpchk HEAD /
http-check expect status 200
server service2.myserver.com 11.22.111.106:40501 check
backend hdp_default_backend
mode http
balance roundrobin
option httpchk HEAD /api/healthcheck
http-check expect status 200
cookie HDP_SESSION insert nocache
#HDP Server Definitions
server service1.myserver.com 11.22.11.105:8080 check cookie service1.myserver.com
#The following server argument was copied from the service2.myserver.com
#configuration file
server service2.myserver.com 11.22.111.106:8080 check cookie service2.myserver.com