HAProxy is an open source load balancer, which can balance all TCP-based services. It is often used to balance HTTP and can help solve traffic problems on your web server. How to set it up.
What is HAProxy?
Load balancers like HAProxy allow you to share traffic across multiple servers, making it easier to manage. Instead of pointing your IP to your web server, you would point to a HAProxy server, which would decide where it would be sent from there. HAProxy is very lightweight and does not require much resources to operate, so you can use a single load balancer for many backend servers. Ideally, you want both your HAProxy server and your web servers to host in the same data center, from the same cloud provider, to cut down on latency.
HAProxy also allows your network to become more resilient. If a web server crashes, HAProxy may redirect traffic to the rest while diagnosing the problem. For it to be really resistant, you want a backup HAProxy server, if your load balancer goes down.
Even with HAProxy, you still want a complete website CDN in front of it, both to handle extra load and to have multiple presence points closer to the end user.
How to set up HAProxy Load Balancing
First install HAProxy from your district̵7;s package manager. For Debian-based systems like Ubuntu it would be:
apt-get install haproxy
Then you need to turn it on by editing the init script on
/etc/default/haproxy and setting
ENABLED to 1:
If you jump now
service haproxyyou should see that it is enabled and ready to be configured. We start by archiving the default configuration file:
mv /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg.old
Instead, create a new configuration file and start by adding some global settings:
global log 127.0.0.1 local0 notice maxconn 2000 user haproxy group haproxy stats enable stats uri /haproxy?stats stats realm HAProxy Statistics stats auth admin:password
log setting specifies the syslog server to which HAProxy sends logs. You must have a server running rsyslog to use it. The
maxconn setting specifies the maximum simultaneous connections, and
group specify which Unix user HAProxy works as.
The last lines activate HAProxy’s built-in statistics page, which you can view by navigating to the URI in your browser. In this case, it would be
http://your_ip/haproxy?stats, but you can see a demo of this.
Then we set the default configuration that applies to everyone
listen blocks if they do not make any changes to it:
defaults log global mode http option httplog option dontlognull retries 3 option redispatch timeout connect 5000 timeout client 10000 timeout server 10000
We set the default setting to use the global log setting, use HTTP, and set some settings related to connection timeouts.
We create one
frontend blocks that will make heavy lifting and forward connections to the backend:
frontend proxy bind *:80 # ACL function declarations acl is_abuse src_http_req_rate(Abuse) ge 10 acl inc_abuse_cnt src_inc_gpc0(Abuse) gt 0 acl abuse_cnt src_get_gpc0(Abuse) gt 0 # Rules tcp-request connection track-sc0 src table Abuse tcp-request connection reject if abuse_cnt http-request deny if abuse_cnt http-request deny if is_abuse inc_abuse_cnt option httpclose option forwardfor use_backend appname
The first line connects this front end to port 80, where HAProxy will listen.
The following two sections are for interest rate limitation. First, the ACL (Access Control List) functions are declared, which determine if an IP address is abusive. Then a set of rules will reject a connection if it makes too many requests.
forwardfor the option forwards the client’s IP address to the server. Because HAProxy acts as a reverse proxy, your nginx server only sees your HAProxy server’s IP address. This option sets
X-Forwarded-For HTTP header to the client’s IP address.
And finally, we’ll set this up
frontend block to use the backend “appname”, which we need to create. The
backend block simply defines the servers to be forwarded to, along with some options:
backend appname your_ip:80 balance roundrobin cookie SERVERNAME insert server web1 web1_ip:80 check cookie web1 server web2 web2_ip:80 check cookie web2
balance directives define how HAProxy balances requests between servers. The most common option is
roundrobin, which rotates connections through each server in order. If you encounter balance issues, try using the option
leastconn, which selects based on simultaneous connections. If you need users to access the same server through multiple connections, you can use
source option, which selects based on a hash of the client’s IP address.
The last two rows assign servers to this
listen block. You give them a name (
web2) enter their addresses and then enter some options. Here we use
check to ensure that the server is healthy and accepts connections, and
cookie parameter to set
SERVERNAME cookie (which we inserted directly above) to the server name, which is used for the stickiness of the session (so that the user does not change servers when you use your website). The
balance source achieves the same effect.
And because we use speed limits, we actually need a different backend to store IP addresses:
backend Abuse stick-table type ip size 100K expire 30m store gpc0,http_req_rate(10s)
This actually does not forward any connections; it acts as a table to store addresses in. Addresses are flushed after 30 minutes, so addresses that are considered abusive will be blocked for 30 minutes.
Finally, you can start the HAProxy service by running:
service haproxy start