System Administrator | HA Proxy setup and configuration

When an application becomes popular, it sends an increased number of requests to the
application server. A single application server may not be able to handle the entire load
alone. We can always scale up the underlying hardware, that is, add more memory and
more powerful CUPs to increase the server capacity; but these improvements do not
always scale linearly. To solve this problem, multiple replicas of the application server
are created and the load is distributed among these replicas. Load balancing can be
implemented at OSI Layer 4, that is, at TCP or UDP protocol levels, or at Layer 7, that
is, application level with HTTP, SMTP, and DNS protocols.

In this recipe, we will install a popular load balancing or load distributing service,
HAProxy. HAProxy receives all the requests from clients and directs them to the actual
application server for processing. Application server directly returns the final results to
the client. We will be setting HAProxy to load balance TCP connections.

haproxy

HAProxy, which stands for High Availability Proxy, is a popular open source software TCP/HTTP Load Balancer and proxying solution which can be run on Linux, Solaris, and FreeBSD. Its most common use is to improve the performance and reliability of a server environment by distributing the workload across multiple servers (e.g. web, application, database). It is used in many high-profile environments, including: GitHub, Imgur, Instagram, and Twitter.


Installing HAProxy

In this recipe, we will install a popular load balancing or load distributing service,
HAProxy. HAProxy receives all the requests from clients and directs them to the actual
application server for processing. Application server directly returns the final results to
the client. We will be setting HAProxy to load balance TCP connections.

As a fast developing open source application HAProxy available for install in the Ubuntu default repositories might not be the latest release. To find out what version number is being offered through the official channels enter the following command.

sudo apt show haproxy

HAProxy has always three active stable versions of the releases, two of the latest versions in development plus a third older version that is still receiving critical updates. You can always check the currently newest stable version listed on the HAProxy website and then decide which version you wish to go with.

https://haproxy.debian.net/

1. Install HAProxy:

$ sudo apt-get update
$ sudo apt-get install haproxy
$ haproxy -v

Enable the HAProxy init script to automatically start HAProxy on system boot.
Open /etc/default/haproxy and set ENABLE to 1 :

ENABLED=1

HAProxy is now installed. Let us now create a setup in which we have 2(two) Apache Web Server instances and 1(one) HAProxy instance. Below is the setup information:

We will be using three systems, spawned virtually through VirtualBox:

Instance 1 – Load Balancer
Hostname: haproxy
OS: Ubuntu
Private IP: 192.168.43.187

Instance 2 – Web Server 1

Hostname: webser01
OS: Ubuntu with LAMP
Private IP: 192.168.43.172

Instance 2 – Web Server 2

Hostname: webserver02
OS: Ubuntu with LAMP
Private IP: 192.168.43.159


Configuring the load balancer

Setting up HAProxy for load balancing is a quite straightforward process. Basically, all you need to do is tell HAProxy what kind of connections it should be listening for and where the connections should be relayed to.

This is done by creating a configuration file /etc/haproxy/haproxy.cfg with the defining settings. You can read about the configuration options at HAProxy documentation page if you wish to find out more.


Different load balancing algorithms

Configuring the servers in the backend section allows HAProxy to use these servers for load balancing according to the roundrobin algorithm whenever available.

The balancing algorithms are used to decide which server at the backend each connection is transferred to. Some of the useful options include the following:

  • Roundrobin: Each server is used in turns according to their weights. This is the smoothest and fairest algorithm when the servers’ processing time remains equally distributed. This algorithm is dynamic, which allows server weights to be adjusted on the fly.
  • Leastconn: The server with the lowest number of connections is chosen. Round-robin is performed between servers with the same load. Using this algorithm is recommended with long sessions, such as LDAP, SQL, TSE, etc, but it is not very well suited for short sessions such as HTTP.
  • First: The first server with available connection slots receives the connection. The servers are chosen from the lowest numeric identifier to the highest, which defaults to the server’s position on the farm. Once a server reaches its maxconn value, the next server is used.
  • Source: The source IP address is hashed and divided by the total weight of the running servers to designate which server will receive the request. This way the same client IP address will always reach the same server while the servers stay the same.

Now, edit the HAProxy /etc/haproxy/haproxy.cfg configuration file. You may
want to create a copy of this file before editing:

$ cd /etc/haproxy
$ sudo cp haproxy.cfg haproxy.cfg.copy
$ sudo nano haproxy.cfg

Load balancing on layer 4

Once installed HAProxy should already have a template for configuring the load balancer.

Backup the original file by renaming it:

cp /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg.original

Open the configuration file, for example, using vim with the command underneath.Add the following sections to the end of the file. Replace the <server name> with whatever you want to call your servers on the statistics page and the <private IP> with the private IPs for the servers you wish to direct the web traffic to.

frontend http_front
 bind *:80
 stats uri /haproxy?stats
 default_backend http_back

backend http_back
 balance roundrobin
 server <server1 name> <private IP 1>:80 check
 server <server2 name> <private IP 2>:80 check

We can verify that the configuration file is valid with the following command. If there are any errors, they will be displayed so that we can go in and fix them:

haproxy -f /etc/haproxy/haproxy.cfg -c

Once you’re done configuring start the HAProxy service:

sudo service haproxy start

This defines a layer 4 load balancer with a front-end name http_front listening to the port number 80, which then directs the traffic to the default backend named http_back. The additional stats URI /haproxy?stats enables the statistics page at that specified address.


Testing Load-Balancing and Fail-over

We will append the server name in both the default index.html file located by default at /var/www/index.html

ServerName webserver01
 DocumentRoot /var/www/
ServerName webserver02
 DocumentRoot /var/www/

On the Instance 2 – Web Server 1 (webserver01 with IP- 192.168.43.172), append below line as:

sudo sh -c "echo \<h1\>Hostname: webserver01 \(192.168.43.172\)\<\/h1\> >> /var/www/index.html"

On the Instance 3 – Web Server 2 (webserver02 with IP- 192.168.205.17), append below line as:

sudo sh -c "echo \<h1\>Hostname: webserver02 \(192.168.43.159\)\<\/h1\> >> /var/www/index.html"

restart on apache on 192.168.43.172,192.168.43.159

Now open up the web browser on local machine and browse through the haproxy IP i.e. http://192.168.43.187.
Each time you refresh the tab, you’ll see the load is being distributed to each web server. Below is screenshot of my browser:

webserver01.png

And for the second time, i.e. when I refresh the page, I get:

webserver02

You can also check the haproxy stats by visiting http://192.168.43.187/haproxy?stats

haproxystat.png


If you would like to monitor live traffic that passes through HAProxy, enable debugging with the -d flag:

haproxy -f /etc/haproxy/haproxy.cfg -d


Load balancing on layer 7:–

Another possibility is to configure the load balancer to work on layer 7, which is useful when parts of your web application are located on different hosts. This can be accomplished by conditioning the connection transfer for example by the URL.

Open the HAProxy configuration file with a text editor.

sudo vim /etc/haproxy/haproxy.cfg

Then set the front and backend segments according to the example below.

frontend http_front
 bind *:80
 stats uri /haproxy?stats
 acl url_blog path_beg /blog
 use_backend blog_back if url_blog
 default_backend http_back

backend http_back
 balance roundrobin
 server <server name> <private IP>:80 check

backend blog_back
 server <server name> <private IP>:80 check

The front end declares an ACL rule named url_blog that applies to all connections with paths that begin with /blogUse_backend defines that connections matching the url_blog condition should be served by the backend named blog_back, while all other requests are handled by the default backend.


 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s