How to : Scalability - Load-Balancing - Fault tolerance

with Apache JServ 1.1

document version: 1.2.1 (03 Feb 2000)

Summary

Introduction
Changes from previous versions
Features

Scalability

Apache Scalability
JServ Scalability

Openess
Security
Load-balancing
Session handling

How sessions work

Fault tolerance

No Single point of failure
Apache fault tolerance
JServ fault tolerance

Manageability

Configuration
Internals

Load-balancing algorithm
Watchdog process
Shared memory

Internal state - Admistration tasks
Large sites
Known problems
FAQ
Tips
Authors

1. Introduction

However, as the Java Virtual Machines become faster over years, people tend to ask for always more personnalized and dynamic contents, and this type of application requires always more and more CPU power. Apache JServ addresses this requirement and lets you distribute your application load over as many hosts as needed.

At the same time, applications and user transactions are based on HTTP sessions, and Apache JServ ensures that the system will take care of them. Of course the whole system has to be fault-tolerant.

And of course all for free ... ;-)

This document describes how to use Apache JServ load-balancing and fault tolerance features.

Thank you for using Apache JServ.

2. Changes from previous versions

ApJServShmFile

configuration

3. Features

3.1. Scalability

Apache JServ

Just as example : scalable from :

1 PC, running 1 Apache + 1 JServ (from 1 to z zones) (my home PC)

To (OK, let's take one realistic example) :

10 hosts running Apache. (load-balancing with round-robin DNS / or dedicated hardware)
33+ JServs running on top of :

15 PC running Linux JDK1.2
5 Sun E450
10 PC NT JDK1.1.6
3 AS400
+ ... (What fits your needs)

3.1.1. Scaling Apache HTTP/HTTPS servers :

3.1.2. Scaling JServ servers :

No JVM is today able of running thousands of (doing something really useful) threads at the same time.

Some Java classes have synchronized methods. Increasing the number of JServs increases the parallelism.
Redundancy increases system safety. If one JVM crashes, or if one servlet produces an System.exit(), or anything bad happens, others Apache JServ will be still alive.

Servlet zones can be distributed on different JServs. This increases security, as these Jservs can be started with different userid.

every JServ can be started with it's own CLASSPATH.
every JServ can be started with it's own JDK version, or JRE option (-nojit, - verbosegc, ...).

3.2. Openess

3.3. Security

You can run Apache & JServ on same/different hosts with different userids and rights.
You can separate different zones on different JServs.
JServ can be run behind a firewall (AJP protocol uses only one TCP port)
JServ uses an ACL and allows AJP requests only from ACL'ed hosts.
The communication can be authenticated with a secret key between Apache & JServ.
Apache+SSLeay allows 128bits SSL encryption with https URLs.(between the browser and the http server).
Server redundancy (duplicating both Apache & JServ) increases the service availability.
Apache + JServ does not rely on any external component. There is no single point of failure (SPF).

3.4 Load-balancing

Level 0 :

automatic

Level 1 :

Level 2 :

balance

Configuration

Level 3 :

routing

3.5. Session handling

3.5.1 How sessions work

3.6 Fault tolerance

3.6.1 No single point of failure

If you have more than one JServ host, one of them can stop, and the system will still work. (modulo broken sessions).

Any Apache is able to route a request to any JServ. (session are maintained, does not rely on any of the following elements : load-balancer hardware, or Apache server).
As long as you have one Apache and one JServ running, the system can work.

Fault-tolerance is implicit if load-balancing is enabled for this zone.
All our httpd processed share a memory zone, which contains the last known status for every JServ.
As soon as a JServ is marked "unavailable" by one httpd process, the information is accessible by other processes on the same host. This prevents repetitive connect failures or even TCP connect timeouts.
The request & following requests will be redirected to others JServ in the same "set" of JServs.
A watchdog program will run silently, and try to connect to this JServ until success. Once succeeded, the JServ will be accessed again by httpd processes.
The session is broken if a request is redirected to another JServ. The existing one is considered invalid, as sessions can not travel across the network (not in Servlet API specs).

3.6.2. Scenario 1: Apache fault-tolerance

We have 2 Apache servers on 2 different hosts.
1 - first step
a. Web client requests a page from www.jserv.com, port 80
b. www.jserv.com is a load-balancing system which actually resolves the request to go to server Httpd server 111.222.333.10, port 3000.
c. The Apache Httpd server listening on 111.222.333.10, port 3000 chooses (at random) a Jserv machine, 192.168.0.51, port 8885, to handle the request.
d. The Jserv machine at 192.168.0.51, port 8885 responds to the request with the content of the page along with a cookie with name JServSessionID and value "xxxx-JS1".

2 - second step
a. Web client requests another page from www.jserv.com, port 80
b. www.jserv.com resolves to 111.222.333.20, port 3000 (a different machine from last hit time).
c. The Apache server recognizes the JServSessionID cookie and looks at the Jserv identifier, "JS1" at the end of the cookie.
d. The Apache server sees that JS1 is the identifier for 192.168.0.51, port 8885, and passes the request to that same server.

3.6.3. Scenario 2 : JServ load-balancing & fault tolerance

1 - first step : JServ

load-balancing

2 - second step : session handling

3 - third step : JServ fault tolerance
a. client A requests a servlet (and send the previously set cookie JS1) (A2).
b. the httpd server recognizes the cookie JS1.
c. the request is passed to JServ1, resulting in a failure (A2')
(the httpd process marks JServ1 "dead" in shared memory).
d. the http server finds a backup and sends the request to JServ3 (A2'')
e. result of A2'': If a session is needed, a new one is created and a cookie JS3 is set. (JS1 cookie erased).

3.7. Manageability

4. Configuration

Nothing special to do on the Java side. NO new parameter in properties file.
Administrator installs JServ and then chooses to use load-balancing or not from the Apache side.

1 - defining hosts and routing parameters (in red)

2 - defining hosts weight (in orange)

ApJServBalance set1 SPARK 4

Default weight is = 1. This one (SPARK) is a 4*CPU engine.

3 - defining set (here called set1) of equivalent JServs (in green)

4 - defining load-balanced servlet mount point (in blue)

If one of the JServs fails; requests will be redirected to other members in set "set1".

5 - defining the shared memory file (in pink)

ApJServShmFile log/jserv_shm

<IfModule mod_jserv.c>
####################################################
# Apache JServ Configuration File #
####################################################
# Note: this file should be appended to httpd.conf
ApJServManual on

ApJServMount /oldservlet ajpv12://192.168.0.1:7777/zone2 // old style config

ApJServMount /servlet balance://set1/zone1

ApJServBalance set1 PC1
ApJServBalance set1 PC2
ApJServBalance set1 PC3
ApJServBalance set1 SPARK 4

ApJServHost PC1 ajpv12://192.168.0.51:7777
ApJServHost PC2 ajpv11://192.168.0.52:8888
ApJServHost PC3 ajpv11://192.168.0.53:9999
ApJServHost SPARK ajpv12://192.168.0.54:7777

ApJServRoute JS1 PC1
ApJServRoute JS2 PC2
ApJServRoute JS3 PC3
ApJServRoute sp1 SPARK

ApJServShmFile log/jserv_shm

ApJServSecretKey DISABLED
</IfModule>

5. Internal

5.1 Load-balancing algorithm

When Apache (re)starts, the configuration file (httpd.conf) is parsed. For each set (of servlet engines), a circular list of engines is created. Every engine is inserted exactly n times, where n is the weight as described above.
In our example, the circular list contains :
PC1 - PC2 - PC3 - SPARK - SPARK - SPARK - SPARK

for each HTTP request coming for a balanced Servlet mount point:
do

    if is the first HTTP request for a balanced Servlet mount point
    then
        process_mount_default_target = randomly chosen in set
    endif

    if a session cookie is found
    then
        the process finds the JServ which owns the session (using ApJServRoute parameters)
        if the JServ is not dead/stopped (in shared memory)
        then
            the process sends the request to this JServ.
            if the JServ replies
            then
                return
            else
                mark the JServ dead (in shared memory)
            endif
        endif
    endif

    # here we have either got a request without session cookie
    # or a broken session. In this last case, we send te request 
    # to another target in the list without saying it to the client.
    # (the application will have to notice the broken session anyway)

    if process_target != process_mount_default_target AND 
       our process_mount_default_target is alive (in shared memory)
    then
        process_target = process_mount_default_target
    endif
    while (process_target exists)
    do
        if process_target is alive (in shared memory)
        then
            the process sends the request to this JServ.
            if the JServ replies
            then
                return
            else
                mark the JServ dead (in shared memory)
            endif
        endif
        process_target = next in set
    done
done

5.2. Watchdog process

while true
do
    sleep(10)

    if this process is not the default watchdog (in shared memory )
        break
    endif

    for each JServ present in shared memory list
    do
        if JServ.state = down
            connect JServ
            if failed
                continue
            else
                mark it alive in shared memory
            endif
        endif
    done
done

5.3. Shared memory

6. Internal state - Admistration tasks

UP (internally '+') : means this JServ is running.
DOWN (internally '-') : means JServ doesn't answer. (Watchdog is allowed to change the state back to UP)
SHUTDOWN_IMMEDIATE (internally 'X') : JServ is stopped by administator, no traffic allowed.
SHUTDOWN_GRACEFUL (internally '/') : JServ is stopped by administrator, no new traffic allowed, but existing sessions still allowed. (example: will be stopped in 10 minutes).

Before an non urgent administration task, the administrator changes the state from UP to SHUTDOWN_GRACEFUL, and then wait some minutes, before changing again the state to SHUTDOWN_IMMEDIATE and perform the task. How long should he wait is depending of your application.

Once set to SHUTDOWN*, the watchdog doen't attempt to reconnect and set it up again.

Before an urgent administration task, the administrator changes the state from UP to SHUTDOWN_IMMEDIATE, and then performs the task. This ensures that no request will be routed to this JServ.

If the web site has 10 Apache, he has to modify the shared memory file on each of them.

After performing an administration task, the administrator changes the state from SHUTDOWN_IMMEDIATE to UP back.

How to read/modify the shared memory file ?

New !

http://localhost/jserv/

7. Large sites

JServ1, JServ2, JServ3, JServ4, JServ5

JServ11, JServ12, JServ13, JServ14, JServ15

A client connects and obtains IP1 from the DNS for www.jserv.com.
The sequest is sent to Apache1.
Apache1 uses its load-balancing algorithm and chooses a JServ from its set. (Let's say JServ1)
JServ1 sets a (JServSessionId=xxxx.JS1) cookie to the client's browser.
Following requests from this client, (session establihed) coming to Apache1 are routed to JServ1.
Apache1 dies.
Hopefully (thanks to our load-balancing equipment) the request comes on Apache2.
The cookie is recognized (JServSessionId=xxxx.JS1) and Apache2 sends the request to JServ1.
Every Apache never sends requests outside its own set, except for established sessions.

####################################################

# Apache JServ Configuration File for Apache A #

####################################################

# Note: this file should be appended to httpd.conf

ApJServManual on

ApJServMount /servlet balance://set1/zone1

# my own set of JServs for servlet zone "zone1"
ApJServBalance set1 JServ1
ApJServBalance set1 JServ2
ApJServBalance set1 JServ3
ApJServBalance set1 JServ4
ApJServBalance set1 JServ5

ApJServHost JServ1 ajpv12://192.168.0.51:7777
ApJServHost JServ2 ajpv12://192.168.0.52:8888
ApJServHost JServ3 ajpv12://192.168.0.53:9999
ApJServHost JServ4 ajpv12://192.168.0.54:7777
ApJServHost JServ5 ajpv12://192.168.0.55:7777
ApJServHost JServ11 ajpv12://192.168.0.61:7777
ApJServHost JServ12 ajpv12://192.168.0.62:7777
ApJServHost JServ13 ajpv12://192.168.0.63:7777
ApJServHost JServ14 ajpv12://192.168.0.64:7777
ApJServHost JServ15 ajpv12://192.168.0.65:7777

ApJServRoute JS1 JServ1
ApJServRoute JS2 JServ2
ApJServRoute JS3 JServ3
ApJServRoute JS4 JServ4
ApJServRoute JS5 JServ5
ApJServRoute JS11 JServ11
ApJServRoute JS11 JServ12
ApJServRoute JS11 JServ13
ApJServRoute JS11 JServ14
ApJServRoute JS11 JServ15

ApJServShmFile log/jserv_shm

ApJServSecretKey DISABLED
</IfModule>

####################################################
# Apache JServ Configuration File for Apache B #
####################################################
# Note: this file should be appended to httpd.conf
ApJServManual on

ApJServMount /servlet balance://set1/zone1

# my own set of JServs for servlet zone "zone1"
ApJServBalance set1 JServ11
ApJServBalance set1 JServ12
ApJServBalance set1 JServ13
ApJServBalance set1 JServ14
ApJServBalance set1 JServ15

ApJServShmFile log/jserv_shm

ApJServSecretKey DISABLED
</IfModule>

8. Known problems

The maximum number of active JServs is limited to 25. To increase it, just increase the NB_MAX_JSERVS in jserv.h and recompile.

Oops ! forgot this one : The full functionnality is working on U**xes (tested on Solaris/Linux). But the NT port is only partial : no watchdog for the moment. Volunteers to work on this ?
Today we have load-balancing + but NO shared memory on NT.

The Watchdog process is created (forked) at server startup by the main program (Unix only), so it inherits (too much IMHO) its characteristics : runs with Apache's main process userid.
(must be root for TCP ports < 1024). This is potentially dangerous, and even if exploits can be difficult to create (as this process does'nt listen HTTP requests), it has to be fixed ASAP.

9. FAQ

Q: One of the JServs get twice the load of others
A: Yes, due to the randomly chosen target, some of the JServs can be used as default more often.
In order to get a correct distribution, you need to run lots of httpd processes. This is designed for big sites, so ...

Q: My ApJServShmFile is corrupted
A: Remove it. Start Apache.

10. Tips

Desactivating load-balancing

configuration

ApJServMount /svlet1 ajpv12://192.168.0.1:7777/sv <<<<< this one is NOT load balanced
ApJServMount /servlet balance://set1/zone1 <<<<< this one IS.

using load-balancing without shared memory nor watchdog

ApJServShmFile

######ApJServShmFile log/jserv_shm

Still using the load-balancing syntax :

ApJServMount /servlet balance://set1/zone1
.../...

and the shared memory will NOT be used between httpd processes, and you'll get NO watchdog to detect JServs that get alive again.

How to : Scalability - Load-Balancing - Fault tolerance

with Apache JServ 1.1

document version: 1.2.1 (03 Feb 2000)

Summary

1. Introduction

2. Changes from previous versions

3. Features

3.1. Scalability

3.1.1. Scaling Apache HTTP/HTTPS servers :

3.1.2. Scaling JServ servers :

3.2. Openess

3.3. Security

3.4 Load-balancing

3.5. Session handling

3.5.1 How sessions work

3.6 Fault tolerance

3.6.1 No single point of failure

3.6.2. Scenario 1: Apache fault-tolerance

3.6.3. Scenario 2 : JServ load-balancing & fault tolerance

3.7. Manageability

4. Configuration

1 - defining hosts and routing parameters (in red)

2 - defining hosts weight (in orange)

3 - defining set (here called set1) of equivalent JServs (in green)

4 - defining load-balanced servlet mount point (in blue)

5 - defining the shared memory file (in pink)

5. Internal

5.1 Load-balancing algorithm

5.2. Watchdog process

5.3. Shared memory

6. Internal state - Admistration tasks

How to read/modify the shared memory file ?

7. Large sites

8. Known problems

9. FAQ

10. Tips

Desactivating load-balancing

using load-balancing without shared memory nor watchdog

Removing all load-balancing code

11. Authors