sobota, 3 grudnia 2011

JMeter DB benchmarks: Embedded Apache Derby vs. Network Apache Derby

The Apache Derby database as you probably know from my previous posts has two modes of operation:

  • embedded - in which the datbase itself is started and owned by the process starting it (the only way to connect to it is via the owning process, moreover there is no other process that can open the database from outside world)
  • network server - the database itself is started as a standalone process to which one can connect via TCP/IP (JDBC, ij or any other Derby client). It allows multiconnection, multiuser mode of working. This mode is very similar to what other database are offering.
This article will cover the description howto start and benchmark the existing Derby database in the network server mode. 

Network server Derby

It is not that difficult actually. What you need is the binary that you can download from the ApacheDerby web site. Next there is a great tutorial on how to startup Derby in network mode: Apache Derby tutorial. One remark - in my case (Derby 10.8.2.2) it was important to set the DERBY_HOME variable (not the DERBY_INSTALL). That is mainly because the : setNetworkServerCP script was expecting this one rather than DERBY_INSTALL.

#!/bin/sh

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at

#   http://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.

if [ -z "$DERBY_HOME" ]; then
  echo "Error: DERBY_HOME is not set. Please set the DERBY_HOME environment variable"
  echo "to the location of your Derby installation."
  return 1
fi

export CLASSPATH="${DERBY_HOME}/lib/derbynet.jar:${DERBY_HOME}/lib/derbytools.jar:${CLASSPATH}"

In short you need to invoke the following commands:


$ export DERBY_HOME=`/home/krychu/mediate/experiments/derby/binaries/db-derby-10.8.2.2-bin`
$ cd bin
$ . ./setNetworkClientCP
$ ./startNetworkServer ∓
[1] 12629
$ Sat Dec 03 13:21:59 CET 2011 : Security manager installed using the Basic Server security policy.
Sat Dec 03 13:22:00 CET 2011 : Apache Derby Network Server - 10.8.2.2 - (1181258) started and ready to accept connections on port 1527

Migrating existing DB

For a totally new installation you would usually create a completely new db, create the corresponding data model and do the provisioning but for the existing db (when you change only the mode of working or you just upgrade to a newer version) you might want to use/migrate the existing Apache Derby database to the network server mode.
In my exercise I wanted to migrate the DB from Apache Derby version 10.3.3.0 (so far use only in embedded mode) to 10.8.2.2 . In order to do that you need only to copy the database directory to $DERBY_HOME/bin one:

# cp -Rf /home/krychu/mediate/experiments/derby/mzdb $DERBY_HOME/bin

JMeter enhancements

Now you need to add the JDBC driver for connecting to the Derby network server. Again you only need to copy a couple of jars to the JMeter lib directory:

#cp $DERBY_HOME/lib/* $JMETER_HOME/lib/

Configure the JDBC connection

Make sure that the Derby network server is running. By default it should open the 1527 port for incoming connections. Good approach is to make a simple connection test using the ij as in the example below:

$ $DERBY_HOME/bin/ij
ij version 10.8
ij> connect 'jdbc:derby://localhost:1527/mzdb;create=false;user=mzadmin;password=mz';
ij> 

If everything is ok it is time to start the JMeter and define the proper JDBC URL. I took the testcase defined in the previous article and changed only the embedded mode into network mode by disabling the database configuration for embedded mode and adding a new one with the parameters as below:

JDBC connection configuration

As I mentioned the test is organized as in previous article:
  • read access - single, simple search by indexed field
  • write access - single write operation into one table
  • backup - in the background the backup procedure is invoked (total size of db: 285 MB)
All three groups are running in parallel.

Apache Derby benchmark - JMeter testcase

Now everything is ready for the first benchmark. Do not forget to start the agents if you want to also monitor the CPU, memory, network and disk activity.

Benchmark results


Below one can find the tables that summarizes the results obtained on HP EliteBook 8540W machine running Ubuntu 11.04 Natty. As you see the results are similar, slightly better for network server but I would still say that in the range of errors. As expected the network server consumes much more  network bandwidth (also a proof that the communication goes over loopback interface), moreover it will increase with the number of transactions.
As described previously the test has been invoked when all types of accesses triggered by separate thread groups. The backup procedure has only effect on the number of writes (as expected) in my opinion since both are I/O consuming. During the tests when the backup was initiated the write rate was decreased by a factor of ~3 but they were still processed (not a single one suspended or rejected). As far as reads are concerned there was no difference measured.

Category
Apache Derby embedded
Apache Derby - network server
Read - single row during backup (Tps)
27 000
27 000
Read - single row (Tps)
27 000
27 000
Write - single row during backup (Tps)
22
23
Write - single row (Tps)
65
70
Network bandwidth - lo during backup (KB/s)
2
15
Network bandwidth - lo (KB/s)
2
42
Backup time (s)
35
32


Summary

As far as performance is concerned it does not make that big difference if Apache Derby is running in embedded mode or in standalone (network server running on the same host - communication via localhost) - at least on my small machine (HP EliteBook 8540W). It might be that the differences will be more visible when larger amount of data are being transferred over the network - using queries returning multiple rows. However taking into account other aspects like: allowing simultaneous access by multiple clients, better serviceability, easier administration, configuration and flexibility in deployment (it might run on the same server or on remote one) - I would choose the network server mode.