Oracle 12c Transaction Guard, Application Continuity … choose your degrees of freedom!

Last Saturday at Trivadis Tech Event, I presented on Application Continuity, Transaction Guard … choose your degrees of freedom”.
What a strange title, you will be asking! What do Application Continuity (AC) and Transaction Guard (TG) have to do with freedom 😉
And why talk about Transaction Guard, isn’t that just a technology used by AC? Surely it’s not something you would care to implement – if the language you’re working with is Java?

Well, there are – in addition to the extensive, but not really easy to digest Oracle documentation – some very helpful resources (blog articles, presentations) on the net, but these mostly focus on infrastructural aspects: How does AC work with RAC, with Data Guard, with RAC One Node? What does the DBA have to do to enable AC (or TG)?

The demos are mostly designed to demonstrate that AC “just works”, in different environments. As the client code is not in focus, often the easiest way of implementing AC on the client is chosen: using Oracle Universal Connection Pool (UCP). (I’ve encountered one notable exception, which is Laurent Leturgez’ very interesting post on AC resource usage.)

However, in real life, much will depend on the developer teams: Are they comfortable with making the necessary modifications? Do they trust the technology? What if they, for whatever reasons, use their own connection pool, and so can’t use UCP?

In this presentation, the focus is on the developers’ part. How the code looks / might look, and what pitfalls there are – what errors you might see if you don’t do it right, and what they mean. This is for both AC and TG.

Let’s assume, however, that you’re VERY impatient and just want to know what the “main thing” is here 😉 … I’d say it’s about TG.

As of today, I’m not aware of any Java code on the web implementing TG that is NOT from Oracle documentation / whitepapers. Of course, as the topic is not easy and probably a bit “scary”, we are thankful for the example code Oracle provides. In Transaction Guard with Oracle Database 12c Oracle provide the following code example, which shows how it works:

Connection jdbcConnection = getConnection();
boolean isJobDone = false;
while(!isJobDone) {
    try {
        isJobDone = true;
    } catch (SQLRecoverableException recoverableException) {
    try {
    } catch (Exception ex) {}
    Connection newJDBCConnection = getConnection();
    LogicalTransactionId ltxid = ((OracleConnection)jdbcConnection).getLogicalTransactionId();
    isJobDone = getTransactionOutcome(newJDBCConnection, ltxid);
    jdbcConnection = newJDBCConnection;

Basically we have a loop around our transaction. Normally that loop is left immediately. But in case we receive a recoverable exception, we get a new connection, obtain the Logical Transaction ID from the dead connection, and ask the database server for the transaction outcome for that LTXID. If the commit went through successfully, we’re done, otherwise we resubmit our transaction.

Now while this demonstrates how to do it, we do not want to clutter our code like this everywhere, do we? And fortunately, with Java 8, we don’t have to!

In Java 8, we have Functional Interfaces. Formally, a functional interface is an interface with exactly one explicitly declared abstract method. This abstract method can be implemented directly inline, using a lambda expression. That is, the lambda expression IS an implementation of the functional interface.

This allows us to separate transaction handling from the business methods, and get rid of the above loops. How?

On the one hand, this is how one business method could look:

private void updateSalaries(Connection conn) throws SQLException {
    String query = "select empno, sal from tg.emp";
    PreparedStatement stmt = conn.prepareStatement(query, ResultSet.TYPE_SCROLL_SENSITIVE, ResultSet.CONCUR_UPDATABLE);
    ResultSet rs = stmt.executeQuery();
    while ( {
       int oldsal = rs.getInt("sal");
       int newsal = calculateNewValue(oldsal);
       rs.updateInt("sal", newsal);

On the other, here we have a functional interface:

public interface Transaction {
    public void execute(Connection connection) throws SQLException;

One implementation of a transaction can be a lambda expression that wraps the updateSalaries business method, like this: (conn -> updateSalaries(conn)):

TGTransactionProcessor tp = new TGTransactionProcessor(url, appUser, appPasswd);
    if (tp.process(conn -> updateSalaries(conn))) {
        logger.fine("Salaries updated.");

public boolean process(Transaction transaction) throws SQLException {

    boolean done = false;
    int tries = 0;
    Connection conn = getConnection();

    while (!done && tries <= MAXRETRIES) {

        try {

            done = true;

        } catch (SQLRecoverableException e) {

            try {
            } catch (Exception ex) {
            LogicalTransactionId ltxid = ((OracleConnection) conn).getLogicalTransactionId();
            Connection newconn = getConnection();
            setModule(newconn, moduleName);
            done = isLTxIdCommitted(ltxid, newconn);
            if (done) {
      "Failed transaction had already been committed.");
           } else {
      "Replay of transaction neccessary.");
               conn = newconn;

    return true;

So with Java 8 Functional Interfaces, we have an elegant way to separate business logic and transaction handling in general, and implement TG in Java, specifically.
So that’s the end of the “highlights of” section, for more information just have a look at the slides :-).

Deadlock parser – parsing lmd0 trace files

Does “your” application produce deadlocks often? Or to phrase it more diplomatically, a little bit … well, too often, from time to time at least?
If so, you might want to provide developers with useful debug information – what sql was executed, where from, etc.
However, in RAC, the lmd0 trace files, which log the deadlocks, don’t give away information lightly – in 11g at least (things are different in 12c, and to be precise, I am talking about here only, as I do not currently have access to an instance).

Basically, what you need is the lmd0 trace files from all instances, and then, to combine information from three sections of data:

(1) the deadlock graph (“Wait-For-Graph”) , which is written in one instance only (per deadlock), looks like this …

Global Wait-For-Graph(WFG) at ddTS[0.b650] :
BLOCKED 0x74a6d83f0 5 wq 2 cvtops x1 TX 0x3e0003.0x7b225(ext 0x6,0x0)[9A000-0004-0000001C] inst 4 
BLOCKER 0x74ede07b8 5 wq 1 cvtops x28 TX 0x3e0003.0x7b225(ext 0x6,0x0)[AC000-0003-000001B4] inst 3 
BLOCKED 0x74edebdf8 5 wq 2 cvtops x1 TX 0xd00020.0x41ea(ext 0x39,0x0)[AC000-0003-000001B4] inst 3 
BLOCKER 0x6fbc1f388 5 wq 1 cvtops x28 TX 0xd00020.0x41ea(ext 0x39,0x0)[9A000-0004-0000001C] inst 4 

*** 2014-03-12 10:36:49.760
* Cancel deadlock victim lockp 0x74a6d83f0 

… and allows linking to – via the resource name (here, e.g., “TX 0x3e0003.0x7b225”) – (2) a section containing resource information (including the granted queue and the convert queue, at the bottom) …

Global blockers dump start:---------------------------------
DUMP LOCAL BLOCKER/HOLDER: block level 5 res [0x3e0003][0x7b225],[TX][ext 0x6,0x0]
----------resource 0x75dddead8----------------------
resname       : [0x3e0003][0x7b225],[TX][ext 0x6,0x0]
hash mask     : x3
Local inst    : 4
dir_inst      : 3
master_inst   : 3
hv idx        : 124
hv last : 65
current inc   : 65
hv status     : 0
hv master     : 3
open options  : dd 
Held mode     : KJUSERNL
Cvt mode      : KJUSEREX
Next Cvt mode : KJUSERNL
msg_seq       : 0x1
res_seq       : 23
grant_bits    : KJUSERNL 
count         : 1         0         0         0         0         0
val_state     : KJUSERVS_NOVALUE
valblk        : 0xf0655697ff7f00000000000000000000 .eV
access_inst   : 3
vbreq_state   : 0
state         : x8
resp          : 0x75dddead8
On Scan_q?    : N
Total accesses: 509
Imm.  accesses: 440
Granted_locks : 0 
Cvting_locks  : 1 
value_block:  f0 65 56 97 ff 7f 00 00 00 00 00 00 00 00 00 00
lp 0x74a6d83f0 gl KJUSERNL rl KJUSEREX rp 0x75dddead8 [0x3e0003][0x7b225],[TX][ext 0x6,0x0]
  master 3 gl owner 0x75d451a00 possible pid 11039 xid 9A000-0004-0000001C bast 0 rseq 23 mseq 0 history 0x495149da
  convert opt KJUSERGETVALUE  

… and, via the lock address (here, e.g., “0x74a6d83f0”), to (3) a section containing lock details, such as grant and request level, the sid, the OS user and oracle username, as well as the client machine and, most interestingly for the application developer, the SQL:

----------enqueue 0x74a6d83f0------------------------
lock version     : 14509
Owner inst       : 4
grant_level      : KJUSERNL
req_level        : KJUSEREX
bast_level       : KJUSERNL
notify_func      : (nil)
resp             : 0x75dddead8
procp            : 0x75d9c7eb8
pid              : 11039
proc version     : 36
oprocp           : (nil)
opid             : 11039
group lock owner : 0x75d451a00
possible pid     : 11039
xid              : 9A000-0004-0000001C
dd_time          : 10.0 secs
dd_count         : 1
timeout          : 0.0 secs
On_timer_q?      : N
On_dd_q?         : Y
lock_state       : OPENING CONVERTING 
ast_flag         : 0x0
Open Options     : KJUSERDEADLOCK 
Convert options  : KJUSERGETVALUE 
History          : 0x495149da
Msg_Seq          : 0x0
res_seq          : 23
valblk           : 0x566a560a000000000000646100000000 VjVda
user session for deadlock lock 0x74a6d83f0
  sid: 1686 ser: 50887 audsid: 5210592 user: 110/APPUSER
    flags: (0x41) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
    flags2: (0x40009) -/-/INC
  pid: 154 O/S info: user: oracle, term: UNKNOWN, ospid: 11039
    image: oracle@inst3
  client details:
    O/S info: user: someosuser, term: unknown, ospid: 1234
    machine: program: JDBC Thin Client
  current SQL:
  update tab1 set the_user='somename' where the_type='sometype' and the_date=TO_TIMESTAMP ('2014-02-27 00:00:00.0', 'YYYY-MM-DD HH24:MI:SS.F
DUMP LOCAL BLOCKER: initiate state dump for DEADLOCK
  possible owner[154.11039] on resource TX-003E0003-0007B225

Now surely you can’t just hand the developers these lmd0 traces, so it seemed to make sense to write a parser for them, which outputs the type of information a developer would like to see (I guess ;-)). (I’m stressing the word developer here, because a DBA might be interested in additional information, like the grant and convert queues, information that is currently parsed, but not printed by the parser.)

The output from the parser looks like this:

***              Deadlocks             ***
First in tracefiles:   2014-03-10 12:16:17
Last in tracefiles:    2014-04-01 03:18:34
Deadlocks encountered: 22

Deadlock at: 2014-03-12 10:36:49
  Address: 74ede07b8    [Resource: 0x3e0003-0x7b225 TX]
  Session id: 677
  User: appuser
  Current SQL: update tab1 set the_user='somename' where the_type='sometype' and the_date=TO_TIMESTAMP ('2014-02-27 00:00:00.0', 'YYYY-MM-DD HH24:MI:SS.F

  Address: 6fbc1f388    [Resource: 0xd00020-0x41ea TX]
  Session id: 1686
  User: appuser
  Current SQL: update tab2 set the_info='someinfo' where the_id=5763836

  Address: 74a6d83f0    [Resource: 0x3e0003-0x7b225 TX]
  Session id: 1686
  User: appuser
  Current SQL: update tab2 set the_info='someinfo' where the_id=5763836

  Address: 74edebdf8    [Resource: 0xd00020-0x41ea TX]
  Session id: 677
  User: appuser
  Current SQL: update tab1 set the_user='somename' where the_type='sometype' and the_date=TO_TIMESTAMP ('2014-02-27 00:00:00.0', 'YYYY-MM-DD HH24:MI:SS.F

To be honest, this is an uncommonly neat example, at least judging from the “real life” tracefiles I’ve been using for testing purposes. For other deadlocks, the output can be much more difficult to understand.
For one thing, detail information (like SQL, machine …) might be missing for one or several enqueues. Second, especially when TM locks are involved, it might be difficult to understand what’s really happened.
This is where knowledge about the application should help, and the current output generated by the parser is, in fact, intended to be used by developers who would match it to the application server logs.

On the other hand, the parser could be extended to display information that is relevant to the DBA, or in fact produce different output dependent on command line flags. (Currently I am, for example, parsing the grant and convert queues, but not displaying that to the user.)

If anyone would like to try the parser, which is written in Haskell, the source is on GitHub ( I can also provide a compiled version – so you don’t need to be running Haskell, which most probably you aren’t 😉 – if you’re using 64-bit Linux.

For more information on deadlocks, I think the must-reads (as always, of course) are the “deadlock series” by Jonathan Lewis, especially the classic
that points out the importance of how the application handles the situation (so as a DBA, when you see all those deadlocks occurring, you might well ask the developers how the application handles the ORA-00060, instead of just leaning back knowing this does not cause a database problem … 😉 )

RAC on a laptop, using KVM and libvirt

There are already some guides on the net on how to run RAC on a laptop – but AFAIK, they involve either Virtualbox or VMWare for virtualization – or at least there’s none using KVM that I know of. So I’ve thought I’d add my experiences using KVM and the libvirt library. I don’t intend this to be a howto – there are still open questions, quirks and workarounds in my setup. Nonwithstanding, perhaps, by just basically giving an account of what I’ve done, I can spare others a little time and frustration … So, let’s start.

Preparing the host

The host OS, in my case, is Fedora 17, running on an Intel I7 quadcore with hyperthreading and hardware virtualization, and 16G of memory.
In addition to the basic libvirt and KVM libraries (plus the graphical interface, virt-manager), it makes sense to install the libguestfs tools. Libguestfs provides commands like virt-filesystems and virt-df to display information on the guests’ physical and logical volumes, and the extremely useful virt-resize that you can use to enlarge the guest disk on the host, at the same time enlarging the physical volume in the guest.

yum install qemu-kvm libvirt python-virtinst virt-manager
yum install *guestf*

Guests overview

In my setup, there are 3 guests: 2 cluster nodes and a third machine combining the functions of DNS server and iSCSI target, providing the shared storage.
The cluster nodes are running Scientific Linux (SL) 5.8, the storage/DNS server SL 6.3. (I plan to set up a 6.3 cluster next, but for starters it seemed to make sense to be able to use asmlib…).
The easiest way to create a guest is using virt-manager – which is what I did. However, to check on and later modify the configurations, I’ve used virsh-dumpxml and virsh edit, and I’ll refer extracts of the xml dumps in the following.

Defining the networks

In virt-manager, prior to installing the guests, you need to define the networks. By default, every guest will be part of the default network, which uses NAT and DHCP to provide the guest IPs. Through this network, the guests will be able to access the internet and the host, but not each other.
So, 2 additional bridged networks will be created (called isolated in virt-manager). Through bridging, guests on these networks will be able to communicate with each other and the host, but not the outside world.

This is how the cluster’s public network looks in the resulting configuration:

[root@host ~]# virsh net-dumpxml priv0
<bridge name='virbr1' stp='on' delay='0' />
<mac address='52:54:00:A2:FA:79'/>
<ip address='' netmask=''>

The network to be used for the private interconnect looks basically the same (of course, the mac address and ip differ) – from the libvirt point of view, both are private, that is bridged/”non-natted” networks.

Installing the guests

When creating the cluster nodes (yes, unfortunately, this is not a typo – I didn’t manage to successfully clone the first node, I had to install the second in the same way – see the Open questions section below …), in virt-manager you need to add two additional interfaces, one each using the configured bridged networks.
This is how one of them, priv0, will look in the guest configuration file:

[root@host ~]# virsh dumpxml sl58.1 | xpath /domain/devices/interface[3]
Found 1 nodes:
-- NODE --
<interface type="network">
<mac address="52:54:00:44:02:a3" />
<source network="priv0" />
<target dev="vnet4" />
<model type="virtio" />
<alias name="net2" />
<address type="pci" domain="0x0000" bus="0x00" slot="0x05" function="0x0" />

The interfaces will be recognized by SL during the installation process. Installation completed, one should disable Network Manager in the guest so it cannot overwrite the network configuration. For reference, this is how my ifcfg-eth* scripts look for the three interfaces on the first cluster node:

# Virtio Network Device

# Virtio Network Device

# Virtio Network Device

For the iSCSI target & DNS server, just one interface has to be added, the one for the public network. So there will be just eth0 and eth1 on that vm.

Setting up the DNS

For DNS, I’ve installed bind on the combined DNS & storage node,
In /var/named/, four zone files need to be added, one each for the cluster domain, localhost, cluster domain reverse lookup, and localhost reverse lookup.
For this, I basically followed the instructions in Steve Shaw’s and Martin Bach’s excellent “Pro Oracle RAC on Linux”. For a quick reference, this is how my /var/named/ and /var/named/192.168.100.rev look – included in the latter 2 are the nodes’ public IPs, the VIPs, and the SCAN IPs:

$TTL    86400
@               IN SOA root.localhost. (
42              ; serial
3H              ; refresh
15M             ; retry
1W              ; expiry
1D )            ; minimum
@               IN NS
localhost       IN A
stor1           IN A
node1           IN A
node2           IN A
node1-vip       IN A
node2-vip       IN A
cluster1-scan   IN A

$TTL    86400
@               IN SOA root.localhost. (
42              ; serial
3H              ; refresh
15M             ; retry
1W              ; expiry
1D )            ; minimum
@               IN NS
2               IN PTR
10              IN PTR
11              IN PTR
20              IN PTR
21              IN PTR

It’s important to check both hostname lookup and reverse lookup (using nslookup or dig). Errors in the reverse lookup zone file can lead to OUI complaining with “INS-40904 ORACLE_HOSTNAME does not resolve to a valid host name”, which is in fact a misleading error message as you don’t have to set this environment variable when reverse lookup works.
One thing to pay attention to, for those new to bind as me: Don’t forget the dot after the hostname. It signifies to bind that a FQDN has been defined (see, e.g.,

In /etc/named.conf, I’ve added zone blocks for these 4 zones, and defined to be the master DNS server for the 192.168.100 domain. For records not in this domain, the server forwards to the host on the default network.

directory               "/var/named";
dump-file               "data/cache_dump.db";
statistics-file         "data/named_stats.txt";
memstatistics-file      "data/named_mem_stats.txt";
listen-on port 53       {;; };
allow-query { localhost;; };
recursion yes;
forwarders {; };

zone "" {
type master;
file "";
zone "localhost" {
type master;
file "master.localhost";
zone "" {
type master;
file "192.168.100.rev";
zone "" {
type master;
file "localhost.rev";

Finally, to make it work, is configured as the DNS server to be contacted on the cluster nodes, in /etc/resolv.conf:


Preparing the storage

On, iscsi-target-utils is installed and the service is started:

[root@stor1 ~]# yum install iscsi-target-utils
[root@stor1 ~]# service tgtd start
[root@stor1 ~]# chkconfig tgtd on

In order to define the LUNs, I need to provide logical devices on the storage server. For convenience, what I did was add a second disk in virt-manager, resulting in a second physical volume in the vm, and from this create a second volume group. From that volume group, I then created one logical volume for every LUN I want to export.

Prior to the LUNs themselves, I create a target:

[root@stor1 ~]# tgtadm --lld iscsi --mode target --op new --tid 1 --targetname

Then, the LUNs are bound to the target:

[root@stor1 ~]# tgtadm --lld iscsi --mode logicalunit --op new --tid 1 --lun 1 --b <path of logical volume>

All LUNs defined, the configuration is written to /etc/tgt/targets.conf:

[root@stor1 ~]# tgt-admin --dump > /etc/tgt/targets.conf

This is how it looks in my case:

[root@stor1 ~]# less /etc/tgt/targets.conf
default-driver iscsi
backing-store /dev/vg_stor/vol1
backing-store /dev/vg_stor/vol2
backing-store /dev/vg_stor/vol3
backing-store /dev/vg_stor/vol4
backing-store /dev/vg_stor/vol5
backing-store /dev/vg_stor/vol6
backing-store /dev/vg_stor/vol7
backing-store /dev/vg_stor/vol8

On the clients, having installed iscsi-initiator-utils, I can try if discovering the target works:

iscsiadm -- mode discovery --type sendtargets --portal,1

Finally, I’ve added the new storage pool in virt-manager, and then added the LUNs as additional storage to the client vms. Actually, the latter step proved tedious and not quite working in virt-manager, so the easier way was to add one LUN in virt-manager, and then, using virsh edit, define the others based on that example. This is, for example, how the first 2 LUNs look on one client:

virsh dumpxml sl58.1 | xpath /domain/devices/disk
<disk type="block" device="disk">
<driver name="qemu" type="raw" />
<source dev="/dev/disk/by-path/" />
<target dev="vdb" bus="virtio" />
<alias name="virtio-disk1" />
<address type="pci" domain="0x0000" bus="0x00" slot="0x0b" function="0x0" />
</disk>-- NODE --
<disk type="block" device="disk">
<driver name="qemu" type="raw" />
<source dev="/dev/disk/by-path/" />
<target dev="vdc" bus="virtio" />
<alias name="virtio-disk2" />
<address type="pci" domain="0x0000" bus="0x00" slot="0x0c" function="0x0" />
</disk>-- NODE --

Now on the clients, the new devices will appear as /dev/sd* and can be partitioned as usual.

And the rest is – Oracle

On the partitioned devices, I’ve used asmlib to create the asm disks, and proceeded to install GI. Here, I ran into a problem: the OUI “flashing-and-blinking-bug” (see also: Actually, this is related to screen resolution, which according to the GI installation guide, has to be at least 1024*768. With the default video driver chosen for the vm during install, qxl, it was not possible to select any resolution higher than 800*600, but switching to vga and editing the xorg.conf on the vm did the trick – for this, see the very helpful instructions on
The screen resolution problem out of the way, the GI install went as usual, and the database home install as well… but with clusterware running, performance was horrible. With iowaits of 50-60% on the OS disks (not the asm disks!), installing iotop

yum install iotop

revealed the culprit: ologgerd. Reducing the log level

oclumon debug log ologgerd allcomp:1

leaves iowait at around 10%, and performance very acceptable. And that’s it so far – I have a working cluster, and first experience with KVM, DNS, and iSCSI …
Of course, I also have lots of things to do more economically, in a more automated way, another time.

Open questions

The most annoying, and certainly to be investigated further, problem was for sure the inability to successfully clone a vm.
After cloning (done from virt-manager), starting in single mode, editing the network configuration (change IPs and MAC addresses), the bridged interfaces did not come up, saying “RTNetlink answers: No such device“. This looks like a udev problem – but I did not find any MAC addresses hard coded in the udev configuration files. Of course, installing every RAC node from scratch is a no-go, so there’s certainly some problem to be solved…