Tune the App, Not the SQL – DBA Sherlock’s Adventures in Hibernate/jOOQ Land

Last weekend at Trivadis Tech Event, in addition to my session on Oracle 12c Client High Availability for Java (Application Continuity, Transaction Guard … choose your degrees of freedom”), I gave a quite different talk called Tune the App, Not the SQL – DBA Sherlock’s Adventures in Hibernate/jOOQ Land”.

In a nutshell, what is it all about? Well, if you’re a DBA, you might sometimes (or often, or seldom – depends on your job ;-)) do what we call “SQL Tuning”. It might look like this: You get a call that the application “is slow”. Fortunately, you’re able to narrow that down quickly, and it’s just “the click on that button” that is slow. Fortunately, too, the application is well instrumented, and getting traces for specific tasks is not a problem. So you get a SQL trace file, run tkprof on it, and obtain the problematic SQLs in order of elapsed time. Then you analyse the “Top SQL”. You may find it should be rewritten, or that it would benefit from an index. Perhaps adding additional data structures might help, such as creating a materialized view. Perhaps you’re unlucky and just an ugly hint will fix it.

So you tell the developers “please rewrite that, like this”, or you create that index. And after, it’s waiting and hoping that performance will improve – noticeably, that is. Because tuning that single statement might – if you’re unlucky – not make that much of a difference.

There is, however, another approach to tuning application performance (always talking about the DB related part of it here, of course), and that has to do with the application logic and not single statements. There is an excellent demonstration of this in Stéphane Faroults very recommendable book “Refactoring SQL applications”, in fact, this is right at the beginning of the book and something that immediately “draws you in” (it was like that for me :-)).

Application logic affects what, when, and how much data is fetched. Of course, there are many aspects to this – for example, it just may not be possible to replace two DB calls by one simply because another service has to be queried in between. Also, there will have to be a tradeoff between performance and readability/maintainability of the code. But often there will be a choice. And you will see in the presentation it is not always that combining two queries into one results in better performance.

In fact, it all depends. So the first conclusion is the ubiquitous “don’t just assume, but test”.

There is another aspect to this, though. While Stéphane Faroult, in his test case, uses plain JDBC, I am using – and comparing, in a way – two commonly used frameworks: Hibernate and jOOQ. (For an “interdisciplinary introduction” to Hibernate, see my talk from previous Tech Event, Object Relational Mapping Tools – let’s talk to each other!. Quite a contrast, jOOQ is a lightweight, elegant wrapper framework providing type safety and near-100% control over the generated SQL.)

Now while for a DBA looking at a raw trace file, it will always be a challenge to “reverse engineer” application logic from the SQL sent to the database (even if plain JDBC or jOOQ are being used), the difficulties rise to a new dimension with Hibernate :-). In the talk, I am showing a DBA – who doesn’t need to be convinced any more about the importance of application logic – trying to make sense of the flow of statements in a trace file: the only information he’s got. And it turns out to be very difficult …

But as he’s not just “some DBA”, but Sherlock, he of course has his informants and doesn’t remain stuck with his DB-only view of the world – which brings me to one of my ceterum censeo’s, which is “DBAs and developers, let’s talk to each other” :-).

The slides are here.

Advertisements

Peeking into Oracle using Linux tracing tools – five tools in five minutes

Five tools in five minutes?

What’s that supposed to mean, you will be asking! Well, one week ago at Trivadis Tech Event, I had the honor of being invited to participate in the Oracle Lightning Talks session organised by my colleague Ludovico Caldara, where every topic has to be presented in five minutes (which of course makes for quite an interesting “listening experience” for the audience …).
To sort of “maximize the mismatch”, I chose to present on Linux tracing tools – not one (normally you’d need hours to explain just one) – but five, one for every minute 😉

So here’s my personal list of appetizers, you’ll find some links at the end to really get you started:

strace

So, strace, you will be thinking, that sounds a bit boring, doesn’t it? Well, perhaps, but it’s a good tool to start an investigation. Oracle has its “wait event lingo”, – but what’s going on underneath?
Here for example, we’re stracing the CKPT process, who says he’s doing a control file parallel write. With strace we can see the asynchronous IO request it issued (io_submit), the two file descriptors it was issued to, and how the process waits for completion in a blocking IO call (io_getevents):

$ strace -tt -f -e write=all -v -s 100 -p $(pgrep ora_ckpt)

So strace can be useful, but beware, it will slow down the process being traced – check out Brendan Gregg’s post “Strace Wow Much Syscall” (nice title isn’t it ;-)) listed in the appendix. Speaking of whom …

https://github.com/brendangregg/perf-tools

… Brendan Gregg has a very nice collection of scripts on github, following the unix/linux philosophy, “one tool one purpose”.
Here, for example, I’m using iolatency to get a meaningful representation of IO latencies to a voting disk on one of my VMs:

$ lsblk/dev/vdb
$ ./iolatency –d 251,16

Or check out execsnoop, which shows newly created processes on the system – just watch how clusterware is constantly monitoring itself and others …

$ ./execsnoop

Useful little scripts, you could write them yourself!
Just have a look at what’s to be discovered in …

The /proc filesystem

Its content doesn’t always come as nicely formatted as the /proc/meminfo us DBAs are used to using ;-), but it may be possible to look in the kernel source to figure out what it means… For example, you might look in fs/proc/stat.c to figure out this:

$ cat /proc/stat

And perhaps even more helpful, we have lots of per-process information under /proc/<pid>/.
For example, process memory mappings, like this one here of the Oracle log writer:

$ cat /proc/$(pgrep ora_lgwr)/maps

Ok. Two left, time for the big shots… Enter –

perf

Perf has many uses, but one of the most frequent probably is sampling on-cpu functions. Say I want to determine how much overhead it is setting statistics_level to all.
I‘m executing a simple two-table join, first with statistics_level set to all, and then, with statistics_level set to typical, and for both, I sample what runs on cpu using perf. The result: not much of a difference!

$ perf record -p  -o all_hj.data sleep 30
$ perf record -p  -o typ_hj.data sleep 30
$ perf diff typ_hj.data all_hj.data

Well. Oracle was doing a hash join. Now I do the same, forcing a nested loop join, what happens?

$ perf record -p  -o all_nl.data sleep 30
$ perf record -p  -o typ_nl.data sleep 30
$ perf diff typ_nl.data all_nl.data

Oracle spends 26% of the time in a vdso call, getting the current time from the operating system!

That‘s just one example of what you can do with perf, there’s numerous others – and possibly an even more amazing tool is …

systemtap

Using systemtap, I can profile user-space (here: Oracle) functions.
This all builds upon the awesome investigations of two guys, Luca Canali and Frits Hoogland, whose work you will find linked in the appendix.
So say I’d like compare two versions of a Java program, one hard parsing all the time (using plain java.sql.Statement) and another that uses bind variables (by means of java.sql.PreparedStatement). What are they doing „under the hood“, and how long do these operations take?

This is one thing I could do: get a latency histogram of the Oracle function kksParseCursor. First, the case of doing a hard parse every time: Most parses take between 128 and 512 ms.

Then we have the code using bind variables – most parses take less than 8 ms here…

Again, this is just a starting point … possibilities are endless. So to get you started, have a look at the appendix, and: enjoy!

Appendix: What to Read

System Tap into Oracle for Fun and Profit
Life of an Oracle I/O: Tracing Logical and Physical I/O with SystemTap
Linux strace doesn’t lie after all.
How long does a logical IO take?
strace Wow Much Syscall
perf Examples

Bind variables vs. literals, a quiz and a surprise

The quiz

Recently, something I had intended to be a simple demo promoting the use of bind variables among Java developers turned out to be more interesting than expected.
It all started as a quiz. So I was going to ask, how long would the following run:

  • a fixed query, repeated 100,000 times, always with the same literal
  • the same query, again repeated 100,000 times, but this time using a bind variable, the same value every time
  • the same query, repeated 100,000 times, using 100,000 different literals
  • the same query, repeated 100,000 times, using 100,000 different values for the bind variable

Just so you know what I did exactly, this is how the Java code looked (displaying the relevant parts here only):

One same literal, 100,000 times

String query = "select text1, text2 from mytab where rn=";
int id = 1;
long startTime = System.currentTimeMillis();
Statement stmt = conn.createStatement();
for (int i = 0; i <= 100000; i++) {
  ResultSet rset = stmt.executeQuery (query + id);
  String s;
  while (rset.next ()) {
    s = rset.getString (1) + " " + rset.getString (2);
  }
  rset.close();
}
stmt.close();
System.out.println(System.currentTimeMillis() - startTime);
conn.close();

One same bind value, 100,000 times

String query = "select text1, text2 from mytab where rn=?";
int id = 1;
long startTime = System.currentTimeMillis();
PreparedStatement stmt = conn.prepareStatement(query);
for (int i=0; i<= 100000; i++) {
  stmt.setInt(1, id);
  ResultSet rset = stmt.executeQuery();
  String s;
  while (rset.next ()) {
    s = rset.getString (1) + " " + rset.getString (2);
  }
  rset.close();
}
stmt.close();
System.out.println(System.currentTimeMillis() - startTime);
conn.close();

100,000 different literals

String query = "select text1, text2 from mytab where rn=";
int id = 1;
long startTime = System.currentTimeMillis();
Statement stmt = conn.createStatement();
for (int i = 0; i<= 100000; i++) {
  ResultSet rset = stmt.executeQuery (query + id);
  String s;
  while (rset.next ()) {
    s = rset.getString (1) + " " + rset.getString (2);
  }
  rset.close();
  id = id + 1;
}
stmt.close();
System.out.println(System.currentTimeMillis() - startTime);
conn.close();

100,000 different bind values

String query = "select text1, text2 from mytab where rn=?";
int id = 1;
long startTime = System.currentTimeMillis();
PreparedStatement stmt = conn.prepareStatement(query);
for (int i = 0; i<= 100000; i++) {
  stmt.setInt(1, id);
  ResultSet rset = stmt.executeQuery();
  String s;
  while (rset.next ()) {
    s = rset.getString (1) + " " + rset.getString (2);
  }
  rset.close();
  id = id + 1;
}
stmt.close();
conn.close();

So, of course what I intended to show was the devastating effect of using different literals all the time. Which I succeeded in. But let’s look at the results, first, the execution times measured on the client side:

Client-side execution times (ms)
One same literal 36.1
One same bind value 24.7
100,000 literals 147.4
100,000 bind values 25.3

The surprise

Nice. The 100,000 different literals case does have a horrible execution time. But let’s look at the one-same-value cases, same literal vs. same bind value. Why is the bind variable case so much faster?
(And by the way, although I’m listing the results of one trial only, they were totally reproducible, the pattern was the same how ever I varied the circumstances – e.g., with/without tracing …)

In the one-same-literal case, we perform one hard parse and two soft parses, – from there on it’s “softer soft parses” only.
This can nicely be seen in the trace file. In the third CLOSE line, we have type=1, and PARSE elapsed times are <30 microseconds from there on:

PARSING IN CURSOR ...
PARSE #140263995577416:c=2999,e=20813,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=2832086359,tim=1420822842233661
EXEC #140263995577416:c=0,e=69,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=2832086359,tim=1420822842238446
WAIT #140263995577416: nam='SQL*Net message to client' ela= 4 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1420822842238488
FETCH #140263995577416:c=0,e=110,p=0,cr=4,cu=0,mis=0,r=1,dep=0,og=1,plh=2832086359,tim=1420822842238639
..
WAIT #140263995577416: nam='SQL*Net message from client' ela= 101732 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1420822842340650
CLOSE #140263995577416:c=0,e=15,dep=0,type=0,tim=1420822842340788
=====================
PARSING IN CURSOR ... 
PARSE #140263995577416:c=0,e=199,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=2832086359,tim=1420822842341057
EXEC #140263995577416:c=0,e=46,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=2832086359,tim=1420822842341161
WAIT #140263995577416: nam='SQL*Net message to client' ela= 3 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1420822842341197
FETCH #140263995577416:c=0,e=44,p=0,cr=4,cu=0,mis=0,r=1,dep=0,og=1,plh=2832086359,tim=1420822842341312
...
WAIT #140263995577416: nam='SQL*Net message from client' ela= 1708 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1420822842343128
CLOSE #140263995577416:c=0,e=8,dep=0,type=0,tim=1420822842343193
=====================
PARSING IN CURSOR ...
PARSE #140263995577416:c=0,e=52,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=2832086359,tim=1420822842343301
EXEC #140263995577416:c=0,e=30,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=2832086359,tim=1420822842343386
WAIT #140263995577416: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1420822842343414
FETCH #140263995577416:c=0,e=29,p=0,cr=4,cu=0,mis=0,r=1,dep=0,og=1,plh=2832086359,tim=1420822842343467
...
WAIT #140263995577416: nam='SQL*Net message from client' ela= 1618 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1420822842345279
CLOSE #140263995577416:c=0,e=14,dep=0,type=1,tim=1420822842345357
PARSE #140263995577416:c=0,e=27,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=2832086359,tim=1420822842345420
EXEC #140263995577416:c=0,e=31,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=2832086359,tim=1420822842345473
WAIT #140263995577416: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1420822842345502
FETCH #140263995577416:c=0,e=27,p=0,cr=4,cu=0,mis=0,r=1,dep=0,og=1,plh=2832086359,tim=1420822842345568
WAIT #140263995577416: nam='SQL*Net message from client' ela= 1618 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1420822842347230
CLOSE #140263995577416:c=0,e=11,dep=0,type=3,tim=1420822842347318

From a visual inspection of the trace file for the one-same-bind-value case, you might think both cases should take about the same time in the end:

PARSING IN CURSOR ...
PARSE #140450201769032:c=0,e=873,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=0,tim=1420823096225812
EXEC #140450201769032:c=2000,e=2337,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=2832086359,tim=1420823096228602
WAIT #140450201769032: nam='SQL*Net message to client' ela= 4 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1420823096228684
FETCH #140450201769032:c=0,e=112,p=0,cr=4,cu=0,mis=0,r=1,dep=0,og=1,plh=2832086359,tim=1420823096228869
...
WAIT #140450201769032: nam='SQL*Net message from client' ela= 83168 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1420823096312251
EXEC #140450201769032:c=0,e=71,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=2832086359,tim=1420823096312442
WAIT #140450201769032: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1420823096312507
FETCH #140450201769032:c=0,e=60,p=0,cr=4,cu=0,mis=0,r=1,dep=0,og=1,plh=2832086359,tim=1420823096312527
WAIT #140450201769032: nam='SQL*Net message from client' ela= 1081 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1420823096313684
EXEC #140450201769032:c=0,e=36,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=2832086359,tim=1420823096313794
WAIT #140450201769032: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1420823096313832
FETCH #140450201769032:c=0,e=35,p=0,cr=4,cu=0,mis=0,r=1,dep=0,og=1,plh=2832086359,tim=1420823096313848

How does this go together then – on the one hand, the work done by the database does not seem to be tremendously different, on the other, the bind values case is noticeably faster on the client!

Well, let’s look at a statistical summary of the database calls for both cases.
First, the one-same-literal case:

1 same literal, 100,000 times
call type count sum elapsed avg elapsed stddev elapsed median elapsed
PARSE 100001 1850716 19 20 16
EXEC 100001 1207545 12 12 11
FETCH 100001 1384287 14 17 12
CLOSE 100000 335618 3 4 3
overall db 400003 4778166
1 same bind value, 100,000 times
call type count sum elapsed avg elapsed stddev elapsed median elapsed
PARSE 1 202 202 0 202
EXEC 100001 1388479 14 25 12
FETCH 100001 3080674 31 30 28
CLOSE 0
overall db 200003 4469355

So it’s true, seen from the database side, the elapsed times for both cases do not differ much at all!
So really the database is doing about the same amount of work in both cases.

Wait a sec …

What about when we add the wait time, especially, SQL*NET message from client (overall db calls are the db call summary numbers from above)?

1 same literal, 100,000 times
type count sum elapsed avg elapsed stddev elapsed median elapsed
SQL*NET msg from client 100001 15308493 153 949 106
overall db calls 400003 4778166
overall total 500004 20086659
1 same bind value, 100,000 times
type count sum elapsed avg elapsed stddev elapsed median elapsed
SQL*NET msg from client 100001 11425310 114 416 74
overall db calls 200003 4469355
overall total 300004 15894665

So we see it is really the time subsumed under SQL*NET message from client that makes the difference.
Why? A SQL*NET trace shows that the amount of data transferred over the network is about twice as much for the literals case than for the binds case (156 vs. 70 bytes per execution). Of course, that’s way below TCP packet size in both cases, but the data has to pass various layers where the size of the read/write operations might make a difference. Probably, the biggest factor here might be the processing on the Java client side, the client having to marshal either a simple number or a not-too-short string. (Of course, that’s just a guess which would need to be supported by more sophisticated measurements on the client side.)

So what’s the conclusion? We can be even more confident about advising (Java) developers to use bind variables, because response times will go down even in cases where one (me, at least) wouldn’t have expected to see a noticeable difference 🙂

SQL trace files, think time, and outliers

Idle wait events and think time

When creating a resource profile for a session or a group of sessions, one usually has to decide how to handle the so-called idle wait events, first and foremost, SQL*Net message from client.

One possibility would be to simply remove (physically or logically) those lines from the trace, thereby avoiding the question of whether some wait should be classified as client (= application) response time, or as think time.
Another would be be to define a threshold and exclude only those lines with wait times above the threshold. In that case, of course, I have to know what makes for a sensible threshold. I could base that threshold on network latencies but then, I might miss information about how the application behaves – what, from the database server’s point of view, is the response time of the application?

So here I want to try another approach, which is applying common outlier detection methods and find out how far that gets me ;-). My test material, this time, will be “real life”, non-artificial trace files, my tools will be the mining infrastructure based on external tables described in the latest three blog posts (starting with Mining SQL Trace Files Using External Tables Part 1: Getting the Timeline and, for ease of use, uploaded on github, and the outlier detection methods used will be four common ones described in http://d-scholarship.pitt.edu/7948/1/Seo.pdf.

Outlier detection methods

The four methods chosen were selected for being both commonly used as well as easily applicable. None of them are based on hypothesis testing as this would require knowledge of the underlying distribution.

Standard deviation method

One commonly used method is classifying as outliers those measurements whose distance from the mean is more than 2 or 3 standard deviations. We may already suspect that a method based on the normality assumption might not be the most effective here, but we’ll have a look anyway, be it only for establishing a baseline for the other methods.

Boxplot

As per Tukey’s boxplot method, values are seen as outliers when either below the first quartile minus 1.5* (3*, respectively) the interquartile range or above the third quartile plus 1.5 * (3*, respectively) the interquartile range (i.e., x < Q1 – 1.5[3]*IQR or x > Q3 + 1.5[3]*IQR).

MADe

Using the MADe method, values are classified as outliers when there distance from the median is more than 2 times (3 times, respectively) the median of absolute differences from the median, scaled by 1.483 (this scaled median of absolute differences is the so-called MADe).
I.e., x < median – 2[3] MADe or x > median – 2[3] MADe.

Median method

Using the median method, values are classified as outliers when they are below/above the median by more than n*IQR, when n is adjustable and chosen as 2.3 in the above mentioned paper.

Outlier detection methods applied

Inspecting the data

First, for all four trace files, let me indicate some summary statistics for the SQL*Net message from client event and present the data graphically.
All trace files have about 50.000 lines for this wait event, with minimum values of ~ 0.35-0.55 milliseconds, a median value of ~ 0.55-0.75 ms, an average of ~1.5-1.8 ms and a maximum ranging from 2.2 to 12 seconds approximately.

SQL> select min(elapsed) min, median(elapsed) median, trunc(avg(elapsed)) avg, trunc(stddev(elapsed)) stddev, max(elapsed) max, count(elapsed) cnt from wait where event='SQL*Net message from client';

-- file 1
       MIN     MEDIAN	     AVG     STDDEV	   MAX	      CNT
---------- ---------- ---------- ---------- ---------- ----------
       368	  541	    1606      33961    3005346	    53124

-- file 2
       MIN     MEDIAN	     AVG     STDDEV	   MAX	      CNT
---------- ---------- ---------- ---------- ---------- ----------
       523	  683	    1843      33805    2269024	    51485

-- file 3
       MIN     MEDIAN	     AVG     STDDEV	   MAX	      CNT
---------- ---------- ---------- ---------- ---------- ----------
       339	  530	    1530      68874   11794066	    47152

-- file 4
       MIN     MEDIAN	     AVG     STDDEV	   MAX	      CNT
---------- ---------- ---------- ---------- ---------- ----------
       547	  770	    1482      27795    3306050	    52608

The following diagrams should help visualize how the data is distributed. I’ve chosen a bucket size of 250 microseconds/0.25 milliseconds, which after some experimentation seemed to work best.
(This means that the numbers on the x axis have to be multiplied by 0.25 to give the elapsed time in microseconds).

SQLNET message from client wait elapsed time distribution

SQLNET message from client wait elapsed time distribution

SQLNET message from client wait elapsed time distribution

SQLNET message from client wait elapsed time distributionFor all methods, I’m interested in the upper threshold only. Now let’s see what the different methods yield.

Standard deviation method
with sd_data as
  (select avg(elapsed) avg, stddev(elapsed) stddev from wait where event='SQL*Net message from client'),
  bounds as 
    (select avg + 2*stddev upper_2sd, avg + 3*stddev upper_3sd from sd_data),
    candidates_2sd as
      (select elapsed from wait w join bounds b on (w.event='SQL*Net message from client' and w.elapsed >  b.upper_2sd)),
      candidates_3sd as
        (select elapsed from wait w join bounds b on (w.event='SQL*Net message from client' and w.elapsed >  b.upper_3sd))
        select '   2sd' "method", count(elapsed) "num_outliers", round(min(elapsed)/1000, 1) "smallest_outlier_ms", round(max(elapsed)/1000, 1) "max_value_ms" from candidates_2sd
        union all 
        select '   3sd', count(elapsed), round(min(elapsed)/1000, 1), round(max(elapsed)/1000, 1) from candidates_3sd;


-- file 1
method num_outliers smallest_outlier_ms  max_value_ms
------ ------------ ------------------- -------------
   2sd		 64		   75.2        3005.3
   3sd		 62		  106.6        3005.3

-- file 2
method num_outliers smallest_outlier_ms max_value_ms
------ ------------ ------------------- ------------
   2sd		 88		   70.5 	2269
   3sd		 84		  116.3 	2269

-- file 3
method num_outliers smallest_outlier_ms max_value_ms
------ ------------ ------------------- ------------
   2sd		 31		  145.9      11794.1
   3sd		 26		  221.7      11794.1

-- file 4
method num_outliers smallest_outlier_ms max_value_ms
------ ------------ ------------------- ------------
   2sd		 64		   62.4       3306.1
   3sd		 55		     93       3306.1

Not surprisingly (given the non-normality of the data), the standard deviation method does not yield useful results. We’ll certainly want to classify as think time values well below 100 ms.

Boxplot
with iqr_data as 
  (select percentile_cont(0.25) within group (order by elapsed) q1, percentile_cont(0.75) within group (order by elapsed) q3, abs(percentile_cont(0.25) within group (order by elapsed) - percentile_cont(0.75) within group (order by elapsed)) iqr from wait where event='SQL*Net message from client'),
  bounds as 
    (select q3 + 1.5 * iqr lower_bound, q3 + 3 * iqr upper_bound from iqr_data),
      candidates_inner as
          (select elapsed from wait w join bounds b on (w.event='SQL*Net message from client' and w.elapsed >  b.lower_bound)),
          candidates_outer as
            (select elapsed from wait w join bounds b on (w.event='SQL*Net message from client' and w.elapsed >  b.upper_bound))
            select 'outside inner bound' "criterion", count(elapsed) "num_outliers", round(min(elapsed)/1000, 1) "smallest_outlier_ms", round(max(elapsed)/1000, 1) "max_value_ms" from candidates_inner
            union all
            select 'outside outer bound', count(elapsed), round(min(elapsed)/1000, 1), round(max(elapsed)/1000, 1) from candidates_outer
        ;
   

-- file 1
criterion	    num_outliers smallest_outlier_ms max_value_ms
------------------- ------------ ------------------- ------------
outside inner bound	    5370		  .9	   3005.3
outside outer bound	    3983		 1.1	   3005.3

-- file 2
criterion	    num_outliers smallest_outlier_ms max_value_ms
------------------- ------------ ------------------- ------------
outside inner bound	    3869		   1	     2269
outside outer bound	    2264		 1.2	     2269

-- file 3
criterion	    num_outliers smallest_outlier_ms max_value_ms
------------------- ------------ ------------------- ------------
outside inner bound	    2026		  .8	  11794.1
outside outer bound	     809		  .9	  11794.1

-- file 4
criterion	    num_outliers smallest_outlier_ms max_value_ms
------------------- ------------ ------------------- ------------
outside inner bound	    3836		 1.1	   3306.1
outside outer bound	    1418		 1.3	   3306.1

This looks better! If we choose the outer bound, waits have to exceed 1 ms to be classified as think time. However, looking at the diagrams, I’m not sure if this is not too low. Let’s compare with median-based methods.

MADe
with median as
  (select median(elapsed) median from wait where event='SQL*Net message from client'),
  abs_differences as
    (select abs(elapsed - median) absdiff from wait, median),
    mad_e as 
      (select median(absdiff) mad_e from abs_differences),
        bounds as 
          (select median + 2*mad_e upper_2mad_e, median + 3*mad_e upper_3mad_e from median, mad_e),
          candidates_2mad_e as
            (select elapsed from wait w join bounds b on (w.event='SQL*Net message from client' and w.elapsed >  b.upper_2mad_e)),
            candidates_3mad_e as
              (select elapsed from wait w join bounds b on (w.event='SQL*Net message from client' and w.elapsed >  b.upper_3mad_e))
                select '2mad_e' "method", count(elapsed) "num_outliers", round(min(elapsed)/1000, 1) "smallest_outlier_ms", round(max(elapsed)/1000, 1) "max_value_ms" from candidates_2mad_e
                union all 
                select '3mad_e', count(elapsed), round(min(elapsed)/1000, 1), round(max(elapsed)/1000, 1) from candidates_3mad_e;


-- file 1
method	num_outliers smallest_outlier_ms max_value_ms
------- ------------ ------------------- ------------
2mad_e   	2740		     1.6       3005.3
3mad_e  	1523		     2.2       3005.3

-- file 2
method num_outliers smallest_outlier_ms max_value_ms
------ ------------ ------------------- ------------
2mad_e		287		      2 	2269
3mad_e		208		    2.7 	2269

-- file 3
method num_outliers smallest_outlier_ms max_value_ms
------ ------------ ------------------- ------------
2mad_e		236		    1.6      11794.1
3mad_e		142		    2.1      11794.1

-- file 4
method num_outliers smallest_outlier_ms max_value_ms
------ ------------ ------------------- ------------
2mad_e		487		    2.3       3306.1
3mad_e		380		    3.1       3306.1

Now this – again choosing the outer bound, so we end up at a threshold of ~ 2-2.5 ms, looks much better to me. Perhaps it is still a little low, from visual inspection?

Median method
with iqr_data as 
  (select median(elapsed) median, percentile_cont(0.25) within group (order by elapsed) q1, percentile_cont(0.75) within group (order by elapsed) q3, abs(percentile_cont(0.25) within group (order by elapsed) - percentile_cont(0.75) within group (order by elapsed)) iqr from wait where event='SQL*Net message from client'),
  bounds as 
    (select median + 2.3 * iqr bound from iqr_data),
      candidates as
          (select elapsed from wait w join bounds b on (w.event='SQL*Net message from client' and w.elapsed >  b.bound))
            select 'outside bound' "criterion", count(elapsed) "num_outliers", round(min(elapsed)/1000, 1) "smallest_outlier_ms", round(max(elapsed)/1000, 1) "max_value_ms" from candidates;


-- file 1   
criterion     num_outliers smallest_outlier_ms max_value_ms
------------- ------------ ------------------- ------------
outside bound	      5065		    .9	     3005.3

-- file 2
criterion     num_outliers smallest_outlier_ms max_value_ms
------------- ------------ ------------------- ------------
outside bound	      3575		     1	       2269

-- file 3
criterion     num_outliers smallest_outlier_ms max_value_ms
------------- ------------ ------------------- ------------
outside bound	      1729		    .8	    11794.1

-- file 4
criterion     num_outliers smallest_outlier_ms max_value_ms
------------- ------------ ------------------- ------------
outside bound	      3316		   1.1	     3306.1

The median method, at least for the trace files given, seems to yield results very similar to boxplot. Same as with boxplot, it might be too strict though.

Conclusion

Of course, four trace files, taken from the same application and under similar conditions, do not make for a representative sample. However, the idea here was to demonstrate the approach, i.e., looking at think time from an outlier detection perspective. For the given sample, the MADe method seems to yield a plausible classification. However, what classification makes sense is, in the end, a question of (at least) two things – the application we’re looking at (a real application? or end users, sitting at their terminals?) and what we want to do with the data (diagnose the database only or the client, too).

I guess it’s the latter case when this approach will be the most useful.

.

Mining SQL Trace Files Using External Tables Part 3: Recursive Overhead

Recursive statements overhead

After first, introducing the idea of profiling SQL trace files using the external table preprocessor (see Mining SQL Trace Files Using External Tables Part 1: Getting the Timeline), and then, showing how this approach could yield the usual output of resource profiles etc. (in Mining SQL Trace Files Using External Tables Part 2: Resource Profiling), I’d finally like to come to one of the main features I created this for – an expressive and convenient way to view the SQL statements internally executed by Oracle, and to analyze the performance overhead caused by using different optional features.

I won’t try to draw any statistical conclusions here, as I’m restricting myself to a specific test setup and just one tracefile here (plus, one Oracle version, OS version, etc. etc. etc.), – the intent is just to demonstrate the usefulness of the method and the kinds of insight it yields.

The question

In my test setup, I’ll be using the three features standard auditing, fine grained auditing, and virtual private database. Although I’m on 12.1.0.2, I’m using traditional auditing here. As it’s an artificial setup, designed to demonstrate what’s possible with the approach, I’ll be using a PL/SQL procedure to iterate through statements against one and the same object; of course if you want to answer questions about the real impact of implementing auditing, FGA or VPD in a real application, real trace files from this application would have to be used.

So, this is the test setup we’ll be using to have a look at auditing, FGA and VPD effects:

/* drop all objects, for repeated execution */
begin
  dbms_fga.drop_policy('USR', 'MYOBJECTS', 'MYOBJECTS_FGA_POLICY');
end;
/
begin
  dbms_rls.drop_policy ( 
    object_schema   	 => 'USR', 
    object_name     	 => 'MYOBJECTS', 
    policy_name     	 => 'MYOBJECTS_POLICY');
end;
/
drop table myobjects;

/* sample table */
create table myobjects as select * from dba_objects;

/* test procedure issues select and DML for every 100th row, every second of which will be prohibited by VPD */
create or replace procedure generate_stuff_to_trace
is
  type t_numtab is table of number;
  l_numtab t_numtab;
  l_name varchar2(30);
begin
  select object_id bulk collect into l_numtab from myobjects;
  for i in 1..l_numtab.count loop
    if mod(i, 100) = 0 then
      dbms_session.set_identifier('client_' || to_char(i)); 
      begin
        select object_name into l_name from myobjects where object_id = l_numtab(i);
      exception
        when no_data_found then null;
      end;
      update myobjects set owner = null where object_id = l_numtab(i);
      delete from myobjects where object_id = l_numtab(i);
    end if;
  end loop;
  dbms_session.set_identifier(null);
end;
/

/* just audit every statement type on this table */
audit all on myobjects;

/* unconditional FGA, again for all types of select/DML */
begin
  dbms_fga.add_policy (
    object_schema      => 'USR', 
    object_name        => 'MYOBJECTS', 
    policy_name        => 'MYOBJECTS_FGA_POLICY',
    audit_condition    =>  null, 
    statement_types    => 'SELECT, INSERT, UPDATE, DELETE',
    audit_trail        =>  DBMS_FGA.DB_EXTENDED,
    audit_column_opts  =>  DBMS_FGA.ANY_COLUMNS);
end;
/

/* VPD policy, to be evaluated on context change */
begin
  dbms_rls.add_policy ( 
    object_schema   	 => 'USR', 
    object_name     	 => 'MYOBJECTS', 
    policy_name     	 => 'MYOBJECTS_POLICY',
    function_schema  	 => 'USR',
    policy_function  	 => 'MYOBJECTS_POLICY',
    statement_types  	 => 'SELECT, INSERT, UPDATE, DELETE',
    update_check     	 => TRUE,
    static_policy    	 => FALSE,
    policy_type      	 => DBMS_RLS.CONTEXT_SENSITIVE,
    sec_relevant_cols	 => 'OBJECT_ID, OWNER, OBJECT_TYPE, OBJECT_NAME'); 
end;
/

/* every second client will be blocked by this */
CREATE OR REPLACE FUNCTION MYOBJECTS_POLICY( 
  schema_var IN VARCHAR2,
  table_var  IN VARCHAR2
 )
 RETURN VARCHAR2
 IS
  retval VARCHAR2 (400);
 BEGIN
  retval := 'mod(to_number(substr(nvl(sys_context(''userenv'', ''client_identifier''), ''client_7777777777''), 8)), 200)  0' ;
  RETURN retval;
 END MYOBJECTS_POLICY;
/

/* first execution just to have blocks in cache */
exec generate_stuff_to_trace;
rollback;

/* performed with and without this */
alter system flush shared_pool;

/* execute test */
alter session set tracefile_identifier='audit_fga_vpd'; 
exec dbms_monitor.session_trace_enable(binds=>true);
exec generate_stuff_to_trace;
exec dbms_monitor.session_trace_disable; 

So what’s going on?

If I just execute the above without flushing the shared pool – surprisingly little!
This is the flow of statements, showing what is happening in chronological order and how long every call takes:

col tim for 999999999999
create or replace view timeline_diffs as
select
  t.call_type,
  t.sqlid,
  substr(cs.sql_text,1,30) sql_text,
  t.close_type,
  t.tim,
  t.tim - lag(t.tim) over (order by t.tim) timediff
from timeline_extended t
left join call c on (t.lineno=c.lineno)
join (select distinct sqlid, sql_text from cursor) cs on (t.sqlid=cs.sqlid)
where T.call_type in ('PARSE', 'EXEC', 'FETCH', 'CLOSE')
order by t.tim;

select * from timeline_diffs fetch next 50 rows only;

CALL_ SQLID	    SUBSTR(CS.SQL_TEXT,1,30)	   CLOSE_TYPE		TIM   TIMEDIFF

EXEC  31wf1v9f4ags6 BEGIN dbms_monitor.session_tra		 5061658516
CLOSE 31wf1v9f4ags6 BEGIN dbms_monitor.session_tra	    0	 5061659991	  1475
PARSE 6agkhk8bx8ra0 BEGIN generate_stuff_to_trace;		 5061660241	   250
PARSE fk584xkqxdatw insert into sys.fga_log$ (sess		 5061660810	   569
EXEC  fk584xkqxdatw insert into sys.fga_log$ (sess		 5061662883	  2073
CLOSE fk584xkqxdatw insert into sys.fga_log$ (sess	    0	 5061718106	 55223
EXEC  9vr8phz4kn5hg SELECT OBJECT_ID FROM MYOBJECT		 5061718456	   350
EXEC  4vs91dcv7u1p6 insert into sys.aud$( sessioni		 5061719824	  1368
FETCH 9vr8phz4kn5hg SELECT OBJECT_ID FROM MYOBJECT		 5061752683	 32859
CLOSE 9vr8phz4kn5hg SELECT OBJECT_ID FROM MYOBJECT	    3	 5061752859	   176
PARSE fk584xkqxdatw insert into sys.fga_log$ (sess		 5061754105	  1246
EXEC  fk584xkqxdatw insert into sys.fga_log$ (sess		 5061754899	   794
CLOSE fk584xkqxdatw insert into sys.fga_log$ (sess	    0	 5061754968	    69
EXEC  6r8xukyd3acaf SELECT OBJECT_NAME FROM MYOBJE		 5061755097	   129
EXEC  4vs91dcv7u1p6 insert into sys.aud$( sessioni		 5061756412	  1315
FETCH 6r8xukyd3acaf SELECT OBJECT_NAME FROM MYOBJE		 5061761298	  4886
CLOSE 6r8xukyd3acaf SELECT OBJECT_NAME FROM MYOBJE	    3	 5061761423	   125
PARSE fk584xkqxdatw insert into sys.fga_log$ (sess		 5061762180	   757
EXEC  fk584xkqxdatw insert into sys.fga_log$ (sess		 5061762914	   734
CLOSE fk584xkqxdatw insert into sys.fga_log$ (sess	    1	 5061762976	    62
EXEC  4tv1husqy68fm UPDATE MYOBJECTS SET OWNER = N		 5061768013	  5037
EXEC  4vs91dcv7u1p6 insert into sys.aud$( sessioni		 5061770303	  2290
CLOSE 4tv1husqy68fm UPDATE MYOBJECTS SET OWNER = N	    3	 5061770499	   196
PARSE fk584xkqxdatw insert into sys.fga_log$ (sess		 5061771597	  1098
EXEC  fk584xkqxdatw insert into sys.fga_log$ (sess		 5061772367	   770
CLOSE fk584xkqxdatw insert into sys.fga_log$ (sess	    3	 5061772412	    45
EXEC  dj5wz2dxcta77 DELETE FROM MYOBJECTS WHERE OB		 5061776760	  4348
EXEC  4vs91dcv7u1p6 insert into sys.aud$( sessioni		 5061778510	  1750
CLOSE dj5wz2dxcta77 DELETE FROM MYOBJECTS WHERE OB	    3	 5061778721	   211
PARSE fk584xkqxdatw insert into sys.fga_log$ (sess		 5061779091	   370
EXEC  fk584xkqxdatw insert into sys.fga_log$ (sess		 5061779790	   699
CLOSE fk584xkqxdatw insert into sys.fga_log$ (sess	    3	 5061779839	    49
EXEC  6r8xukyd3acaf SELECT OBJECT_NAME FROM MYOBJE		 5061779937	    98
EXEC  4vs91dcv7u1p6 insert into sys.aud$( sessioni		 5061781042	  1105
FETCH 6r8xukyd3acaf SELECT OBJECT_NAME FROM MYOBJE		 5061781168	   126
CLOSE 6r8xukyd3acaf SELECT OBJECT_NAME FROM MYOBJE	    3	 5061781230	    62
PARSE fk584xkqxdatw insert into sys.fga_log$ (sess		 5061781713	   483
EXEC  fk584xkqxdatw insert into sys.fga_log$ (sess		 5061782423	   710
CLOSE fk584xkqxdatw insert into sys.fga_log$ (sess	    3	 5061782464	    41
EXEC  4tv1husqy68fm UPDATE MYOBJECTS SET OWNER = N		 5061782541	    77
EXEC  4vs91dcv7u1p6 insert into sys.aud$( sessioni		 5061783692	  1151
CLOSE 4tv1husqy68fm UPDATE MYOBJECTS SET OWNER = N	    3	 5061783814	   122
PARSE fk584xkqxdatw insert into sys.fga_log$ (sess		 5061784107	   293
EXEC  fk584xkqxdatw insert into sys.fga_log$ (sess		 5061784725	   618
...
...

There are nearly no recursive statements at all in the tracefile, apart from the inserts into sys.aud$ and sys.fga$. And no sign of VPD at all.

What’s interesting to see is how cursor management works differently with standard auditing as compared to FGA.
The standard auditing’s insert into sys.aud$ cursor is never closed, not once during the tracing session; the FGA’s insert into sys.fga_log$ is re-parsed again on every use, this being a “softer soft parse” starting from after the third close.

select
  'standard auditing', round(avg(timediff)) avg_microsecs, round(stddev(timediff)) stddev_microsecs, round(stddev(timediff)/avg(timediff), 2) COV
  from timeline_diffs
  where sql_text like 'insert into sys.aud$%'
union all
select 
  'fga - parse', round(avg(timediff)) avg_microsecs, round(stddev(timediff)) stddev_microsecs, round(stddev(timediff)/avg(timediff), 2) COV
  from timeline_diffs
  where sql_text like 'insert into sys.fga_log$%' and call_type='PARSE'
union all
select 
  'fga - exec', round(avg(timediff)) avg_microsecs, round(stddev(timediff)) stddev_microsecs, round(stddev(timediff)/avg(timediff), 2) COV
  from timeline_diffs
  where sql_text like 'insert into sys.fga_log$%' and call_type='EXEC'
union all
select 
  'fga - close', round(avg(timediff)) avg_microsecs, round(stddev(timediff)) stddev_microsecs, round(stddev(timediff)/avg(timediff), 2) COV
  from timeline_diffs
  where sql_text like 'insert into sys.fga_log$%' and call_type='CLOSE'
;


'STANDARDAUDITING AVG_MICROSECS STDDEV_MICROSECS	COV
----------------- ------------- ---------------- ----------
standard auditing	   1472 	     672	.46
fga - parse		   1531 	    2089       1.36
fga - exec		    880 	     465	.53
fga - close		     83 	    1062      12.75

So – again, not extrapolating here, just taking the sample data for what they are – the overhead of standard auditing, in my case, was 1.5 milliseconds per insert on average, and very predictible, with a – comparatively – low coefficient of variation.
FGA, on average, took ~ 2.5 milliseconds per insert, where again the time taken by operations subsumed under ‘EXEC’ was rather predictible, while parse (and close) time tended to vary much more.

So what about VPD? Evidently, here, to investigate performance overhead we need a different approach – probably the usual two-factor setup based on representative experiments.

The rules

Personally, I was impressed, when looking at the recursive statements executed, that there were no occurrences visible of Oracle having to remind itself of the rules of the game – what to audit and under which conditions 😉 . Perhaps that’s not too astonishing either, as my procedure made use of just one object to audit and a small set of statements, so all policy stuff could be kept in memory.

Just to have a look at what happens when Oracle does not have the rules in memory, I ran the test again flushing the shared pool immediately before the traced run. In this case we see, at the beginning of the tracefile, the expected queries against rls$ (e.g., select pname,pname,stmt_type,check_opt,enable_flag,pfschma,ppname,pfname from rls$ where obj#=:1), fga$ (select pname, enable_flag, ptxt, pcol, pfschma, ppname, pfname, stmt_type, powner# from fga$ where obj# = :1) and aud_object_opt$ (select policy#, action# from aud_object_opt$ where object# = :1 and type = 2).

But as the timediff column shows, overhead for these calls is minimal.

CALL_ SQLID	    SQL_TEXT			   CLOSE_TYPE	     TIM   TIMEDIFF
----- ------------- ------------------------------ ---------- ---------- ----------
EXEC  31wf1v9f4ags6 BEGIN dbms_monitor.session_tra	      7759969665
CLOSE 31wf1v9f4ags6 BEGIN dbms_monitor.session_tra	    0 7759971054       1389
PARSE gngtvs38t0060 SELECT /*+ CONNECT_BY_FILTERIN	      7759973184       2130
EXEC  87gaftwrm2h68 select o.owner#,o.name,o.names	      7759978350       5166
FETCH 87gaftwrm2h68 select o.owner#,o.name,o.names	      7759978385	 35
CLOSE 87gaftwrm2h68 select o.owner#,o.name,o.names	    3 7759978418	 33
EXEC  87gaftwrm2h68 select o.owner#,o.name,o.names	      7759978509	 91
FETCH 87gaftwrm2h68 select o.owner#,o.name,o.names	      7759978526	 17
CLOSE 87gaftwrm2h68 select o.owner#,o.name,o.names	    3 7759978546	 20
EXEC  87gaftwrm2h68 select o.owner#,o.name,o.names	      7759978626	 80
FETCH 87gaftwrm2h68 select o.owner#,o.name,o.names	      7759978642	 16
CLOSE 87gaftwrm2h68 select o.owner#,o.name,o.names	    3 7759978673	 31
EXEC  gngtvs38t0060 SELECT /*+ CONNECT_BY_FILTERIN	      7759978915	242
FETCH gngtvs38t0060 SELECT /*+ CONNECT_BY_FILTERIN	      7759978961	 46
PARSE b1wc53ddd6h3p select audit$,options from pro	      7759979646	685
EXEC  7fwum1yuknrsh select t.ts#,t.file#,t.block#,	      7759979922	276
FETCH 7fwum1yuknrsh select t.ts#,t.file#,t.block#,	      7759980044	122
CLOSE 7fwum1yuknrsh select t.ts#,t.file#,t.block#,	    3 7759980073	 29
EXEC  gd28w82ct6rva select audit$ from tab$ where	      7759980223	150
FETCH gd28w82ct6rva select audit$ from tab$ where	      7759980254	 31
CLOSE gd28w82ct6rva select audit$ from tab$ where	    3 7759980272	 18
PARSE 04kug40zbu4dm select policy#, action# from a	      7759980341	 69
EXEC  04kug40zbu4dm select policy#, action# from a	      7759980513	172
FETCH 04kug40zbu4dm select policy#, action# from a	      7759980577	 64
CLOSE 04kug40zbu4dm select policy#, action# from a	    1 7759980634	 57
PARSE 04kug40zbu4dm select policy#, action# from a	      7759980966	332
EXEC  04kug40zbu4dm select policy#, action# from a	      7759981848	882
FETCH 04kug40zbu4dm select policy#, action# from a	      7759981930	 82
CLOSE 04kug40zbu4dm select policy#, action# from a	    1 7759981991	 61
EXEC  616m6uhpa2usu select i.obj#,i.ts#,i.file#,i.	      7759982186	195
FETCH 616m6uhpa2usu select i.obj#,i.ts#,i.file#,i.	      7759982565	379
EXEC  5n1fs4m2n2y0r select pos#,intcol#,col#,spare	      7759982676	111
FETCH 5n1fs4m2n2y0r select pos#,intcol#,col#,spare	      7759982754	 78
...

So, leaving this example and getting back to the beginning of the post, – what’s the point? It all depends on the questions you have – the method allows for all kinds of straightforward and less straightforward investigations 🙂

Mining SQL Trace Files Using External Tables Part 2: Resource Profiling

Extending the timeline

Last time in Mining SQL Trace Files Using External Tables Part 1: Getting the Timeline, we saw how to extract essential information from a 10046 trace file and establish the order in which SQL statements are issued. Today, we will turn our attention to the most prominent application of trace file parsing, namely, the derivation of resource profiles (leaving more specialized applications for the future).

To that purpose, we have to extend the timeline information to EXEC and FETCH calls as well as waits, so we can map the respective trace file lines to the correct sql_id. This is done analogously to the last post’s simple timeline:

create or replace view timeline_extended_raw as
 select "LINENO","CALL_TYPE","SQLID","CRSR","TIM","CALL_DEPTH","CLOSE_TYPE" from
  (
   select lineno, 'OPEN' call_type, sqlid, crsr, tim, call_depth, null close_type from cursor
   union all
   select lineno, 'PARSE', null, crsr, tim, call_depth, null from call where call_type='PARSE'
   union all
   select lineno, 'EXEC', null, crsr, tim, call_depth, null from call where call_type='EXEC'
   union all
   select lineno, 'FETCH', null, crsr, tim, call_depth, null from call where call_type='FETCH'
   union all
   select lineno, 'CLOSE', null, crsr, tim, call_depth, type from cursor_close
   union all
   select lineno, 'WAIT', null, crsr, tim, null, null from wait
  )
 order by lineno;

create or replace view timeline_extended as
 select
  lineno,
  call_type,
  case when sqlid is not null then sqlid else lag(sqlid) ignore nulls over (partition by crsr order by lineno) end sqlid,
  crsr,
  tim,
  call_depth,
  close_type
 from timeline_extended_raw order by lineno;

With this missing part of infrastructure in place, let’s start with the session’s overall resource profile.

Overall resource profile

The example trace file will be the same one as used in the first post. In the following resource profile, I am excluding idle events, splitting up CPU usage by call type, and out of the vast range of statistical measures, displaying just average, median and maximum value – but as the query is operating on the raw data, you’re free of course to display exactly the information you’re interested in (e.g., not excluding idle wait events or making the decision dependent on the actual values).

col percent_wait for 99.99
select
 cast(event as varchar2(32)) event,
 cast(wait_class as varchar2(20)) wait_class,
 time_waited_total,
 num_waits,
 round(100 * ratio_to_report(time_waited_total) over (), 2) percent_wait,
 round(avg_wait) avg_wait,
 round(median_wait) median_wait, round(max_wait) max_wait from
  (
   select
    event, wait_class, sum(elapsed) time_waited_total, count(elapsed) num_waits,
    avg(elapsed) avg_wait, median(elapsed) median_wait, max(elapsed) max_wait
    from wait w join v$event_name e on (w.event=e.name)
    where e.wait_class  'Idle'
    group by event, wait_class
    union all
    select 
    call_type, 'CPU', sum(cpu), count(cpu), avg(cpu), median(cpu), max(cpu) from call
    group by call_type
  )
order by time_waited_total desc;

EVENT				 WAIT_CLASS	      TIME_WAITED_TOTAL  NUM_WAITS PERCENT_WAIT       AVG_WAIT	  MEDIAN_WAIT	MAX_WAIT
-------------------------------- -------------------- ----------------- ---------- ------------ 
db file sequential read 	 User I/O			3270429        156	  65.95 	 20964		 9395	  169189
EXEC				 CPU				1039000        680	  20.95 	  1528		    0	  509000
PARSE				 CPU				 307000        143	   6.19 	  2147		    0	  139000
FETCH				 CPU				 143000       1609	   2.88 	    89		    0	   18000
row cache lock			 Concurrency			  66660        197	   1.34 	   338		  300	    2230
gc cr grant 2-way		 Cluster			  64633 	84	   1.30 	   769		  584	    5995
library cache lock		 Concurrency			  29250 	44	    .59 	   665		  561	    4255
library cache pin		 Concurrency			  17864 	43	    .36 	   415		  349	    1625
db file scattered read		 User I/O			  11566 	 2	    .23 	  5783		 5783	   10537
enq: RC - Result Cache: Contenti Application			   2140 	 4	    .04 	   535		  580	     638
enq: TM - contention		 Application			   1606 	 2	    .03 	   803		  803	    1062
PX Deq: Slave Session Stats	 Other				   1319 	 2	    .03 	   660		  660	    1204
gc cr multi block request	 Cluster			   1168 	 1	    .02 	  1168		 1168	    1168
reliable message		 Other				    844 	 1	    .02 	   844		  844	     844
enq: PS - contention		 Other				    627 	 1	    .01 	   627		  627	     627
KJC: Wait for msg sends to compl Other				    622 	46	    .01 	    14		   10	      40
IPC send completion sync	 Other				    499 	 2	    .01 	   250		  250	     332
PX Deq: Signal ACK EXT		 Other				    473 	 2	    .01 	   237		  237	     405
PX Deq: reap credit		 Other				    327 	23	    .01 	    14		   10	      62
SQL*Net message to client	 Network			    214 	31	    .00 	     7		    7	      15
Disk file operations I/O	 User I/O			      7 	 1	    .00 	     7		    7	       7

Noticing that for response time’s most important contributor, db file sequential read, mean and median are more than moderately different (20964 vs. 9395 microseconds), we can easily get an impression of the distribution, not even leaving sqlplus:

col plot for a100
select range, min(elapsed) low_value, max(elapsed) high_value, max(elapsed) - min(elapsed) width, round(156/(max(elapsed) - min(elapsed)), 3) height, rpad('*', 100 * 156/(max(elapsed) - min(elapsed)), '*') plot
from
(select elapsed, ntile(10) over (order by elapsed) range from wait where event='db file sequential read')
group by range
order by range;


 RANGE  LOW_VALUE HIGH_VALUE      WIDTH	HEIGHT PLOT -------------------------------------------------------------------------------------------------------
 1	  408	     672	264	  .591 ********************************************************
 2	  701	     979	278	  .561 ********************************************************
 3	 1025	    1673	648	  .241 ************************
 4	 1851	    6587       4736	  .033 ***
 5	 6619	    9563       2944	  .053 *****
 6	 9678	   15456       5778	  .027 **
 7	15491	   20224       4733	  .033 ***
 8	20279	   31984      11705	  .013 *
 9	33681	   59401      25720	  .006
10	63073	  169189     106116	  .001

Of course, deciding on the number of bins for the histogram may need a little bit of experimentation, as well as choosing the multiplicator (here, I chose 100) to display the plot. Nicer graphs can always be generated using other tools, but here I’m focusing on what can be done quickly in sqlplus, and if we’re interested in getting quick statistical information on the distribution of a wait event, there’s a nice convenience function (or rather, procedure) in dbms_stat_funcs, dbms_stat_funcs.summary.
Reckoning that I will use it a lot, I create a wrapper around it:

create or replace procedure wrap_stats_summary (p_owner in varchar2, p_tabname in varchar2, p_colname in varchar2)
as
  l_summary dbms_stat_funcs.SummaryType;
begin
   dbms_stat_funcs.summary(p_owner, p_tabname, p_colname, 3, l_summary);
   dbms_output.put_line ('Statistics summary for ' || p_owner || '.' || p_tabname || '.' || p_colname);
   dbms_output.put_line (rpad('Count: ', 20, ' ')           || l_summary.count);
   dbms_output.put_line (rpad('Min: ', 20, ' ')             || l_summary.min);
   dbms_output.put_line (rpad('Max: ', 20, ' ')             || l_summary.max);
   dbms_output.put_line (rpad('Range: ', 20, ' ')           || l_summary.range);
   dbms_output.put_line (rpad('Mean: ', 20, ' ')            || round(l_summary.mean));
   dbms_output.put_line (rpad('Mode Count: ', 20, ' ')      || l_summary.cmode.count);
   dbms_output.put_line (rpad('Mode: ', 20, ' ')            || l_summary.cmode(1));
   dbms_output.put_line (rpad('Variance: ', 20, ' ')        || round(l_summary.variance));
   dbms_output.put_line (rpad('Stddev: ', 20, ' ')          || round(l_summary.stddev));
   dbms_output.put_line (rpad('5th percentile: ', 20, ' ')  || l_summary.quantile_5);
   dbms_output.put_line (rpad('25th percentile: ', 20, ' ') || l_summary.quantile_25);
   dbms_output.put_line (rpad('Median: ', 20, ' ')          || l_summary.median);
   dbms_output.put_line (rpad('75th percentile: ', 20, ' ') || l_summary.quantile_75);
   dbms_output.put_line (rpad('95th percentile: ', 20, ' ') || l_summary.quantile_95);
   dbms_output.put_line (rpad('Extreme Count: ', 20, ' ')   || l_summary.extreme_values.count);
   dbms_output.put_line (rpad('Top 5: ', 20, ' ')           || l_summary.top_5_values(1) || ','
                                                            || l_summary.top_5_values(2) || ','
                                                            || l_summary.top_5_values(3) || ',' 
                                                            || l_summary.top_5_values(4) || ',' 
                                                            || l_summary.top_5_values(5));
   dbms_output.put_line (rpad('Bottom 5: ', 20, ' ')        || l_summary.bottom_5_values(5) || ','
                                                            || l_summary.bottom_5_values(4) || ',' 
                                                            || l_summary.bottom_5_values(3) || ','
                                                            || l_summary.bottom_5_values(2) || ',' 
                                                            || l_summary.bottom_5_values(1));
end;
/

Calling the wrapper with a view created for db file sequential read events (again, creating the views could be automated) and the column we’re interested in yields

set serveroutput on
create view v_db_file_sequential_read as select * from wait where event='db file sequential read';
exec wrap_stats_summary('myuser','v_db_file_sequential_read','elapsed');

Statistics summary for myuser.v_db_file_sequential_read.elapsed:
Count:		    156
Min:		    408
Max:		    169189
Range:		    168781
Mean:		    20964
Mode Count:	    1
Mode:		    1370
Variance:	    921834830
Stddev: 	    30362
5th percentile:     546.5
25th percentile:    1181.25
Median: 	    9395
75th percentile:    23279.25
95th percentile:    84211.75
Extreme Count:	    4
Top 5:		    169189,163650,125610,124584,99328
Bottom 5:	    408,413,438,466,506

Should we, from looking at the plot, be tempted to think we might have an exponential distribution here, why not do a quick check using dbms_stat_funcs.exponential_dist_fit? Unfortunately, the package documentation does not say whether it is acceptable to pass a lambda value calculated from the sample mean – AFAIK, in principle, different significance thresholds should apply depending on whether lambda was given apriori or derived from the sample mean. (Actually, as there is no parameter to specify how the lambda passed in came about, we have no choice anyway.)

set serveroutput on
declare
  lambda number := 1/20964;
  mu     number := 0;
  sig    number;
begin
 dbms_stat_funcs.exponential_dist_fit('C##FROG', 'v_db_file_sequential_read', 'elapsed', 'KOLMOGOROV_SMIRNOV', lambda, mu, sig);
 dbms_output.put_line('Lambda  : ' || lambda);
 dbms_output.put_line('Sig     : ' || to_char(sig,'9.999'));
end;
/
D value : .2309900895441184871772937767930610144644
Lambda	: .00004770082045411181072314443808433505056287
Sig	:   .000

So, evidently we do not have an exponential distribution here (sorry for the spoiler – but the point here is to show what’s possible with the external table setup, not to draw conclusions from a minimal data set ;-)).
So let’s go on with the main topic of this post – resource profiling.

Overall session profile – drilldowns

With the data extracted from the trace file, it is easy to pick a wait event and drill down into its parameters. E.g., for db file sequential read, we can find the files and blocks with the most reads, or else – as the object information itself is included in the trace file – go for the objects directly:

select cast(owner as varchar2(30)) owner, cast(object_name as varchar2(30)) name, sum(elapsed) elapsed from wait w join dba_objects o on (w.obj=o.object_id) where event='db file sequential read' group by owner, object_name order by 3 desc fetch next 5 rows only;

OWNER			       NAME				 ELAPSED
------------------------------ ------------------------------ ----------
SYS			       CDEF$				  323772
SYS			       HIST_HEAD$			  322557
SYS			       I_HH_OBJ#_INTCOL#		  235055
SYS			       TAB$				  206864
SYS			       I_DEFERRED_STG1			  193338

(Don’t wonder about these being internal objects only – in the example trace session I was querying dba_objects directly instead of a table “created as select”.)

Profiling by SQL statement

Whereas in some cases the overall resource profile might be the thing we’re most interested in, in others it will be single SQL statements (most problably it will be both). Again, the range of possible queries is limited only by the information provided in a trace file. As an example, here I’m sorting by elapsed time, grouping not by sql_id but by rollup(sql_id, call_type) so I can directly see, for the most time-consuming statements, where time went, call-wise – e.g. here, for the only level 0 statement in the trace file, I see that time was mostly spent in the EXEC phase:

with elapsed as
 (select t.sqlid, c.call_type, sum(c.elapsed) elapsed, count(c.elapsed) call_count
  from call c join timeline_extended t using (lineno)
  group by rollup(t.sqlid, c.call_type)
 ),
 sqlids as 
  (select sqlid, sql_text, count(1) occurrence_count, min(call_depth) min_call_depth
   from cursor group by sqlid, sql_text
  )
  select elapsed.*, cast(substr(sqlids.sql_text,1,30) as varchar2(30)) sql_text, sqlids.occurrence_count, sqlids.min_call_depth 
  from elapsed join sqlids on (elapsed.sqlid=sqlids.sqlid)
  order by elapsed desc
  fetch next 10 rows only;

SQLID	      CALL_    ELAPSED CALL_COUNT SQL_TEXT			 OCCURRENCE_COUNT MIN_CALL_DEPTH
------------- ----- ---------- ---------- ------------------------------ ---------------- --------------
5dqx2k1st1uxa	       4602081	       30 select owner, object_name from		3	       0
5dqx2k1st1uxa EXEC     4566966		3 select owner, object_name from		3	       0
4b4wp0a8dvkf0	       2783682		3 SELECT executions, end_of_fetc		1	       1
4b4wp0a8dvkf0 EXEC     2747671		1 SELECT executions, end_of_fetc		1	       1
64hu4m7arrfdh	       2036951		1 SELECT * FROM NO_OBJECT_LINK("		1	       2
64hu4m7arrfdh PARSE    2036951		1 SELECT * FROM NO_OBJECT_LINK("		1	       2
616m6uhpa2usu		785493	       55 select i.obj#,i.ts#,i.file#,i.		1	       2
96g93hntrzjtr		641889	      539 select /*+ rule */ bucket_cnt,		1	       3
616m6uhpa2usu FETCH	636347	       37 select i.obj#,i.ts#,i.file#,i.		1	       2
96g93hntrzjtr FETCH	581688	      269 select /*+ rule */ bucket_cnt,		1	       3

I’ve also displayed the number of times a statement was executed (occurrence_count), to put the elapsed time in context, and the (minimal) call depth, so I can immediately spot the level 0 statements.
Now of course, we still need the tkprof-like output:

select c.call_type "call", count(c.elapsed) "count", sum(c.cpu) "cpu", sum(c.elapsed) "elapsed",
 sum(c.phys_reads) "disk", sum(c.cons_reads) "query", sum(c.cur_reads) "current", sum(c.num_rows) "rows",
 sum(c.misses) "misses"
from call c join timeline_extended t using (lineno)
where t.sqlid='5dqx2k1st1uxa'
group by t.sqlid, c.call_type
order by decode(c.call_type, 'PARSE', 1, 'EXEC', 2, 'FETCH', 3);

call	   count	cpu    elapsed	     disk      query	current       rows     misses
----- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
PARSE	       3       2000	  4721		0	   0	      0 	 0	    1
EXEC	       3     511000    4566966	      157	2532	      0 	 0	    1
FETCH	      24      12000	 30394		6	 123	      0        298	    0

select event "event", count(1) "times waited", max(elapsed) "max wait", sum(elapsed) "total waited"
from wait w join timeline_extended t using (lineno)
where t.sqlid='5dqx2k1st1uxa'
group by event
order by 3 desc;

event				times waited	max wait   total waited
---------------------------------------------------------------- ------------ 
SQL*Net message from client		24	11510302     27972579
db file sequential read 		 4	   29541	35812
gc cr multi block request		 1	    1168	 1168
row cache lock				 6	    1094	 3865
db file scattered read			 1	    1029	 1029
library cache lock			12	     927	 6261
gc cr grant 2-way			 4	     733	 2670
library cache pin			12	     555	 3432
KJC: Wait for msg sends to complete      7	      27	   96
SQL*Net message to client	        24	      15	  154

That’s for classical resource profiling with the external table trace profiler method :-). Next time, we’ll look more into options provided by this specific method, and then, more interesting trace files will be used, I promise 😉

Mining SQL Trace Files Using External Tables Part 1: Getting the Timeline

Why external tables?

SQL trace files and external tables – at first glance, you wouldn’t think they’re a good match (or so I guess).
Given their structure, SQL trace files seem best suited for being processed serially, line by line, by a procedural or functional programming language. However,once you choose a profiler – be it tkprof or any of the much more sophisticated ones – you’re stuck with the output it produces; you cannot ask it just any question.
This, together with the fascination (I admit) of the preprocessor option, made me want to try out how far I’d get with this “set approach” ;-).

Of course, even though the idea itself might be intriguing, the whole thing must be practical; which means there should be a one-time setup of the infrastructure (shell scripts, table definitions) and no work required when analysing a different trace file apart from just changing the table’s location. So no data loads; everything that works in sqlldr but not in external tables cannot be used. Secondly, keep it simple; sophistication can always be added later, in SQL, when querying the data.

Base tables

For starters, I’ll restrict myself to extracting 4 kinds of information contained in a trace file:
– the cursor information itself, starting with PARSING IN CURSOR #
– cursor close information, starting with CLOSE #
– the database call (PARSE/EXECUTE/FETCH) and
– waits, identified by WAIT #

Each of these entities will end up in its own table; each of these tables will have the trace file’s line number (see below why), the 15-digit cursor number and the timestamp (tim) from the trace file.

As we can’t use sqlldr’s amazing transformation capabilities, data extraction will have to be done by short shell scripts (perl scripts, actually, in order to avoid long pipes of unix commands). For simplicity, here I’m putting scripts and trace files in the same directory, trace_dir, which I have set to /home/oracle/trace.

Cursors

Let’s start, then! First, the table CURSOR will store sql_id, sql_text etc. information:

drop table cursor;
create table cursor (
  lineno number,
  crsr number,
  crsr_length number,
  call_depth number,
  parsing_user_id number,
  command_type number,
  parsing_schema_id number,
  tim number,
  hash_value number,
  address varchar2(8),
  sqlid varchar2(13),
  sql_text varchar2(4000)
  )
  organization external
    (
    type oracle_loader
    default directory trace_dir
    access parameters
      (
        records delimited by newline
        preprocessor trace_dir:'cursor.sh'
        logfile trace_dir:'cursor.log'
        badfile trace_dir:'cursor.bad'
        fields terminated by '!' optionally enclosed by "'"
        (
          lineno char,          
          crsr char,
          crsr_length char,
          call_depth char,
          parsing_user_id char,
          command_type char,
          parsing_schema_id char,
          tim char,
          hash_value char,
          address char,
          sqlid char,
          sql_text char(4000)
        )
      )
    location (trace_dir:'')
  );  

The shell script that extracts and prepares the required information looks like this:

[oracle@node1 trace]$ cat cursor.pl
#!/usr/bin/perl
use strict;
use warnings;
while () {
  if (/^PARSING IN CURSOR/../END OF STMT/) {
    if (/^PARSING IN CURSOR/) {
      s/^PARSING IN CURSOR \#//;
      s/ [a-z]+=/!/g;
      s/\n$/!/;
      $_ = "$.!$_";
    }
    unless (/^END OF STMT/) {
      print;
    }
  }
}

Database calls

In the same way, information about database calls (restricted to parses, executes and fetches, for now) is extracted:

drop table call;
create table call
  (
    lineno number,
    call_type varchar2(5),
    crsr number,
    cpu number,
    elapsed number,
    phys_reads number,
    cons_reads number,
    cur_reads number,
    misses number,
    num_rows number,
    call_depth number,
    opt_goal number,
    ph_value number,
    tim number
  )
  organization external
  (
    type oracle_loader
    default directory trace_dir
    access parameters
      (
        records delimited by newline
        preprocessor trace_dir:'call.sh'
        logfile trace_dir:'call.log'
        badfile trace_dir:'call.bad'
        fields terminated by ','
        (
          lineno char,
          call_type char,
          crsr char,
          cpu char,
          elapsed char,
          phys_reads char,
          cons_reads char,
          cur_reads char,
          misses char,
          num_rows char,
          call_depth char,
          opt_goal char,
          ph_value char,
          tim char 
        )
    )
    location (trace_dir:'')
  );
[oracle@node1 trace]$ cat call.pl
#!/usr/bin/perl
use strict;
use warnings;
while () {
  if (/^(PARSE|EXEC|FETCH) \#/) {
      s/^(PARSE|EXEC|FETCH) \#(.+)/$1,$2/;
      s/[a-z]+=//g;
      s/:/,/g;
      print "$.,$_";
    }
}

Waits

Next, for the waits. Parameters end up nicely in their own columns:

drop table wait;
create table wait (
  lineno number,
  crsr number,
  event varchar2(64),
  elapsed number,
  p1name varchar2(64),
  p1value number,
  p2name varchar2(64),
  p2value number,
  p3name varchar2(64),
  p3value number,
  obj number,
  tim number
  )
  organization external
  (
    type oracle_loader
    default directory trace_dir
    access parameters
      (
        records delimited by newline
        preprocessor  trace_dir:'wait.sh'
        logfile trace_dir:'wait.log'
        badfile trace_dir:'wait.bad'
        fields terminated by ',' optionally enclosed by "'"
         (
          lineno char,
          crsr char,
          event char,
          elapsed char,
          p1name char,
          p1value char,
          p2name char,
          p2value char,
          p3name char,
          p3value char,
          obj_ char,
          obj char,
          tim char 
         )
       )
    location ('')
  );  
[oracle@node1 trace]$ cat wait.pl 
#!/usr/bin/perl
use strict;
use warnings;
while () {
  if (/^WAIT \#/) {
      s/^WAIT \#//;
      s/(\d+): nam=(.+) ela= (\d+) ([^=]+)=(\d+) ([^=]+)=(\d+) ([^=]+)=(\d+) ([^=]+)=(-?\d+) tim=(\d+)/$1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12/;
      print "$.,$_";
    }
}

Cursor closes

Finally (for today), we have the table storing cursor close information:

drop table cursor_close;
create table cursor_close (
  lineno number,
  crsr number,
  cpu number,
  elapsed number,
  call_depth number,
  type number,
  tim number )
  organization external
  (
    type oracle_loader
    default directory trace_dir
    access parameters
      (
        records delimited by newline
        preprocessor  trace_dir:'cursor_close.sh'
        logfile trace_dir:'cursor_close.log'
        badfile trace_dir:'cursor_close.bad'
        fields terminated by ',' 
         (
           lineno char,
           crsr char,
           cpu char,
           elapsed char,
           call_depth char,
           type char,
           tim char
         )  
       )
    LOCATION ('')
  ); 
[oracle@node1 trace]$ cat cursor_close.pl 
#!/usr/bin/perl
use strict;
use warnings;
while () {
  if (/^CLOSE \#/) {
      s/^CLOSE \#//;
      s/[a-z]+=//g;
      s/:/,/g;
      print "$.,$_";
    }
}

To actually select a trace file for analysis, we just alter the external tables’ locations, which can easily be accomplished with the following procedure:

create or replace procedure changeloc (newloc in varchar2)
is
  tablist sys.odcivarchar2list := sys.odcivarchar2list('cursor','cursor_close','call','wait');
begin
  for i in tablist.first .. tablist.last loop
    execute immediate 'alter table ' || tablist(i) || ' location (''' || newloc || ''')';
  end loop;
end;
/

The Timeline – Step 1

With the raw data extracted, the essential step is organizing it so it can be queried without too much thinking!
First, let’s get an overview of how cursors are being opened and closed:

create or replace view timeline_raw as
  select * from (
    select lineno, 'OPEN' call_type, sqlid, crsr, tim, call_depth, null close_type from cursor
    union all 
    select lineno, 'PARSE', null, crsr, tim, call_depth, null from call where call_type='PARSE'
    union all
    select lineno, 'CLOSE', null, crsr, tim, call_depth, type from cursor_close)
  order by lineno;

For example:

SQL> col crsr for 999999999999999
SQL> col tim for 99999999999
SQL> select * from timeline_raw fetch next 30 rows only;

    LINENO CALL_ SQLID			   CRSR        TIM CALL_DEPTH CLOSE_TYPE
---------- ----- ------------- ---------------- ---------- ---------- ----------
	27 OPEN  31wf1v9f4ags6	140411772197800 4624635824	    0
	35 CLOSE		140411772197800 4678444836	    0	       0
	37 OPEN  g9p1d48c5dsuq	140411772197800 4678446274	    0
	40 PARSE		140411772197800 4678446272	    0
	51 CLOSE		140411772197800 4696606742	    0	       0
	53 OPEN  2hy64qgq10kb8	140411772197800 4696607776	    0
	56 PARSE		140411772197800 4696607775	    0
	67 CLOSE		140411772197800 4750298461	    0	       0
	69 OPEN  5dqx2k1st1uxa	140411772197800 4750298879	    0
	72 PARSE		140411772197800 4750298878	    0
	76 OPEN  1p5grz1gs7fjq	140411771983304 4750303279	    1
	99 CLOSE		140411771983304 4750343695	    1	       3
       121 CLOSE		140411771983304 4750369440	    1	       3
       142 CLOSE		140411771983304 4750386565	    1	       3
       144 OPEN  1mjd9xp80vuqa	140411772838712 4750386665	    1
       158 CLOSE		140411772838712 4750427684	    1	       3
       160 OPEN  3un99a0zwp4vd	140411772481520 4750428600	    1
       163 PARSE		140411772481520 4750428599	    1
       187 CLOSE		140411772481520 4750472154	    1	       0
       189 OPEN  8swypbbr0m372	140411772481520 4750472732	    1
       192 PARSE		140411772481520 4750472732	    1
       205 CLOSE		140411772481520 4750556542	    1	       0
       215 OPEN  616m6uhpa2usu	140411772181736 4750564487	    2
       218 PARSE		140411772181736 4750564485	    2
       223 OPEN  96g93hntrzjtr	140411772237656 4750570678	    3
       226 PARSE		140411772237656 4750570676	    3
       243 CLOSE		140411772237656 4750598219	    3	       3
       245 OPEN  7u49y06aqxg1s	140411766762152 4750599062	    3
       248 PARSE		140411766762152 4750599061	    3
       273 CLOSE		140411766762152 4750615749	    3	       3

30 rows selected.

The important thing here is having the line numbers, which had been added by the preprocessor script – having the tim values is not enough! (Probably when you write a tracefile parser in a procedural or functional language and process the file line by line anyway, you don’t much think about it – but using sql makes you reflect on it).
Compare, for example, these lines from the output:

160 OPEN  3un99a0zwp4vd	140411772481520 4750428600	    1
163 PARSE		140411772481520 4750428599	    1

If we had just the tim values (next to last column), the PARSE line would appear before the OPEN, which would make it impossible to infer the correct sql_id for it (see following).

The Timeline – Step 2

Now, while this reflects the parse flow, it doesn’t show for which sqlid a cursor is being closed – there simply is no sqlid in the cursor close lines. To fill in the missing information, we have to match cursor open and cursor close lines – which is not completely straightforward at first glance, because the 15-digit cursor numbers may be reused as soon as their relative cursors are closed.
Analytic functions (and the line number) to the rescue ;-):

create or replace view timeline (lineno, call_type, sqlid, crsr,tim, call_depth, close_type) 
 as select
  lineno,
  call_type,
  case when sqlid is not null then sqlid else lag(sqlid) ignore nulls over (partition by crsr order by lineno) end sqlid,
  crsr,
  tim,
  call_depth,
  close_type
 from timeline_raw order by lineno;

Which gives us the complete timeline of when cursors were opened and closed:

SQL> select * from timeline fetch next 30 rows only;


    LINENO CALL_ SQLID			   CRSR        TIM CALL_DEPTH CLOSE_TYPE
---------- ----- ------------- ---------------- ---------- ---------- ----------
	27 OPEN  31wf1v9f4ags6	140411772197800 4624635824	    0
	35 CLOSE 31wf1v9f4ags6	140411772197800 4678444836	    0	       0
	37 OPEN  g9p1d48c5dsuq	140411772197800 4678446274	    0
	40 PARSE g9p1d48c5dsuq	140411772197800 4678446272	    0
	51 CLOSE g9p1d48c5dsuq	140411772197800 4696606742	    0	       0
	53 OPEN  2hy64qgq10kb8	140411772197800 4696607776	    0
	56 PARSE 2hy64qgq10kb8	140411772197800 4696607775	    0
	67 CLOSE 2hy64qgq10kb8	140411772197800 4750298461	    0	       0
	69 OPEN  5dqx2k1st1uxa	140411772197800 4750298879	    0
	72 PARSE 5dqx2k1st1uxa	140411772197800 4750298878	    0
	76 OPEN  1p5grz1gs7fjq	140411771983304 4750303279	    1
	99 CLOSE 1p5grz1gs7fjq	140411771983304 4750343695	    1	       3
       121 CLOSE 1p5grz1gs7fjq	140411771983304 4750369440	    1	       3
       142 CLOSE 1p5grz1gs7fjq	140411771983304 4750386565	    1	       3
       144 OPEN  1mjd9xp80vuqa	140411772838712 4750386665	    1
       158 CLOSE 1mjd9xp80vuqa	140411772838712 4750427684	    1	       3
       160 OPEN  3un99a0zwp4vd	140411772481520 4750428600	    1
       163 PARSE 3un99a0zwp4vd	140411772481520 4750428599	    1
       187 CLOSE 3un99a0zwp4vd	140411772481520 4750472154	    1	       0
       189 OPEN  8swypbbr0m372	140411772481520 4750472732	    1
       192 PARSE 8swypbbr0m372	140411772481520 4750472732	    1
       205 CLOSE 8swypbbr0m372	140411772481520 4750556542	    1	       0
       215 OPEN  616m6uhpa2usu	140411772181736 4750564487	    2
       218 PARSE 616m6uhpa2usu	140411772181736 4750564485	    2
       223 OPEN  96g93hntrzjtr	140411772237656 4750570678	    3
       226 PARSE 96g93hntrzjtr	140411772237656 4750570676	    3
       243 CLOSE 96g93hntrzjtr	140411772237656 4750598219	    3	       3
       245 OPEN  7u49y06aqxg1s	140411766762152 4750599062	    3
       248 PARSE 7u49y06aqxg1s	140411766762152 4750599061	    3
       273 CLOSE 7u49y06aqxg1s	140411766762152 4750615749	    3	       3

Sanity Checks

Let’s do some sanity checks. The example trace file was recorded for issuing the statement
select owner, object_name from dba_objects where object_id between :low and :high
three times in a row, with different binds each.
This is how that statement appears in the view:

SQL> select * from timeline where sqlid='5dqx2k1st1uxa';

    LINENO CALL_ SQLID			   CRSR        TIM CALL_DEPTH CLOSE_TYPE
---------- ----- ------------- ---------------- ---------- ---------- ----------
	69 OPEN  5dqx2k1st1uxa	140411772197800 4750298879	    0
	72 PARSE 5dqx2k1st1uxa	140411772197800 4750298878	    0
     10593 CLOSE 5dqx2k1st1uxa	140411772197800 4765329851	    0	       0
     10635 OPEN  5dqx2k1st1uxa	140411772197800 4771858794	    0
     10638 PARSE 5dqx2k1st1uxa	140411772197800 4771858792	    0
     10698 CLOSE 5dqx2k1st1uxa	140411772197800 4777876691	    0	       0
     10734 OPEN  5dqx2k1st1uxa	140411772197800 4784680278	    0
     10737 PARSE 5dqx2k1st1uxa	140411772197800 4784680276	    0
     10797 CLOSE 5dqx2k1st1uxa	140411772197800 4796212667	    0	       1

So we see the cursor was parsed 3 times; twice it was hard closed and the third time, it was put into the cursor cache (which fits the official documentation).

How about the (according to tkprof, most frequently parsed) statement in the trace file?
What tkprof tells us is it was parsed, executed as well as fetched 47 times each:

SQL ID: 04kug40zbu4dm Plan Hash: 4006480256

select policy#, action# 
from
 aud_object_opt$ where object# = :1 and type = 2


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse       47      0.00       0.00          0          0          0           0
Execute     47      0.00       0.01          0          0          0           0
Fetch       47      0.00       0.00          0        423          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total      141      0.01       0.02          0        423          0           0

Misses in library cache during parse: 2
Misses in library cache during execute: 2
SQL> select call_type, close_type, count(1) from timeline where sqlid='04kug40zbu4dm' group by call_type, close_type order by decode(call_type, 'OPEN', 1, 'PARSE', 2, 'CLOSE', 3), 2;

CALL_ CLOSE_TYPE   COUNT(1)
----- ---------- ----------
OPEN			  6
PARSE			 47
CLOSE	       0	  4
CLOSE	       1	  2
CLOSE	       3	 41

So we see the 47 parses confirmed, plus the cursor was hard closed 4 times, put “afresh” into the cursor cache 2 times, and put back there “ever after” 😉

As a last check, let’s look at a statement for which tkprof reports 0 parses:

SQL ID: 1p5grz1gs7fjq Plan Hash: 813480514

select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, 
  spare2, spare3, signature, spare7, spare8, spare9 
from
 obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null 
  and linkname is null and subname is null


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        0      0.00       0.00          0          0          0           0
Execute      9      0.00       0.00          0          0          0           0
Fetch        9      0.00       0.11          6         35          0           8
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       18      0.00       0.12          6         35          0           8

Misses in library cache during parse: 0
Misses in library cache during execute: 2

Looking at the timeline:

SQL> select * from timeline where sqlid='1p5grz1gs7fjq';

    LINENO CALL_ SQLID			   CRSR        TIM CALL_DEPTH CLOSE_TYPE
---------- ----- ------------- ---------------- ---------- ---------- ----------
	76 OPEN  1p5grz1gs7fjq	140411771983304 4750303279	    1
	99 CLOSE 1p5grz1gs7fjq	140411771983304 4750343695	    1	       3
       121 CLOSE 1p5grz1gs7fjq	140411771983304 4750369440	    1	       3
       142 CLOSE 1p5grz1gs7fjq	140411771983304 4750386565	    1	       3
      5975 CLOSE 1p5grz1gs7fjq	140411771983304 4752345230	    2	       3
      6167 CLOSE 1p5grz1gs7fjq	140411771983304 4752374262	    2	       3
      6353 CLOSE 1p5grz1gs7fjq	140411771983304 4752400269	    3	       3
      7134 CLOSE 1p5grz1gs7fjq	140411771983304 4753627638	    3	       3
      7505 CLOSE 1p5grz1gs7fjq	140411771983304 4753804698	    3	       3
      7969 CLOSE 1p5grz1gs7fjq	140411771983304 4754168121	    3	       3

… we see that in this case, already the first time the cursor was closed it was put back into the cursor cache.
Still, we have seen above that even a “softer soft parse”, which already finds the cursor in the cache, should make a PARSE line appear in the trace file. I suspect that with sql trace info alone it might not be possible to give a satisfying explanation here.

Coming next

Anyway, we’ve seen how to get the flow of the sql statements contained in the trace file, which opens up an enormous space for further explorations :-).
In the next installments, we’ll see how to make use of the various types of information contained in a trace file.