Wednesday, November 26, 2008

Metrics to gather when doing performance exercises for Oracle SOA Suite

In a lot of my performance gigs the one question that keeps coming up a lot is what are you trying to achieve i.e. what is the end goal:

  • Throughput (# of messages or # completed transactions, TPS)
  • Response Time
  • Reliability
  • Data Resiliency (also known no loss of Data)
  • Any combination of the above
  • All of the above?

Once the objective has been highlighted the next question is the diagnostics, what are the logs/data that need to be gathered while doing a performance run to do your performance tuning. This topic was also the subject of my presentation at OOW 2008 which to my surprise fostered a lot of post presentation conversations, so I thought that I should put it up on my blog.

Metrics can be broken down in these main sections:

  • Box/OS
  • JVM
  • DB


Machine where SOA Suite is running

  • CPU
  • vmstat
  • prstat
  • iostat
  • VisualGC
  • top


  • CPU
  • AWR Reports
  • Redo Logs
  • iostat


  • Collect Performance metrics from the Metrics page --> they can show you bottlenecks and help you troubleshoot.
  • You can rely on the response time on the BPEL console for Synchronous processes only!
  • For asynchoronous processes use the CUBE_INSTANCE table in the orabpel schema.
    • Use the start time of the first asynch BPEL process and end time of all last BPEL process in a given time (completed processes only) --> will give you the time taken to complete
    • Count the amount of completed BPEL instances
    • Divide the difference in time/amount of instances to calculate TPS
    • For better variance calculate the median time difference
  • Thread dumps if required

Example of BPEL SQL Script for asynchronous processes:


select count(*) COUNT, process_id PROCESS_ID, max(modify_date) END_TIME , min(creation_date) BEGIN_TIME,(extract(day from max(modify_date) - min(creation_date))*86400+ extract(hour from max(modify_date) - min(creation_date))*3600+ extract(minute from max(modify_date) - min(creation_date))*60+ extract(second from max(modify_date) - min(creation_date))) duration_in_second,median(extract(day from modify_date - creation_date)*86400+ extract(hour from modify_date - creation_date)*3600+ extract(minute from modify_date - creation_date)*60+ extract(second from modify_date - creation_date)) MEDIAN from cube_instance where state = 5 and process_id like <process_name> group by process_id


Results of the SQL Script:


Process name Count Begin Time End Time Duration in seconds TPS Median

I have not shown the data here, but you can run the above scripts after every performance run to get the TPS for your asynchronous BPEL processes and chart out the above table as graphs (TPS, Median - on Y and Process on the X axis). The graphs will show how your system is behaving after each tuning exercise.


  • Metrics from the ESB Console.
  • Metrics gathered from log.xml - not intuitive but you can get a lot of information from this log for e.g. Time taken for the ESB to complete a transaction.
  • Monitor the iostat for the ESB process, sometimes based on OS you may get different results on iostat (AIX being the most IO intensive)

Visual GC


VisualGC helps in monitoring how your JVM is behaving and helps in capturing any thread deadlocks. If you see a flat line in your Eden Space while the test is running its usually a sign of a deadlock and its a good time to gather thread dumps at regular intervals to see what is going on in the JVM.

Analysis of Thread Dump

Once you have gathered the thread dumps you can analyse them using a free tool called Samurai.


The red colours are signs of potential deadlocks, by clicking on them you can see where the deadlock is:


Another great tool that has been added in SOA Suite  for BPEL is the new statistics page which provides information about the amount of threads that BPEL is using , adapter threads and various other statistics that can help provide a better picture.


All of the above metrics should be gathered regardless of the performance objectives. The OS and DB level metrics can be automated so that they are gathered by scripts. The thread dumps would depend entirely on whether VisualGC is showing any flat lines, while BPEL/ESB stats page. and the BPEL script can provide first hand view of what is happening in your system.

The next question is how do you tune your system - well that ties into the first question of what are you trying to achieve, throughput or response time, since both are inversely related there is always a price to pay, so there is always a choice to be made.

Happy gathering!


Wednesday, October 15, 2008

Dynamic BPEL PartnerLinks and Dynamic Routing of BPEL processes - a powerhouse combination

This is a subject that I have been working on for quite some time, and the more I have investigated this subject, the moreI have realized that this model is very powerful in today's ever changing business world. This model allows your BPEL processes to be very dynamic, both from a development point of view and from a runtime point of view by adding an additional intangible runtime value. Lets review each of these individually first. Please note that this blog post will not contain any code examples.

Dynamic BPEL PartnerLinks

In a lot of development environments, we sometimes do not know what service we are going to call or for that matter where does this service resides. So its very hard at development time to provide an endpoint to our BPEL process, so the questions to answer are what service do we call, and where does it live?

This is where dynamic partnerlinks come in handy, they resolve both questions, by providing the answers at runtime (magically ;o) ). Well not really magically but these values are passed to the BPEL process at runtime, and using the ws-addressing feature of BPEL PM we can invoke and receive callbacks from these dynamic endpoints, without having to provide these values at design time. The only caveat is that you need to know the data model of your service that you will be calling, so you need to know the XSD or a variant of the XSD. For an example please visit the BPEL Cookbook on OTN. (I have a working example of dynamic partnerlinks for 11G, which I will post at a later date, once 11G is released).

Dynamic Routing of BPEL processes

This aspect of the model relates to using some sort of a Business Rules Engine, which will route your workflow based on some business logic/events, all of it dynamic. So the only two things given would be the start and the end, what path was taken to get to the end will not be known at design time, its only at runtime that all will unfold and told. There are no caveats here, but you need to design and develop your business rules, make them available to BPEL to allow for dynamic routing of your workflows.

Combined Model

Now if we take the dynamic partnerlinks and combine it with dynamic routing, you end up with a double whammy, a fully dynamic BPEL process AND workflow!! How many times have you wished for this, well wait no longer, cuz here she is!!

Its a little hard to imagine but the benefits of this model are enormous for e.g.:

- minimum impact on your parent BPEL process, even if you change your endpoints or even the work that the service does, it has no impact on the calling BPEL process

- the only contract between the BPEL process and partnerlinks is the XSD, as long as that stays constant, it does not matter to the BPEL process

- by having all dynamic partnerlinks (or most, or some of them, you may still have to have some static PL), by changing the workflow it does not matter what services are called since they all conform to same XSD contract

- this model allows you to quickly change the partnerlinks and workflow (rules) with no impact on the BPEL process, since these rules and partnerlinks are external to the BPEL process - no redeployment!

-imagine being able to point to another version of your service and only have it called when a certain rule is satisfied, while still having the older service/rule up and running and doing this with NO DOWN TIME!!

Of course with all the benefits come some costs :) and the one cost (not bad odds one cost vs so many benefits) is the contract between services, i.e. the XSD --> for this whole model to work you need to come up with a common XSD for all the dynamic services that will be called, also known as a canonical XSD. Now, this canonical XSD can be a simpler version of a "true" canonical XSD, not as many elements in it, just enough to satisfy the endpoint. But this still means work on your part to come with this legal contract, and it may mean "modifying" services to meet this requirement. Some of the services that you are calling may not owned by you, so it may get tricky unless you can get a confirmation that the XSD's will not change in the near future (and the endpoint/implementation may not change that frequently either).

The thing to note here, this model works best for use cases where you are expecting a lot of change, for example in the legal industry, or policy driven industries (insurance, banks), due to the very dynamic nature of these businesses, they may need to change their process very frequently. If however, your business needs are static and do not except much change, this model may prove to "hectic", though it will not hurt, - if you are looking to implement this approach it would be prudent to do a feasibility exercise first.

Another benefit of this approach, and in my view one of the most important, is for all your long-running BPEL processes - my definition of a long running BPEL process (aka "active durable processes") is a process that cannot survive a rolling upgrade (usually 2 days), i.e. it has to finish where it started.

Processes can run for months/years - the problem with active durable processes is that if you have to change their implementation or if you have to migrate, you need to wait while they finish. Well, with this model, you can make all your active durable processes, into multiple short-running dynamic BPEL processes(a small design change), so any change to their implementation will have no impact at all on the BPEL processes.

So here ya go, a powerhouse at your disposal. As always comments and feedback welcome!


Thursday, July 24, 2008

To UDDI or not to UDDI

So here is my first official post to my blog and unfortunately its not a happy one....

In June and most of July I was involved in designing and building a prototype, at HQ, for Oracle's next generation of Fusion Applications. I was tasked with designing the new Security Framework that Fusion Applications will use and a big chunk of the architecture involved using the 11G SOA Suite (apologies but I can't divulge more architectural details)

Being a big proponent of standards we chose to use Systinet's UDDI as the WebService registry for publishing and looking up our wsdl endpoints to make our BPEL processes and partnerlinks dynamic. The use case was simple, publish a WebService (BPEL processes or otherwise) to the UDDI registry and then look it up using the service name (and not the registryServiceKey as shown here by Clemens). We could not use the serviceKey since all we had at design time was the service name and not the key, think about this lookup as a JNDI lookup of a resource based on a JNDI name. UDDI experts will argue that its a bad idea to use a service name as the lookup parameter since a service name is not unique, and what happens when you have multiple versions of that same service. Well for this prototype we made an assumption that there will always be one service with that name and hence unique. We would tackle the version issue down the know one step at a time.

So instead of using the registryServiceKey we had to use the UDDI/Systinet API to look up a service, I mean how hard can that be, right?? So we followed the Systinet examples that ship with the product, made sure that we could use them and then extend/change them to our needs.

So the first thing that hit me were the poor (read non-existent) JavaDocs and lack of examples on the Web. The irony of this exercise was that the Systinet UI had support for what we wanted to do (give me an endpoint based on a name) but there was no way to figure out what the underlying code looked like. I tried some reverse engineering but that lead to some more frustration and hair pulling and eventually lead us nowhere. Anyway after 12 days of looking, decompiling, posting questions on the Systinet forum (for which I still have not received a response) we had something that we could use, even though it still did not give us the functionality we needed.

And here comes the killer....let me ask you all you nice Java Developers, how many jar files do you need to run a single Java class? 1, 2, 3, 4, 5 jars..?? Hmm that's what I thought...ummm but not in this case...we needed a total of 28 jar files!! (that's after trimming from an initial count of 42). So to run a single class file  I need 28 jar files?? Hmmm talk about modularization!! Well, keeping faith we kept moving forward, and I looked at how we could add versioning and that brought along another aspect - publishing, which is just another can of beans...

So after 15 days of this, we realized the amount of complexity involved for such a simple use case, was just not worth it and relented. So UDDI gave way to the invincible and simple "Database 2 column Table". Yes, we are going to store our endpoints in the DB, and look them up using service names, all we want is the endpoint to be called dynamically (Dynamic endpoints, I will cover in my next blog posting). I wish it wasn't this way, but I guess some times you just can't beat the good ol' DB. I know that I won't be going close to a UDDI registry for a while....

Take care!


Thursday, July 3, 2008

Welcome to DASOAWorld

A lot of people at Oracle and at customer sites have asked me whether I have my own blog and the answer has been always regrettably "No"...until now!! I have finally found the time and the energy to create my own space on the web, where you can read about my SOA escapades with BPEL, ESB, UDDI and the rest of the SOA Suite. Issues that I have encountered at numerous client sites, architectural discussions that I have had, and best practices that have been implemented.

Here I will try to capture morbid details of issues and how we resolved them (both in 10G and 11G), and bring a world perspective on how SOA in general and the Oracle FMW suite of products are impacting the normal everyday individual in their Web2.0 lives.

Stay tuned...lots to share!! Enjoy!!

Deepak Arora