Monday, December 6, 2010

Oracle 10g BPEL dehydration store purge strategies

After many months of deliberation here is the white paper that I have been working on to define 10G BPEL dehydration store purge strategies. You can find it here on OTN. The main goal of the paper is to provide SOA administrators a way to identify their SOA Data footprint and choose the right purging strategy that best meets their requirements.

Please give it a read and I would welcome any feedback for further editions. I am currently working on an 11g purging white paper due for release in early 2011.

Thanks,

Deepak

Friday, September 10, 2010

11g Dynamic partnerlink example

In one of my blog postings I had mentioned using dynamic partnerlinks in 11g but didn’t post an example since 11g was not out then. Well here is the example – I checked it against Sean Carey’s example from the BPEL Cookbook and there are is only one big change. Since there is no bpel.xml anymore all references to static WSDL’s are now located in the composite.xml. So in my very simple example:

<reference name="HelloWorld" ui:wsdlLocation="HelloWorld.wsdl">
<interface.wsdl interface="
http://xmlns.oracle.com/HelloWorld/HelloWorlf/HelloWorld#wsdl.interface(HelloWorld)"
callbackInterface="http://xmlns.oracle.com/HelloWorld/HelloWorlf/HelloWorld#wsdl.interface(HelloWorldCallback)"/>
<binding.ws port="
http://xmlns.oracle.com/HelloWorld/HelloWorlf/HelloWorld#wsdl.endpoint(client/HelloWorld_pt)"
location="HelloWorld.wsdl"/>
</reference>

There is still a reference to the local WSDL (not remote) which is used as a static interface to the actual WSDL that is passed at runtime. None of the other artifacts change in 11g i.e. the EndpopintReference variable, the ServiceName, Address and assign to Partnerlink – all stay the same.

In my example I am including a SequentialProcess.wsdl which is not used in the project but it can be used as a template for defining a static WSDL for future projects. At the moment my GoDynamicBPEL process is adding has the values for ServiceName and Address at design time, but these can be changed to pick up the values at runtime instead to make the process truly robust.

The project can be downloaded from here:

GoDynamic.zip

As always comments and feedback welcome.

DA

Tuesday, August 31, 2010

How to automate deletion of XML_DOCUMENT partitions in 10g SOA

I just recently returned from the Middle East where I was visiting a client to conduct a performance analysis. As part of this exercise I spent a lot of time going over their purging strategies for their BPEL dehydration store. Since I just finished writing the 10g BPEL purging strategy white paper, this was a perfect ground to test some of the theories. After some careful analysis we concluded that we would have to use the hybrid approach – the multi-looped purge script and partitioning for XML_DOCUMENT as our purging strategy. While conducting this exercise I came up with an interesting way to automate the deletion (drop partitions) for the XML_DOCUMENT.

For this exercise we need to use the partitioning scheme and verify scripts as described in Michael Bousamra’s partitioning white paper. I am not going into the semantics of the hybrid approach (which is mentioned in my strategy white paper) but focus more on how to automate the deletion of XML_DOCUMENT partitions. Currently the verify script takes an array with a name of the partitions to check for deletion – this is a manual task that the DBA has to conduct i.e. the partition names have to be provided manually. The problem with this approach is that not only does someone have to remember all the partition names that are ready for deletion but also keep track of partitions that were not previously dropped.

For e.g. there are 6 monthly partitions: partitionA, partitionB, partitionC, partitionD, partitionE, partitionF and they are dropped on a monthly basis. Lets assume that in the 1st month partitionA was not dropped since there were still running BPEL instances. In the 2nd month the DBA would now have to pass in two names to the verify script, partitionA and partitionB. Lets assume that partitionB was dropped but partitionA was not (long running BPEL processes) then in the 3rd month the DBA has to pass in partitionA and partitionC to the verify scripts and so on. So in this case not only does the DBA have to remember what names to pass to the verify script but also keep a track of partitions that were not dropped in past purge cycles. As you can imagine this can quickly get complicated with a lot of partitions in the mix. Here is how this can be automated:

1. Create a new table called XML_DOC_PARTITION_STATE with 4 columns in the ORABPEL schema. The columns names are: PartitionName, StartDate, ExpiryDate, isDropped

XML_DOC_PARTITION_STATE

PartitionName StartDate ExpiryDate isDropped

partitionA

01-01-2010

31-01-2010

N

partitionB

01-02-2010

28-02-2010

Y

partitionC

01-03-2010

31-03-2010

N

2. Create a DB trigger to automatically populate the above table whenever a new XML_DOCUMENT partition is created.

3. Create a SQL query which will read from the XML_DOC_PARTITION_STATE based on the expiry date and state and pass the names of the partitions to the verify script. (If you would like you can embed this query into the verify script directly – DC_EXEC_VERIFY): The SQL would look like this:

SELECT PARTITIONAME FROM XML_DOC_PARTITION_STATE WHERE EXPIRYDATE < SYSDATE-1 AND ISDROPPED=’N’;

The goal is to select all the partition names that have not been dropped and meet the purging date criteria. So based on the above example partitionA and partitionC would be selected and passed to the DC_EXEC_VERIFY.sql.

4. At the moment the verify script creates a report stating if a partition can be dropped or not. The actual deletion happens outside of the verify scripts. This can be automated by changing the DC_VERIFY.sql to add the ALTER table command directly in the verify script to do this in one shot:

THEN
      IF (Ci_Ref_Part_Ok(UPPER(doc_drv_list(i))))
      THEN
        IF (Ad_Part_Ok(UPPER(doc_drv_list(i))))
        THEN
          UTL_FILE.Put_Line (PART_HANDLE,'PASS: ALL DOCUMENTS ARE UNREFERENCED THUS THE');
          UTL_FILE.Put_line (PART_HANDLE,'        XML_DOCUMENT PARTITION CAN BE DROPPED');

         Delete_Partition(UPPER(doc_drv_list(i)));

        ELSE
          UTL_FILE.Put_Line (PART_HANDLE,'FAIL: AUDIT_DETAILS TABLE HAS ACTIVE DOCUMENTS');
          UTL_FILE.Put_Line (PART_HANDLE,'         THUS THE XML_DOCUMENT PARTITON CANNOT BE DROPPED');

The Delete_Partition function call will just do the following (pseudo code):

ALTER TABLE DROP PARTITION doc_drv_list(i) –> which is the current partition name that the script is looping over.

The above mechanism will generate the report and also drop the partition at the same time instead of doing this at separate times.

5. Once a partition has been dropped update the XML_DOC_PARTITION_STATE table to update the isDropped column to ‘Y’ for that partition. So the SQL in the DC_VERIFY.sql would look like:

THEN
      IF (Ci_Ref_Part_Ok(UPPER(doc_drv_list(i))))
      THEN
        IF (Ad_Part_Ok(UPPER(doc_drv_list(i))))
        THEN
          UTL_FILE.Put_Line (PART_HANDLE,'PASS: ALL DOCUMENTS ARE UNREFERENCED THUS THE');
          UTL_FILE.Put_line (PART_HANDLE,'        XML_DOCUMENT PARTITION CAN BE DROPPED');
         Delete_Partition(UPPER(doc_drv_list(i)));

Update_State_Table((UPPER(doc_drv_list(i)));

where the Update_State_Table function is just updating the state for that partition (pseudo code):

UPDATE XML_DOC_PARTITION_STATE SET ISDROPPED=’Y’ WHERE PARTITIONNAME=doc_drv_list(i);

COMMIT;

So using our above example if both partitionA and partitionC were dropped the XML_DOC_PARTITION_STATE would look like this:

XML_DOC_PARTITION_STATE

PartitionName StartDate ExpiryDate isDropped

partitionA

01-01-2010

31-01-2010

Y

partitionB

01-02-2010

28-02-2010

Y

partitionC

01-03-2010

31-03-2010

Y

Summary:

By using the STATE tables approach you can automate the purging of XML_DOCUMENT partitions. There is no need to track or remember the partition names and the partitions can be dropped directly in the verify scripts. This same methodology can be applied for other partitioned tables to help with automated purging.

As always would love to hear your comments and or questions.

DA!

Wednesday, July 28, 2010

The German F1 – some thoughts…

Amid the fuss about Ferrari 'fixing' the result, it's been easy to overlook a few significant developments that occurred in Germany...

Ferrari Have Found Their Pace
Ferrari have been quick ever since they introduced a major upgrade package - including a rear blown diffuser - at Valencia, but circumstances have hitherto denied its full point-scoring impact. A certain controversy has distracted attention and submerged proper appreciation of their on-track resurgence, but both Red Bull and McLaren, beaten by almost half a minute, will have left Hockenheim with deep concerns.

With the exception of Montreal, and even there it is arguable whether McLaren's apparent superiority was genuine, this was the first time since the opening weekend of the season when the Red Bulls have been reduced to second best. With eight races remaining, it was an ominous step forward by the boys in red.

The World Championship Is A Five-Way Fight
With victory, Fernando Alonso has clambered to within 30 points of championship leader Lewis Hamilton. Due to the alteration of the points system, the deficit sounds substantial, but it is actually only the equivalent of a single race win and a bit. In previous years, the difference between Hamilton and Alonso would be measured at approximately 13 points. The Spaniard is very much back in the reckoning.

Moreover, Germany provided two other critical reasons to regard Alonso as a leading challenger, if not the leading contender to usurp Hamilton. The first, as stated above, was the demonstration of Ferrari's superior pace. The second was the realisation that, following the team's tacit withdrawal of Massa from contention, Alonso need not concern himself with his team-mate and has assumed the position of first among equals at Ferrari. Alonso's number-one billing is a luxury that none of his other World Championship rivals possess and could yet prove a critical advantage.

Alonso's Brilliance Has A Flaw – he is a tantrum throwing rich brat!
Describing Alonso as a deserved victor in Germany is to drift into dangerous territory, but he was indisputably the outstanding performer of the weekend. He qualified half a second ahead of his team-mate and was clearly the faster of the two Ferraris in the race. But for Vettel's chop off the line, Massa's defeat would almost certainly have been emphatic.

The limit to the adulation, however, is reached with the reminder of Alonso's petulance and his reaction to adversity. "This is ridiculous," he shrieked on the car-to-pit radio after his first - and only - attempt to pass Massa at full speed was rebuffed. It was a low moment, revealing, once again, the petulance that makes Alonso so difficult to admire. Whereas others, such as Mark Webber, would respond with renewed determination, Alonso's typical response to adversity consists of toy-throwing. It cost him points in Valencia and this weekend's real ridiculousness was his failure to change his ways.

Massa Still Cannot Work The Hard Tyres
As the man himself put it, "what happened today is something that has happened in many races this year: when I put on the hard tyres I struggle". The Brazilian just isn't suited to the tyres' characteristics and nearly slithered off the track three times in the first two laps after his pit-stop. "So I know why sometimes I'm a little bit penalised, it's just because of the very hard tyres that we have this year," he added. But these tyres aren't exclusive to Massa and his failure to adapt is a black mark on his CV. He is making hard work of a difficult job.

McLaren Are Playing Catch-Up Again
It's becoming the team's default position. Still in the process of experimenting with the blown diffuser they failed to successfully introduce at Silverstone, there was further headscratching inside McLaren's garage this weekend when pictures were published apparently showing the front wings of the Ferraris and Red Bulls flexing.

Though the team has repeatedly demonstrated in the past 12 months that it is adept at closing gaps and developing new parts, the constant need to play catch-up in a three-way title fight is not a position of preference. Nor can it be sustainable. Already working on a substantial upgrade, the last thing McLaren need ahead of a three-week factory closedown and another race this weekend is a second new concept to be rushed off the production line.

Vettel Just Can't Get Off The Line
How different - and bereft of talking points - the last two races would have been if the pole-sitter had led into the first corner. Vettel was so bogged down on the line in Germany that, in his own words, he was "lucky not to stall". It is food for thought, and presumably an internal investigation at Red Bull, that Vettel has so far won just one of the six grands prix he has started from pole this season.

Mercedes Struggles Are Worsening
But for Vettel's poor start setting in train a sequence of events that Ferrari spectacularly mishandled, Sunday's post-race focus would have been trained on the humiliation endured by Mercedes. In front of their home support as well as a troop of company executives, both of their cars were lapped by the top three. Sometimes there is too much of a good thing - as their race unraveled, the team could have been forgiven for wishing for a mechanical gremlin to excuse their lack of competiveness.

If it is true that attention - and resource - has already switched to 2011, then Michael Schumacher's renewal of vows for next seasons makes crystal-clear sense. He doesn't want to spend the next four months touring around the midfield in an abandoned car just for someone else to enjoy the fruits of Mercedes' refocus next March.

Button's Decision To Leave Mercedes Has Been Vindicated
As an enhancement of his World Championship prospects, Button's decision to join McLaren was vindicated long ago. Despite pitching himself in direct competition against Lewis Hamilton, the fact of the matter is that Button possesses a greater chance of retaining his title at McLaren than he would if he had remained at Mercedes.

A statistic worth highlighting at this juncture is that Button currently holds more points than the two Mercedes drivers, Schumacher and Nico Rosberg, combined.

Renault Are Also Falling Back
The team's cash-flow shortage is such that they have reportedly requested an advance on money due at the end of the season.

The disclosure that the team have had to tighten their belt - and possibly even mothballed development work - tallies with the impression they have, along with Mercedes, fallen back towards the also-runners. As a result, the top three appear to have broken away into a league of their own.

It's a significant development in the World Championship because it means that, but for mechanical failure or unfortunate circumstances, McLaren - the current leaders of the Constructors' Championship - should be able to finish no lower than fifth and sixth in the eight races still to be run. Red Bull, trailing by a mere 28 points, will not be unduly bothered. But the likelihood of the McLarens being able to collect a decent haul of points in all the races to come is a large obstacle in the way of Ferrari closing a gap that currently stands at 92 points. This weekend, their 1-2 pulled back just 20 against McLaren's 4-5. One more mistake would cause Ferrari to surrender their hopes.

Monday, May 17, 2010

UAT Testing for SOA 10g clusters

Test cases for each component - Oracle Application Server 10G

General Application Server test cases

This section is going to cover very General test cases to make sure that the Application Server cluster has been set up correctly and if you can start and stop all the components in the server via opmnct and AS Console.

Test Case 1

Check if you can see AS instances in the console

Implementation

1. Log on to the AS Console --> check to see if you can see all the nodes in your AS cluster. You should be able to see all the Oracle AS instances that are part of the cluster. This means that the OPMN clustering worked and the AS instances successfully joined the AS cluster.

Result

You should be able to see if all the instances in the AS cluster are listed in the EM console. If the instances are not listed here are the files to check to see if OPMN joined the cluster properly:

  • $ORACLE_HOME\opmn\logs{*}opmn.log*
  • $ORACLE_HOME\opmn\logs{*}opmn.dbg*

If OPMN did not join the cluster properly, please check the opmn.xml file to make sure the discovery multicast address and port are correct (see this link for opmn documentation). Restart the whole instance using opmnctl stopall followed by opmnctl startall. Log on to AS console to see if instance is listed as part of the cluster.

Test Case 2

Check to see if you can start/stop each component

Implementation

  1. Check each OC4J component on each AS instance
  2. Start each and every component through the AS console to see if they will start and stop.
  3. Do that for each and every instance.

Result

Each component should start and stop through the AS console. You can also verify if the component started by checking opmnctl status by logging onto each box associated with the cluster

Test Case 3

Add/modify a datasource entry through AS console on a remote AS instance (not on the instance where EM is physically running)

Implementation

  1. Pick an OC4J instance
  2. Create a new data-source through the AS console
  3. Modify an existing data-source or connection pool (optional)

Result

Open $ORACLE_HOME\j2ee\<oc4j_name>\config\data-sources.xml to see if the new (and or the modified) connection details and data-source exist. If they do then the AS console has successfully updated a remote file and MBeans are communicating correctly.

Test Case 4

Start and stop AS instances using opmnctl @cluster command

Implementation

1. Go to $ORACLE_HOME\opmn\bin and use the opmnctl @cluster to start and stop the AS instances

Result

Use opmnctl @cluster status to check for start and stop statuses.

HTTP server test cases

This section will deal with use cases to test HTTP server failover scenarios. In these examples the HTTP server will be talking to the BPEL console (or any other web application that the client wants), so the URL will be _http://hostname:port\BPELConsole

Test Case 1

Shut down one of the HTTP servers while accessing the BPEL console and see the requested routed to the second HTTP server in the cluster

Implementation

  1. Access the BPELConsole
  2. Check $ORACLE_HOME\Apache\Apache\logs\access_log --> check for the timestamp and the URL that was accessed by the user. Timestamp and URL would look like this
1xx.2x.2xx.xxx [24/Mar/2009:16:04:38 -0500] "GET /BPELConsole=System HTTP/1.1" 200 15



  1. After you have figured out which HTTP server this is running on, shut down this HTTP server by using opmnctl stopproc --> this is a graceful shutdown.


  2. Access the BPELConsole again (please note that you should have a LoadBalancer in front of the HTTP server and configured the Apache Virtual Host, see EDG for steps)


  3. Check $ORACLE_HOME\Apache\Apache\logs\access_log --> check for the timestamp and the URL that was accessed by the user. Timestamp and URL would look like above



Result



Even though you are shutting down the HTTP server the request is routed to the surviving HTTP server, which is then able to route the request to the BPEL Console and you are able to access the console. By checking the access log file you can confirm that the request is being picked up by the surviving node.



Test Case 2



Repeat the same test as above but instead of calling opmnctl stopproc, pull the network cord of one of the HTTP servers, so that the LBR routes the request to the surviving HTTP node --> this is simulating a network failure.



Test Case 3



In test case 1 we have simulated a graceful shutdown, in this case we will simulate an Apache crash



Implementation




  1. Use opmnctl status -l to get the PID of the HTTP server that you would like forcefully bring down


  2. On Linux use kill -9 <PID> to kill the HTTP server


  3. Access the BPEL console



Result



As you shut down the HTTP server, OPMN will restart the HTTP server. The restart may be so quick that the LBR may still route the request to the same server. One way to check if the HTTP server restared is to check the new PID and the timestamp in the access log for the BPEL console.



BPEL test cases





This section is going to cover scenarios dealing with BPEL clustering using jGroups, BPEL deployment and testing related to BPEL failover.



Test Case 1



Verify that jGroups has initialized correctly. There is no real testing in this use case just a visual verification by looking at log files that jGroups has initialized correctly.




  • Check the opmn log for the BPEL container for all nodes at $ORACLE_HOME/opmn/logs/<group name><container name><group name>~1.log. This logfile will contain jGroups related information during startup and steady-state operation. Soon after startup you should find log entries for UDP or TCP.


  • Example jGroups Log Entries for UDPApr 3, 2008 6:30:37 PM org.collaxa.thirdparty.jgroups.protocols.UDP createSockets



·         INFO: sockets will use interface 144.25.142.172


·          


·         Apr 3, 2008 6:30:37 PM org.collaxa.thirdparty.jgroups.protocols.UDP createSockets


·          


·         INFO: socket information:


·          


·         local_addr=144.25.142.172:1127, mcast_addr=228.8.15.75:45788, bind_addr=/144.25.142.172, ttl=32


·         sock: bound to 144.25.142.172:1127, receive buffer size=64000, send buffer size=32000


·         mcast_recv_sock: bound to 144.25.142.172:45788, send buffer size=32000, receive buffer size=64000


·         mcast_send_sock: bound to 144.25.142.172:1128, send buffer size=32000, receive buffer size=64000


·         Apr 3, 2008 6:30:37 PM org.collaxa.thirdparty.jgroups.protocols.TP$DiagnosticsHandler bindToInterfaces


·          


·         -------------------------------------------------------


·          


·         GMS: address is 144.25.142.172:1127


·          


-------------------------------------------------------



  • Example jGroups Log Entries for TCPApr 3, 2008 6:23:39 PM org.collaxa.thirdparty.jgroups.blocks.ConnectionTable start



·         INFO: server socket created on 144.25.142.172:7900


·          


·         Apr 3, 2008 6:23:39 PM org.collaxa.thirdparty.jgroups.protocols.TP$DiagnosticsHandler bindToInterfaces


·          


·         -------------------------------------------------------


·         GMS: address is 144.25.142.172:7900


-------------------------------------------------------



  • In the log below the "socket created on" indicates that the TCP socket is established on the own node at that IP address and port the "created socket to" shows that the second node has connected to the first node, matching the logfile above with the IP address and port.Apr 3, 2008 6:25:40 PM org.collaxa.thirdparty.jgroups.blocks.ConnectionTable start



·         INFO: server socket created on 144.25.142.173:7901


·          


·         Apr 3, 2008 6:25:40 PM org.collaxa.thirdparty.jgroups.protocols.TP$DiagnosticsHandler bindToInterfaces


·          


·         ------------------------------------------------------


·         GMS: address is 144.25.142.173:7901


·         -------------------------------------------------------


·         Apr 3, 2008 6:25:41 PM org.collaxa.thirdparty.jgroups.blocks.ConnectionTable getConnection


INFO: created socket to 144.25.142.172:7900


Result



By reviewing the log files, you can confirm if BPEL clustering at the jGroups level is working and that the jGroup channel is communicating.



Test Case 2



Test connectivity between BPEL Nodes



Implementation




  1. Test connections between different cluster nodes using ping, telnet, and traceroute. The presence of firewalls and number of hops between cluster nodes can affect performance as they have a tendency to take down connections after some time or simply block them.


  2. Also reference Metalink Note 413783.1: "How to Test Whether Multicast is Enabled on the Network."



Result



Using the above tools you can confirm if Multicast is working  and whether BPEL nodes are commnunicating.



Test Case3



Test deployment of BPEL suitcase to one BPEL node.



Implementation




  1. Deploy a HelloWorrld BPEL suitcase (or any other client specific BPEL suitcase) to only one BPEL instance using ant, or JDeveloper or via the BPEL console


  2. Log on to the second BPEL console to check if the BPEL suitcase has been deployed



Result



If jGroups has been configured and communicating correctly, BPEL clustering will allow you to deploy a suitcase to a single node, and jGroups will notify the second instance of the deployment. The second BPEL instance will go to the DB and pick up the new deployment after receiving notification. The result is that the new deployment will be "deployed" to each node, by only deploying to a single BPEL instance in the BPEL cluster.

Test Case 4



Test to see if the BPEL server failsover and if all asynch processes are picked up by the secondary BPEL instance



Implementation




  1. Deploy a 2 Asynch process:

    1. A ParentAsynch Process which calls a ChildAsynchProcess with a variable telling it how many times to loop or how many seconds to sleep


    2. A ChildAsynchProcess that loops or sleeps or has an onAlarm




  2. Make sure that the processes are deployed to both servers


  3. Shut down one BPEL server


  4. On the active BPEL server call ParentAsynch a few times (use the load generation page)


  5. When you have enough ParentAsynch instances shut down this BPEL instance and start the other one. Please wait till this BPEL instance shuts down fully before starting up the second one.


  6. Log on to the BPEL console and see that the instance were picked up by the second BPEL node and completed



Result



The BPEL instance will failover to the secondary node and complete the flow



ESB test cases





This section covers the use cases involved with testing an ESB cluster. For this section please follow Metalink Note 470267.1 which covers the basic tests to verify your ESB cluster.

Thursday, April 29, 2010

Back after a very long break

So, I have been away, for tooooo long. Sorry, but I don’t think anybody missed me, but I should apologise none-the-less. Anyway for those that missed me I just recently finished a gruelling 15 month Executive MBA at Queens University which basically required a lot of my attention and time. It was a lot of late nights, constantly working to keep up with the course work and also fulfilling my Oracle duties. It is something that I will cherish for a very long time and something that will help me move on in life and in my career.

The MBA helped me learn the finer aspects of running a business, the financial acumen and other details that are required to be successful. Moreover it has taught me to think better, broader and analyse more information in lesser time. But most importantly it has taught me how important a team is and how a team can be your best asset to be successful. The Queens MBA has taught me how best to function within a team, how to leverage your team to be successful and how to use cleverly devised processes to get the best out of your team.

The Queens Executive MBA was an amazing experience, one never to be forgotten. I can only look at the past 15 months in awe and respect, always wondering how I ever finished this course. I have made some amazing friends, shared some great memories and had lots of laughs along the way. But now its time to get back to the real world and get down to business.

I have been busy since the beginning of the year and have some great material ready to be shared from my SOA escapades with various clients. I will try to update this space often with more blog postings and information, stay tuned – its good to be back!!

Thanks,

Deepak