Pages

Tuesday, October 28, 2014

IBM Integration Bus sizing and performance

It's not the first time that I am been in a situation where I have been asked to evaluate how many cores would be necessary to run a load on IBM Integration Bus (IIB).

In this post I will provide you a way to make this evaluation based on public IIB performance reports.

Required information  

Before starting your evaluation, you first need to identify a set of flows that corresponds to ESB patterns that you would like to achieve: transformation, aggregation, protocols, logging, ...

You would like as well to define on what operating system the run time will run.

For these flows you will need to define the following information
  • Kind of messages: XML, non XML
  • Average size of messages
  • Load peak expressed in transaction/sec. I usually use the peak since this is the throughput that you would like to sustain. 
  • Operating system where the IIB will run
Note that the CPU type has also it's importance. Normally you would have to add a corrector factor to compensate the fact that you have a more efficient processor.

Performance reports

Once this information is available, you can start your performance evaluation computation.
For this we will use performance report that is publicly available. For IBM Integration Bus, go to the link message throughput. (good link to bookmark is the performance topic). For older version, you can find performance reports at the support pack link.
On the message throughput link select your operating system.

These performance reports have been measure using a processor: IBM xSeries x3690 X5 with 2 x Deca-Core Intel(R) Xeon(R) E7-2860.
For those that would like to know the weight associated to this processor, please have a look at the PVU information page. You will find that this processor corresponds to a 70 PVU processor (2 sockets with 10 cores).

You will find different type of patterns that have been tested: aggregation, coordination request/reply, message routing, ...
For each of these tests there is a table providing the message rate, the CPU busy and the CPU ms/msg.

The information that we will keep are the 
  • message rate
  • the CPU ms/msg

Performance evaluation

How can we use these information?

Let's take an example!

Imagine that you would like to deploy a flow doing aggregation with an average non-persistent message size of 2 kb.
This flow will process 460000 messages a day but there is a peak! During one hour in the day the flow has to sustain 720000 messages.

We would like to size the bus such a way that he will sustain the peak. This corresponds to a throughput of 200 trs/sec.
Lets have a look to the tables available on the performance report page. There is one for the aggregation:

Non PersistentFull Persistent
Msg SizeMessage Rate% CPU BusyCPU ms/msgMessage Rate% CPU BusyCPU ms/msg
256b4020.199.34.941476.912.45.215
2kB3430.698.35.732707.819.75.559

For 2kb non persistent message, the processing of one message takes 5.732 ms of CPU.

The processing of 200 transactions per seconds would take 200 (msg/sec) *5.732 (CPU ms/msg) = 1146.4 (msg/sec)*(CPU sec/1000 * 1/msg) = 1.1464 CPU. 

So you would need to at least 2 processors to process this load. The total load of the machine with 2 CPU would be 57.32%.

It is not recommended to have a processor loaded at more than 70 % but here the 2 CPU will do the deal.

Of course you could have more complex situation such MQ, JMS, database, routing....

It is possible to approach more complex situation by adding the CPU load for each scenarios.
If my integration needs to be exposed as web service using SOAP messages, performs routing and transforms messages, I would approximate my CPU load by adding the corresponding load factor: 2.633 (SOAP) + 0.488 (routing) 0.896(transformation).

This is not ideal since each scenarios include the protocol processing like MQ for routing.
In the previous reports that are available in the support pack page, more tests have been done and it is possible to build a spreadsheets to isolate the processing of each mediations.
For example there is a test made for MQInput-MQOutput (x ms/msg), MQI-Transformation-MQO (y ms/msg) and JMSIn-JMSout (z ms/msg). For a JMSI-Transformation-JMSO integration flow, you could therefore approximate the CPU load using: y ms/msg - x ms/msg + z ms/msg.
The performance in IIB V9 is higher than for V8, so using the old performance reports would provide you a good estimation with a security factor.

References

message throughput information about IIB in function of the operating system.
performance information, tips and hints.
support pack old performance reports.
PVU information page to find the PVU for a defined CPU.

Tuesday, October 7, 2014

MQ Cluster demistified

I would like to provide through this post some highlighting on the MQ clustering: what is it, what does it brings and how to set it up.

MQ cluster is not about data replication or make data high available, it's about making object definitions known and available in a group of queue managers and providing a workload management capability.

 The "what is it"


Put it simply, a MQ Cluster is a collection of queue managers that have been defined to be part of a group called a cluster.
Within this group, queue managers know all objects that are shared within the cluster. Objects are like queues, topics.
Within a cluster all queue managers know how to send messages to the target clustered destination. The message transfer are handled automatically by the queue manager in the cluster.

The "what does it brings"

The cluster provides the following main advantages:
  • Simplify administration for distribute queueing. When a message has to send to a destination located to another queue manager, the queue has to be simply shared in the cluster on the remote queue manager that's it. The transfer of the message from the queue manager where the application has put the message to the remote queue manage is carried out by the cluster: no remote queues, no transmission queues, no channels. 
  • Workload balancing: when multiple instances of queues are defined, the cluster can balance the message distribution to all instances shared in the cluster. Priority and weight can be defined for specific queue managers. 
  • Improve availability and simplify maintenance: when multiple instance of queues are defined and one queue manager that hosts this queue is not available, the messages are automatically routed to the remaining queues. This increases the availability since messages can still be processed but it also simplifies the maintenance as the queue manager is known to be stopped in the cluster. 

Some disadvantages that I have in mind:
  • Queue manager in a cluster are directly connected with each others. It increases the number of channels which means consume more resources. 
  • Less control on how the messages are sent from one queue manager to another 
  • Less flexibility in the queue name resolution compared to the distributed queueing (alias, remote queues, ...). 

The "Principle" 

Let's first discuss about queue destinations.
When an application tries to put a message to a queue and this queue is not known by the local queue manager (the queue manager that the application is connected to), the queue manager queries the cluster (if it is part of one) to know if this queue is known.
If the queue is shared within the cluster, the information about its location is provided to the requestor queue manager. The local queue manager handles the message by putting it in the cluster transmission queue with a transmission header containing all the information to route it to the correct destination. The message is then sent by the local queue manager directly to the target queue manager.

Be aware that an application can get messages from local queue only, cluster defined or not. Local queues are queues defined on the queue manager that the application is connected to.
The story is a little bit different for topics. Indeed topic is used to define a destination where an application can publish or subscribe.
When a topic is shared within a cluster, it is known by all queue managers in this cluster.
Therefore if an application subscribes to a cluster topic it will receive all messages published to this topic even though the topic has been created in another queue manager.

The "how to make it" 

You will be puzzled how simple it is to create a cluster !!

Before going further there is one think to know about: the repositories. A repository is a store used by a queue manager to store cluster object definitions.

There are two types of repository: full and partial.
A full repository is a repository that contains cluster object definition from the whole cluster. When an object is shared in a cluster, the object definition is sent by this queue manager to the queue manager that holds the full repository.
A partial repository holds information about cluster object definitions that have been resolved from the queue manager that holds this repository. When an application puts a message for a remote queue shared in the cluster, the local queue manager requests the destination object information to the queue manager holding the full repository. This information is then stored in its local repository. Hence the name "partial repository" as it does not store all object definitions of the whole cluster.

In order to exchange these object definitions, queue managers use cluster channels:
  • a cluster receiver channel to receive object definitions 
  • a cluster sender channel to send object definitions to the queue manager(s) holding the full repository. 
Two rules about the repositories:
  • A queue manager can hold either a full repository or a partial repository (not both) 
  • Two (or one if not in production) full repositories is enough. All the other queue manager would hold a partial repository only. 
With this in mind, let's have a look how to make it !

The steps provided here after has to be performed in the same order.

For full repository
  1. Set the queue manager property "REPOS" to the name of the cluster. This will tell to the queue manager that he will host a full repository. A runmqsc command would be
    ALTER QMGR REPOS('myCluster')
     
  2. Create cluster receiver channel to define how to connect to this queue manager: 
    DEF CLUSRCVR(TO.MyQMgr) 
    CONNAME('ipaddressOfThisQMgr(portNumberOfThisQMgr)') 
    CLUSTER('myCluster')
  3. Optionally cluster sender channels can be created to point to another full repository ONLY. 
 For partial repository
  1. Create cluster receiver channel to define how to connect to this queue manager:
    DEF CLUSRCVR(TO.MyQMgr)
     CONNAME('ipaddressOfThisQMgr(portNumberOfThisQMgr)') 
    CLUSTER('myCluster')
  2. Create a cluster sender channel to define how to reach a queue manager holding a full repository (not partial).
    DEF CLUSSDR(TO.FullRepoQMgr) 
    CONNAME('ipaddressOfFullRepoQMgr(port)') 
    CLUSTER('myCluster')
That's it !!
Now you can create a queue and share it to the cluster "myCuster" that you just created.

References 

Information about the number of repositories in a cluster:
https://www.ibm.com/developerworks/community/blogs/messaging/entry/wmq_clusters_why_only_two_full_repositories?lang=en
Best practices (old presentation but the principle are still valid):
http://www.academia.edu/5513555/WMQCluster_Best_Practices
And this ten things for having an healthy cluster
https://www.ibm.com/developerworks/community/blogs/aimsupport/entry/ten_quick_tips_for_healthy_mq_cluster?lang=en
Very useful information about cluster questions
https://www.ibm.com/developerworks/community/blogs/messaging/tags/clustering?lang=en
Very good article that provides information about MQ HA
http://www.ibm.com/developerworks/websphere/library/techarticles/0505_hiscock/0505_hiscock.html