Showing posts with label statistics. Show all posts
Showing posts with label statistics. Show all posts

Friday, January 5, 2018

Day 24 The tide is high

Day 24

The tide is high

Let us say that you do not have Sterling Control Center to run the High Water Mark report for maximum concurrent sessions.
You might want this for determining if you are making optimum use of the number of concurrent sessions that you are licensed for.
You may not have enough Connect:Direct nodes that you feel justifies the need for Sterling Control Center.

For Windows

The following script water-mark.vbs is an example of how to do this using VBScript and the Connect:Direct Windows SDK.
In fact this is a fairly simple example of how to use the Connect:Direct Windows SDK.
'
' Script to report on the high water mark for the number of concurrent Connect:Direct sessions 
'

Dim node        ' Represents the Connect Direct Node we are using. 
Dim stats       ' A collection of statistic records. 
Dim stat        ' An individual statistic record. 

' Make the OLE/COM connection to Connect:Direct 
Set node = CreateObject("CD.Node")

' Sign on using defaults from the Client Connection Utility
node.Connect "CD.NICKE","",""

' Get start and end sessions statistics records since yesterday
set stats = node.SelectStats("select statistics startt=(today) recids=(SSTR,SEND) ccode=(eq,0)")

currentNumberOfSessions = 0
highWaterMark  = 0

For Each stat in stats
       Select Case stat.RecId 
          Case "SSTR" currentNumberOfSessions=currentNumberOfSessions+1          Case "SEND" currentNumberOfSessions=currentNumberOfSessions-1       End Select
       If currentNumberOfSessions > highWaterMark Then
          highWaterMark = currentNumberOfSessions
       End If
Next

Wscript.echo "Concurrent sessions high water mark = " & highWaterMark

Set node  = nothing
Set stats = nothing
Set stat  = nothing
If you have the Connect:Direct Windows SDK installed and default sign on credentials registered with the Client Connection Utility you can run the script as follows:
C:\Users\nicke> cscript /nologo water-mark.vbs

Concurrent sessions high water mark = 5
The above script is minimal in that it is only meant to be run interactively as it does not itself check for every error.
You run the script on a Windows machine with the SDK installed, but the node in question can be a remote Windows or UNIX (I have not tested other platforms) Connect:Direct node as long as the node is registered with the Client Connection Utility.

For UNIX

If all you have are UNIX machines then you can use the following shell function for convenience:
function watermark
{
        # Usage: cd work/UNIX.NODE ; cat S201712* | watermark
        egrep "RECI=(SSTR|SEND)" | grep CCOD=0 | awk '
BEGIN{
        currentNumberOfSessions=0
        highWaterMark=0
}
/RECI=SSTR/ { currentNumberOfSessions++ }
/RECI=SEND/ { currentNumberOfSessions-- }
{
        if(currentNumberOfSessions > highWaterMark){
                highWaterMark=currentNumberOfSessions
        }
}
END{
        print "Concurrent sessions high water mark = " highWaterMark
}'
}
Put the above shell function definition either in your .profile or just paste it into a terminal session. You will need to be in the work directory for your UNIX node where the statistics files are. Then you can run it as follows by piping whatever period of statistics files you want through the watermark shell function:
[work/CD.UNIX] $ cat S201712* | watermark
Concurrent sessions high water mark = 8
Now you will know whether the number of sessions you use is appropriate for your licensing of these nodes.



Saturday, November 27, 2010

Least busiest time on the Connect:Direct node?

Day 15

Often when you need to make a change to a production Connect:Direct server,  you want to know when would be the best time to do it.

Sometimes you want to add additional transfers but want to schedule them at a time that does not put all the load at one time of the day.

Either way you can use the following short script to give you a histogram showing the activity on the local node to identify the best times for scheduling transfers or making configuration changes with as little impact to existing transfers.

I first did this for UNIX Connect:Direct nodes and was then asked if something similar could be done for Windows Connect:Direct nodes.  The Windows script for this used VBScript and an HTML application showing the histogram in HTML and the command line technique you have seen previously.  I’ll save that description for a later post.

This post will just address the UNIX solution to this problem.

You need to feed the “cdhours” shell function with the Connect:Direct UNIX statistics files for the period you are interested in.  To do this you probably want to be in the “work” directory for the local node for convenience where the statistics files are generated.

An example of how to use the script is given in the comment at the top of the script.

This script was the beginning of a collection of scripts to handle querying the Connect:Direct UNIX statistics/configuration files and the Secure+ configuration.

Next we will look at showing a histogram of the volume of data going through a Connect:Direct node ordered by remote node.  Useful for capacity planning and also billing.



# Connect:Direct Activity by hour Histogram
# =========================================
#
# Usage: cdhours
#
# $ cat S20100903.??? | cdhours
#
# To get a better picture for the month you could say
#
# $ cat S201009??.??? | cdhours
#
# Hours   Transfers
# =====   =========
#
# 00      54 #############
# 01      48 ############
# 02      54 #############
# 03     20 ##############################
# 04    244 ###############################################################
# 05      86 ######################
# 06      66 #################
# 07      76 ###################
# 08      36 #########
# 09       0
# 10      44 ###########
# 11    190 #################################################
# 12      48 ############
# 13      62 ################
# 14      30 #######
# 15      28 #######
# 16      85 #####################
# 17    221 #########################################################
# 18       0
# 19       0
# 20       0
# 21       0
# 22       0
# 23       0
#

function cdhours
{
       _FILES=$*
       # How many columns does the terminal have
       _COLS=`tput cols`
       # We are interested in the Copy Termination ReCords (CTRC)
       cat $_FILES | grep RECI=CTRC | nawk "{print $2}" | \
       nawk -v cols=$_COLS '
BEGIN{
       # Initialise the array that represents the histogram
       hours_per_day=24
       for(h=0;h < hours_per_day;h++)
       {
               # The keys of the array are packed with a leading zero
               hour=sprintf("%02d",h)
               tally[hour]=0
       }
}
       # This section gets executed for all records passed to nawk
       {
               split($2,fields,":")
               # Update the tally of these records that falls within this particular hour
               tally[fields[1]]++
       }
END{
       for(h=0;h < hours_per_day;h++)
       {
               # The keys of the array are packed with a leading zero
               hour=sprintf("%02d",h)
               # Keep track of the maximum number of tally marks so we can scale the histogram
               if(tally[hour] > max)
               {
                       max=tally[hour]
               }
       }
       node_name_length=16
       # Scale factor to make the histogram fit in the terminal window
       scale=(cols-(node_name_length+1))/max;

       printf("Hours\tTransfers\n")
       printf("=====\t=========\n\n")
       for(h=0;h < hours_per_day;h++)
       {
               # The keys of the array are packed with a leading zero
               hour=sprintf("%02d",h)
               # Scale the histogram bar to fit in terminal window
               bar=tally[hour]*scale
               printf "%3s\t%5d ",hour,tally[hour]
               # Generate the histogram bar
               for(b=1;b <= bar;b++)printf "#";
               printf "\n"
       }
}
'
}

Friday, October 15, 2010

Difficult C:D questions?

Day 14


Connect:Direct records almost everything in the Connect:Direct statistics.  The statistics can be queried by using the Connect:Direct command line program “direct”, or more friendly programs such as the Connect:Direct Requester, or the browser interface, and even Sterling Control Center (SCC).

The statistics can be queried for information regarding a particular process or in general for errors and even for evidence of compliance to standards etc.

Some queries however are not possible to express in the Connect:Direct command line and difficult in the other tools mentioned earlier.

Some of these more difficult queries are listed below:

  • What is the maximum number of concurrent sessions being used, at what time and which nodes had the lion share of the sessions?
  • Which transfers are not secured using Secure+?
  • Which nodes are using self signed digital certificates?
  • What was the total volume of data transferred ordered by remote node?
  • When is the least busiest time on this Connect:Direct node?
  • Which remote nodes are triggering local scripts/processes?
  • What are the transfers that have had a failure, but have not been successfully transmitted later?

The reason why these and other queries are difficult to express in the Connect:Direct command line is that the Connect:Direct statistics contain information that you can not get at with the Connect:Direct command line.

If you have ever taken a look at the Connect:Direct statistics files on UNIX you might not have liked what you saw:



STAR=20100902 17:00:03|PNAM=PULL|PNUM=98765|SSTA=20100902 17:00:03|STRT=20100902 17:00:03|STOP=20100902 17:00:03|STPT=20100902 17:00:03|SELA=00:00:00|SUBM=aaaacd@unx.aaaa|SBID=aaaacd|SBND=unx.aaaa|SNOD=CD.OTHER|CCOD=0|RECI=CTRC|RECC=CAPR|TZDI=3600|MSGI=SCPA000I|MSST=Copy step successful.
:
etc.



The above Connect:Direct statistic record is for a COPY statement within a Connect:Direct process.  It is just one long line with all the fields separated by the ‘|’ character.  Each field contains the 4 character name for the field name, an equals sign followed by the value of that field.

The 4 letter field names are documented in the “Connect:Direct for UNIX User Guide”.  You do not need to know them all.  Just use the ones you need when you need them.



PNAM is Process Name
PNUM is Process Number
PNOD is Primary Node Name
SNOD is Secondary Node Name
CCOD is Condition Code
SFIL is Source File Name
DFIL is Destination File Name
DBYW is Destination Bytes Written



A simple UNIX command can help make these statistics files easier to read

$ cat S20100902.047 | grep RECI=CTRC | tr '|' '\n'

The grep for records that contain the string “RECI=CTRC” filters just those records that are Copy Termination ReCords i.e. produced by a COPY statement.

Which produces something like the following:



STAR=20100902 17:00:03
PNAM=PULL
PNUM=98765
:
SUBM=aaaacd@unx.aaaa
:
SNOD=CD.OTHER
CCOD=0
RECI=CTRC
:
MSGI=SCPA000I
MSST=Copy step successful.
:
PNOD=unx.aaaa
SNOD=CD.OTHER
LNOD=P
:
CSPE=Y
CSPP=TLSv1
CSPS=TLS_RSA_WITH_AES_256_CBC_SHA
CERT=(/C=GB/ST=Cheshire/L=Congleton/O=A Global Financial Institution Plc/OU=Middleware/CN=CD.OTHER/SN=69009876789765456787654567667729)
CERI=(/O=Trusted Network/OU=Trusted, Inc./OU=Trusted CA/OU=www.trusted.org/SN=78ee48de185b2071c9c9c3b51d7bddc1)
SFIL=\share123\outgoing\AAAAAA.123456.DAT
:
DFIL=/data/projectx/from_agfi/AAAAAA.123456.DAT
:
DBYW=1161
:
etc.



Now we know the format of the stats records and we have an easier way to view them we can write some shell functions to help us with some of our tasks with Connect:Direct on UNIX and even with these more difficult questions.

Next we will look at one of those questions and how it can be answered in more detail.