Monday, December 24, 2012

Self Sign


Day 19

If you're using SSL/TLS digital certificates with Connect:Direct Secure+ when connecting to your 3rd parties, you might well have a security policy that states that you will not connect with machines from 3rd parties that use self signed certificates.

You probably would prefer to connect with a 3rd party node that presents a digital certificate that is signed by someone you trust, such as VeriSign.

It might be that the history of the configuration of the Connect:Direct connection and the security policy did not coincide.

You may be in need of finding out which of your existing connections use self signed certificates.

As usual the information is within the Connect:Direct statistics, but not viewable from the "select statistics" command from  within the Connect:Direct command line.

The following function will check the statistics records piped through it to check which Secure+ connections use certificates that were signed by themselves.  In other words it checks statistics records where Secure+ was used and where the certificate issuer is the same as the subject of the server certificate used by the connection.

function selfsigncerts
{
grep RECI=CTRC | awk -F\| '
{
record["CSPE"]=""
record["CERI"]=""
record["CERT"]=""

for(i=1;i<=NF;i++)
{
key=substr($i,0,index($i,"=")-1)
value=substr($i,index($i,"=")+1)
record[key]=value
}

name = record["PNOD"] ":" record["SNOD"]

if((record["CSPE"] == "Y") && (record["CERI"] == record["CERT"]))
{
connections[name]=record["CERI"]
}
}
END{
for (name in connections)
{
print name ":" connections[name]
}
}'
}

It is used like this, assuming you are in the work directory where the statistics files are:

$ cat S20121224.001 | selfsigncerts
unx.node:OTHER.NODE:(/C=GB/L=Lincoln/O=Bank/OU=IT/CN=OTHER.NODE/emailAddress=joe.blogs@bank.co.uk/SN=12345678)

Now you know which connections use self signed certificates you can go about getting them replaced with certificates you can trust via your trusted 3rd party such as VeriSign.

Other things you could check for are the encryption algorithms used by a connection. Over time encryption algorithms lose favour as they are considered weaker than others.

Certificate signing algorithms also need checking for compliance with security policies.  For example the MD5 checksum algorithm in the past was used for signing certificates, but is considered weak, and has been shown that it can be exploited.

You may have a security policy that states you don't use certain algorithms, and you may have to demonstrate that you don't use them, and if you do, identify them for remediation.

The next few blog entries will cover these issues.

Sunday, December 16, 2012

Mistaken Identity

Day 18 

Most problems with translation tables amount to a case of mistaken identity. Sometimes it is the source of the file that is assumed to be something, but turns out to be something else.

For example someone says they are having problems transferring a file from a VMS system to UNIX and the Excel spreadsheet is not arriving in the correct format.

Well in this case you can not just look at this problem from a VMS/UNIX perspective. The Excel spreadsheet probably originated from a Windows machine. So how was it transferred to the VMS machine? Was it transferred in binary mode? Is the spreadsheet file really an Excel file, or just a .csv file?

The answers to those questions have an impact on the problem and its' solution. If the file was truely an Excel spreadsheet then you would want to transfer it in binary mode so the file ends up at its' destination literally the same as at the source of the transfer, no matter how many hops there are in the transfer.

It all depends on what is being used to produce the file to be transferred and what will end up consuming/processing the file at the destination. In the case of an Excel spreadsheet it will be a piece of software expecting a file the same as would be on a Windows machine, hence the binary mode.

If the source of the transfer was . csv file (comma separated values), i.e. a text file and was to be consumed/processed on a UNIX machine by an application, then we would want the file to arrive on the UNIX machine as a UNIX text file with each line terminated with a newline character as opposed to a carriage-return and newline characters on a Windows platform.

For this to happen we want Connect:Direct to treat the file as a text file and not binary as in the previous example. So we would not specify DATATYPE binary as before but use the default DATATYPE which is TEXT.

Some times you are told which codepages are being used on both ends of the proposed Connect:Direct transfer with absolute certainty.

For example you be told that a text file transferred from a mainframe was produced using codepage IBM-1140 and that the application on a Windows machine receiving the file is using UTF-8, an encoding for Unicode.

It really does depend on the application that will consume/process the file. It might be assumed that the application can handle UTF-8, or that as ASCII is a subset of UTF-8 there should be no problem.

In this case an international character might be used within the file on the mainframe that is available to it within the IBM-1140 codepage and this will be translated to the corresponding UTF-8 encoding of Unicode.

For characters that map directly to a single ASCII/UTF-8 there will not be a problem, but international characters can be encoded as 1,2,3 or even 4 byte UTF-8 encoded Unicode characters.

This is because UTF-8 is a variable byte character encoding. If the application is written to use Windows codepage CP-1252, then it will only be expecting single byte characters and not multi-byte Unicode characters that UTF-8 can encode. It will then probably choke on the multi-byte encoding or just not recognise what it is supposed to represent, and not process the file properly.

Imagine that data entered into an application on the mainframe is using one codepage, but the application was programmed to use a field delimiter character from another codepage. The file the application produces on the mainframe will contain data from one codepage and delimiters from another, and then transferred to another machine with codepage translation specified for the destination.

You may not be surprised to find that the field delimiter characters were not translated correctly for the destination.

In this particular case I suggested that the application programmer on the mainframe use a particular hex value character for the field delimiter that was available within the IBM-1140 codepage, and it turned out that the application on the Windows machine was using CP-1252 and not UTF-8.

It turned out for this particular file and applications that there were no special codepage requirements, as the default translation was sufficient.

So next time some one is enfatically, absolutely certain about codepage requirements, it might just be a good idea to check the facts for yourself, as it is easy for people to get this wrong.

Sunday, December 9, 2012

Transformers



Day 17

Connect:Direct is available on both ASCII and EBCDIC character set machines. There are also many different code pages available to cater to different regions and languages. So it naturally comes about that translations of one character set to another will be needed from time to time.

In Connect:Direct this is achieved by default for translations between ASCII and EBCDIC in either direction for DATATYPE=TEXT files. For custom requirements, tradtionally this was achieved using translation tables and referring to them within the SYSOPTS clause of a Connect:Direct process.

Translation tables come in two flavours Single Byte Character Set (SBCS) and Double Byte Character Set (DBCS).

For SBCS translation tables, it has in the past been necessary for me to decode a "custom_translation.xlt" as the source to build it was not available. To do this on UNIX I wrote a short shell function to take a binary .xlt file and produce the source to build the SBCS translation table.

function dxlt
{
 if [[ $# -ne 1 ]]
 then
  echo
  echo "Usage: dxlt file.xlt"
  echo
  echo "Dumps the C:D transaltion table file.xlt."
  echo
  return
 fi
 echo
 echo " 0 1 2 3 4 5 6 7 8 9 a b c d e f\n"
 od -A x -t x1 $1 | cut -c6- | sed 's/^00$//'
}

Below is an example of using the above shell function:

$ dxlt custom_translation.xlt

 0 1 2 3 4 5 6 7 8 9 a b c d e f

00 00 01 02 03 37 2d 2e 2f 16 05 25 0b 0c 0d 0e 0f
10 10 11 12 13 3c 3d 32 26 18 19 3f 27 1c 1d 1e 1f
20 40 5a 7f 7b 5b 6c 50 7d 4d 5d 5c 4e 6b 60 4b 61
30 f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 7a 5e 4c 7e 6e 6f
40 7c c1 c2 c3 c4 c5 c6 c7 c8 c9 d1 d2 d3 d4 d5 d6
50 d7 d8 d9 e2 e3 e4 e5 e6 e7 e8 e9 ad e0 bd 5f 6d
60 79 81 82 83 84 85 86 87 88 89 91 92 93 94 95 96
70 97 98 99 a2 a3 a4 a5 a6 a7 a8 a9 c0 4f d0 bc 07
80 20 21 22 23 24 15 06 17 28 29 2a 2b 2c 09 0a 1b
90 30 31 1a 33 34 35 36 08 38 39 3a 3b 04 14 3e e1
a0 41 aa 43 44 45 46 47 48 49 51 52 53 54 55 56 57
b0 58 59 62 63 64 65 66 67 68 69 70 71 72 73 74 ab
c0 76 77 78 80 8a 8b 8c 8d 8e 8f 90 9a 9b 9c 9d 9e
d0 9f a0 aa ab ac ad ae af b0 b1 b2 b3 b4 b5 b6 b7
e0 b8 b9 ba bb bc bd be bf ca cb cc cd ce cf da db
f0 dc dd de df ea eb ec ed ee ef fa fb fc fd fe ff

Another useful function is called "chars" which just shows you the printable characters within the current locale.

function chars
{
	echo
        echo "     0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f"
	nawk 'BEGIN{for(i=0;i<=255;i++){printf "%c",i}}' | od -A x -t c $1 | cut -c6- | \
	sed 's/^00$//;s/[0-9][0-9][0-9]/   /g;s/   /  /g'
}
An example of using the above function is:
$ chars

   0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f

00 \0                   \a \b \t \n \v \f \r
10
20    !  "  #  $  %  &  '  (  )  *  +  ,  -  .  /
30 0  1  2  3  4  5  6  7  8  9  :  ;  <  =  >  ?
40 @  A  B  C  D  E  F  G  H  I  J  K  L  M  N  O
50 P  Q  R  S  T  U  V  W  X  Y  Z  [  \  ]  ^  _
60 `  a  b  c  d  e  f  g  h  i  j  k  l  m  n  o
70 p  q  r  s  t  u  v  w  x  y  z  {  |  }  ~
80
90
a0    ¡  ¢  £  ¤  ¥  ¦  §  ¨  ©  ª  «  ¬  ®  ¯
b0 °  ±  ²  ³  ´  µ  ¶  ·  ¸  ¹  º  »  ¼  ½  ¾  ¿
c0 À  Á  Â  Ã  Ä  Å  Æ  Ç  È  É  Ê  Ë  Ì  Í  Î  Ï
d0 Ð  Ñ  Ò  Ó  Ô  Õ  Ö  ×  Ø  Ù  Ú  Û  Ü  Ý  Þ  ß
e0 à  á  â  ã  ä  å  æ  ç  è  é  ê  ë  ì  í  î  ï
f0 ð  ñ  ò  ó  ô  õ  ö  ÷  ø  ù  ú  û  ü  ý  þ  ÿ

Together these two functions are very useful for sorting out translation table problems where a UNIX machine is involved.
As with many problems it is important to understand the context surrounding the issue at hand.
In terms of codepage translation tables this means looking at what type of file is being translated, which codepage was used to produce the file in question, which translation table was used to transform it, and what codepage is being used to view/process it at the destination. If these are not taken into account it can make solving translation tables issues very difficult to solve.
In the next post I will walk through a particular codepage translation problem using the above functions.