Tuesday, March 1, 2016

Day 23 Translation what's the difference

Day 23

Translation what’s the difference

First let’s create some Connect:Direct translation tables using PowerShell:
PS C:\Users\nicke> conv ibm285 iso-8859-15 @(0..255) | Set-Content -Encoding byte ibm285-iso-8859-15.cdx

PS C:\Users\nicke> conv ibm01146 iso-8859-15 @(0..255) | Set-Content -Encoding byte ibm01146-iso-8859-15.cdx
Here we are converting from codepage ibm285 IBM EBCDIC (UK) to iso-8859-15 which has the Euro currency symbol, and converting all the byte values from 0 through to 255 (that is what the @(0..255) means), saving the result with the Connect:Direct Windows translation table file extension .cdx.
We then do the same thing for producing a translation table that will convert from IBM EBCDIC (UK-Euro) to iso-8859-15 .
Using the Powershell function below we can see the difference between these two translation tables.
# List difference between translation tables
function xlt_diff ([byte[]]$tbla,[byte[]]$tblb) {
    0..255 | %{
        if($tbla[$_] -ne $tblb[$_]) {
            "{0:x} : {1:x} | {2:x}" -f $_,$tbla[$_],$tblb[$_]
        }
    }
}
The above functions can be used as follows:
PS C:\Users\nicke> xlt_diff (cat -Encoding byte .\ibm285-iso-8859-15.cdx) (cat -Encoding byte .\ibm01146-iso-8859-15.cdx)
9f : 3f | a4
The output above shows that two translation tables differ when they map hex byte value 0x9f. In the first table it maps to hex value 0x3f, and in the other to 0xa4.
Now if we create the translation tables for translating back to either ibm285/ibm01146 from iso-8859-15, and then compare like so:
PS C:\Users\nicke> conv iso-8859-15 ibm01146 @(0..255) | Set-Content -Encoding byte iso-8859-15-ibm01146.cdx

PS C:\Users\nicke> conv iso-8859-15 ibm285 @(0..255) | Set-Content -Encoding byte iso-8859-15-ibm285.cdx

PS C:\Users\nicke> xlt_diff (cat -Encoding byte .\iso-8859-15-ibm01146.cdx) (cat -Encoding byte .\iso-8859-15-ibm285.cdx)
a4 : 9f | 6f
Here the translation tables differ in how they convert the Euro (€) symbol in iso-8859-15 (0xa4) to the two mainframe codepages.
This is not that surprising as ibm01146 has the Euro (€ 0x9f) and codepage ibm285 does not. In fact if you look up codepage 1146 on wikipedia you will see that ibm01146 was created to be ibm285 with the addition of the Euro (€) symbol.
I chose these two codepages as a simple example to showcase the finding the difference between translation tables.
These last two posts were about creating custom codepage translation tables for Connect:Direct, and spotting the differences between tables.
Next time we will look at displaying what maps to what more easily with these translation tables, and show a generally better way of translating from one codepage to another.

No comments: