Analyzing Bit Errors with J and Tcl
I let this year's International Obfuscated C Code Contest (IOCCC) deadline pass because I've been having too much fun learning J, an array language where every line reads like an IOCCC best one-liner entry.1
J is a descendent of APL, both created by Ken Iverson. For the last year I've been listening to the ArrayCast and reading Henry Rich's book "J for C Programmers" cover to cover.2 The ArrayCast is often more technical than my other favorite nerd podcasts, The Amp Hour and The Embedded FM Podcast, and it frequently gets into the detailed design and implementation of array languages.
As a quick introduction, J, like other array langauges, uses a set of primitives to manipulate data stored in arrays of any dimension. The interpreter will automatically loop primatives over the data based on the "rank" (or shape) of the primative and data, and primitives can be composed to modify their behavior. It's a unique style of functional programming. In J, all primatives are written with one or more ASCII characters. APL and other array languages use special symbols for primitives.
I previously wrote a Tcl script to analyze bit errors in packet payloads. Simple things like counting the number of errors (the Hamming distance), the positions of each error and the distances between them. Writing that in Tcl was quick and easy since I already know Tcl. And Tcl is naturally a good fit for processing strings of ones and zeros (or hex), and it also supports arbitrary precision integers (big ints) for performing bitwise operators.
I figured J's support for operations on arbitrary length vectors would also be a good fit for this task, so I wrote a similar program in J. This was a much slower process, as expected, since I'm still learning the language. I'm a long way from knowing idiomatic J or really grokking the array way of thinking, but I think experience with bit-twiddling and masking in C and some functional programming background was a helpful start.
The array langauges support a programming style called "tacit programming". Tacit code that does not explicitly reference its input variables; it consists only of primitives that operate on the implied arguments. It's similar to how stack based (or concatenative) langauges like Forth or Factor work, except that the syntax is built around applying unary and binary operations and a syntax called "forks" and "hooks". In my opinion, that really complicates things, and a stack based model would be easier for me to understand.3 That said, I tried to write tacit verbs where I could.
Two annoyances with J: First, when an integer overflows in J the interpreter
promotes it to a double floating point value, not an extended
precision integer like in Tcl. So you must first convert any number
that might overflow to an extended precision type in advance.4
The xfh
("extended from hex") verb modifies dfh
("decimal from
hex") from the stdlib.ijs
file to parse extended precision integers.
Otherwise long hex strings would be converted to a double.
Second, the J installation process on Linux is a bit messy, so I just hardcoded a local directory in the shebang line for now. And newer versions of glibc complain about executing code on the stack, so I had to run J with this wrapper script:
#!/bin/bash GLIBC_TUNABLES=glibc.rtld.execstack=2 $(dirname "$0")/bin/jconsole "$@"
Load library /path/to/j9.6/bin/libj.so failed: /path/to/j9.6/bin/libj.so: cannot enable executable stack as shared object requires: Invalid argument
The code
#!/path/to/j9.6/jconsole.sh NB. Copyright © 2025 Remington Furman NB. NB. SPDX-License-Identifier: MIT NB. NB. Permission is hereby granted, free of charge, to any person NB. obtaining a copy of the software demonstrated in code cells (the NB. "Software"), to deal in the Software without restriction, including NB. without limitation the rights to use, copy, modify, merge, publish, NB. distribute, sublicense, and/or sell copies of the Software, and to NB. permit persons to whom the Software is furnished to do so, subject NB. to the following conditions: NB. NB. The above copyright notice and this permission notice shall be NB. included in all copies or substantial portions of the Software. NB. NB. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, NB. EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF NB. MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NB. NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS NB. BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN NB. ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN NB. CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE NB. SOFTWARE. NB. From https://www.jsoftware.com/pipermail/programming/2014-May/037347.html NB. See also: https://www.jsoftware.com/pipermail/general/2021-October/038699.html 9!:29]1[9!:27'2!:55]1' NB. Exit on error. NB. Use ASCII boxes. boxdraw_j_ 1 NB. Create an extended precision integer from a hex string. H=. '0123456789ABCDEF' h=. '0123456789abcdef' xfh=: (16 #. 16x | (H,h) i. ]) :.hfd NB. Create a binary list from a hex string, padding to a multiple of NB. four bits. bfh=: (_4*#@]) {. #:@xfh process=: monad define "1 header=. ;:'len exp act' (header)=. cut y NB. Multiple assignment. len=. 0".len NB. Parse number. exp_bin=. len {. bfh exp NB. Right justify if len is negative. act_bin=. len {. bfh act diff=. exp_bin~:act_bin NB. XOR is not-equal. hamm=. +/ diff NB. Add the one bits in diff. pos=. I. diff NB. Get the indicies of one bits. fdiff=. (2-~/\])^:a: pos NB. Repeated fwd. diff. between error positions. pos_gcd=. +./ pos NB. GCD of error positions. NB. Pretty print. Remove spaces and make diff easier to read. exp_bin_p=. 1":exp_bin act_bin_p=. 1":act_bin diff_p=. diff{'.#' NB. Format tables. header=. header, ;:'exp_bin act_bin diff hamm pos fdiff pos_gcd' out=. len; exp; act; exp_bin_p; act_bin_p; diff_p; hamm; pos; fdiff; pos_gcd header ,. out ) NB. Split stdin into lines. input=. ];._2 stdin '' echo@process input exit''
#!/usr/bin/tclsh # Copyright © 2025 Remington Furman # # SPDX-License-Identifier: MIT # # Permission is hereby granted, free of charge, to any person # obtaining a copy of the software demonstrated in code cells (the # "Software"), to deal in the Software without restriction, including # without limitation the rights to use, copy, modify, merge, publish, # distribute, sublicense, and/or sell copies of the Software, and to # permit persons to whom the Software is furnished to do so, subject # to the following conditions: # # The above copyright notice and this permission notice shall be # included in all copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, # EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF # MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND # NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS # BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN # ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN # CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE # SOFTWARE. proc pop_count {x} { # Format as binary and count all the 1 bits. return [regexp -all 1 [format %b [expr {$x}]]] } proc gcd {a b} { while {$b != 0} { set t $b set b [expr {$a % $b}] set a $t } return $a } # Fold implementation, from https://wiki.tcl-lang.org/page/fold . proc invoke {func args} { uplevel #0 $func $args } proc foldl {func init list} { foreach item $list { set init [invoke $func $init $item] } return $init } proc foldl1 {func list} { foldl $func [lindex $list 0] [lrange $list 1 end] } # Forward and backward differences. proc fwd_diff {xs} { return [lmap a [lrange $xs 0 end-1] b [lrange $xs 1 end] {expr {$b-$a}}] } proc back_diff {xs} { return [lmap a [lrange $xs 0 end-1] b [lrange $xs 1 end] {expr {$a-$b}}] } proc bit_pos_be {diff_bin n} { # Count from big end. set bin_diff [format "%0*llb" $n $diff_bin] return [lsearch -all [split $bin_diff {}] 1] } proc bit_pos_le {diff_bin n} { # Count from little end. set bin_diff [format "%0*llb" $n $diff_bin] return [lsearch -all [lreverse [split $bin_diff {}]] 1] } proc hex_diff {expected actual} { set nybbles [string length [regsub "0x" $expected ""]] set bytes [expr $nybbles / 2] set bits [expr {4 * $nybbles}] set diff [expr {$expected ^ $actual}] set diff_hex [format "0x%0*llx" $nybbles $diff] set diff_bin [format "0b%0*llb" $bits $diff] set diff_pos_be [bit_pos_be $diff_bin $bits] set diff_pos_le [bit_pos_le $diff_bin $bits] set hamming [pop_count $diff_hex] set percent_diff [expr {(1.0 * $hamming / $bits) * 100}] puts "bytes: $bytes" puts "nybbles: $nybbles" puts "bits: $bits" puts "expected: $expected" puts "actual: $actual" puts "diff_hex: 0x[regsub -all 0 [string range $diff_hex 2 end] .]" puts "diff_bin: 0b[regsub -all 0 [string range $diff_bin 2 end] .]" puts "diff_pos_be: $diff_pos_be" puts "diff_pos_le: $diff_pos_le" puts "diff_pos_be_gcd: [foldl1 gcd $diff_pos_be]" puts "diff_pos_le_gcd: [foldl1 gcd $diff_pos_le]" puts "diff_pos_be_fwd_diff: [fwd_diff $diff_pos_be]" puts "diff_pos_le_fwd_diff: [fwd_diff $diff_pos_le]" puts "hamming distance: $hamming" puts "percent_diff: [format %.02f $percent_diff]" } set input [read stdin] foreach {len expected actual} $input { hex_diff "0x$expected" "0x$actual" puts "" }
Example output
J
$ cat input.txt 32 deadbeef deedbeaf 32 deadbeef deafbead 32 deadbeef bead0000 16 dead deed 4 0 0 4 f 0 4 0 f 4 f f 2 a 5 -2 a 5 2 6 9 -2 6 9 $ ./hex_diff.ijs < input.txt +-------+--------------------------------+ |len |32 | +-------+--------------------------------+ |exp |deadbeef | +-------+--------------------------------+ |act |deedbeaf | +-------+--------------------------------+ |exp_bin|11011110101011011011111011101111| +-------+--------------------------------+ |act_bin|11011110111011011011111010101111| +-------+--------------------------------+ |diff |.........#...............#......| +-------+--------------------------------+ |hamm |2 | +-------+--------------------------------+ |pos |9 25 | +-------+--------------------------------+ |fdiff | 9 25 | | |16 0 | | | 0 0 | +-------+--------------------------------+ |pos_gcd|1 | +-------+--------------------------------+ +-------+--------------------------------+ |len |32 | +-------+--------------------------------+ |exp |deadbeef | +-------+--------------------------------+ |act |deafbead | +-------+--------------------------------+ |exp_bin|11011110101011011011111011101111| +-------+--------------------------------+ |act_bin|11011110101011111011111010101101| +-------+--------------------------------+ |diff |..............#..........#....#.| +-------+--------------------------------+ |hamm |3 | +-------+--------------------------------+ |pos |14 25 30 | +-------+--------------------------------+ |fdiff |14 25 30 | | |11 5 0 | | |_6 0 0 | | | 0 0 0 | +-------+--------------------------------+ |pos_gcd|1 | +-------+--------------------------------+ +-------+--------------------------------------------------------+ |len |32 | +-------+--------------------------------------------------------+ |exp |deadbeef | +-------+--------------------------------------------------------+ |act |bead0000 | +-------+--------------------------------------------------------+ |exp_bin|11011110101011011011111011101111 | +-------+--------------------------------------------------------+ |act_bin|10111110101011010000000000000000 | +-------+--------------------------------------------------------+ |diff |.##.............#.#####.###.#### | +-------+--------------------------------------------------------+ |hamm |15 | +-------+--------------------------------------------------------+ |pos |1 2 16 18 19 20 21 22 24 25 26 28 29 30 31 | +-------+--------------------------------------------------------+ |fdiff | 1 2 16 18 19 20 21 22 24 25 26 28 29 30 31| | | 1 14 2 1 1 1 1 2 1 1 2 1 1 1 0| | | 13 _12 _1 0 0 0 1 _1 0 1 _1 0 0 0 0| | | _25 11 1 0 0 1 _2 1 1 _2 1 0 0 0 0| | | 36 _10 _1 0 1 _3 3 0 _3 3 _1 0 0 0 0| | | _46 9 1 1 _4 6 _3 _3 6 _4 0 0 0 0 0| | | 55 _8 0 _5 10 _9 0 9 _10 0 0 0 0 0 0| | | _63 8 _5 15 _19 9 9 _19 0 0 0 0 0 0 0| | | 71 _13 20 _34 28 0 _28 0 0 0 0 0 0 0 0| | | _84 33 _54 62 _28 _28 0 0 0 0 0 0 0 0 0| | | 117 _87 116 _90 0 0 0 0 0 0 0 0 0 0 0| | |_204 203 _206 90 0 0 0 0 0 0 0 0 0 0 0| | | 407 _409 296 0 0 0 0 0 0 0 0 0 0 0 0| | |_816 705 0 0 0 0 0 0 0 0 0 0 0 0 0| | |1521 0 0 0 0 0 0 0 0 0 0 0 0 0 0| | | 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0| +-------+--------------------------------------------------------+ |pos_gcd|1 | +-------+--------------------------------------------------------+ +-------+--------------------------------+ |len |32 | +-------+--------------------------------+ |exp |deadbeef | +-------+--------------------------------+ |act |56253667 | +-------+--------------------------------+ |exp_bin|11011110101011011011111011101111| +-------+--------------------------------+ |act_bin|01010110001001010011011001100111| +-------+--------------------------------+ |diff |#...#...#...#...#...#...#...#...| +-------+--------------------------------+ |hamm |8 | +-------+--------------------------------+ |pos |0 4 8 12 16 20 24 28 | +-------+--------------------------------+ |fdiff |0 4 8 12 16 20 24 28 | | |4 4 4 4 4 4 4 0 | | |0 0 0 0 0 0 0 0 | | |0 0 0 0 0 0 0 0 | | |0 0 0 0 0 0 0 0 | | |0 0 0 0 0 0 0 0 | | |0 0 0 0 0 0 0 0 | | |0 0 0 0 0 0 0 0 | | |0 0 0 0 0 0 0 0 | +-------+--------------------------------+ |pos_gcd|4 | +-------+--------------------------------+ +-------+----------------+ |len |16 | +-------+----------------+ |exp |dead | +-------+----------------+ |act |deed | +-------+----------------+ |exp_bin|1101111010101101| +-------+----------------+ |act_bin|1101111011101101| +-------+----------------+ |diff |.........#......| +-------+----------------+ |hamm |1 | +-------+----------------+ |pos |9 | +-------+----------------+ |fdiff |9 | | |0 | +-------+----------------+ |pos_gcd|9 | +-------+----------------+ +-------+----+ |len |4 | +-------+----+ |exp |0 | +-------+----+ |act |0 | +-------+----+ |exp_bin|0000| +-------+----+ |act_bin|0000| +-------+----+ |diff |....| +-------+----+ |hamm |0 | +-------+----+ |pos | | +-------+----+ |fdiff | | +-------+----+ |pos_gcd|0 | +-------+----+ +-------+-------+ |len |4 | +-------+-------+ |exp |f | +-------+-------+ |act |0 | +-------+-------+ |exp_bin|1111 | +-------+-------+ |act_bin|0000 | +-------+-------+ |diff |#### | +-------+-------+ |hamm |4 | +-------+-------+ |pos |0 1 2 3| +-------+-------+ |fdiff |0 1 2 3| | |1 1 1 0| | |0 0 0 0| | |0 0 0 0| | |0 0 0 0| +-------+-------+ |pos_gcd|1 | +-------+-------+ +-------+-------+ |len |4 | +-------+-------+ |exp |0 | +-------+-------+ |act |f | +-------+-------+ |exp_bin|0000 | +-------+-------+ |act_bin|1111 | +-------+-------+ |diff |#### | +-------+-------+ |hamm |4 | +-------+-------+ |pos |0 1 2 3| +-------+-------+ |fdiff |0 1 2 3| | |1 1 1 0| | |0 0 0 0| | |0 0 0 0| | |0 0 0 0| +-------+-------+ |pos_gcd|1 | +-------+-------+ +-------+----+ |len |4 | +-------+----+ |exp |f | +-------+----+ |act |f | +-------+----+ |exp_bin|1111| +-------+----+ |act_bin|1111| +-------+----+ |diff |....| +-------+----+ |hamm |0 | +-------+----+ |pos | | +-------+----+ |fdiff | | +-------+----+ |pos_gcd|0 | +-------+----+ +-------+---+ |len |2 | +-------+---+ |exp |a | +-------+---+ |act |5 | +-------+---+ |exp_bin|10 | +-------+---+ |act_bin|01 | +-------+---+ |diff |## | +-------+---+ |hamm |2 | +-------+---+ |pos |0 1| +-------+---+ |fdiff |0 1| | |1 0| | |0 0| +-------+---+ |pos_gcd|1 | +-------+---+ +-------+---+ |len |_2 | +-------+---+ |exp |a | +-------+---+ |act |5 | +-------+---+ |exp_bin|10 | +-------+---+ |act_bin|01 | +-------+---+ |diff |## | +-------+---+ |hamm |2 | +-------+---+ |pos |0 1| +-------+---+ |fdiff |0 1| | |1 0| | |0 0| +-------+---+ |pos_gcd|1 | +-------+---+ +-------+---+ |len |2 | +-------+---+ |exp |6 | +-------+---+ |act |9 | +-------+---+ |exp_bin|01 | +-------+---+ |act_bin|10 | +-------+---+ |diff |## | +-------+---+ |hamm |2 | +-------+---+ |pos |0 1| +-------+---+ |fdiff |0 1| | |1 0| | |0 0| +-------+---+ |pos_gcd|1 | +-------+---+ +-------+---+ |len |_2 | +-------+---+ |exp |6 | +-------+---+ |act |9 | +-------+---+ |exp_bin|10 | +-------+---+ |act_bin|01 | +-------+---+ |diff |## | +-------+---+ |hamm |2 | +-------+---+ |pos |0 1| +-------+---+ |fdiff |0 1| | |1 0| | |0 0| +-------+---+ |pos_gcd|1 | +-------+---+
Tcl
./hex_diff.tcl < input.txt bytes: 4 nybbles: 8 bits: 32 expected: 0xdeadbeef actual: 0xdeedbeaf diff_hex: 0x..4...4. diff_bin: 0b.........1...............1...... diff_pos_be: 9 25 diff_pos_le: 6 22 diff_pos_be_gcd: 1 diff_pos_le_gcd: 2 diff_pos_be_fwd_diff: 16 diff_pos_le_fwd_diff: 16 hamming distance: 2 percent_diff: 6.25 bytes: 4 nybbles: 8 bits: 32 expected: 0xdeadbeef actual: 0xdeafbead diff_hex: 0x...2..42 diff_bin: 0b..............1..........1....1. diff_pos_be: 14 25 30 diff_pos_le: 1 6 17 diff_pos_be_gcd: 1 diff_pos_le_gcd: 1 diff_pos_be_fwd_diff: 11 5 diff_pos_le_fwd_diff: 5 11 hamming distance: 3 percent_diff: 9.38 bytes: 4 nybbles: 8 bits: 32 expected: 0xdeadbeef actual: 0xbead0000 diff_hex: 0x6...beef diff_bin: 0b.11.............1.11111.111.1111 diff_pos_be: 1 2 16 18 19 20 21 22 24 25 26 28 29 30 31 diff_pos_le: 0 1 2 3 5 6 7 9 10 11 12 13 15 29 30 diff_pos_be_gcd: 1 diff_pos_le_gcd: 1 diff_pos_be_fwd_diff: 1 14 2 1 1 1 1 2 1 1 2 1 1 1 diff_pos_le_fwd_diff: 1 1 1 2 1 1 2 1 1 1 1 2 14 1 hamming distance: 15 percent_diff: 46.88 bytes: 4 nybbles: 8 bits: 32 expected: 0xdeadbeef actual: 0x56253667 diff_hex: 0x88888888 diff_bin: 0b1...1...1...1...1...1...1...1... diff_pos_be: 0 4 8 12 16 20 24 28 diff_pos_le: 3 7 11 15 19 23 27 31 diff_pos_be_gcd: 4 diff_pos_le_gcd: 1 diff_pos_be_fwd_diff: 4 4 4 4 4 4 4 diff_pos_le_fwd_diff: 4 4 4 4 4 4 4 hamming distance: 8 percent_diff: 25.00 bytes: 2 nybbles: 4 bits: 16 expected: 0xdead actual: 0xdeed diff_hex: 0x..4. diff_bin: 0b.........1...... diff_pos_be: 9 diff_pos_le: 6 diff_pos_be_gcd: 9 diff_pos_le_gcd: 6 diff_pos_be_fwd_diff: diff_pos_le_fwd_diff: hamming distance: 1 percent_diff: 6.25 bytes: 0 nybbles: 1 bits: 4 expected: 0xf actual: 0x0 diff_hex: 0xf diff_bin: 0b1111 diff_pos_be: 0 1 2 3 diff_pos_le: 0 1 2 3 diff_pos_be_gcd: 1 diff_pos_le_gcd: 1 diff_pos_be_fwd_diff: 1 1 1 diff_pos_le_fwd_diff: 1 1 1 hamming distance: 4 percent_diff: 100.00 bytes: 0 nybbles: 1 bits: 4 expected: 0x0 actual: 0xf diff_hex: 0xf diff_bin: 0b1111 diff_pos_be: 0 1 2 3 diff_pos_le: 0 1 2 3 diff_pos_be_gcd: 1 diff_pos_le_gcd: 1 diff_pos_be_fwd_diff: 1 1 1 diff_pos_le_fwd_diff: 1 1 1 hamming distance: 4 percent_diff: 100.00 bytes: 0 nybbles: 1 bits: 4 expected: 0xa actual: 0x5 diff_hex: 0xf diff_bin: 0b1111 diff_pos_be: 0 1 2 3 diff_pos_le: 0 1 2 3 diff_pos_be_gcd: 1 diff_pos_le_gcd: 1 diff_pos_be_fwd_diff: 1 1 1 diff_pos_le_fwd_diff: 1 1 1 hamming distance: 4 percent_diff: 100.00 bytes: 0 nybbles: 1 bits: 4 expected: 0xa actual: 0x5 diff_hex: 0xf diff_bin: 0b1111 diff_pos_be: 0 1 2 3 diff_pos_le: 0 1 2 3 diff_pos_be_gcd: 1 diff_pos_le_gcd: 1 diff_pos_be_fwd_diff: 1 1 1 diff_pos_le_fwd_diff: 1 1 1 hamming distance: 4 percent_diff: 100.00 bytes: 0 nybbles: 1 bits: 4 expected: 0x6 actual: 0x9 diff_hex: 0xf diff_bin: 0b1111 diff_pos_be: 0 1 2 3 diff_pos_le: 0 1 2 3 diff_pos_be_gcd: 1 diff_pos_le_gcd: 1 diff_pos_be_fwd_diff: 1 1 1 diff_pos_le_fwd_diff: 1 1 1 hamming distance: 4 percent_diff: 100.00 bytes: 0 nybbles: 1 bits: 4 expected: 0x6 actual: 0x9 diff_hex: 0xf diff_bin: 0b1111 diff_pos_be: 0 1 2 3 diff_pos_le: 0 1 2 3 diff_pos_be_gcd: 1 diff_pos_le_gcd: 1 diff_pos_be_fwd_diff: 1 1 1 diff_pos_le_fwd_diff: 1 1 1 hamming distance: 4 percent_diff: 100.00
Footnotes:
It's a lot of fun to work out an IOCCC one-liner entry with nothing more than a pencil, paper, and a copy K&R…
Uiua is a newer and exciting array langauge with stack semantics. I'd like to try it too, but I wanted to learn one of Ken Iverson's langauges first. And J has a strong background in numerical programming which I am interested in exploring.
At least both J and Tcl detect integer overflow and do something sane. Prior to C23, signed integer overflow in C is undefined behavior even though the results are entirely predictable for two's complement arithmetic. You also can't detect overflow after an operation, even though most (but not all) CPUs have an overflow status bit. It's a mess.