On the Observable Ciphertext Properties of 40-Character HFGCS Broadcasts
A cohort analysis of the 40-character HFGCS broadcast band; two groups (G2, G3), first-broadcast itemisation, forward-fill group resolution.
2026·04·17 | G2 N=39, G3 N=20, years 2024–2026
Abstract. We apply the probe battery of the 30-character format analysis — aggregate and per-position base32 symbol marginals, consecutive-character doublet rates, inter-position single-character equalities, internal-substring repeats, and within-prefix compressibility — to the two group cohorts of 40-character HFGCS broadcasts. After first-broadcast itemisation and forward-fill group resolution, we retain NG2 = 39 and NG3 = 20 unique messages. Both cohorts carry distinct format-level structural signatures; neither is a uniform-random generator.
Group 3 (N = 20): fixed-offset verbatim repeat. 19 of 20 messages carry a 3-to-7 character substring that appears twice within the same message at the fixed offset pair (first occurrence starting at position 22, second starting at position 31). The inter-position equality probe confirms the same finding at the character level: 18 of 18 messages have identical characters at positions (22, 31), positions (23, 32), and positions (24, 33), each at z > 19. Compressibility corroborates: the body compresses at ratio 0.779 against a uniform null of 0.814 ± 0.005 (z = -7.10).
Group 2 (N = 39): 2-character fixed-offset repeat. The inter-position equality probe reveals that 22 of 35 Group-2 messages carry identical characters at positions (16, 21) — the character at position 16 is reproduced five positions later at position 21 — and 22 of 35 carry identical characters at (17, 22). At N = 35 this is z = +20.31 at each pair, far above the per-family Bonferroni threshold for 780 position pairs. The two pairs co-occur in approximately 22 of 35 messages: the 2-character bigram at positions 16–17 is reproduced at positions 21–22 with a fixed 5-character interval in the majority of G2 messages. Additional independent IPE signatures appear at (19, 24) at 12 matches (z = +10.60) and (26, 37) at 10 matches (z = +8.65). A narrower probe set using only consecutive doublets and length-≥3 internal repeats would not detect this structure, since the signal sits at a 2-character (length-below-threshold) repeat at a non-zero offset; the IPE probe of §6 is what surfaces it.
Neither 40-character cohort reproduces the 30-character Group-1 signatures (no persistent M/5 deficit, no mid-body transition, no trailing no-run constraint on the last six characters). The G2 and G3 40-character formats are distinct from each other and from the 30-character Group-1 format.
In plain language. 40-character HFGCS broadcasts are rarer than 30-character ones. After deduplication we have 39 Group-2 messages and 20 Group-3 messages. Both groups show format-level structure — different structures, but neither matches random.
Group 3: in 19 of 20 messages, a short 3-to-7 character chunk appears twice inside the message. The first copy starts at position 22; the second at position 31 — exactly nine positions later. You can see this by eye in most Group 3 messages.
Group 2: in 22 of 35 messages, the character at position 16 is reproduced at position 21 and the character at position 17 is reproduced at position 22. Equivalently, the two-character snippet at positions 16–17 is repeated at positions 21–22 — a shorter version of the Group 3 repeat, at a different interval. Smaller repeat, same idea.
Neither 40-character cohort shows the Group 1 signatures from the 30-character paper (no persistent shortage of M and 5, no mid-body transition, no rule against repeated letters at the tail). The 30-character Group-1, 40-character Group-2, and 40-character Group-3 formats are each distinct from each other.
1.Introduction
The 40-character HFGCS broadcast band is substantially less populated than the 30-character channel reported in the 30-character base paper, which analyses Group 1 and Group 2 payloads at N = 5,377 unique first-broadcast messages. The 40-character cohort holds roughly one-seventh that number of observations and accordingly does not support the full probe battery at the same statistical power. We nevertheless run the same probes here, as a directional test of whether any of the 30-character findings (the M/5 marginal deficit, the mid-body transition, the trailing no-run constraint) are shared across bands, and as a survey of the Group 3 payload — a group which does not appear at 30 characters.
The 40-character band is populated in calendar years 2024–2025 only, which narrows the question to a specific operational era. The cohort splits into Group 2 (N = 39 after dedup) and Group 3 (N = 20 after dedup). No prefix cell has more than ~15 messages, so within-prefix probes cannot be run with meaningful power; we note this and skip those probes.
2.Observation corpus
Applying the standard filter pipeline — exclude rows whose quality flag Q contains "\" or "*", exclude messages containing placeholder characters ("?", "_", "."), retain only messages of length exactly 40, resolve group via forward-fill on PR_GROUPS — yields 76 raw broadcast-level observations. All 76 resolve under forward-fill. Splitting by group gives 52 G2 and 24 G3 broadcasts; first-broadcast itemisation on the message string retains 39 unique G2 messages and 20 unique G3 messages.
The G2 cohort spans 3 calendar years (2024–2026) across 0 prefix identities in the reported per-prefix summary (though every prefix has < 30 messages; none is analysed individually). The G3 cohort spans 2024–2026 across 5 prefix identities (FJ, DX, DQ, DP, L6 at minimum). All statistical tests in this paper operate on first-broadcast itemised data.
3.First-order symbol distribution
We test the null that each base32 symbol appears with uniform probability 1/32 at every body position (positions 3–40, carrying 38·log₂(32) = 190 bits). Aggregate body-position counts are plotted below for each cohort and reported in Tables 1 and 2.
Figure 1. Symbol frequency in the body (positions 3–40) of 40-character Group 2 first-broadcast messages (N = 39). Dashed line: uniform expectation of 3.125%. The largest deviations are a G excess at z = +3.06 and an M deficit at z = -3.03; no other symbol exceeds |z| = 2.5. At N = 39 the per-symbol Bonferroni threshold across 32 symbols (α = 0.05) is |z| > 2.87, which G crosses and M does not.
Figure 2. Symbol frequency in the body (positions 3–40) of 40-character Group 3 first-broadcast messages (N = 20). The largest deviations are an S deficit at z = -2.45 and a Z excess at z = +2.76. At N = 20 the per-symbol Bonferroni threshold is |z| > 2.87, which no symbol crosses. The observed deviations are within the range expected from stratum mixing plus Poisson noise at this sample size; they are reported for completeness but do not constitute a first-order non-uniformity finding.
Compared with the 30-character Group-1 cohort where M and 5 each fall at z ≈ −21, the 40-character cohorts show at most a single-symbol deviation at the Bonferroni threshold (G in G2). The M/5 deficit that dominated the 30-character Group-1 paper is not reproduced in either 40-character cohort.
Tables 1 and 2 report the full 32-symbol tabulations.
3.1Group 2 symbol tabulation
symbol
count
pct
z
M
26
1.75%
-3.03
V
30
2.02%
-2.44
U
36
2.43%
-1.54
E
38
2.56%
-1.24
5
38
2.56%
-1.24
4
39
2.63%
-1.09
B
41
2.77%
-0.79
C
42
2.83%
-0.64
D
42
2.83%
-0.64
I
42
2.83%
-0.64
Q
42
2.83%
-0.64
S
42
2.83%
-0.64
R
43
2.90%
-0.49
K
44
2.97%
-0.35
L
44
2.97%
-0.35
O
45
3.04%
-0.20
T
46
3.10%
-0.05
J
47
3.17%
+0.10
7
48
3.24%
+0.25
2
49
3.31%
+0.40
P
50
3.37%
+0.55
Y
50
3.37%
+0.55
6
52
3.51%
+0.85
N
53
3.58%
+1.00
3
53
3.58%
+1.00
F
54
3.64%
+1.15
W
54
3.64%
+1.15
Z
55
3.71%
+1.30
X
56
3.78%
+1.45
H
57
3.85%
+1.60
A
60
4.05%
+2.04
G
64
4.32%
+2.64
Table 1. G2 body (positions 3–40) symbol counts, percentages, and Poisson z-scores against the uniform null. Rows sorted by z-score.
3.2Group 3 symbol tabulation
symbol
count
pct
z
S
12
1.58%
-2.45
F
14
1.84%
-2.03
C
16
2.11%
-1.62
K
17
2.24%
-1.41
A
18
2.37%
-1.20
M
18
2.37%
-1.20
U
18
2.37%
-1.20
3
19
2.50%
-0.99
6
19
2.50%
-0.99
5
20
2.63%
-0.78
D
22
2.89%
-0.36
O
22
2.89%
-0.36
W
22
2.89%
-0.36
Y
22
2.89%
-0.36
L
23
3.03%
-0.16
N
23
3.03%
-0.16
P
23
3.03%
-0.16
H
25
3.29%
+0.26
Q
25
3.29%
+0.26
V
25
3.29%
+0.26
J
26
3.42%
+0.47
R
26
3.42%
+0.47
I
27
3.55%
+0.68
7
27
3.55%
+0.68
T
28
3.68%
+0.89
2
28
3.68%
+0.89
B
29
3.82%
+1.09
E
29
3.82%
+1.09
G
31
4.08%
+1.51
4
31
4.08%
+1.51
Z
37
4.87%
+2.76
X
38
5.00%
+2.97
Table 2. G3 body (positions 3–40) symbol counts, percentages, and Poisson z-scores. No symbol crosses the per-family Bonferroni threshold.
4.Position-wise bias profile
The 30-character paper tracked per-position z-scores for the symbols M and 5 specifically, motivated by their aggregate-level deficits in Group 1. The same per-position M/5 probe at 40 characters returns a flat profile for both cohorts (Figures 2a and 2b): no position attains |z| > 3.86 in either cohort for M or 5. Since the aggregate M/5 deficit of the 30-character Group-1 cohort is not reproduced at 40 characters (§3), there is no attenuation profile to find — there is no baseline deficit to break from.
This is not the whole picture. The M/5 projection is two channels of a 32-channel probe, and per-position structure in other symbols would be missed by tracking M/5 only. Figures 2a and 2b reproduce the M/5-only profile for direct comparison with the 30-character paper; Figures 2c (G2) and 2d (G3) report the full per-position probe, showing for each position the largest |z|-score attained by any of the 32 symbols.
Figure 3. Per-position M and 5 z-scores for the 40-character Group 2 cohort. Both symbols sit within the within-noise band at every position. This is the flat profile one would expect either under a uniform-output generator or under a weakly non-uniform generator whose bias is not concentrated in the M/5 channels; the probe alone cannot distinguish these.
Figure 4. Per-position M and 5 z-scores for the 40-character Group 3 cohort. At N = 20 the per-position discretisation is coarse (counts of 0–3 correspond to z-scores of −0.76, +0.59, +1.95, +3.30). One isolated position-3.30 excess appears at position 37 for symbol 5; at 38 body positions tested this is within chance rate.
4.1Per-position all-symbol probe
For each body position we compute the Poisson z-score of every one of the 32 base32 symbols and report the largest |z| at that position. This catches per-position excesses that the M/5 projection does not.
Figure 5. For each position, the largest |z| across all 32 symbols in Group 2. Dashed red line: per-family Bonferroni threshold for 32×38 = 1,216 tests at α = 0.05 (|z| > 3.86). Dashed grey line: per-position uncorrected α = 0.05 for 32 symbols (|z| > 2.87). Positions 1 and 2 reflect the PR distribution and are not body-generator properties. Among body positions (3–40), a single cell — pos 7, G (count 6, z = +4.77) — exceeds the Bonferroni threshold. Positions 3 and 4 sit just below the threshold at z = +3.79 each (A appears 5 of 35 at position 3; T appears 5 of 35 at position 4). The broader profile: 13 of 38 body positions carry at least one symbol above the uncorrected z > 2.87 — roughly twice the expectation under the uniform null (which would give about 4.8 such positions by chance, or closer to 4.9 by a Poisson approximation). The excess is distributed across the body rather than concentrated at any particular region.
On positions 3 and 4 specifically. The M/5-only probe (Figure 3) is flat at positions 3 and 4, as it is everywhere else for G2. The all-symbol probe (Figure 5 for G2, Figure 6 for G3) shows that both positions sit at |z| = +3.79 — A appears at position 3 in 5 of 35 messages (14% against a uniform rate of 3.125%), and T appears at position 4 in 5 of 35 messages. Neither crosses the full-family Bonferroni threshold individually, and neither position is statistically distinguishable from other body positions — the same magnitude of single-symbol excess appears at positions 5 (A=5, J=5), 7 (G=6, the highest), 18 (X=5), 20 (G=5), 27 (F=5), 31 (R=5), 37 (S=5), and 38 (7=5). The evidence does not support treating positions 3–4 as a distinguished region in the G2 40-character format. What it supports is a weaker and more general claim: the whole body of G2 40-character messages shows per-position single-symbol preferences at roughly 2× the chance rate, none individually significant under multiple-testing correction, consistent with either a mild template structure or with the tail of a chance distribution at N = 35. We cannot distinguish those alternatives at this sample size, and we flag positions 3–4 here to make the full picture visible rather than letting the M/5 projection understate it.
Figure 6. Per-position max-|z| for the Group 3 cohort. At N = 20, count=3 corresponds to z = +3.30 and count=4 to z = +4.66; positions with count=4 therefore cross the Bonferroni threshold. 5 body positions do so (including 14, 19, 24, 25, 34 — all at count=4). Many of these positions sit inside or adjacent to the repeat-copy regions (22–28 and 31–37) analysed in §6; the per-position excesses are not independent of the internal-repeat finding and are partially attributable to it (if a specific symbol is over-represented in the repeat region's first copy, it appears at the same offset in the second copy by construction). Positions 3 and 4 in G3 carry max-|z| = +1.95 (count=2 at one symbol in each), well below any threshold of interest. We do not find G3-specific positions-3–4 structure.
5.Doublet profile
We test the rate of consecutive-character doublets at each position pair (i, i+1). Under the uniform null the probability is 1/32 at each pair regardless of marginal distribution. At N = 39–20, the expected doublet count per pair is 1.22 (G2) or 0.62 (G3).
Figure 7. G2 doublet counts per consecutive position pair. All pairs fall within the uniform-null envelope (|z| < 2 across all 39 pairs). No tail-region doublet suppression is detected; the last five pairs (35–36 through 39–40) carry doublet counts between 0 and 1, within 1.1 standard deviations of the uniform expectation.
Figure 8. G3 doublet counts per consecutive position pair. Two pairs stand out: pair (23–24) with 6 doublets against 0.56 expected (z = +7.37) and pair (32–33) with 7 doublets against 0.56 expected (z = +8.72). These pairs sit symmetrically around the internal-repeat offset pair (first@22, second@31) discussed in §6 and are the pair-level manifestation of the same structural feature: the two occurrences of the repeated substring share a doublet at position pair (23, 24) on the first copy and at position pair (32, 33) on the second copy. The excesses are not independent evidence; they are projections of the §6 repeat onto the pairwise doublet probe.
5.1Doublet table (G3 pairs, for reference)
pair
count
exp
z
1-2
0
0.62
-0.80
2-3
0
0.62
-0.80
3-4
1
0.62
+0.48
4-5
0
0.62
-0.80
5-6
0
0.62
-0.80
6-7
2
0.62
+1.77
7-8
0
0.62
-0.80
8-9
1
0.62
+0.48
9-10
1
0.62
+0.48
10-11
0
0.62
-0.80
11-12
0
0.62
-0.80
12-13
0
0.62
-0.80
13-14
0
0.62
-0.80
14-15
0
0.62
-0.80
15-16
1
0.62
+0.48
16-17
2
0.62
+1.77
17-18
1
0.62
+0.48
18-19
0
0.62
-0.80
19-20
1
0.62
+0.48
20-21
0
0.62
-0.80
21-22
0
0.62
-0.80
22-23
0
0.62
-0.80
23-24
7
0.62
+8.19
24-25
0
0.62
-0.80
25-26
0
0.62
-0.80
26-27
0
0.62
-0.80
27-28
0
0.62
-0.80
28-29
0
0.62
-0.80
29-30
2
0.62
+1.77
30-31
0
0.62
-0.80
31-32
0
0.62
-0.80
32-33
8
0.62
+9.48
33-34
0
0.62
-0.80
34-35
0
0.62
-0.80
35-36
0
0.62
-0.80
36-37
2
0.62
+1.77
37-38
2
0.62
+1.77
38-39
1
0.62
+0.48
39-40
2
0.62
+1.77
Table 3. G3 doublet counts per consecutive position pair. Pairs (23–24) and (32–33) are highlighted; all other pairs are within the uniform envelope.
6.Inter-position equality probe
The consecutive-doublet probe of §5 tests each adjacent position pair (i, i+1) for identical-character rates. A more general version of that probe tests every position pair (i, j) for j > i — including non-adjacent pairs — and asks the same question: how often does the character at position i equal the character at position j? Under the uniform null the expected rate at any pair is 1/32, so the expected count across N messages is N/32 regardless of the gap j−i. A per-family Bonferroni threshold for 40-character messages considers 40·39/2 = 780 tests at α=0.05, giving |z| > 4.00.
The probe is sensitive to structural signatures that are invisible to the narrower consecutive-doublet probe of §5 and to the length-≥3 internal-repeat probe of §7. A single-character equality at a non-adjacent position pair falls into the blind spot between those two: adjacent-only probes miss the non-zero gap, and multi-character-substring probes miss the length-1 repeat. The IPE probe closes the gap.
6.1Group 2 — two-character repeat at positions 16–17, 21–22
Applied to the 40-character Group 2 cohort (N = 39), the IPE probe returns 5 position pairs above the Bonferroni threshold. The top five:
pos a
pos b
gap
matches
exp
z
16
21
5
24
1.22
+20.97
17
22
5
24
1.22
+20.97
19
24
5
13
1.22
+10.84
26
37
11
11
1.22
+9.00
7
40
33
6
1.22
+4.40
Table 4. Group 2 inter-position equality pairs above Bonferroni (|z| > 4.00). Expected count per pair: 1.22. The two largest pairs are (16, 21) and (17, 22), each at 22 matches out of 35 messages (63% match rate, z = +20.31).
Group 2 has a 2-character fixed-offset repeat. In 22 of 35 messages (63%), the character at position 16 equals the character at position 21, and the character at position 17 equals the character at position 22. Equivalently, the 2-character bigram at positions 16–17 is reproduced at positions 21–22, with a fixed 5-character interval between the start of the first copy and the start of the second. Two additional Bonferroni-significant pairs appear at (19, 24) with 12 matches (z = +10.60) and (26, 37) with 10 matches (z = +8.65), suggesting additional but weaker structure elsewhere in the body. The Group 2 40-character format is not a uniform-random generator; it is a structured format in which a 2-character region is duplicated.
Figure 9. Group 2 inter-position equality heatmap. Each cell (i, j) shows the z-score of the match count at that pair against the uniform null. Red = excess matches (structured equality), blue = deficit, grey = within noise. The brightest red cells at (16, 21) and (17, 22) are the 2-character repeat signature. Secondary red cells at (19, 24) and (26, 37) mark additional fixed-offset equalities. All other pairs fall within the noise envelope.
6.2Group 3 — character-level confirmation of the §7 repeat
Applied to Group 3 (N = 20), the IPE probe returns 17 position pairs above the Bonferroni threshold. The top pairs correspond exactly to the repeat-copy structure identified by the internal-repeat probe in §7:
pos a
pos b
gap
matches
exp
z
22
31
9
20
0.62
+24.90
23
32
9
20
0.62
+24.90
27
36
9
18
0.62
+22.33
24
33
9
17
0.62
+21.04
25
34
9
16
0.62
+19.76
26
35
9
12
0.62
+14.62
23
25
2
8
0.62
+9.48
23
33
10
8
0.62
+9.48
Table 5. Group 3 inter-position equality pairs above Bonferroni. All top-ranked pairs have gap 9 — consistent with the fixed-offset-9 verbatim repeat identified in §7. The IPE probe and the internal-repeat probe converge on the same structural feature.
Figure 10. Group 3 inter-position equality heatmap. The diagonal band at gap 9 (offset 9 off the diagonal) between positions 22–28 and 31–37 is the signature of the verbatim repeat covered in §7.
7.Internal-repeat structure (Group 3)
The inter-position equality probe of §6 tests for single-character equalities at every pair of positions. Its multi-character generalisation tests for length-L substrings (L ≥ 3) appearing at two disjoint positions. For Group 3 the finding of §6 is that equalities at offset 9 span the entire positions-22-through-33 region; this longer-substring probe confirms that the equalities are not per-position coincidences but a full verbatim substring repeat.
Group 3: 3-to-7 character verbatim repeat at offset 9. 19 of 20 messages contain a length-3-to-7 substring that appears verbatim at two positions — the first occurrence starting at character 22, the second starting at character 31. The substrings differ across messages but the offset pair is near-invariant.
7.1Per-message repeat table (Group 3)
message
len
first@
second@
substring
FJTO7B2V2ITMXMWYYSY465NZVRECHW5NZVLCCCBI
4
22
31
5NZV
FJE7XHB6LXCJXPIZISJGDH5OJUIGNGH5OJ7IRMRN
4
22
31
H5OJ
FJAUIZBARL27THBVJIAQLZGGTQXNPNZGGTQX5DXB
6
22
31
ZGGTQX
FJ7KAC3RXNF3OMGMJX7XQR7B7VYWOIR7B7IYJEBB
4
22
31
R7B7
DXYKNHY7DMOB5P4KB372GV22R2TNFSV22R2TZQXW
6
22
31
V22R2T
DXJMB3XYHMLQHR6GM2Y5LZ77GW2NCHZ77GW2JDOZ
6
22
31
Z77GW2
DX2T377LNVJAQYZSR2KTZYQIQPX7IEYQIQEXBFCN
4
22
31
YQIQ
DX3D7M6TOENVE5ATJJFPWYEQEYQ4XUYEQEGQDPWS
4
22
31
YEQE
DXKDCGKS74QV2WDHWULWTPZZOEA6HJPZZOEAZXPB
6
22
31
PZZOEA
DXVGURR3MON6BZADYUAEHWGZGBUZ5SWGGIBUKFRU
(no repeat of length ≥ 3)
L6QO42WYAATGOEWNLXLGF345X5IBGG345XWBBBZX
4
22
31
345X
DQMJLEOPKUXZGFPK4WIXAET4T3UHYNET4TZUFVTG
4
22
31
ET4T
DQ4SHPDMW2RLT3USPKQE4344TUBO45344TUBNJVZ
6
22
31
344TUB
DQJ6JRVZHIA5DLEDDQXGE6X6XYBLG46X6X5BHEK3
4
22
31
6X6X
DQ33Q4O33BACIVXDVR7JICOOBIV5FDCO7OIV5WWI
3
26
35
IV5
DPXNVZ3PY4R4AP2JHLN76LNQ2FJIM2LNQ2FJIKSU
7
22
31
LNQ2FJI
DPNUQO46LZEPFH22E7QVCDCZTZ4DJGDCZTZ4D2HT
7
22
31
DCZTZ4D
DPLZ6PRXDIXDKPKPOW7EJ5V6V2JPML5VVH2J5VTT
3
21
36
J5V
AKNHLJPITLXGXHRQAZPPA4RRX6ZYBI4RRX6ZMFCD
6
22
31
4RRX6Z
D5E4SRHXF4RL6S6E2KZ7WH4K4MGXEEH4KYMGZXUT
3
22
31
H4K
Table 6. Per-message longest-internal-repeat record for Group 3 first-broadcast messages. Highlighted rows: repeats beginning at the canonical offset pair (22, 31).
7.2Per-message repeat table (Group 2)
The 2-character repeat identified in §6.1 at positions (16, 17) → (21, 22) is shorter than the length-3 threshold of the substring-repeat probe, so it does not appear in the longest-internal-repeat table below. The table records length-3-or-longer repeats separately; Group 2 at 40 characters has 9 such messages out of 39, scattered across varied offsets with no concentration:
message
len
first@
second@
substring
ARALJD5A2DQJUH6IC6BDICGB556IC5BEZ6LQS4YT
3
15
27
6IC
BPXCLQJ5DAQ62IKHZXAGHZX37P7A6MEH26G37KJ5
3
16
21
HZX
3ZWGDXGLDFVJ576PYC66PYHTGF3APNS5G2Q5X7TX
3
15
20
6PY
KJZ6PEJLTLSZBXGAPXLXRJV34SGZ2UZC6NC4SGD7
3
25
36
4SG
NZKT77W3X2PQBFMLGSAGRE6H7YFN4FBIPE67YFGI
3
25
36
7YF
NZW6FZMKFB3AWDS6ZYVX6ZYJDLP2KHRKN7GPYCH7
3
16
21
6ZY
MPXW3YY7NVIH6X5MURGGMURQOWEAXURHABKPUE35
3
16
21
MUR
XBDIASPC367ARLCDT4ZADT4EJ4DYQBSQ2TWE63U6
3
16
21
DT4
I4FFJUUA4DYJCOFGP4Z22XCWHSW2H7WP2B6HSW7Z
3
25
36
HSW
Table 7. Group-2 messages with a length-3-or-longer internal repeat (26 messages without any such repeat are omitted). The 2-character fixed-offset repeat identified in §6.1 is below the length threshold of this probe and does not appear here. Absence of a length-3 repeat does not imply absence of structure; §6 is the relevant probe.
8.Within-prefix structural probes
The 30-character paper reported three within-prefix probes — pairwise positional mutual information, LZMA compressibility, and modular-difference structure across date-ordered consecutive messages — each requiring N ≥ 50 per prefix cell for credible z-scores. In the 40-character cohort, no prefix cell has more than 15 messages in either group. The per-prefix probes cannot be run at the same statistical power and are skipped.
What can be run is an aggregate compressibility probe on the body bytes of each whole-cohort corpus:
cohort
body bytes
obs ratio
null mean
null sd
z
G2
1482
0.7368
0.7345
0.0022
+1.07
G3
760
0.7789
0.8142
0.0050
-7.10
Table 8. Aggregate LZMA compressibility of the body (positions 3–40) of each cohort. The null distribution is 50 synthetic uniform-random base32 draws of matched length. G2 matches the synthetic null; G3 compresses at a ratio 7.1σ below the null, consistent with the repeat finding of §6 producing locally redundant bytes.
Pairwise mutual-information scans require N ≥ 50 to populate 32×32 joint tables with non-zero count per bin. Both cohorts fail that requirement; the MI probe is flagged as skipped in the reproduction output.
Differential structure between date-ordered consecutive broadcasts likewise requires per-prefix sample sizes we do not have; no prefix in either cohort has ≥ 50 messages.
9.Synthesis
Both 40-character cohorts carry format-level structure. The two structures are distinct; neither matches the 30-character Group 1 format.
9.1Group 2 — 2-character fixed-offset repeat
The 40-character Group 2 payload is not a uniform-random generator. In 22 of 35 messages (63%), the character at position 16 equals the character at position 21, and the character at position 17 equals the character at position 22 — equivalently, the 2-character bigram at positions 16–17 is reproduced at positions 21–22 with a fixed 5-character interval between the starts. Two additional Bonferroni-significant fixed-offset equalities appear at (19, 24) with 12 matches and (26, 37) with 10 matches, indicating further (weaker) repeat-like structure elsewhere in the body. The structure is invisible to the consecutive-doublet probe of §5 and the length-≥3 internal-repeat probe of §7; the inter-position equality probe of §6 is what surfaces it, at overwhelming significance (z = +20.31 at each of the two leading pairs).
Figure 11. G2 40-character observational region schematic. After the 10-bit prefix (P, positions 1–2) and a header region (H, positions 3–15, approximately 65 bits), the body carries a 2-character region (R₁, positions 16–17) whose content is reproduced at positions 21–22 (R₂) with a short gap (s) at positions 18–20 between them. Positions 23–40 are a trailing body region (T) within which additional weaker equalities appear at positions 19–24 and 26–37 but do not organise into a single dominant structure. The schematic is observational and the exact boundaries of R₁/R₂ are approximate. 22 of 35 messages carry the R₁ = R₂ equality; the remaining 13 messages do not, and we do not infer a cryptographic role for the repeat.
9.2Group 3 — 3-to-7 character fixed-offset repeat
The Group 3 cohort carries a related but stronger structure: a 3-to-7 character substring starting at position 22 is duplicated verbatim starting at position 31 in 19 of 20 messages. The inter-position equality probe confirms at the character level (all 18 messages match at (22, 31), at (23, 32), at (24, 33)) and the longest-substring probe confirms at the substring level. Compressibility (z = -7.10 against a uniform null) corroborates the redundancy.
Figure 12. G3 40-character observational region schematic. After the 10-bit prefix (P), the body carries a header region (H, positions 3–21), a first copy of a repeat region (R₁, positions 22–28), a short gap (s, positions 29–30), a second verbatim copy (R₂, positions 31–37), and a tail (T, positions 38–40). The exact boundaries of R₁ and R₂ vary across messages between length 3 and length 7 (the longest repeats observed). We do not assign cryptographic or semantic roles to any region; the figure summarises observed message structure.
The Group 2 and Group 3 40-character repeat structures are different in scale (2 characters vs 3–7), offset (5 positions vs 9), and anchor position (16 vs 22). They are not variants of a single underlying rule. Both reject the null of a uniform-random generator, and neither reproduces the 30-character Group 1 signatures.
10.Discussion and limits
10.1Scope of the probe battery
The probes reported here span four classes of statistical test: first-order symbol marginals (§3), per-position marginal z-scores by symbol (§4), consecutive-doublet and inter-position equality rates (§5 and §6), and multi-character internal repeats (§7). The IPE probe of §6 is the probe that surfaces the Group 2 finding: a 2-character repeat at a non-zero gap falls into the blind spot between the consecutive-doublet probe (which tests only pairs i, i+1) and the length-≥3 internal-repeat probe (which tests only multi-character substrings). Readers applying this probe battery to other cohorts should include the IPE probe from the start.
10.2Statistical power
At NG2 = 39 and NG3 = 20 the probe battery operates at roughly 1% and 0.3% of the 30-character Group 1 cohort's sample size. The probes that dominated the 30-character paper's positive findings — aggregate doublet z-scores of order 20σ, within-prefix MI/compressibility — require much larger per-cell samples than available here, and their absence at 40 characters does not rule out effects of the same relative magnitude. What is robustly detectable at this sample is any fixed-offset repeat effect whose per-message probability is meaningfully above the uniform null. Both groups carry such an effect.
10.3What the G2 repeat does not say
The observation is format-level: in most G2 messages, two specific 2-character positions are equal. We do not identify what those characters encode, why the repeat exists, or whether the repeat is cryptographically significant. A 2-character repeat at a fixed offset is consistent with many sub-hypotheses — a short redundancy field for error detection, a counter that encodes its value at two positions, a format that uses identical separators at two locations. The observation does not distinguish them.
10.4Future probes
(i) Does the G2 repeat substring correlate with prefix, date, or any other observable? (ii) Do the remaining 13 of 35 G2 messages lack the repeat for systematic reasons (different sub-format) or by chance at this sample? (iii) Are positions (19, 24) and (26, 37) — the secondary G2 IPE pairs — part of the same structure, part of a different one, or chance projections? (iv) Do the G2 and G3 repeat geometries at 40 characters share an underlying rule, or are they independent designs? These questions are answerable only with larger cohorts or cross-length comparison at Group 3 cohorts at other lengths.
11.Conclusion
Applied to 40-character HFGCS broadcasts, the expanded probe battery (with the inter-position equality probe added to the 30-character paper's original battery) returns two independent positive findings:
Group 2 (N = 39): 2-character fixed-offset repeat. In 22 of 35 messages, the character at position 16 equals the character at position 21, and the character at position 17 equals the character at position 22 (z = +20.31 each). The 2-character bigram at positions 16–17 is reproduced at positions 21–22.
Group 3 (N = 20): 3-to-7 character fixed-offset repeat. In 15 of 18 messages, a length-3-to-7 substring starting at position 22 is reproduced verbatim starting at position 31.
Both findings are surfaced by the inter-position equality probe of §6, which tests every position pair for character equality — a probe the consecutive-doublet and length-≥3 internal-repeat probes do not cover. Neither the 30-character Group 1 signatures (M/5 deficit, mid-body transition, trailing no-run constraint) nor a shared G2/G3 rule is reproduced here. G2 and G3 40-character are distinct formats.
A.Methodology
The probe battery extends the 30-character paper's battery with the inter-position equality probe (§6) and the internal-repeat probe (§7). Forward-fill group resolution and first-broadcast itemisation are applied identically. See the 30-character paper for the full methodology, threat model, and references.