Three Sheets, One Architecture

Three Sheets, One Architecture:
Inter-Sheet Joint Pauli Measurements for
𝐾=4
Quantum Error Correction
Raghu Kulkarni
SSMTheory Group, IDrive Inc., Calabasas, CA 91302, USA
raghu@idrive.com
Abstract
We present a CSS quantum error correcting code on the Face-Centered Cubic (FCC)
lattice that combines surface-code-compatible
𝐾=4
connectivity with native
inter-
sheet joint Pauli measurements
between co-located CSS code blocks. The construc-
tion has two parts. First, restricting the
[[3𝐿
3
, 2𝐿
3
+2, 3]]
FCC lattice code to a
single triad sheet yields the
sheet code
with parameters
[[𝐿
3
, 2𝐿, 𝐿]]
at even
𝐿
(or
[[𝐿(𝐿 1)
2
, 𝐿, 𝐿 1]]
as a planar variant), uniform weight-4 stabilizers, and
𝐾=4
active per-qubit connectivity. Three triad sheets share the FCC lattice geometry, en-
coding
6𝐿
logical qubits at distance
𝐿
, deployable as a monolithic 2D chip (small
𝐿
),
a three-layer stacked architecture, or native 3D hardware. Second, we introduce a
fault-tolerant surgery protocol using local FCC triangle measurements to implement
joint Pauli measurements between logical qubits in dierent sheets.
Our quantitatively characterized claims are:
Static memory baseline.
Single-sheet logical memory simulation at
𝐿
{4, 6}
on a custom Stim circuit gives an FSS crossing near
1.0%
(Section 7.4),
consistent with surface-code-like behavior under circuit-level depolarizing noise
with MWPM decoding. This is reported as a baseline, not the central charac-
terized contribution.
Joint-
𝑍𝑍
surgery (ZZ-merge): FSS-crossing threshold estimate
1.07% ± 0.05%
for the toric variant and
0.76% ± 0.05%
for the boundary-
aware planar variant, from pairwise crossings at
𝐿 {4, 6, 8}
(Sections 7, 9.2).
This is the central characterized FT primitive.
Joint-
𝑋𝑋
surgery (XX-merge): FSS-crossing threshold estimate
1.0% ± 0.1%
on the toric variant (Section 9.3). The planar boundary-aware
variant has a structural blocker for the dual primitive under the standard
boundary choice.
As a synthesis claim, we verify the three-sheet Horsman [5] CNOT logical truth table
at
𝐿 {4, 6, 8}
on the toric variant (
𝑝 = 0
, deterministic on all four computational-
basis inputs, DEM builds cleanly), and observe a
4.2𝜎
distance improvement at
𝑝 = 10
3
from
𝐿 = 4
to
𝐿 = 6
under MWPM. However, a direct diagnostic shows
the protocol as constructed is
not yet fault-tolerant in the standard sense
: increasing
the per-merge depth
𝑑
each
does not reduce LER (at
𝐿 = 4
,
𝑑
each
= 8
gives
higher
LER
than
𝑑
each
= 4
), because the Pauli-frame correction enters the observable through
1
specic gauge measurements that are not themselves repeated
𝑑
times. Reaching
FT-thresholded status for the full CNOT requires either FT-repeated gauge mea-
surements or a gauge-aware decoder; both are concrete follow-up directions.
The architecture’s
deployment niche
is therefore: high-density FT memory plus
FT joint-Pauli measurements at
𝐾=4
, with the full universal Cliord layer iden-
tied as a near-term protocol renement rather than a structural blocker. The
density advantage is concrete and consistent:
16
physical qubits per logical at
𝐿 = 4
in a three-sheet deployment vs
31
for distance-4 rotated surface code (a
1.9×
improvement, essentially at in
𝐿
). Application classes that consume only
memory and joint-Pauli primitives quantum networking nodes, large-scale logical
benchmarking, NISQ-to-FT bridge demonstrations are immediately addressable
on existing or near-term
𝐾=4
hardware (Google Willow, IQM, OQC). Compared to
surface code lattice surgery (conned within a single patch), the sheet code provides
2𝐿
logicals per sheet at the same distance and connectivity; compared to bivariate
bicycle codes (inter-block gates between physically separated
𝐾=6
modules), the
sheet code achieves inter-sheet operations at
𝐾=4
with only short-range couplings
between co-located sheets.
1 Introduction
Two families of quantum error correcting codes currently dominate the discussion of near-
term fault-tolerant quantum computing. The
surface code
[2, 4] pairs
𝐾=4
planar hard-
ware compatibility with mature fault-tolerant protocols, including lattice surgery [5, 6]
for logical gates within a single 2D substrate. The
bivariate bicycle (BB) code
family [10]
achieves an order-of-magnitude rate advantage over the surface code at the cost of
𝐾=6
connectivity and a small number of long-range couplers. The BB family’s
inter-block
log-
ical gates gates that move logical qubits between distinct code blocks remain an
active research problem [11].
This paper introduces a third option that occupies a previously unlled niche: a CSS
code at
𝐾=4
connectivity (matching the surface code) with
native inter-sheet joint Pauli
measurements
between co-located CSS code blocks a fault-tolerant primitive that the
surface code provides only within a single patch’s lattice surgery zone. This is a dierent
design point from surface code lattice surgery (which operates within a single substrate)
and from bivariate bicycle codes (which target inter-block gates between physically sepa-
rated
𝐾=6
modules and where modular logical operation protocols are an area of active
architectural development). Full FT logical CNOT composition between sheets is a syn-
thesis of two such joint-Pauli primitives that we verify at the truth-table level but identify
as not yet fault-tolerantly characterized in this work (Section 10); the primary character-
ized contribution is the joint-Pauli primitive itself. The construction has two interlocking
ingredients.
The FCC sheet code.
Restricting the
[[3𝐿
3
, 2𝐿
3
+2, 3]]
FCC lattice code [1] to a single
triad sheet (one of three orthogonal
𝐾=4
sublattices in FCC) eliminates the FCC code’s
weight-3 vulnerability and yields a CSS code with parameters
[[𝐿
3
, 2𝐿, 𝐿]]
at even
𝐿
. Each
of the three sheets decomposes into
𝐿
parallel 2D toric codes; three sheets on a shared
FCC substrate encode
6𝐿
logical qubits at distance
𝐿
on
3𝐿
3
data qubits.
Cross-sheet triangle surgery.
Every FCC triangle has one edge in each of the three
2
triad sheets. Weight-3 Pauli measurements on FCC triangles couple data qubits across
sheets, providing a natural primitive for inter-sheet logical operations. We show that
triangle products implement joint Pauli measurements between logical qubits in dierent
sheets, with the merged code preserving distance
𝑑 = 𝐿
.
The two ingredients work in tandem: the sheet code provides ecient storage at
𝐾=4
;
the triangle surgery primitive provides fault-tolerant inter-sheet joint Pauli measurements
at the same envelope. The result lls a gap that neither surface code nor bivariate bicycle
codes address:
𝐾=4
planar connectivity with inter-sheet joint Pauli measurements as
a native primitive. Composing these primitives into full FT logical gates (Cliord cir-
cuits, magic-state distillation, etc.) requires the further protocol or hardware renements
identied in Section 10.
0
1
x
0
1
y
0
1
z
v
a
v
b
v
c
Every FCC triangle has one edge per triad sheet
sheet
S
xy
sheet
S
xz
sheet
S
yz
Figure 1: Every FCC triangle has exactly one edge in each triad sheet. The three edge
vectors of a triangle
{𝑣
𝑎
, 𝑣
𝑏
, 𝑣
𝑐
}
partition naturally across
𝑆
𝑥𝑦
,
𝑆
𝑥𝑧
,
𝑆
𝑦𝑧
. A single weight-
3 Pauli measurement on a triangle simultaneously couples data qubits across all three
sheets.
Summary of results.
(i) sheet code with parameters
[[𝐿
3
, 2𝐿, 𝐿]]
on a torus,
[[𝐿(𝐿
3
1)
2
, 𝐿, 𝐿 1]]
on a plane, uniform weight-4 stabilizers,
𝐾=4
data-qubit connectivity (Sec-
tion 2); (ii) three-sheet hardware architecture via time-multiplexed syndrome extraction
(Section 3); (iii) triangle algebra spanning
6𝐿3
of the
6𝐿
cross-sheet Z-logicals (Sec-
tion 4); (iv) fault-tolerant surgery protocol with
𝐾=4
verication and
𝑂(𝐿)
gate overhead
(Section 5); (v) distance preservation under merge for both Z- and X-sides (Section 6); (vi)
threshold simulation via custom Stim at
𝐿 = 4, 6, 8
giving
𝑝
𝑍𝑍
th
= 1.07% ± 0.05%
(toric)
and
𝑝
planar
th
= 0.76% ± 0.05%
(boundary-aware planar) (Section 7); (vii) comparison with
state of the art (Section 8); (viii) CSS-dual XX-merge primitive demonstrated with com-
parable threshold
1.0%
at
𝐿 = 4, 6, 8
on the toric variant (Section 9.3); (ix) three-sheet
Horsman CNOT logical truth table veried at
𝐿 {4, 6, 8}
but
not yet fault-tolerantly
characterized
(Section 10).
Claim status matrix.
To make the epistemic status of each claim transparent up front:
4
Claim Status Evidence / Section
[[𝐿
3
, 2𝐿, 𝐿]]
code parameters (toric);
[[𝐿(𝐿 1)
2
, 𝐿, 𝐿 1]]
(planar)
Proven Theorem 2, §2
𝐾=4
active per-qubit connectivity Veried by con-
struction
§5, gate schedule
Z- and X-distance preservation under
merge
Proven (general
𝐿
) + compu-
tational checks
(
𝐿 = 4, 6, 8
)
Theorems 5, 6
Static memory FSS-crossing baseline
1.0%
(toric, single sheet)
Simulation,
𝐿
{4, 6}
, MWPM
§7.4
ZZ-merge FSS-crossing threshold
1.07%±0.05%
(toric) /
0.76%±0.05%
(planar boundary-aware)
Simulation,
𝐿 {4, 6, 8}
,
MWPM, FSS
data-collapse t
available
§7, §9.2
XX-merge FSS-crossing threshold
1.0% ± 0.1%
(
toric only
; planar XX
has structural blocker)
Simulation,
𝐿 {4, 6, 8}
,
MWPM
§9.3
Three-sheet Horsman CNOT logical
truth table at
𝑝 = 0
Veried de-
terministic at
𝐿 {4, 6, 8}
,
𝑑
each
{2, 3, 4}
at
𝐿 = 4
§10
CNOT distance suppression at
𝑝 =
10
3
Observed:
4.2𝜎
improvement
𝐿 = 4 𝐿 = 6
§10, Fig. 6
CNOT fault-tolerant threshold
Not extracted
;
𝑑
each
-scaling
fails (non-FT)
§10, follow-up direction
Three-layer stacked architecture with
vertical inter-sheet couplers
Specied struc-
turally; per-
coupler delity
not modeled
§3
Magic-state distillation, non-Cliord
layer
Suggested by
octahedral
symmetry; no
protocol con-
structed
§10.8
5
2 The FCC Sheet Code
2.1 The Triad Decomposition
The FCC lattice has
𝐾 = 12
nearest-neighbor vectors, partitioning naturally into three
orthogonal sheets of 4:
𝑆
𝑥𝑦
: (±1, ±1, 0)
𝑆
𝑥𝑧
: (±1, 0, ±1)
(1)
𝑆
𝑦𝑧
: (0, ±1, ±1)
Each FCC edge belongs to exactly one sheet. At lattice size
𝐿
(even), each sheet contains
𝐿
3
edges. Restricted to a single sheet, each FCC vertex has
𝐾=4
incident edges.
2.2 The Sheet Code Stabilizers
Denition 1
(FCC sheet code)
.
Fix one triad sheet
𝑆
(say
𝑆
𝑥𝑦
). Place one physical qubit
on each edge in
𝑆
(
𝑛 = 𝐿
3
qubits). The stabilizers are:
𝑍
-stabilizers: for each vertex
𝑣
, apply
𝑍
to the 4 edges of
𝑆
incident to
𝑣
.
𝑋
-stabilizers: for each octahedral void
𝑜
, apply
𝑋
to the 4 edges of
𝑆
connecting the
6 vertices surrounding
𝑜
.
Both stabilizer types have uniform weight 4. The CSS condition
𝐻
𝑋
𝐻
𝑇
𝑍
= 0
over
GF(2)
is
satised because each edge in sheet
𝑆
connects two vertices and participates in exactly two
octahedral voids restricted to
𝑆
; the overlap between any X-stabilizer and any Z-stabilizer
is even.
2.3 Code Parameters
Theorem 1
(Sheet code parameters)
.
At even
𝐿
, the FCC sheet code has parameters
[[𝐿
3
, 2𝐿, 𝐿]]
:
𝑛 = 𝐿
3
physical qubits,
𝑘 = 2𝐿
logical qubits, code distance
𝑑 = 𝐿
.
The parameters follow from the layer decomposition (Section 2.4) together with standard
2D toric code counting. Computational verication:
𝐿 𝑛 rank(𝐻
𝑍
) rank(𝐻
𝑋
) 𝑘
4 64 28 28 8
6 216 102 102 12
8 512 248 248 16
In each case
rank(𝐻
𝑍
) = rank(𝐻
𝑋
) = (𝐿
3
2𝐿)/2
, giving
𝑘 = 𝐿
3
2(𝐿
3
2𝐿)/2 = 2𝐿
.
The general result follows from Theorem 2 below.
6
2.4 Layer Decomposition and Proof of Parameters
Why the distance increases.
The full FCC code has
𝑑 = 3
because weight-3 logical
operators exist at tetrahedral voids: one edge from each of the three triad sheets forms
a triangle commuting with all weight-12 stabilizers. Within a single triad sheet, only
one edge of any such triangle survives, giving a single-edge Pauli that anticommutes with
the appropriate opposite-type sheet stabilizers (a Z-edge with the X-stabilizers of the
sheet; an X-edge with the Z-stabilizers) and is therefore detected. No weight-3 logical
survives the sheet restriction; the minimum-weight logical operators of the sheet code are
non-contractible cycles within the sheet, of length
𝐿
.
Layer structure.
Each triad sheet decomposes further into
𝐿
independent layers indexed
by the zero-displacement coordinate. For sheet
𝑆
𝑥𝑦
, edges have
𝑑𝑧 = 0
, so each
𝑆
𝑥𝑦
edge
has a well-dened
𝑧
-coordinate equal to the shared
𝑧
of its two endpoints. Edges in layer
𝑧 = 𝑧
0
form a 2D toric code on a rotated
𝐿 ×𝐿
square lattice. Analogous decompositions
hold for
𝑆
𝑥𝑧
(layered by
𝑦
) and
𝑆
𝑦𝑧
(layered by
𝑥
).
Theorem 2
(Layer decomposition)
.
The FCC sheet code on sheet
𝑆
𝑥𝑦
at even
𝐿
is iso-
morphic, as a stabilizer code, to
𝐿
disjoint 2D toric codes, each on an
𝐿×𝐿
rotated square
lattice with
𝐿
2
data qubits,
𝑘 = 2
logical qubits, and distance
𝐿
. The Z-stabilizers (resp. X-
stabilizers) of the sheet code partition into
𝐿
disjoint sets, one per layer; within each layer,
the rank deciency equals 1 (one product redundancy among
𝐿
2
/2
vertex stabilizers).
Proof.
Edge partition.
Each
𝑆
𝑥𝑦
edge
(𝑣
1
, 𝑣
2
)
has
𝑧(𝑣
1
) = 𝑧(𝑣
2
)
since the displacement
vector
𝑣
2
𝑣
1
{(±1, ±1, 0)}
has
𝑑𝑧 = 0
. Dene
layer(𝑒) = 𝑧(𝑣
1
)
. The map
layer : 𝑆
𝑥𝑦
{0, 1, . . . , 𝐿 1}
partitions the
𝐿
3
edges of
𝑆
𝑥𝑦
into
𝐿
disjoint sets of
𝐿
2
edges each.
Stabilizer partition.
A vertex Z-stabilizer at vertex
𝑣
acts on the 4 sheet-
𝑆
𝑥𝑦
edges
incident to
𝑣
, all of which have the same
𝑧
-coordinate as
𝑣
. Hence each vertex Z-stabilizer
is supported entirely within one layer. An analogous argument applies to octahedral void
X-stabilizers within
𝑆
𝑥𝑦
, since the 4 edges of an oct void restricted to
𝑆
𝑥𝑦
all share the
same
𝑧
-coordinate as the void center.
Layer is a 2D toric code.
Within layer
𝑧 = 𝑧
0
, the
𝐿
2
edges connect vertices
{(𝑥, 𝑦, 𝑧
0
) :
𝑥+𝑦 𝑧
0
(mod 2)}
via the four neighbor vectors
(±1, ±1, 0)
. This is precisely the rotated
𝐿 × 𝐿
square lattice, and the vertex Z-stabilizers and oct-void X-stabilizers on this layer
are exactly the standard 2D toric code stabilizers. The toric code on
𝐿×𝐿
has parameters
[[𝐿
2
, 2, 𝐿]]
.
Rank count.
The 2D toric code on
𝐿
2
data qubits has
𝐿
2
/2
vertex Z-stabilizers, sat-
isfying one redundancy (the product over all vertices is the identity). Hence per layer,
rank(𝐻
layer
𝑍
) = 𝐿
2
/2 1
. Across
𝐿
layers,
rank(𝐻
sheet
𝑍
) = 𝐿 · (𝐿
2
/2 1) = (𝐿
3
2𝐿)/2
.
The same argument applies to
𝐻
sheet
𝑋
.
Code parameters.
𝑘 = 𝑛 rank(𝐻
𝑍
) rank(𝐻
𝑋
) = 𝐿
3
2 · (𝐿
3
2𝐿)/2 = 2𝐿
. The
minimum-weight logical operators are the non-contractible cycles of the per-layer 2D toric
codes, each of length
𝐿
. Hence
𝑑 = 𝐿
.
Consequence for the rank formula.
Theorem 2 eliminates the need for a per-
𝐿
verication of the rank: the formula
rank(𝐻
𝑍
) = rank(𝐻
𝑋
) = (𝐿
3
2𝐿)/2
holds for every
even
𝐿 2
.
7
2.5 Planar Variant
For deployment on planar quantum chips that do not support periodic boundary condi-
tions, each layer becomes a rotated surface code
[[(𝐿1)
2
, 1, 𝐿1]]
via standard boundary
engineering [4]. The resulting
planar sheet code
has parameters
[[𝐿(𝐿 1)
2
, 𝐿, 𝐿 1]] (
planar boundaries
).
(2)
The distance drops by one due to the standard rotated-surface-code boundary truncation,
and the encoding rate halves from
2𝐿 𝐿
.
3 Hardware Embedding: Three Sheets, Three Layers,
or One Chip
The three triad sheets are edge-disjoint:
3𝐿
3
data qubits in total, with
𝐿
3
per sheet. We
now address the question of how to physically realize these qubits on hardware. This
question is non-trivial because the sheet code uses
Θ(𝐿
3
)
data qubits to encode
Θ(𝐿)
logical qubits at distance
𝐿
, and a monolithic 2D embedding of an
𝐿
3
-vertex 3D graph
cannot maintain unit-length nearest-neighbor couplings as
𝐿
grows.
We discuss three deployment options, in increasing order of scalability.
3.1 Option A: Monolithic Planar Chip (Small to Moderate
𝐿
)
For
𝐿 8
(
1,500
data qubits total), a monolithic planar processor hosts all three sheets
via time-multiplexed syndrome extraction: data qubits occupy xed positions; per-sheet
ancillas are physically distinct but co-located at FCC vertex and oct-void positions; cou-
plers recongure between rounds to activate one sheet at a time (
𝑆
𝑥𝑦
, 𝑆
𝑥𝑧
, 𝑆
𝑦𝑧
in successive
rounds).
Wire-length scaling:
on a monolithic 2D embedding of the
Θ(𝐿
3
)
-qubit 3D
lattice, average nearest-neighbor distance scales as
Θ(𝐿
1/2
)
straightforward at
𝐿 = 4
(192 qubits), requires active calibration at
𝐿 = 8
(1,536 qubits), impractical at
𝐿 12
without coupler-reach upgrades.
Idle penalty:
while one sheet is measured, the other
two idle; with round time
𝑡
, full cycle is
3𝑡
and each data qubit idles
2𝑡
per cycle, captured
in the Section 7 noise model.
3.2 Option B: Three-Layer Stacked Architecture (Recommended
Deployment)
For
𝐿 8
, we recommend three planar
𝐾=4
processors, one per triad sheet, bonded with
through-silicon-vias or inter-layer capacitive couplers for triangle-mediated cross-sheet
operations.
Each chip is a standard planar
𝐾=4
device. Layer hosting
𝑆
𝑥𝑦
carries the
𝐿
3
data qubits
on
𝐿
stacked rotated
𝐿 × 𝐿
lattices (Theorem 2). Triangle ancillas sit between chip
layers, coupled via short-range vertical couplers (TSVs) to the three data qubits of their
8
triangle, one from each chip. Inter-layer couplers activate only during surgery operations
and remain inactive otherwise. The chip-internal
𝐾=4
connectivity is unaected; the
inter-layer couplers add
𝐾 = 1
per data qubit per active triangle, preserving the
𝐾=4
eective constraint during surgery (Section 5). Three-layer stacked QEC architectures
appear in recent hardware roadmaps [7, 10]; IBM’s stacked-die approach for bivariate
bicycle uses analogous couplers. The sheet code’s three-chip architecture has lower per-
chip connectivity (
𝐾=4
vs.
𝐾=6
) but simpler inter-chip coupling (only triangle ancillas
need vertical bonds).
Vertical crosstalk.
Only triangle ancillas (a minority,
𝑂(𝐿)
per surgery primitive vs.
Θ(𝐿
3
)
data qubits per layer) carry inter-layer couplers; data qubits and per-sheet ancillas
have no vertical wiring, so static memory is unaected by inter-layer phenomena. Cou-
plers activate via control electronics; o-state isolation via detuning is platform-dependent
(superconducting TSVs reach
40
60
dB [7]; neutral-atom and ion-trap platforms can
achieve higher via mechanical separation). Platform-specic crosstalk budgeting is iden-
tied as future work (Section 10.8).
3.3 Option C: Native FCC Hardware
For maximally ecient embedding, a quantum hardware platform with native 3D con-
nectivity (such as 3D-printed superconducting circuits, neutral atom arrays with 3D-
addressable laser systems, or trapped ion architectures with multi-segment traps) hosts
the full FCC lattice without the embedding penalty of options A or B. The sheet code
runs natively on such hardware with all
𝐾=4
couplings at unit physical distance. This
option is forward-looking; no current commercial platform oers it.
3.4 Hardware Footprint
Across all options:
3𝐿
3
data qubits (one per FCC edge, partitioned by sheet),
3 × 𝐿
3
/2
vertex Z-ancillas,
3 × 𝐿
3
/2
octahedral void X-ancillas, and
𝐿
transient triangle ancillas
per active surgery primitive (reusable). Total:
6𝐿
3
recurring qubits with a
𝐾=4
active
syndrome-extraction schedule per data qubit, plus
𝑂(𝐿)
transient ancillas during surgery.
3.5 Comment on the “Cross-Block” Terminology
The three triad sheets occupy the same FCC lattice geometry (time-multiplexed in option
A, stacked in option B, co-located in option C), not spatially separated modules in the
bivariate bicycle sense. Throughout this paper we use the phrase
inter-sheet operations
for
any operation between logical qubits in distinct sheets implemented by triangle surgery.
These decompose into two strictly distinguished categories. (1)
Inter-sheet joint Pauli
measurements
the ZZ-merge and XX-merge primitives of Sections 7 and 9.3 are
fault-tolerant primitives with FSS-crossing threshold estimates. (2)
Composite logical
Cliord gates
(e.g., the three-sheet Horsman CNOT obtained by sequencing two joint
Pauli measurements) are constructed in Section 10 as correct logical gates at
𝑝 = 0
but
are
not yet fault-tolerantly characterized
; calling these “logical gates” is correct but does
not imply they are FT-thresholded in this work. The sheets are logically distinct CSS
9
code blocks (each independently encoding
2𝐿
logical qubits with independent stabilizer
groups and decoders) but not physically separated. This intermediate regime, between
surface code lattice surgery within one substrate and bivariate bicycle inter-block gates
across separated modules, is the niche our construction occupies.
4 Triangle Algebra and Cross-Sheet Logicals
4.1 FCC Triangles
Lemma 1
(Triangle structure)
.
Every triangle (3-cycle) in the FCC graph has one edge
in each of the three triad sheets. At lattice size
𝐿
, the FCC graph contains
4𝐿
3
triangles,
and each FCC edge participates in exactly 4 triangles.
Proof.
For three mutually adjacent FCC vertices
𝑣
𝑎
, 𝑣
𝑏
, 𝑣
𝑐
, the three edge-vectors
𝑣
𝑏
𝑣
𝑎
,
𝑣
𝑐
𝑣
𝑎
,
𝑣
𝑐
𝑣
𝑏
must each be FCC neighbor vectors. Direct case analysis on the 12
neighbor vectors shows that any three pairwise-summing-to-zero NN vectors necessarily
lie in distinct sheets. Counting: each FCC vertex is in 24 triangles;
24 · 𝐿
3
/2/3 = 4𝐿
3
.
Each edge appears in
4𝐿
3
· 3/(3𝐿
3
) = 4
triangles. See Figure 1.
4.2 Triangle Operators
Denition 2
(Triangle operator)
.
For an FCC triangle
𝑇
with edges
𝑒
𝑥𝑦
𝑆
𝑥𝑦
,
𝑒
𝑥𝑧
𝑆
𝑥𝑧
,
𝑒
𝑦𝑧
𝑆
𝑦𝑧
, dene the Z-triangle operator
𝒵
𝑇
= 𝑍
𝑒
𝑥𝑦
𝑍
𝑒
𝑥𝑧
𝑍
𝑒
𝑦𝑧
and similarly
𝒳
𝑇
= 𝑋
𝑒
𝑥𝑦
𝑋
𝑒
𝑥𝑧
𝑋
𝑒
𝑦𝑧
.
A single triangle Z-operator commutes with all per-sheet Z-stabilizers but anticommutes
with exactly 6 per-sheet X-stabilizers (two per sheet). Products of triangles can be chosen
to commute with all stabilizers.
4.3 Reachable Cross-Sheet Logicals
Theorem 3
(Cross-sheet reachability)
.
Let
GF(2)
|𝑇 𝑛
edges
be the triangle-edge inci-
dence matrix and
𝐻
𝑋
the cross-sheet X-stabilizer matrix. Dene the space of valid triangle
products
𝒱 = {𝑚 · : 𝑚 GF(2)
|𝑇 |
, 𝐻
𝑋
(𝑚 · )
𝑇
= 0}.
Then
dim(𝒱 mod row span(𝐻
𝑍
)) = 6𝐿 3
, and every operator in this space has support
on exactly two sheets.
Verication:
At
𝐿 = 4
: 21 logicals (out of
6𝐿 = 24
), all 2-sheet, distributed as 7 per sheet
pair. At
𝐿 = 6
: 33 logicals (out of 36), all 2-sheet, distributed as 11 per sheet pair. The
missing 3 logicals are global homological cycles that no triangle product can form.
10
Theorem 4
(Per-sheet coverage)
.
For each sheet
𝑆
𝑖
, the projection of triangle-reachable
cross-sheet logicals onto the Z-logical space of
𝑆
𝑖
has dimension
𝐿
for each partner sheet
𝑆
𝑗
(
𝑗 = 𝑖
). The union of projections via both partners covers the full
2𝐿
-dimensional
Z-logical space of
𝑆
𝑖
.
Verication at
𝐿 = 4
:
Each sheet pair reaches a
4
-dimensional subspace (
= 𝐿
) of each
sheet’s 8-dim (
= 2𝐿
) Z-logical space. The two partner-pair subspaces are distinct; their
union is the full 8-dim space.
4.4 Operational Consequence
Every Z-logical (and by CSS symmetry, every X-logical) of every sheet can participate
in a triangle-mediated joint measurement with at least one partner sheet. Combined
with fresh ancilla logical qubits and standard surgery protocols [5, 6], these reachability
results suggest that arbitrary logical-pair interactions can in principle be mediated using
ancilla logicals and logical-basis routing; a complete routing-depth construction (and the
associated FT-thresholded characterization, since the composed CNOT itself is not yet
FT-thresholded in this work, Section 10) is left to future work.
5 Fault-Tolerant Surgery Protocol
5.1 Ancilla Placement
Each surgery primitive uses
𝐿
triangles forming a localized cluster on the FCC lattice.
For each triangle
𝑇
, a measurement ancilla is placed at the centroid of
𝑇
’s three vertex
positions, coupled to its 3 data qubits via short-range couplers (
𝐾 = 3
at the ancilla).
Figure 2 illustrates the canonical
𝐿 = 4
four-triangle primitive.
Flag-qubit protocol at small
𝐿
.
A weight-3 measurement with single-fault propaga-
tion produces data errors of weight
2
. For correctability, we require
𝑤 (𝑑 + 1)/2
,
equivalently
𝑑 = 𝐿 5
for weight-3 measurements. For
𝐿 6
, no ag qubits are needed:
the per-sheet code distance suces for fault-tolerant triangle measurements. For
𝐿 = 4
,
a ag-qubit protocol [9] catches the worst-case weight-2 propagation.
5.2
𝐾=4
Connectivity Verication
Two senses of
𝐾=4
.
The connectivity claim made by this paper, here and throughout,
is about
active connectivity per syndrome round
: each data qubit participates in at most
4 two-qubit gates per round of syndrome extraction, including during surgery. This is the
constraint that matters for hardware compatibility (gate scheduling, crosstalk, parallel
CNOT capacity, calibration overhead) and is what makes the architecture compatible
with platforms such as Google Willow and IQM Star/Garnet whose native processors are
designed around
𝐾=4
active connectivity. The
physical coupler layout
of a real chip may
include additional couplers (e.g., diagonal ones used only during specic phases or never
used at all in this protocol); this is a separate hardware-implementation question, and we
11
do not claim the physical layout must literally have exactly 4 couplers per data qubit.
Throughout this paper,
𝐾=4
should be read as active per-qubit connectivity during
any single syndrome round.
Each data qubit therefore participates in at most 4 two-qubit gates per syndrome round,
with no exceptions during surgery. During surgery, some gate slots recongure from
per-sheet ancillas to triangle ancillas. At
𝐿 = 4
with the canonical 4-triangle primitive:
the two
𝑥𝑦
-sheet edges shared between pairs of surgery triangles see 2 triangle-coupler
gates per round (retaining 2 per-sheet coupler gates each); the other 8 data qubits use 1
triangle-coupler gate with 3 per-sheet coupler gates retained.
The
𝐾 = 4
active envelope
is preserved throughout the surgery operation.
0.0
0.2
0.4
0.6
0.8
1.0
x
0.0
0.2
0.4
0.6
0.8
1.0
y
0.0
0.5
1.0
1.5
2.0
2.5
3.0
z
v
0
v
1
v
2
v
3
v
8
v
9
T0
T4
T24
T28
Surgery primitive at
L
= 4: 4 triangles spanning two sheets
(weight-8 joint
Z
A
Z
B
measurement)
S
xy
S
xz
S
yz
triangle ancilla
Figure 2: Surgery primitive at
𝐿 = 4
. Four FCC triangles
𝑇
0
, 𝑇
4
, 𝑇
24
, 𝑇
28
share two
𝑥𝑦
-
sheet edges,
(𝑣
2
, 𝑣
8
)
and
(𝑣
3
, 𝑣
9
)
, which cancel in the product. The net operator has weight
8 with support on 4 edges in
𝑆
𝑥𝑧
and 4 edges in
𝑆
𝑦𝑧
, implementing
𝑍
𝐴
𝑍
𝐵
where
𝐴
is
a logical of sheet
𝑥𝑧
and
𝐵
is a logical of sheet
𝑦𝑧
. Red stars indicate triangle ancilla
positions at each triangle’s centroid.
12
5.3 Gate Schedule
The triangle measurements’ CNOT gates schedule via graph coloring on the conict graph
𝐺
conflict
(nodes: surgery triangles; edges: shared data qubits). At
𝐿 = 4
, the 4-triangle
primitive’s conict graph requires 2 colors; with 3 CNOTs per triangle, the surgery oper-
ation completes in
3 × 2 = 6
time slots. For comparison, a standard per-sheet syndrome
extraction round takes 6–8 time slots.
5.4 Overhead Analysis
At lattice size
𝐿
:
Quantity Per syndrome round (per-sheet) Per surgery operation
Ancillas (3 sheets)
3𝐿
3
𝐿
measurement ancillas
Two-qubit gates
12𝐿
3
3𝐿
to
5𝐿
Time slots
6
8 6
Fraction of per-round cost (at
𝐿=10
)
< 0.5%
Surgery has subleading gate cost:
𝑂(𝐿)
two-qubit gates for the triangle measurements
compared with
𝑂(𝐿
3
)
per syndrome round for the per-sheet stabilizer extraction.
Clock-cycle impact.
Triangle measurements during the merge execute in parallel with
per-sheet syndrome extraction; in a time-multiplexed schedule, triangle CNOTs occupy
the same time slots as per-sheet CNOTs without extending wall-clock time. The merge
phase costs
𝐿
syndrome rounds at the same clock cycle as memory, for total surgery
overhead of
3𝐿
rounds vs.
𝐿
for memory. On superconducting transmons (100–400 ns
cycles [8]),
3𝐿
rounds at
𝐿 = 8
is
2.4
10
µ
s additional wall-clock, well below typical
𝑇
1
, 𝑇
2
(
100
µ
s); on neutral-atom platforms (
1
ms cycles) the
24
ms total requires sustained
coherence achievable in recent demonstrations. Idle errors during merge are captured by
the noise model of Section 7.
5.5 Decoder Graph
The decoder for surgery operations operates on a combined detector graph: per-sheet
syndrome detectors (vertex Z, oct void X) plus triangle measurement detectors. Triangle-
triangle correlations arise from shared data qubits: a
𝑍
error on a shared edge ips both
triangles’ outcomes simultaneously. Standard MWPM [13] applies directly to this graph;
the matcher extends the per-sheet syndrome graph with cross-sheet edges induced by
triangle measurements [5, 6].
13
5.6 Boundary Deformation: Broken Stabilizers and Their Recon-
struction
A standard concern in lattice surgery is that the merge operation temporarily disrupts
the per-block stabilizer structure: some stabilizers become gauge operators during the
merge and must be reconstructed afterward. We characterize this disruption precisely for
the FCC triangle primitive.
Lemma 2
(Broken X-stabilizers per triangle)
.
Each individual triangle
𝑇
with edges
(𝑒
𝑥𝑦
, 𝑒
𝑥𝑧
, 𝑒
𝑦𝑧
)
, one per sheet, anticommutes with exactly six per-sheet X-stabilizers: the
two octahedral voids in each sheet that contain one of the triangle’s edges. The breakdown
is two X-stabilizers in
𝑆
𝑥𝑦
, two in
𝑆
𝑥𝑧
, and two in
𝑆
𝑦𝑧
.
Proof.
A triangle Z-operator
𝒵
𝑇
acts on three edges. Each edge
𝑒 𝑆
𝑖
is contained in
exactly two octahedral voids of
𝑆
𝑖
, since each FCC edge connects two oct-void neighbors.
The X-stabilizer of an oct void contains
𝑒
as one of its four support edges. Therefore
𝒵
𝑇
overlaps each such X-stabilizer in exactly one edge (odd), and anticommutes with it.
The six X-stabilizers (two per sheet) are distinct since they correspond to distinct oct
voids.
Gauge structure during surgery.
During the multi-round merge, the
𝐿
triangles of
the surgery primitive collectively anticommute with
3𝐿
distinct per-sheet X-stabilizers
(each broken by exactly two triangles, accounting for the
6𝐿
total triangle-stab anticom-
mutations of Lemma 2). These
3𝐿
X-stabilizer measurements become
gauge bits
during
the merge: their outcomes are correlated with the triangle outcomes but do not constrain
the merged code’s logical subspace. Veried numerically:
𝐿
Triangles in primitive Distinct X-stabs broken Per-triangle broken Net broken
4 4 12 6 0
6 6 18 6 0
The “net broken” column counts X-stabilizers with odd total ip count across the
𝐿
triangles. This is zero by construction: the triangle product
𝑇
𝒵
𝑇
= 𝑍
𝐴
𝑍
𝐵
commutes
with all X-stabilizers (Theorem 3), so each X-stabilizer is broken by an even number of
triangles in the primitive.
Post-merge reconstruction.
After the
𝑑
-round merge phase, the
3𝐿
initially-broken
X-stabilizer outcomes are reconstructed from the
𝐿
triangle measurement outcomes plus
surviving stabilizer constraints: each broken X-stabilizer’s eigenvalue is the modulo-2 sum
of (i) its pre-merge eigenvalue, (ii) the triangle measurements whose Z-operators overlap
it in odd parity, and (iii) propagated Pauli corrections from the surgery protocol. This is
the FCC-triangle analog of the standard rough/smooth boundary deformation in surface
code lattice surgery [5]: per-sheet boundary stabilizers are temporarily opened as gauges
and closed upon surgery completion. No per-sheet stabilizer is permanently modied;
the aected weight-4 stabilizers are still physically measured throughout the merge, but
their round-to-round detector constraints are gauge-dependent during the merge window
14
and are therefore omitted from the DEM during that window (see Section 7.1); they are
restored post-merge using the triangle-outcome records described above.
Single-fault error propagation across sheets.
A single fault on a triangle ancilla
mid-circuit can propagate to at most two data qubits in dierent sheets: for the canonical
CNOT schedule
𝑒
𝑥𝑦
𝑒
𝑥𝑧
𝑒
𝑦𝑧
, a
𝑍
error on the ancilla after the
𝑆
𝑥𝑧
CNOT and
before the
𝑆
𝑦𝑧
CNOT propagates to one data qubit each in
𝑆
𝑥𝑧
and
𝑆
𝑦𝑧
. This weight-2
cross-sheet error triggers detectors in two distinct per-sheet syndrome graphs; the decoder
graph must include edges spanning these per-sheet graphs (Section 7).
6 Distance Preservation
Theorem 5
(Merged code distance)
.
For any even
𝐿 2
and any cross-sheet measure-
ment operator
Op = 𝑍
𝐴
𝑍
𝐵
implemented by a weight-
2𝐿
triangle product, with
𝑍
𝐴
supported in sheet
𝑆
𝑖
and
𝑍
𝐵
in sheet
𝑆
𝑗
(
𝑖 = 𝑗
), the merged code formed by adding
Op
as a Z-stabilizer has stabilizer rank increased by exactly 1 (consuming one logical qubit),
and the minimum-weight logical operator of the merged code has weight
𝐿
. Distance is
preserved.
Proof.
Stabilizer rank.
Op
commutes with all original X-stabilizers by Theorem 3 (it is
the product of triangle Z-operators chosen to be in the joint kernel of
𝐻
𝑋
). Furthermore,
Op
is not in the row span of
𝐻
𝑍
since it is a non-trivial element of the Z-logical group.
Therefore appending
Op
to
𝐻
𝑍
increases the rank by 1, and the merged code has
𝑘
merged
=
𝑘
pre
1
logical qubits.
Minimum logical weight.
The merged Z-logical group is the original Z-logical group
quotiented by the subgroup
Op
. Each non-trivial equivalence class has the form
{𝑔, 𝑔
Op}
for a representative
𝑔
in the original Z-logical group, with
𝑔
not equivalent to
Op
modulo the original stabilizers.
Decompose any
𝑔
as
𝑔 = 𝑔
𝑥𝑦
𝑔
𝑥𝑧
𝑔
𝑦𝑧
where
𝑔
𝑠
denotes the restriction of
𝑔
to sheet
𝑆
𝑠
.
Similarly
Op
decomposes as
Op = (Op)
𝑖
(Op)
𝑗
with
(Op)
𝑖
= 𝑍
𝐴
of weight
𝐿
in
𝑆
𝑖
and
(Op)
𝑗
= 𝑍
𝐵
of weight
𝐿
in
𝑆
𝑗
. Then
wt(𝑔) = wt(𝑔
𝑥𝑦
) + wt(𝑔
𝑥𝑧
) + wt(𝑔
𝑦𝑧
),
(3)
wt(𝑔 Op) = wt(𝑔
𝑘
) + wt(𝑔
𝑖
𝑍
𝐴
) + wt(𝑔
𝑗
𝑍
𝐵
),
(4)
where
𝑘
is the third sheet (
𝑘 / {𝑖, 𝑗}
).
By the Layer Decomposition Theorem (Theorem 2), each non-trivial logical
𝑔
𝑠
on sheet
𝑆
𝑠
has weight
𝐿
(per-layer 2D toric code distance).
Since
𝑔
is non-trivial in the merged code, at least one of the following holds:
(a)
𝑔
𝑘
= 0 (mod
stab
𝑆
𝑘
)
, i.e.,
𝑔
𝑘
is a non-trivial logical of
𝑆
𝑘
. Then
wt(𝑔
𝑘
) 𝐿
. Both
wt(𝑔)
and
wt(𝑔 Op)
contain
wt(𝑔
𝑘
) 𝐿
as a summand, so the class minimum is
𝐿
.
(b)
𝑔
𝑘
is trivial in
𝑆
𝑘
(i.e.,
𝑔
𝑘
= 0
or a stabilizer), but
𝑔
𝑖
is a non-trivial logical of
𝑆
𝑖
.
Then
wt(𝑔
𝑖
) 𝐿
, so
wt(𝑔) 𝐿
. For
wt(𝑔 Op)
, the contribution
wt(𝑔
𝑗
𝑍
𝐵
)
is
15
either
𝐿
(if
𝑔
𝑗
and
𝑍
𝐵
are in distinct logical classes, or if
𝑔
𝑗
is a stabilizer leaving
the
𝑍
𝐵
contribution of weight
𝐿
) or zero (if
𝑔
𝑗
𝑍
𝐵
(mod
stab
𝑆
𝑗
)
). In the zero
case,
𝑔 Op
has contributions only from
𝑔
𝑖
𝑍
𝐴
(in
𝑆
𝑖
, weight
𝐿
) and
𝑔
𝑘
(in
𝑆
𝑘
,
possibly weight 0). Therefore
wt(𝑔 Op) 𝐿
.
(c) By symmetry with (b), interchanging
𝑖
and
𝑗
.
(d) Multiple sheets contribute non-trivial logicals. Then both
wt(𝑔)
and
wt(𝑔 Op)
inherit contributions from at least two non-trivial per-sheet logicals, each
𝐿
, so
the class minimum is
𝐿
.
Achievability of
𝐿
.
The class containing
𝑍
𝐴
has representatives
{𝑍
𝐴
, 𝑍
𝐴
Op}
. Since
𝑍
𝐴
Op = 𝑍
𝐴
𝑍
𝐴
𝑍
𝐵
= 𝑍
𝐵
modulo per-sheet stabilizers, this class equals
{𝑍
𝐴
, 𝑍
𝐵
}
with representatives of weight
𝐿
each. Therefore the minimum is achieved.
Computational verication.
The proof above is independent of
𝐿
. To rule out edge
cases, we additionally veried Theorem 5 by exhaustive enumeration at small
𝐿
. At
𝐿 = 4
, all
6𝐿 = 24
per-sheet logical generators give class minimum exactly 4, and the
same minimum holds for all
24
2
= 276
pair products. At
𝐿 = 6
, all 36 generators and
36
2
= 630
pair products give class minimum exactly 6. No combination produced a class
minimum below
𝐿
.
6.1 X-Distance Preservation
Theorem 5 addresses the Z-logical group of the merged code, which is the side directly
modied by the cross-sheet Z-measurement
Op = 𝑍
𝐴
𝑍
𝐵
. The X-side requires a sep-
arate argument: the merged X-logical group is the subgroup of the original X-logicals
that commute with
Op
, modulo the original X-stabilizers (which are unchanged by the
surgery).
Theorem 6
(X-distance preservation)
.
Under the same conditions as Theorem 5, the
merged code’s X-distance equals
𝐿
(the per-sheet code distance). The merged code’s X-
logical group consists of all original X-logicals that have even overlap with the support of
Op
, modulo the (unchanged) X-stabilizer group.
Proof.
Setup.
The original code has X-stabilizer matrix
𝐻
𝑋
and Z-logical group
𝑍
,
X-logical group
𝑋
. After surgery, the merged code has stabilizer matrices
𝐻
merged
𝑋
= 𝐻
𝑋
(unchanged) and
𝐻
merged
𝑍
= 𝐻
𝑍
{Op}
. The merged X-logical group is the normalizer
of the merged stabilizer group restricted to X-type operators, modulo the merged X-
stabilizers:
merged
𝑋
= {𝑔
𝑋
: [𝑔, Op] = 0}
𝐻
𝑋
.
The commutativity condition
[𝑔, Op] = 0
for
𝑔
an X-operator and
Op
a Z-operator reduces
to:
𝑔
has even-parity overlap with the support of
Op
.
Counting.
The original X-logical group has
6𝐿
generators (
2𝐿
per sheet for the toric
code;
𝐿
per sheet for the planar variant).
Op = 𝑍
𝐴
𝑍
𝐵
where
𝐴
is a Z-logical of sheet
𝑆
𝑖
and
𝐵
is a Z-logical of sheet
𝑆
𝑗
. The X-logical
𝑋
𝐴
(the conjugate of
𝐴
in
𝑆
𝑖
) anticommutes
with
𝑍
𝐴
(per-sheet anticommutation) and commutes with
𝑍
𝐵
(disjoint sheet supports),
16
so
𝑋
𝐴
anticommutes with
Op
. Symmetrically,
𝑋
𝐵
anticommutes with
Op
. All other
generators of
𝑋
commute with
Op
: per-sheet
𝑋
-logicals in sheets
𝑆
𝑘
,
𝑘 = 𝑖, 𝑗
have
disjoint support from
Op
and commute trivially; per-sheet
𝑋
-logicals in
𝑆
𝑖
or
𝑆
𝑗
other
than
𝑋
𝐴
or
𝑋
𝐵
are independent of
𝑋
𝐴
(resp.
𝑋
𝐵
) and the per-sheet anticommutation
structure ensures they commute with the relevant component of
Op
.
The merged X-logical group has
6𝐿 2
commuting generators from the original
6𝐿
,
but the product
𝑋
𝐴
· 𝑋
𝐵
(sum of two anticommuting generators) commutes with
Op
(anticommutes with each component, so even total overlap). This product is a non-trivial
X-logical of the merged code, representing the consumed-Z logical’s X-conjugate. Thus
the merged X-logical group has
6𝐿 1
generators, consistent with
𝑘
merged
= 6𝐿 1
(one
Z-logical consumed by adding
Op
).
Minimum weight.
The merged X-logical generators are of two types:
(a) Original per-sheet X-logicals that commute with
Op
: weight
𝐿
each (per-sheet 2D
toric distance, Theorem 2).
(b) The new generator
𝑋
𝐴
·𝑋
𝐵
: weight
2𝐿
(disjoint supports across sheets
𝑆
𝑖
and
𝑆
𝑗
).
The minimum weight over all generators is
𝐿
. Linear combinations of generators yield at
least weight
𝐿
by the same layer-decomposition argument as Theorem 5: each non-trivial
component on a single sheet contributes weight
𝐿
. Therefore the merged X-distance
equals
𝐿
.
Computational verication.
At
𝐿 = 4, 6
, all
6𝐿
X-logical generators have weight
𝐿
; exactly 2 anticommute with
Op
(conrming the
6𝐿 2
count of single-generator
commuters), and the minimum weight among merged X-logical generators is exactly
𝐿
.
Combined with Theorem 5, the merged code distance is
𝐿
for both Z- and X-side errors.
7 Threshold Simulation
We characterize the surgery operation using a custom Stim circuit [12] with explicit tri-
angle measurements, decoded with MWPM via PyMatching [13], and report a nite-size-
scaling (FSS) crossing threshold estimate from pairwise crossings at
𝐿 {4, 6, 8}
. The
static memory FSS-crossing estimate is reported as a complementary baseline.
7.1 Custom Stim Circuit Construction
We construct a Stim circuit implementing the full surgery operation: two triad sheets
(
𝑆
𝑥𝑧
, 𝑆
𝑦𝑧
for the
𝐿 = 4
primitive) with per-sheet vertex Z- and oct-void X-stabilizer
measurements,
𝐿
triangle ancilla measurements during the merge phase, and auxiliary
𝑆
𝑥𝑦
data qubits used by the triangle measurements. Three phases of
𝑑
syndrome rounds
(pre-merge, merge, post-merge); the
3𝐿
X-stabilizers broken during merge (Lemma 2)
have detectors skipped per the gauge structure of Section 5.6. Key statistics:
17
𝐿
Total qubits Triangles Broken X-stabs DEM error mechanisms
4 324 4 12 (8 in
𝑆
𝑥𝑧
+ 𝑆
𝑦𝑧
, 4 in
𝑆
𝑥𝑦
) 23,946
6 750 6 18 (12 in
𝑆
𝑥𝑧
+ 𝑆
𝑦𝑧
, 6 in
𝑆
𝑥𝑦
) 70,434
8 2,568 8 24 (16 in
𝑆
𝑥𝑧
+ 𝑆
𝑦𝑧
, 8 in
𝑆
𝑥𝑦
) 539,215
The DEM decomposes via Stim’s
decompose_errors=True
(hyperedges into graph-like
edges where possible) at all three sizes; the resulting graphs are compatible with Py-
Matching’s MWPM decoder.
7.2 Surgery Operation Threshold
Terminology.
The numerical threshold estimates throughout this section and the rest
of the paper are
nite-size crossing estimates
extracted from pairwise crossings of logical
error rates at
𝐿 {4, 6, 8}
with MWPM decoding. We use the abbreviation FSS-crossing
estimate or simply “threshold estimate for compactness, but the reader should not read
these as asymptotic (
𝐿
) thresholds; they are tight empirical crossings at three
accessible code distances with quoted uncertainty drawn from the three pairwise crossings
and a formal FSS data-collapse t where available (Section 9.2 for the planar variant; the
toric FSS t gives
1.134% ± 0.033%
,
𝜈 = 1.50 ± 0.18
, statistically consistent with the
pairwise-crossing estimate). Extending to
𝐿 = 10
or
𝐿 = 12
would tighten the band but
is computationally expensive (Section 10.8).
We perform Z-basis logical memory experiments with the joint observable
𝑍
𝐴
𝑍
𝐵
(the
cross-sheet logical that the surgery primitive measures), running
3𝑑
syndrome rounds
(pre-merge
𝑑
, merge
𝑑
, post-merge
𝑑
) followed by destructive Z-measurement. Three code
distances were measured:
𝐿 = 4, 6, 8
.
𝑝
(%)
𝐿 = 4 𝐿 = 6 𝐿 = 8
0.10
5.0 × 10
3
< 2 × 10
3
(0/500)
< 10
3
(0/1000)
0.20
1.7 × 10
2
1.5 × 10
3
< 10
3
(0/1000)
0.30
3.9 × 10
2
8.0 × 10
3
< 10
3
(0/1000)
0.40
5.5 × 10
3
0.50
9.7 × 10
2
3.8 × 10
2
2.5 × 10
2
0.60
4.3 × 10
2
0.70
1.8 × 10
1
1.1 × 10
1
8.7 × 10
2
0.80
2.3 × 10
1
1.6 × 10
1
1.2 × 10
1
0.90
2.8 × 10
1
2.2 × 10
1
2.0 × 10
1
1.00
3.08 × 10
1
2.77 × 10
1
2.71 × 10
1
1.10
3.3 × 10
1
3.2 × 10
1
3.4 × 10
1
1.20
3.6 × 10
1
4.0 × 10
1
4.2 × 10
1
1.50
4.4 × 10
1
4.6 × 10
1
4.9 × 10
1
Pairwise crossings and threshold estimate.
The pairwise crossings of
𝐿 = 4
vs
𝐿 = 6
,
𝐿 = 4
vs
𝐿 = 8
, and
𝐿 = 6
vs
𝐿 = 8
give three independent estimates:
18
Crossing
𝑝
cross
𝐿 = 4
vs
𝐿 = 6 1.109%
𝐿 = 4
vs
𝐿 = 8 1.069%
𝐿 = 6
vs
𝐿 = 8 1.024%
The three crossings span
1.02%
to
1.11%
. We report a nite-size threshold estimate of
𝑝
surgery
th
= 1.07% ±0.05%
, where the band reects the spread of pairwise crossings. Below
threshold, the
𝑑
-scaling is monotonic with
𝐿
: at
𝑝 = 0.5%
, the logical error rate is
9.7%
at
𝐿 = 4
,
3.8%
at
𝐿 = 6
, and
2.5%
at
𝐿 = 8
, decreasing as expected for a fault-tolerant
operation in the below-threshold regime.
Formal FSS t.
As a second, parameterized threshold estimate, we t the standard
nite-size scaling ansatz
𝑝
𝐿
(𝑝) = 𝐴 + 𝐵 · 𝑥 + 𝐶 · 𝑥
2
with
𝑥 = (𝑝 𝑝
th
) · 𝐿
1/𝜈
to the
near-threshold data (
𝑝 [0.7%, 1.5%]
, 21 data points across
𝐿 = 4, 6, 8
) by least-squares
with weighted residuals (shot-noise statistical errors). The t yields
𝑝
toric
th
= 1.134% ± 0.033%, 𝜈 = 1.50 ± 0.18,
(5)
with
𝜒
2
/DOF = 2.75
. Consistent with the pairwise estimate within combined uncertainty
(t centered
0.07%
higher). The exponent
𝜈 1.5
is in the percolation universality
range expected for surface-code-like threshold transitions (
𝜈
2D perc
= 4/3
to
3/2
[15]). The
𝜒
2
/DOF > 1
indicates systematic deviations from the simple quadratic ansatz typical of
surface-code FSS at moderate
𝐿
. The pairwise-crossing estimate is the more conservative
(model-free) report; both stated for completeness.
Shot counts and methodology.
Each row uses at least
1,000
shots; near-threshold
rows (
𝑝 = 0.9%
1.2%
) use up to
8,000
shots for tighter condence intervals. Standard
error on each rate is the binomial
𝑝(1 𝑝)/𝑁
and is plotted as error bars in Figure 3.
This is a nite-size threshold estimate from a custom Stim circuit with explicit triangle
measurements, not a proxy: the DEM is constructed via Stim’s
decompose_errors=True
and decoded by PyMatching’s MWPM. Each circuit is reproducibly built from the cached
surgery primitive at the corresponding
𝐿
. Below threshold,
𝑑
-scaling is strong: at
𝑝 =
0.1%
,
𝐿 = 4
surgery fails at
5 ×10
3
while
𝐿 = 6, 8
have zero observed errors in
500
1000
shots (rate
< 10
3
).
7.3 Comparison with Proxy Estimate
A natural proxy is a single rotated surface code at distance
𝐿
run for
3𝑑
rounds (the
same total syndrome budget), giving
𝑝
proxy
th
0.5%
. The custom Stim circuit with nite-
size scaling gives
1.07% ±0.05%
, substantially higher. The proxy is conservative because
it (a) assumes the joint
𝑍
𝐴
𝑍
𝐵
observable suers as much logical-error rate as
3𝑑
rounds of single-cycle memory, ignoring the
2𝐿
-qubit cross-sheet support’s robustness; (b)
treats a single 2D code as a stand-in for the sheet code, ignoring the layer decomposition
(Theorem 2) where per-layer X-errors do not propagate; and (c) discards the triangle
measurements’ role as auxiliary syndrome bits during merge.
19
0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6
Physical error rate
p
(%)
10
3
10
2
10
1
Logical error rate (per surgery operation)
Surgery threshold: custom Stim circuit with explicit triangle measurements
L
= 4, 6, 8 finite-size scaling; 3
d
each
rounds total; MWPM decoding
Pairwise crossings:
p
th
= 1.07% ± 0.05%
L=4 (d_each=4)
L=6 (d_each=6)
L=8 (d_each=8)
Figure 3: Surgery threshold from the custom Stim circuit with explicit triangle measure-
ments at
𝐿 = 4, 6, 8
. Each data point uses the full surgery protocol (
𝑑
each
= 𝐿
rounds of
pre-merge, merge with triangle measurements, and post-merge), MWPM decoded on the
decomposed DEM. The three pairwise crossings
𝐿 = 4
vs
𝐿 = 6
,
𝐿 = 4
vs
𝐿 = 8
,
𝐿 = 6
vs
𝐿 = 8
occur at
𝑝 = 1.109%, 1.069%, 1.024%
(gray band), giving
𝑝
surgery
th
= 1.07% ±0.05%
.
7.4 Static Memory Threshold (Single-Sheet Baseline)
For comparison with the surgery threshold, we also measure the single-sheet static memory
threshold using a custom circuit (no triangle measurements). Z-basis memory experiment
with
𝑑
syndrome rounds on the sheet
𝑆
𝑥𝑦
at L=4 and L=6:
𝑝
(%)
𝐿 = 4
,
𝑑 = 4 𝐿 = 6
,
𝑑 = 6
0.10
8.0 × 10
3
6.0 × 10
4
0.30
2.6 × 10
2
4.4 × 10
3
0.50
5.3 × 10
2
1.9 × 10
2
0.80
9.0 × 10
2
5.4 × 10
2
1.20
1.6 × 10
1
1.5 × 10
1
The single-sheet static memory FSS-crossing estimate from this custom circuit at
𝐿
{4, 6}
is
𝑝
static
th
1.0%
, statistically consistent with the surgery FSS-crossing estimate
1.07% ± 0.05%
and with the surface-code-literature threshold (
1%
at MWPM under
circuit-level depolarizing noise).
20
7.5 Decoder: What’s Veried and What Remains Open
The custom Stim circuit of Section 7.1 produces a decomposable detector error model
(23,946 error mechanisms at
𝐿 = 4
surgery).
Veried:
(a) the DEM can be decomposed
into an MWPM-compatible graphlike model via
decompose_errors=True
(a 2-edge de-
composition of the cross-sheet hyperedges; this addresses the principal Section 5.6 concern
that surgery introduces hyperedges, but the decomposition itself can be lossy optimal
hypergraph or BP+OSD decoding remains open); (b) MWPM via PyMatching gives the
threshold estimate
1.07% ±0.05%
at
𝐿 = 4, 6, 8
; (c) the broken-X-stabilizer detector han-
dling produces a deterministic detector pattern combined with triangle outcomes.
Open:
(i) whether BP+OSD [14] extracts a higher FSS-crossing estimate from the same DEM
(the hyperedge decomposition is lossy); (ii) larger-
𝐿
behavior (
𝐿 10
, computationally
expensive); (iii) full three-sheet rather than two-sheet circuit (the
𝑆
𝑥𝑦
sheet supplies only
auxiliary triangle data qubits in our circuit; we expect the estimate unchanged); (iv) de-
coders specic to the FCC sheet code’s octahedral symmetry. The custom Stim circuit
conrms fault-tolerant operation under MWPM at FSS-crossing estimates in the same
hardware-compatible regime as the surface code.
8 Comparison with State of the Art
8.1 Practical Position
What the sheet code is not.
A constant-rate qLDPC code. Its rate
𝑘/𝑛 = 2/𝐿
2
van-
ishes as
𝐿
, the same asymptotic scaling as the surface code. Bivariate bicycle codes
and recent “good” qLDPC codes achieve constant or growing rates at higher connectivity
(
𝐾=6
or higher with long-range couplers). The sheet code does
not
solve the asymptotic
rate problem.
What the sheet code does solve.
Native cross-block fault-tolerant
joint Pauli mea-
surements
at
𝐾=4
active connectivity, with FSS-crossing threshold estimates comparable
to single-block surface-code memory. Surface code lattice surgery provides these primi-
tives only within a single substrate; bivariate bicycle codes target inter-block operations
at
𝐾=6
with active research on the inter-block protocols. The sheet code oers a third
option: inter-sheet joint Pauli measurements between logically distinct but co-located
CSS code blocks, native at
𝐾=4
via local triangle measurements. The trade is explicit:
surface-code-level rate in exchange for surface-code-level connectivity with native inter-
sheet joint Pauli measurements. Composing these primitives into full FT logical Cliord
gates (e.g., a fault-tolerantly thresholded CNOT) requires the additional protocol rene-
ments identied in Section 10 and is not established by this work.
8.2 Quantitative Comparison Table
Figure 4 visualizes the code-family landscape, and Table 1 gives the underlying parame-
ters.
21
Rotated
surface
2D toric Gross BB
(d=12)
Sheet (planar,
1 sheet)
Sheet (toric,
1 sheet)
Sheet (toric,
3 sheets, MUX)
10
0
10
20
30
40
50
60
Logical qubits
k
at code distance 8
K = 4
n = 64
lattice surgery
(within patch)
K = 4 (wraps)
n = 128
lattice surgery
K = 6
n = 144
active
development
K = 4
n = 392
FT joint-Pauli
(triangle surgery)
K = 4 (wraps)
n = 512
FT joint-Pauli
(triangle surgery)
K = 4 (wraps)
n = 1536
FT joint-Pauli
(triangle surgery)
Bottom annotations: inter-block primitive (operation between distinct logical code blocks).
Logical density and inter-block primitive availability
across CSS code families at
d
8
Figure 4: Logical qubit count at code distance
8
across CSS code families, with inter-
block primitive availability annotated as the bottom label of each bar. Inter-block prim-
itive” denotes the operation available between distinct logical code blocks:
lattice surgery
(within patch)
for surface code (within a single patch’s lattice surgery zone);
FT joint-
Pauli measurements via triangle surgery
for the sheet code (FSS-crossing threshold esti-
mate
1%
);
active development
for bivariate bicycle codes. The sheet-code three-sheet
deployment (rightmost) gives
6𝐿 = 48
logicals at
𝑑 = 8
on
3𝐿
3
= 1536
data qubits; the
single-sheet variants give
𝐿
(planar) or
2𝐿
(toric) logicals.
8.3 Trade-os and Practical Niche
Where the sheet code wins.
(i)
Logical density at
𝐾=4
with inter-sheet joint Pauli
measurements
: the toric variant encodes
2𝐿
logicals per sheet (twice the rate of an equiv-
alent count of
𝐾=4
rotated surface code patches), and triangle surgery makes all
6𝐿
logicals across three sheets mutually addressable for joint Pauli measurements a fea-
ture surface code only provides within a single patch’s lattice surgery zone. (ii)
Single-chip
𝐾=4
deployment with three sheets
: the three sheets sit on the same chip with recong-
urable couplers handling time-multiplexing; the bivariate bicycle architecture explicitly
requires
𝐾=6
with some long-range couplers [11]. (iii)
Inter-sheet joint Pauli measure-
ments as a native primitive
: the FCC lattice’s intrinsic 3D triangle structure gives FT-
thresholded inter-sheet joint Pauli measurements where surface code patches would need
expensive routing or transversal protocols. (iv)
Code-distance scaling
: unlike the full
[[3𝐿
3
, 2𝐿
3
+2, 3]]
FCC code’s xed
𝑑 = 3
, the sheet code achieves growing
𝑑 = 𝐿
, and
compared to the 3D toric code’s
𝑑 = 𝐿
at xed
𝑘 = 3
, the sheet code reaches
𝑘 = 2𝐿
per
sheet at the same distance.
Where the sheet code loses.
(i)
Asymptotic rate vs. qLDPC
: rate
Θ(1/𝑑
2
)
vanishing
as
𝑑
, the same as surface code; bivariate bicycle achieves
Θ(1)
rate at
𝑑 = Θ(
𝑛)
,
recent “good” qLDPC codes at
𝑑 = Θ(𝑛)
. At
𝑑 = 12
the Gross BB code’s rate (
1/12
) is
approximately
12×
higher than the sheet code’s at the same distance. (ii)
Toric variant
needs wrap couplers
; the planar variant has half the rate (
𝐿
logicals per sheet). (iii)
Partial
logical reachability via triangles
: products span
6𝐿 3
of the
6𝐿
cross-sheet logicals
(three global-wrap correlations are unreachable); the reachability results suggest that
22
Table 1: Comparison of CSS codes at code distance
𝑑
(or
𝐿
). Rate is
𝑘/𝑛
data
. “Wraps”
indicates periodic boundary couplers required. “Cross-block primitive” denotes the avail-
able primitive for operations between independent code blocks: lattice surgery within a
patch for surface code; FT joint-Pauli measurements via triangle surgery for the sheet
code; active development for bivariate bicycle. The Gross BB code uses xed
𝑑 = 12
,
𝑘 = 12
,
𝑛 = 144
.
Code
𝑛
data
𝑘 𝑑
Rate
𝐾
Cross-block primitive Threshold
Rotated surface [4]
𝑑
2
1
𝑑 1/𝑑
2
4 Lattice surgery
0.7
1.0%
2D toric [3]
2𝑑
2
2
𝑑 1/𝑑
2
4 (wraps) Lattice surgery
0.7%
3D toric [2]
3𝐿
3
3
𝐿 1/𝐿
3
6 Limited
Gross BB [10] 144 12 12
1/12
6 Active development [11]
0.7%
Two-gross BB [10] 288 12 18
1/24
6 Active development
Full FCC [1]
3𝐿
3
2𝐿
3
+2
3
2/3
12
Sheet code (planar, 1 sheet)
𝐿(𝐿1)
2
𝐿 𝐿1 1/𝐿
2
4
Triangle surgery
0.76%
Sheet code (toric, 1 sheet)
𝐿
3
2𝐿 𝐿 2/𝐿
2
4 (wraps)
Triangle surgery
1.0%
Sheet code (toric, 3 sheets, MUX)
3𝐿
3
6𝐿 𝐿 2/𝐿
2
4 (wraps)
Triangle surgery
1.07% ± 0.05%
joint-Pauli measurements between any logical pair can be mediated via ancilla logicals,
but a complete routing protocol with depth and FT analysis is left to future work. (iv)
Threshold simulation uses MWPM on a decomposed DEM
; BP+OSD may extract further
improvement.
Practical niche.
Hardware with
𝐾 = 4
active connectivity (Google Willow [8] and
future surface-code-compatible deployments), code distances
𝑑 {6, 8, 10}
where bivariate
bicycle’s
𝑑 12
overhead is not yet justied, workloads requiring frequent inter-sheet
joint Pauli measurements where surface code patches would force routing or transversal
protocols across separate substrates. For longer-term larger-scale fault tolerance (
𝑑 12
)
on
𝐾 = 6
hardware, bivariate bicycle codes remain preferable.
9 Discussion
9.1 Universal Gate Set
Joint Pauli measurements are sucient in principle to synthesize arbitrary Cliord op-
erations through standard lattice-surgery protocols with ancilla logical qubits [5, 6]. In
this work, we verify the three-sheet Horsman CNOT logical truth table (Section 10)
at
𝐿 {4, 6, 8}
, but nd that the current composed protocol is
not yet fault-tolerantly
thresholded
(its
𝑑
each
-scaling fails because the Pauli-frame correction enters the observ-
able through single-record gauge measurements). Establishing FT-thresholded Cliord
composition and by extension a fault-tolerantly characterized universal Cliord set
therefore requires the gauge-measurement renement or transversal route discussed in
Section 10.5, not provided in this paper. Non-Cliord gates (such as
𝑇
) further require
magic state injection or distillation; we expect the FCC lattice’s high symmetry group
(octahedral, order 48) to support ecient magic state protocols, but a detailed analysis
is left to future work.
23
9.2 Planar-Boundary Surgery
For real hardware deployments without periodic boundary conditions, each layer becomes
a rotated surface code with rough/smooth boundaries: rough on
±𝑥
smooth on
±𝑦
for
𝑆
𝑥𝑦
; rough on
±𝑧
smooth on
±𝑥
for
𝑆
𝑥𝑧
; rough on
±𝑧
smooth on
±𝑦
for
𝑆
𝑦𝑧
. The
rough-on-
𝑧
choice for
𝑆
𝑥𝑧
, 𝑆
𝑦𝑧
aligns the rough boundary with the direction in which
the triangle primitive’s chain extends (the primitive cancels
𝑥𝑦
-sheet edges and extends
through
𝐿
layers along
𝑧
), enabling a non-trivial cross-sheet logical to terminate on rough
boundaries.
Veried parameters.
The planar code satises
𝐻
𝑋
𝐻
𝑇
𝑍
= 0
; the minimum-weight cross-
sheet surgery primitive at
𝐿 = 4, 6, 8
:
𝐿
Data qubits
𝑛
Logicals
𝑘
Distance Triangles in primitive (Op weight)
4 108 12 3 9 (21)
6 450 18 5 25 (55)
8 1,176 24 7 49 (105)
The planar per-sheet logical count is
𝐿
(one per layer), so the three-sheet code encodes
𝑘 = 3𝐿
vs.
6𝐿
in the toric construction. The planar distance scales as
𝐿 1
(standard
rotated surface code nite-size eect). The minimum-weight
naive
surgery primitive uses
(𝐿 1)
2
triangles producing an Op of weight
𝐿
2
(a thick block rather than the toric
ribbon), forced by the additional boundary-stabilizer commutation constraints.
Measured planar threshold.
Following the same custom Stim methodology as the
toric case, we run planar surgery logical-memory experiments with
3𝑑
rounds of syndrome
extraction. The naive
(𝐿 1)
2
-triangle primitive (commuting with all boundary stabs)
yields no
𝑑
-scaling crossing across
𝑝 [5 × 10
5
, 5 × 10
3
]
with shot counts up to 30,000
(logical error rate strictly increasing in
𝐿
at every measured
𝑝
, implying threshold below
5 × 10
5
, dominated by the
𝐿
2
-weight Op being too heavy to protect).
Boundary-aware planar primitive.
Allowing the primitive’s Z-operator to anticom-
mute with one weight-2 boundary X-stab treating it as an additional broken stabilizer
during the merge phase, in direct analogy to the toric protocol’s bulk-broken X-stabs
(Lemma 2) dramatically reduces the primitive’s Op weight:
𝐿
Triangles Op weight Boundary stabs broken Total qubits Total broken X DEM mech.
4 3 7 1 175 6 11,688
6 5 11 1 757 10 41,183
8 7 15 1 1,987 14 137,904
The boundary-aware primitive is a ribbon structure with Op weight
2𝐿 1
and triangle
count
𝐿 1
, closely analogous to the toric primitive (Op weight
2𝐿
, triangle count
𝐿
).
It breaks exactly one additional weight-2 boundary X-stabilizer per surgery operation,
handled by the same detector-suppression mechanism used for bulk-broken X-stabs.
24
Logical error rates per surgery operation under MWPM decoding with the boundary-
aware primitive:
𝑝
(%)
𝐿 = 4 𝐿 = 6 𝐿 = 8
0.10
9.0 × 10
3
7.5 × 10
3
7.5 × 10
3
0.20
1.9 × 10
2
1.3 × 10
2
1.2 × 10
2
0.30
3.7 × 10
2
2.9 × 10
2
2.0 × 10
2
0.50
8.3 × 10
2
7.3 × 10
2
5.8 × 10
2
0.70
1.5 × 10
1
1.4 × 10
1
1.3 × 10
1
0.80
1.7 × 10
1
1.9 × 10
1
1.9 × 10
1
0.90
2.0 × 10
1
2.3 × 10
1
2.4 × 10
1
1.00
2.4 × 10
1
2.9 × 10
1
3.1 × 10
1
1.10
2.6 × 10
1
3.3 × 10
1
3.7 × 10
1
Pairwise crossings and threshold estimate.
Crossing
𝑝
cross
𝐿 = 4
vs
𝐿 = 6 0.725%
𝐿 = 4
vs
𝐿 = 8 0.756%
𝐿 = 6
vs
𝐿 = 8 0.812%
The three crossings span
0.72%
to
0.82%
, giving
𝑝
planar
th
= 0.76%±0.05%
for the boundary-
aware planar variant. This is
30%
lower than toric (
1.07%
), reecting the cost
of boundary-stabilizer breaking, but in the same percentage-level regime practical
for near-term 2D superconducting hardware. For reference, Google’s distance-7 Willow
demonstration reports a two-qubit CZ error rate of
0.36%
[8], so a planar sheet-code
memory at
𝐿 4
would operate at
2×
below this FSS-crossing estimate; the toric
variant at
3×
below.
Formal FSS t (planar).
Applying the same FSS ansatz to the boundary-aware planar
data (
𝑝 [0.4%, 1.2%]
, 21 data points across
𝐿 = 4, 6, 8
) gives
𝑝
planar
th
= 0.725% ± 0.018%, 𝜈 = 1.38 ± 0.12,
(6)
with
𝜒
2
/DOF = 2.12
. The t-based threshold coincides with the lowest pairwise crossing
(
𝐿 = 4
vs
𝐿 = 6
at
0.725%
), reecting the additional weight that low-
𝐿
data carries
in a quadratic t. The exponent
𝜈 1.4
matches the toric value within uncertainty, as
expected for a CSS-symmetric construction.
Comparison summary.
Variant Triangles Op weight Distinct broken X DEM mech. (
𝐿=8
)
𝑝
surgery
th
Toric
𝐿 2𝐿 3𝐿
539,215
1.07% ± 0.05%
Planar (naive)
(𝐿1)
2
𝐿
2
2𝐿
2
325,423
< 5 × 10
5
Planar (boundary-aware)
𝐿1 2𝐿1 2(𝐿1)
137,904
0.76% ± 0.05%
25
0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00
Physical error rate
p
(%)
10
2
10
1
Logical error rate (per surgery operation)
Boundary-aware planar-boundary surgery threshold:
L
= 4, 6, 8
Ribbon primitive (Op weight 2
L
1) recovers toric-like behavior
Pairwise crossings:
p
planar
th
= 0.76% ± 0.05%
L=4 (d_each=4)
L=6 (d_each=6)
L=8 (d_each=8)
Figure 5: Planar-boundary surgery threshold with boundary-aware ribbon primitive at
𝐿 = 4, 6, 8
. The boundary-aware variant allows exactly one weight-2 boundary X-
stabilizer to be broken during merge, in direct analogy to how the toric protocol breaks
bulk X-stabilizers (Lemma 2). The three pairwise crossings
𝐿 = 4
vs
𝐿 = 6
,
𝐿 = 4
vs
𝐿 = 8
,
𝐿 = 6
vs
𝐿 = 8
occur at
𝑝 = 0.725%, 0.756%, 0.812%
respectively (gray band),
giving
𝑝
planar
th
= 0.76% ± 0.05%
.
The boundary-aware planar variant achieves toric-comparable thresholds with strictly
planar (no-wraparound) hardware, deployable on Google Willow-class platforms without
toric topology. The
30%
threshold reduction relative to toric is the cost of handling
boundary stabilizers; further optimization (e.g., distributing broken stabs across both
blocks of a multi-block surgery) is future work.
9.3 XX-Merge: the CSS-Symmetric Surgery Primitive
The triangle primitive that implements the ZZ-merge (Section 7) has a CSS dual: replace
the triangle ancilla protocol with the X-basis version ancilla reset to
|+
, CNOT
direction reversed (ancilla as control, data as target), measure ancilla in X to obtain an
XX-merge primitive measuring the joint
𝑋
-product of triangle data qubits. The constraint
becomes: triangle combinations whose X-product commutes with all
𝑍
-stabilizers and is
not in the row span of
𝐻
𝑋
.
Primitive identication (toric).
Searching the kernel of
𝐻
𝑍
·𝐵
𝑇
(mod 2)
at minimum
weight, after skipping trivial X-stabilizer combinations:
26
𝐿
ZZ-primitive (tri, wt) XX-primitive (tri, wt) Broken stabs Notes
4 4, 8 4, 8 12 Z-stabs exact CSS mirror
6 6, 12 6, 12 18 Z-stabs exact CSS mirror
8 8, 16 10, 20 28 Z-stabs 3-sheet, weight 25% heavier than ZZ
At
𝐿 = 4, 6
the XX-primitive structurally mirrors the ZZ-primitive exactly: same triangle
count, same operator weight, and the same
3𝐿
distinct broken Z-stabilizers per surgery
(the CSS dual of the ZZ-primitive’s broken X-stabs, Lemma 2). At
𝐿 = 8
, minimum-
weight search returns a 2-sheet artifact (weight 16, sheets
{𝑥𝑦, 𝑥𝑧}
only); the rst 3-sheet
primitive appears at higher weight (10 triangles, weight 20), breaking
28
Z-stabs.
Threshold measurement.
Custom Stim circuit with
3𝑑
syndrome rounds,
|+
𝑛
ini-
tialization, nal X-basis readout. Z-stab detectors paired from round 1+ except the
3𝐿
broken ones during the merge phase (
28
at
𝐿 = 8
); X-stab detectors deterministic in
round 0 and paired throughout. Logical error rates per surgery operation:
𝑝
(%)
𝐿 = 4 𝐿 = 6 𝐿 = 8
0.4
5.60 × 10
2
1.90 × 10
2
1.60 × 10
2
0.6
1.25 × 10
1
6.98 × 10
2
9.27 × 10
2
0.8
2.12 × 10
1
1.71 × 10
1
2.23 × 10
1
0.9
2.53 × 10
1
2.33 × 10
1
3.08 × 10
1
1.0
2.87 × 10
1
2.97 × 10
1
3.92 × 10
1
1.1
3.15 × 10
1
3.49 × 10
1
4.30 × 10
1
Pairwise crossings:
𝐿 = 4
vs
𝐿 = 6
at
𝑝 0.97%
(the cleanest CSS-symmetric comparison;
both primitives are exact mirrors of the ZZ-side),
𝐿 = 4
vs
𝐿 = 8
at
𝑝 0.77%
,
𝐿 = 6
vs
𝐿 = 8
at
𝑝 0.46%
. The lower
𝐿 = 6
vs
𝐿 = 8
crossing is attributable to the
𝐿 = 8
XX-primitive’s 25% heavier operator weight (20 vs. 12 at
𝐿 = 6
); a tighter
𝐿 = 8
analog
primitive matching the Z-side’s weight 16 is open computational work. Taking the
𝐿 = 4
vs
𝐿 = 6
crossing as the cleanest estimate: the FSS-crossing threshold estimate is
𝑝
𝑋𝑋
th
1.0% ± 0.1%
, statistically consistent with the ZZ-merge estimate
1.07% ± 0.05%
within
combined uncertainty.
The CSS-symmetric surgery primitive result therefore
holds on the toric variant. On the planar boundary-aware variant the dual
XX-primitive is structurally blocked
(see below), so the “CSS-symmetric” framing
should be read as toric-only.
Multi-input ZZ truth table.
As an additional test of the ZZ-primitive, we run surgery
on all four computational-basis inputs
|𝑐
𝐶
|𝑎
𝐴
(
𝐶, 𝐴
are the merged per-sheet Z-
logicals, weight-
𝐿
strings in
𝑆
𝑥𝑧
, 𝑆
𝑦𝑧
). Logical inputs prepared by applying matched X-
logicals
𝑋
𝐶
, 𝑋
𝐴
to
|0
𝑛
. At
𝑝 = 0
, all four inputs yield deterministic outcomes matching
ZZ = 𝑐 𝑎
(
100
shots each); at
𝑝 = 0.005
, the failure rate is input-independent to four
signicant gures (5,000 shots each, all giving 34.4% raw ip rate). The primitive acts
correctly as
𝑍
𝐶
𝑍
𝐴
on all classical inputs with input-symmetric noise.
Planar XX-merge: structural blocker.
The planar boundary-aware variant does
not admit an analogous XX-primitive under the standard rough-
𝑧
/smooth-
𝑥𝑦
boundary
choice. Weight-2 Z-boundary stabilizers live only on the rough
𝑧
-boundaries; X-strings run
27
perpendicular to
𝑧
, so an XX-primitive that breaks exactly one boundary Z-stab cannot
extend as a cross-sheet ribbon. Searching for
𝐿 = 4, 6, 8
planar XX-primitives breaking
one boundary Z-stab yields zero candidates.
The CSS-symmetric XX-merge result
therefore applies only to the toric variant.
CSS-dual planar surgery requires either
a symmetric-boundary planar variant (rough boundaries distributed across all three axes)
or implementing the protocol on the toric variant.
The three-sheet Horsman CNOT, constructed by sequencing the ZZ- and XX-merge prim-
itives, is discussed in detail in Section 10 (split into its own section to make its epistemic
status explicit:
veried logical truth table, not yet fault-tolerantly characterized
).
10 Three-Sheet Horsman CNOT: Veried Truth Table,
Not Yet Fault-Tolerantly Characterized
This section deliberately stands apart from the joint-Pauli-measurement results of Sec-
tions 7 and 9.3 because its claim hierarchy is dierent. The ZZ- and XX-merges are char-
acterized as fault-tolerant primitives with nite-size crossing estimates of their thresholds.
The three-sheet Horsman CNOT, constructed below by composing the two merges with
a CSS-paired ancilla sheet, is a correct
logical
gate but not, as constructed here, a fault-
tolerantly characterized one. We make both claims explicitly and separately to avoid
the natural conation of “CNOT veried” with “CNOT FT-thresholded”; only the former
holds in this work.
10.1 Construction: Three Sheets, Two Merges, One Logical
CNOT
The standard three-qubit Horsman [5] CNOT places the control
𝐶
, ancilla
𝐴
(initialized
in
|+
𝐿
), and target
𝑇
in three patches connected by sequential ZZ- and XX-merges.
We assign
𝐶
,
𝐴
,
𝑇
to three distinct sheets
𝑆
𝐶
, 𝑆
𝐴
, 𝑆
𝑇
of one FCC sheet code: the ZZ-
primitive (Section 7) spans
(𝑆
𝐶
, 𝑆
𝐴
)
and the XX-primitive (Section 9.3) spans
(𝑆
𝐴
, 𝑆
𝑇
)
,
sharing exactly sheet
𝑆
𝐴
. A search across valid primitives conrms that such a three-
sheet decomposition exists at every
𝐿 {4, 6, 8}
tested with the ZZ- and XX-primitives
sharing zero triangles and exactly one data qubit the CSS-pairing anchor of
𝐴
’s Z- and
X-logicals, as required by the symplectic structure. The default lowest-weight primitive
per sheet pair gives higher overlap as an artifact of greedy primitive selection; choosing
CSS-compatible primitives (Z-prim’s
𝑍
𝐴
component anticommuting at exactly one qubit
with X-prim’s
𝑋
𝐴
component) gives the clean three-sheet decomposition.
10.2 Pauli-Frame Correction via GF(2) Solve
We implement the full merge-split-merge protocol with synchronous syndrome extraction
throughout, intermediate Pauli-frame tracking via Stim’s
OBSERVABLE_INCLUDE
book-
keeping, and the destructive readout in the Z-basis. Two subtleties arise that the bare
Heisenberg picture (which treats the XX-merge as a single non-local measurement of
28
𝑋
𝐴
𝑋
𝑇
) misses. First, the protocol with computational-basis inputs and X-basis an-
cilla measurement leaves
𝑇
in an X-eigenstate, not a Z-eigenstate; measuring
𝐴
in the
Z-basis instead and tracking the joint stabilizer
𝑍
𝐶
𝑍
𝐴
𝑍
𝑇
= (1)
𝑚
𝑍𝑍
+𝑡
recovers the Z-
basis CNOT output, with the predicted parity
(𝑍
out
𝑇
)
par
(𝑍
out
𝐴
)
par
𝑚
𝑍𝑍
equal to
𝑐 𝑡
.
Second, the XX-merge is realized as multiple individual X-triangle measurements; the
cumulative anticommutation count of
𝑍
𝐶
𝑍
𝐴
𝑍
𝑇
with these triangles is even but not neces-
sarily zero, so a gauge bit equal to the XOR of an even subset of triangle outcomes leaks
into the observable. The leaking bit is captured by the rst post-XX-merge measurement
of any non-A-touching broken Z-stab whose anticommutation pattern with the X-triangles
matches the relevant subset; we identify these structurally by solving a linear system over
GF(2). Including the corresponding Z-stab record(s) in the observable cancels the gauge
bit.
10.3 Logical Truth Table Veried at
𝐿 {4, 6, 8}
Stim’s deterministic-observable check passes for all four computational-basis inputs at
𝐿 = 4, 𝑑
each
= 2
: the detector error model builds cleanly at
𝑝 = 0
, and raw sampling
at
𝑝 = 0
gives
(𝑐, 𝑐 𝑡)
deterministically for every input, the canonical CNOT truth
table. The protocol generalizes: the same algorithm veries the CNOT truth table at
𝐿 = 6
(where the decomposition picked by the batch primitive-nder happens to require
no Pauli-frame correction the target anticommutation pattern is identically zero) and
𝐿 = 8
(with one correction Z-stab), and at
𝐿 = 4
with
𝑑
each
{2, 3, 4}
. The GF(2) linear
system for the gauge-bit correction has a solution at every distance and depth we tested.
The veried circuit at
𝐿 = 4, 𝑑
each
= 2
uses
392
physical qubits (
192
data
+4
Z-triangle
merge ancillas
+4
X-triangle merge ancillas
+
bulk stabilizer ancillas);
𝐿 = 6
uses
1308
qubits (
648
data
+6
Z-tri
+6
X-tri merge ancillas);
𝐿 = 8
uses
3098
qubits (
1536
data
+14
Z-tri
+12
X-tri merge ancillas).
10.4 Behavior Under Depolarizing Noise: Distance Suppression
in a Narrow Regime
We sweep the CNOT logical error rate (per CNOT, dened as either
Obs
0
or
Obs
1
devi-
ating from its noise-free expected value) using MWPM decoding via PyMatching on the
combined detector graph, at
𝑑
each
= 𝐿
. The
𝐿 = 4
vs
𝐿 = 6
comparison shows clear
distance suppression in the operationally relevant regime
𝑝 [10
3
, 3×10
3
]
: at
𝑝 = 10
3
,
LER
𝐿=4
= 19.8%±1.4%
versus
LER
𝐿=6
= 11.0%±1.6%
, a
4.2𝜎
improvement attributable
to distance. At
𝑝 = 3 ×10
3
the order reverses (
40.0% ± 1.7%
vs
44.0% ± 2.5%
), placing
the pairwise crossing near
𝑝 2
3 × 10
3
.
10.5 Non-Fault-Tolerant Scaling: Direct Diagnostic and Cause
The protocol as constructed is not fault-tolerant in the standard sense.
A
direct diagnostic shows that increasing
𝑑
each
does
not
reduce the LER at xed
𝐿
: at
𝐿 = 4
,
𝑑
each
= 8
gives
higher
LER than
𝑑
each
= 4
across the swept range (e.g.,
26.0%
vs
19.8%
at
𝑝 = 10
3
,
43.8%
vs
29.4%
at
𝑝 = 2 ×10
3
). Below
𝑝 5 ×10
4
the LER curves
29
are no longer cleanly ordered by code distance
𝐿
either. The structural cause is that
the Pauli-frame correction includes one or two specic gauge measurements (the broken
Z-stabs identied by the GF(2) solve above) that enter the observable as single records
they are not repeated
𝑑
each
times in a way the decoder can majority-vote, so their noise
contribution scales with measurement count rather than beneting from the underlying
code distance.
Path to FT-thresholded status.
Repeating these gauge measurements
𝑑
each
times
in a fashion that is itself fault-tolerant (so the decoder can detect and correct measure-
ment errors on the gauge bits via standard syndrome-dierence detectors) is the next
protocol-level step. An alternative is a decoder that incorporates the Pauli-frame struc-
ture explicitly (e.g., a hypergraph or BP+OSD decoder aware of which detectors share
the gauge-bit error mechanism). Both are concrete follow-up directions but are outside
the scope of this work; we therefore characterize the CNOT only as a
veried logical gate
,
not as a fault-tolerantly thresholded gate. A possible second route, applicable when the
three-sheet stacked architecture (Section 3) is realized with vertical inter-sheet couplers,
is a transversal inter-sheet CNOT
if
the rotational sheet isomorphism that relates two
sheets at the lattice level can be realized as a physical qubit correspondence that is also
a stabilizer-preserving CSS isomorphism between the two sheet codes. This is plausible
(the lattice automorphism induces a permutation of edges that maps each sheet’s vertex-Z
and oct-void-X stabilizers to the corresponding stabilizers in the partner sheet) but we do
not give the explicit stabilizer-level construction or threshold characterization here; both
the stabilizer-mapping lemma and the resulting LER characterization are left to future
work. The transversal route, if substantiated, would inherit the memory threshold and
avoid gauge bookkeeping entirely.
10.6 Summary
The three-sheet Horsman CNOT on the FCC sheet code is a veried correct logical gate
(
𝑝 = 0
truth table at
𝐿 {4, 6, 8}
,
𝑑
each
{2, 3, 4}
) but not, as constructed here, a
fault-tolerantly characterized one. Distance suppression is observed in the narrow regime
𝑝 10
3
; under standard FT scaling diagnostics (
𝑑
each
-scaling and low-
𝑝
distance order-
ing), the protocol fails to qualify as fault-tolerant. The cause is structural (single-record
gauge measurements in the observable), and the remediation paths (FT-repeated gauge
measurements, gauge-aware decoder, or transversal in a stacked architecture) are identi-
ed.
10.7 Hardware Practicality and Deployment Niche
Scope.
What this paper characterizes quantitatively falls into three tiers.
Tier
1 (FT primitives with FSS-crossing threshold estimates, hardware-relevant):
the ZZ-
merge (
1.07% ± 0.05%
toric,
0.76% ± 0.05%
planar boundary-aware) and the XX-merge
(
1.0% ± 0.1%
toric only); the static single-sheet memory baseline near
1.0%
(Sec-
tion 7.4) is reported for context.
Tier 2 (correct logical gate, not yet FT-characterized):
the three-sheet Horsman CNOT truth table veried at
𝐿 {4, 6, 8}
,
4.2𝜎
distance
suppression at
𝑝 = 10
3
, but increasing
𝑑
each
does not reduce LER (Section 10), so no
clean asymptotic threshold; reaching Tier 1 requires FT-repeated gauge measurements
30
10
2
10
1
Physical error rate
p
(%)
10
2
10
1
Logical error rate per CNOT (any of Obs
0
, Obs
1
flipped)
L
= 4
L
= 6 crossing
p
0.01%
Three-sheet Horsman CNOT under depolarizing noise
(MWPM decoding,
d
each
=
L
)
L=4, d=4
L=6, d=6
L=8, d=8
Figure 6: Three-sheet Horsman CNOT logical error rate (any of the two observables
ipped from its noise-free expected value) under circuit-level depolarizing noise at
𝑑
each
=
𝐿
, decoded with MWPM (PyMatching) on the combined detector graph.
𝐿 = 4
and
𝐿 = 6
data each use 800 / 400 shots per point across the four computational-basis inputs;
𝐿 = 8
uses 100 shots per point. At the operating regime
𝑝 10
3
the
𝐿 = 4 𝐿 = 6
improvement is
4.2𝜎
; the pairwise crossing falls near
𝑝 2
3 ×10
3
, roughly
3
5×
below
the individual-merge thresholds (
𝑝
𝑍𝑍
th
1.07%
,
𝑝
𝑋𝑋
th
1.0%
). Below
𝑝 5 × 10
4
the
curves are no longer monotonically ordered by distance, attributable to the Pauli-frame
correction Z-stab(s) being measured once (not repeated
𝑑
times) so their noise contribution
does not benet from code distance.
This is the diagnostic that the CNOT is not
yet fault-tolerant in the standard sense
, and the gure should not be read as a
CNOT-threshold extraction.
or a gauge-aware decoder.
Tier 3 (architectural):
stacked three-layer deployment with
vertical inter-sheet couplers, and magic-state protocols suggested by FCC’s octahedral
symmetry, both specied structurally only.
Hardware compatibility.
The single-sheet variant requires
𝐾 = 4
active nearest-
neighbor planar connectivity (Section 5) identical to surface code. This matches:
Google Willow (105-qubit
𝐾=4
grid [8]), IQM Star and Garnet (
𝐾=4
square grids), and
OQC Toshiko (
𝐾=4
coaxmon lattice). It does not match IBM heavy-hex (
𝐾=3
) or Rigetti
octagonal (
𝐾=3
) topologies without SWAP padding. Trapped-ion (Quantinuum H2,
56
qubits) and neutral-atom (QuEra Aquila, up to
256
logically recongurable qubits)
platforms emulate any topology by reconguration, subject to current qubit-count limits;
for these the
𝐾 = 4
envelope constraint is not binding but the qubit count is.
Concrete near-term experiments.
A planar
𝐿 = 4
single-sheet memory experiment
(
70
physical qubits) is within the scale of contemporary
𝐾=4
superconducting proces-
sors. Google’s 105-qubit Willow processor reports a CZ error rate
0.36%
[8],
2×
below the
0.76%
planar memory FSS-crossing estimate. Two-sheet toric surgery at
𝐿 = 4
requires a few-hundred-qubit
𝐾=4
device (
390
physical qubits) and is therefore a
next-generation demonstration target. The full three-sheet Horsman CNOT at
𝐿 = 4
31
(
392
qubits) likewise requires a larger
𝐾=4
device; even when the qubit count becomes
available, the gate remains Tier 2 until the FT-gauge renement (Section 10.5). Platform-
specic layout, calibration, and crosstalk analysis remain future work.
Density value proposition.
The Tier-1 capabilities dene a specic deployment niche:
higher logical density than surface code without leaving
𝐾 = 4
planar connectivity
. A
three-sheet toric deployment at
𝐿 = 4
uses
384
physical qubits (
192
data +
96
vertex-Z
+
96
oct-void-X ancillas) for
6𝐿 = 24
logicals
16
physical qubits per logical, versus
2𝑑
2
1 = 31
for distance-4 rotated surface code, a
1.9×
improvement. At
𝐿 = 6
:
36
/logical vs
71
/logical; at
𝐿 = 8
:
64
/logical vs
127
/logical. The ratio is essentially at
in
𝐿
at
1.9
2.0×
. Single-sheet deployments (when only
2𝐿
logicals are needed) give a
more modest
1.3×
advantage. Bivariate bicycle codes achieve higher density at
𝐾 = 6
(gross code:
12
logicals in
144
qubits at
𝑑 = 12
), so the sheet code’s specic niche is
the
𝐾 = 4
slot: higher density than surface code without the connectivity upgrade BB
requires.
What density buys without FT inter-logical CNOTs.
The Tier-1 capabilities
(memory + joint Pauli measurements) suce for several memory-heavy deployment
classes that don’t require Cliord composition between logicals:
quantum networking
nodes
(a sheet-code repeater holds
2×
more simultaneous entangled-pair memories per
cryostat than a surface-code equivalent, and joint
𝑍𝑍+𝑋𝑋
measurements implement Bell
measurements for entanglement swapping),
benchmarking and characterization through-
put
(proportional speedup of logical RB, process tomography, memory-lifetime sweeps),
NISQ-to-FT bridge experiments
(a
1,000
-qubit
𝐾 = 4
chip supports
60
sheet-code
logicals vs
32
surface-code at
𝐿 = 𝑑 = 4
), and
control-electronics amortization
(
2×
lower xed overhead per logical qubit-hour). Density does
not
, on its own, enable algo-
rithms requiring composable inter-logical FT Cliords (variational chemistry on entangled
multi-logicals, Shor/Grover at the logical layer, magic-state distillation between distilla-
tion patches); those wait on the Tier-2 CNOT reaching Tier-1 status via the gauge-x
renement or transversal-in-stacked-variant route.
Deployment summary.
As of today this work establishes: high-density FT memory
plus FT joint-Pauli primitives, immediately demonstrable at small scale on existing
𝐾 = 4
hardware, scaling to mid-size testbeds and quantum networking nodes on next-generation
𝐾 = 4
chips. A coherent and quantiable hardware-eciency win for memory-heavy
applications, not the universal-FT-computer endpoint, with a dened path to full Cliord
composition through identied protocol or hardware renements.
10.8 Limitations and Open Problems
Higher distances (
𝐿 10
).
The current threshold is from nite-size scaling at
𝐿 = 4, 6, 8
. Extension to
𝐿 = 10
would tighten the band but requires substantially
more compute time per point (
6,000
qubits at
𝐿 = 10
, extrapolating from
𝐿 = 8
’s
3,100
). Cached-primitive infrastructure (Section 7.1) extends to
𝐿 = 10
.
CNOT fault-tolerance renement.
The three-sheet Horsman CNOT truth table
is veried at
𝐿 {4, 6, 8}
with
𝑑
each
{2, 3, 4}
tested at
𝐿 = 4
(Section 10).
Distance suppression at
𝑝 = 10
3
is
4.2𝜎
from
𝐿 = 4
to
𝐿 = 6
, with the pairwise
crossing near
𝑝 2
3 × 10
3
. However, a direct diagnostic test shows the protocol
32
is not yet fault-tolerant in the standard sense: at
𝐿 = 4
,
𝑑
each
= 8
gives
higher
LER
than
𝑑
each
= 4
, and below
𝑝 5 × 10
4
the LER curves are not cleanly ordered
by code distance. The root cause is structural: the Pauli-frame correction includes
specic gauge measurements (one or two broken Z-stabs per merge) that enter the
observable as single records, not repeated
𝑑
times. Lifting these gauge measurements
to FT-repeated form (or building a gauge-aware decoder) is a concrete protocol-level
renement that would restore standard FT scaling and enable a clean threshold
extraction. Until that renement is in place, the CNOT should be regarded as a
veried correct logical gate but not as a fault-tolerantly characterized one.
Planar threshold further optimization.
The boundary-aware planar variant
achieves
0.76% ±0.05%
,
30%
below toric. Closing this gap is plausible: distribut-
ing broken boundary stabs across both blocks of a multi-block surgery, boundary-
friendly primitives exploiting weight-2 stabs more aggressively, or routing through
bulk ancillas to avoid boundary contact.
Decoder improvements.
Whether BP+OSD [14] or a hypergraph decoder yields
higher threshold than the decomposed-edge MWPM used here is open; specialized
decoders matched to the surgery’s gauge structure may improve performance.
Magic state distillation, three-layer stacked modeling, and BB compar-
ison.
The octahedral symmetry group (order 48) suggests ecient magic state
protocols but a detailed construction is open. Physical characterization of inter-
layer TSV couplers (delity, crosstalk, latency) is treated only structurally here.
Side-by-side comparison with BB-code modular gates (
𝐾=6
, active architectural
development [11]) awaits further development of BB inter-block protocols.
Logical Cliord gates beyond CNOT.
Transversal
𝑆
and Hadamard, and sim-
ilar gates, require lattice surgery in conjugate bases or other protocols; not all are
spelled out.
11 Common Objections and Responses
“Isn’t a single triad sheet just
𝐿
stacked 2D toric codes? Where is the novelty?”
Yes, by Theorem 2, the per-sheet static memory is exactly
𝐿
parallel 2D toric codes. The
novelty is not the per-sheet static memory but the
cross-sheet
primitive: weight-3 FCC
triangle measurements that couple data qubits in three distinct sheets simultaneously,
implementing a joint Pauli measurement between logical qubits in dierent sheets while
preserving
𝐾=4
active connectivity. This primitive does not exist in independent 2D toric
codes.
“Is the
1.07%
gure a real threshold?”
It is a nite-size threshold estimate from
logical error rates at
𝐿 = 4, 6, 8
on the full custom Stim circuit. The three pairwise
crossings (
𝐿 = 4
vs
𝐿 = 6
at
1.109%
,
𝐿 = 4
vs
𝐿 = 8
at
1.069%
,
𝐿 = 6
vs
𝐿 = 8
at
1.024%
) give
𝑝
surgery
th
= 1.07% ± 0.05%
. Shot counts
1,000
8,000
per point. Extending
to
𝐿 = 10
would tighten the band but is computationally expensive. The threshold is
signicantly higher than the proxy estimate of
0.5%
in Section 7.
33
“Does the surgery preserve the full code distance, not just the Z-side?”
Yes.
Theorem 5 (Z-side) and Theorem 6 (X-side) give proofs general in
𝐿
via the layer decom-
position. Computational sanity checks at
𝐿 = 4, 6
conrm both Z and X distances equal
𝐿
after the merge.
“Does the current threshold simulation benchmark the full merge-split gate,
or only the joint parity measurement?”
Both, but with dierent epistemic status.
Sections 7 and 9.3 characterize the ZZ- and XX-merge primitives individually as fault-
tolerant joint Pauli measurements with FSS-crossing threshold estimates (
1.07% ±0.05%
and
1.0% ±0.1%
respectively, toric variant). Section 10 addresses the composed three-
sheet Horsman CNOT separately: the logical truth table is veried deterministically at
𝐿 {4, 6, 8}
at
𝑝 = 0
, and we ran a depolarizing-noise LER sweep with MWPM decoding
(Figure 6) showing a
4.2𝜎
distance improvement from
𝐿 = 4
to
𝐿 = 6
at
𝑝 = 10
3
.
However, the CNOT is not yet fault-tolerantly characterized:
a direct diagnostic shows
𝑑
each
-scaling fails (at
𝐿 = 4
,
𝑑
each
= 8
gives higher LER than
𝑑
each
= 4
) because the Pauli-
frame correction includes single-record gauge measurements that do not benet from
code distance. The full CNOT is therefore a veried
correct logical gate
but not a
fault-
tolerantly thresholded
one. The joint Pauli measurements are; the synthesis is not yet.
Concrete remediation paths (FT-repeated gauge measurements, gauge-aware decoder, or
transversal CNOT in a stacked architecture) are identied in Section 10.5.
“How does this compare to bivariate bicycle codes and 3D toric codes?”
Not
on rate: the sheet code has rate
Θ(1/𝐿
2
)
matching surface code, while BB codes achieve
constant rate at
𝐾=6
with active research on inter-block protocols. The niche we claim
is specically
𝐾=4
planar connectivity (the envelope of standard transmon processors):
the sheet code achieves cross-block joint-Pauli measurements natively at
𝐾=4
, where the
full 3D toric code at
𝐾=6
or
𝐾=12
is not surface-code-compatible. Dierent tools for
dierent workloads.
“Is the planar-boundary case veried numerically?”
Yes, at
𝐿 = 4, 6, 8
. Static
planar code has
𝑘 = 3𝐿
, distance
𝐿 1
, CSS-valid structure. Two planar primitives com-
pared: naive (requires commutation with all boundary stabs, Op weight
𝐿
2
, threshold
< 5 × 10
5
) and boundary-aware (one weight-2 boundary stab broken per surgery, Op
weight
2𝐿 1
, threshold
0.76% ±0.05%
, Figure 5). The boundary-aware planar threshold
is
30%
lower than toric but in the same percentage regime, practical for Willow-class
chips without wraparound couplers.
12 Code Availability
Reference implementation and raw data, packaged as a single archive:
https://github.
com/raghu91302/ssmtheory/blob/main/fcc_code_only.zip
. Files are organized by
section:
Lattice and analytical tests
(Sections 26):
sheet_code_fcc_lattice.py
(lattice, stabilizers, triangles),
sheet_code_gf2.py
(GF(2) utilities),
sheet_code_surgery.py
(logical analysis),
sheet_code_test_construction.py
(unit tests).
34
Toric threshold
(Section 7):
sheet_code_custom_surgery.py
(2-
sheet surgery circuit),
sheet_code_cached_surgery.py
(accelerated
𝐿 = 8
runner),
sheet_code_surgery_threshold.py
(sweep driver),
surgery_primitive_L8.json
+
surgery_threshold_results.json
(cached
primitive and raw shot data).
Planar variant
(Section 9.2):
sheet_code_planar_stabilizers.py
(planar lat-
tice with rough
𝑧
-boundary),
sheet_code_planar_verify.py
(verication at
𝐿 =
4, 6, 8
),
sheet_code_planar_ba_surgery.py
(boundary-aware ribbon primitive),
planar_ba_threshold_results.json
.
XX-merge (Section 9.3) and three-sheet CNOT (Section 10)
:
sheet_code_xx_surgery_toric.py
(XX-merge),
xx_surgery_results.json
,
find_3sheet_primitives.py
+
find_3sheet_primitives_fast.py
(CSS-
compatible primitive enumerators, the latter with batch Gauss-Jordan
for
30×
speedup at
𝐿 = 8
),
css_dual.py
(symplectic-dual nder),
sheet_code_cnot_full.py
(three-sheet Horsman CNOT with structural Pauli-
frame correction),
cnot_truth_table_results.json
(truth-table verication),
cnot_ler_sweep.py
+
cnot_ler_dL_L{4,6,8}.json
(CNOT LER sweeps with
PyMatching, Figure 6).
Reproducibility and FSS.
All sweep runners accept
--seed
and
--dump-dir
ags
for reproducibility, and the
--dump-dir
option emits representative
.stim
/
.dem
artifacts on demand at any chosen
(𝐿, 𝑝)
point (the runners reproduce, e.g., the
toric
𝐿 = 4, 6
at
𝑝 = 1.0%
and planar
𝐿 = 4
at
𝑝 = 0.8%
artifacts refer-
enced in earlier sections).
sheet_code_formal_fss_fit.py
produces the toric
(
𝑝
th
= 1.134% ± 0.033%
,
𝜈 = 1.50 ± 0.18
) and planar (
𝑝
th
= 0.725% ± 0.018%
,
𝜈 = 1.38 ±0.12
) data-collapse plots. Running
sheet_code_test_construction.py
reproduces every analytical claim; the per-variant runners reproduce every numeri-
cal entry.
13 Conclusion
We have presented an integrated CSS code architecture on the Face-Centered Cubic lat-
tice. The static sheet code
[[𝐿
3
, 2𝐿, 𝐿]]
provides ecient storage at
𝐾=4
active connectiv-
ity; the cross-sheet triangle surgery primitive provides fault-tolerant
joint Pauli measure-
ments
between logical qubits in dierent sheets, also at
𝐾=4
. The combination occupies
a previously unlled niche:
𝐾=4
hardware compatibility (matching surface code, deploy-
able on Google Willow and similar chips) with inter-sheet joint Pauli measurements as a
native primitive between co-located CSS code blocks a dierent design point from sur-
face code lattice surgery (conned within one substrate) and bivariate bicycle codes (
𝐾=6
modular inter-block operations between physically separated blocks, an area of active ar-
chitectural development). Under circuit-level depolarizing noise on a custom Stim circuit
with explicit triangle measurements, FSS-crossing threshold estimates at
𝐿 = 4, 6, 8
are
𝑝
𝑍𝑍
th
= 1.07% ± 0.05%
(toric) and
0.76% ± 0.05%
(planar boundary-aware) for the ZZ-
merge, and
1.0% ±0.1%
for the toric XX-merge (the planar XX-merge has a structural
blocker under the standard boundary choice). The construction is veried analytically
35
(with computational sanity checks at
𝐿 = 4, 6, 8
) for code parameters and Z/X distance
preservation under the merge.
Synthesizing the ZZ- and XX-merges into the three-sheet Horsman CNOT, we verify the
logical truth table at
𝐿 {4, 6, 8}
and observe a
4.2𝜎
distance-induced LER reduction at
𝑝 = 10
3
, but a direct diagnostic shows the composed gate does not yet exhibit standard
fault-tolerant
𝑑
each
-scaling the Pauli-frame correction enters the observable through
single-record gauge measurements. The full FT logical CNOT is therefore an identied
follow-up direction with two concrete remediation routes (FT-repeated gauge measure-
ments with a gauge-aware decoder, or transversal CNOT in a stacked architecture); the
joint Pauli primitives stand on their own as the central characterized contribution of this
work.
14 Declarations
Clinical trial registration, Consent to Publish, Ethics and Consent to Partic-
ipate:
not applicable. This study does not involve a clinical trial, human participants,
human data, or animals.
Competing interests:
The author declares no competing interests.
Funding:
This work received no external funding.
Author contributions:
R.K. is the sole author and conceived the construction, per-
formed the mathematical analysis and computational verication, designed and imple-
mented the simulation code, generated the gures, and wrote the manuscript.
Data and code availability:
See Section 12 for the per-le index of the reference imple-
mentation and raw data les. Archive:
https://github.com/raghu91302/ssmtheory/
blob/main/fcc_code_only.zip
.
References
[1] R. Kulkarni, arXiv:2603.20294 (2026).
[2] E. Dennis, A. Kitaev, A. Landahl, and J. Preskill, J. Math. Phys.
43
, 4452 (2002).
[3] A. Yu. Kitaev, Ann. Phys.
303
, 2 (2003).
[4] A. G. Fowler
et al.
, Phys. Rev. A
86
, 032324 (2012).
[5] C. Horsman, A. G. Fowler, S. Devitt, and R. Van Meter, New J. Phys.
14
, 123011
(2012).
[6] D. Litinski, Quantum
3
, 128 (2019).
[7] A. Eickbusch
et al.
, Nature Phys.
21
, 1994 (2025).
[8] R. Acharya
et al.
(Google Quantum AI), “Quantum error correction below the surface
code threshold,” Nature
638
, 920 (2025).
36
[9] R. Chao and B. W. Reichardt, Phys. Rev. Lett.
121
, 050502 (2018).
[10] S. Bravyi, A. W. Cross, J. M. Gambetta, D. Maslov, P. Rall, T. J. Yoder, Nature
627
, 778 (2024).
[11] T. J. Yoder
et al.
, “Tour de gross: A modular quantum computer based on bivariate
bicycle codes,” arXiv:2506.03094 (2025).
[12] C. Gidney, Quantum
5
, 497 (2021).
[13] O. Higgott, ACM Trans. Quantum Comput.
3
, 1 (2022).
[14] P. Panteleev and G. Kalachev, “Degenerate quantum LDPC codes with good nite
length performance,” Quantum
5
, 585 (2021).
[15] C. Wang, J. Harrington, and J. Preskill, “Connement-Higgs transition in a disor-
dered gauge theory and the accuracy threshold for quantum memory,” Ann. Phys.
303
, 31 (2003).
37