Three Sheets, One Architecture

Three Sheets, One Architecture:

Inter-Sheet Joint Pauli Measurements for

𝐾=4

Quantum Error Correction

Raghu Kulkarni

SSMTheory Group, IDrive Inc., Calabasas, CA 91302, USA

raghu@idrive.com

Abstract

We present a CSS quantum error correcting code on the Face-Centered Cubic (FCC)

lattice that combines surface-code-compatible

𝐾=4

connectivity with native

inter-

sheet joint Pauli measurements

between co-located CSS code blocks. The construc-

tion has two parts. First, restricting the

[[3𝐿

, 2𝐿

+2, 3]]

FCC lattice code to a

single triad sheet yields the

sheet code

with parameters

[[𝐿

, 2𝐿, 𝐿]]

at even

𝐿

(or

[[𝐿(𝐿 − 1)

, 𝐿, 𝐿 − 1]]

as a planar variant), uniform weight-4 stabilizers, and

𝐾=4

active per-qubit connectivity. Three triad sheets share the FCC lattice geometry, en-

coding

6𝐿

logical qubits at distance

𝐿

, deployable as a monolithic 2D chip (small

𝐿

a three-layer stacked architecture, or native 3D hardware. Second, we introduce a

fault-tolerant surgery protocol using local FCC triangle measurements to implement

joint Pauli measurements between logical qubits in dierent sheets.

Our quantitatively characterized claims are:



Static memory baseline.

Single-sheet logical memory simulation at

𝐿 ∈

{4, 6}

on a custom Stim circuit gives an FSS crossing near

≈ 1.0%

(Section 7.4),

consistent with surface-code-like behavior under circuit-level depolarizing noise

with MWPM decoding. This is reported as a baseline, not the central charac-

terized contribution.



Joint-

𝑍𝑍

surgery (ZZ-merge): FSS-crossing threshold estimate

1.07% ± 0.05%

for the toric variant and

0.76% ± 0.05%

for the boundary-

aware planar variant, from pairwise crossings at

𝐿 ∈ {4, 6, 8}

(Sections 7, 9.2).

This is the central characterized FT primitive.



Joint-

𝑋𝑋

surgery (XX-merge): FSS-crossing threshold estimate

≈

1.0% ± 0.1%

on the toric variant (Section 9.3). The planar boundary-aware

variant has a structural blocker for the dual primitive under the standard

boundary choice.

As a synthesis claim, we verify the three-sheet Horsman [5] CNOT logical truth table

𝐿 ∈ {4, 6, 8}

on the toric variant (

𝑝 = 0

, deterministic on all four computational-

basis inputs, DEM builds cleanly), and observe a

4.2𝜎

distance improvement at

𝑝 = 10

−3

from

𝐿 = 4

𝐿 = 6

under MWPM. However, a direct diagnostic shows

the protocol as constructed is

not yet fault-tolerant in the standard sense

: increasing

the per-merge depth

𝑑

each

does not reduce LER (at

𝐿 = 4

𝑑

each

= 8

gives

higher

LER

than

𝑑

each

= 4

), because the Pauli-frame correction enters the observable through

specic gauge measurements that are not themselves repeated

𝑑

times. Reaching

FT-thresholded status for the full CNOT requires either FT-repeated gauge mea-

surements or a gauge-aware decoder; both are concrete follow-up directions.

The architecture’s

deployment niche

is therefore: high-density FT memory plus

FT joint-Pauli measurements at

𝐾=4

, with the full universal Cliord layer iden-

tied as a near-term protocol renement rather than a structural blocker. The

density advantage is concrete and consistent:

∼ 16

physical qubits per logical at

𝐿 = 4

in a three-sheet deployment vs

∼ 31

for distance-4 rotated surface code (a

∼ 1.9×

improvement, essentially at in

𝐿

). Application classes that consume only

memory and joint-Pauli primitives — quantum networking nodes, large-scale logical

benchmarking, NISQ-to-FT bridge demonstrations — are immediately addressable

on existing or near-term

𝐾=4

hardware (Google Willow, IQM, OQC). Compared to

surface code lattice surgery (conned within a single patch), the sheet code provides

2𝐿

logicals per sheet at the same distance and connectivity; compared to bivariate

bicycle codes (inter-block gates between physically separated

𝐾=6

modules), the

sheet code achieves inter-sheet operations at

𝐾=4

with only short-range couplings

between co-located sheets.

1 Introduction

Two families of quantum error correcting codes currently dominate the discussion of near-

term fault-tolerant quantum computing. The

surface code

[2, 4] pairs

𝐾=4

planar hard-

ware compatibility with mature fault-tolerant protocols, including lattice surgery [5, 6]

for logical gates within a single 2D substrate. The

bivariate bicycle (BB) code

family [10]

achieves an order-of-magnitude rate advantage over the surface code at the cost of

𝐾=6

connectivity and a small number of long-range couplers. The BB family’s

inter-block

log-

ical gates — gates that move logical qubits between distinct code blocks — remain an

active research problem [11].

This paper introduces a third option that occupies a previously unlled niche: a CSS

code at

𝐾=4

connectivity (matching the surface code) with

native inter-sheet joint Pauli

measurements

between co-located CSS code blocks — a fault-tolerant primitive that the

surface code provides only within a single patch’s lattice surgery zone. This is a dierent

design point from surface code lattice surgery (which operates within a single substrate)

and from bivariate bicycle codes (which target inter-block gates between physically sepa-

rated

𝐾=6

modules and where modular logical operation protocols are an area of active

architectural development). Full FT logical CNOT composition between sheets is a syn-

thesis of two such joint-Pauli primitives that we verify at the truth-table level but identify

as not yet fault-tolerantly characterized in this work (Section 10); the primary character-

ized contribution is the joint-Pauli primitive itself. The construction has two interlocking

ingredients.

The FCC sheet code.

Restricting the

[[3𝐿

, 2𝐿

+2, 3]]

FCC lattice code [1] to a single

triad sheet (one of three orthogonal

𝐾=4

sublattices in FCC) eliminates the FCC code’s

weight-3 vulnerability and yields a CSS code with parameters

[[𝐿

, 2𝐿, 𝐿]]

at even

𝐿

. Each

of the three sheets decomposes into

𝐿

parallel 2D toric codes; three sheets on a shared

FCC substrate encode

6𝐿

logical qubits at distance

𝐿

3𝐿

data qubits.

Cross-sheet triangle surgery.

Every FCC triangle has one edge in each of the three

triad sheets. Weight-3 Pauli measurements on FCC triangles couple data qubits across

sheets, providing a natural primitive for inter-sheet logical operations. We show that

triangle products implement joint Pauli measurements between logical qubits in dierent

sheets, with the merged code preserving distance

𝑑 = 𝐿

The two ingredients work in tandem: the sheet code provides ecient storage at

𝐾=4

;

the triangle surgery primitive provides fault-tolerant inter-sheet joint Pauli measurements

at the same envelope. The result lls a gap that neither surface code nor bivariate bicycle

codes address:

𝐾=4

planar connectivity with inter-sheet joint Pauli measurements as

a native primitive. Composing these primitives into full FT logical gates (Cliord cir-

cuits, magic-state distillation, etc.) requires the further protocol or hardware renements

identied in Section 10.

Every FCC triangle has one edge per triad sheet

sheet

Figure 1: Every FCC triangle has exactly one edge in each triad sheet. The three edge

vectors of a triangle

{𝑣

𝑎

, 𝑣

𝑏

, 𝑣

𝑐

}

partition naturally across

𝑆

𝑥𝑦

𝑆

𝑥𝑧

𝑆

𝑦𝑧

. A single weight-

3 Pauli measurement on a triangle simultaneously couples data qubits across all three

sheets.

Summary of results.

(i) sheet code with parameters

[[𝐿

, 2𝐿, 𝐿]]

on a torus,

[[𝐿(𝐿 −

, 𝐿, 𝐿 −1]]

on a plane, uniform weight-4 stabilizers,

𝐾=4

data-qubit connectivity (Sec-

tion 2); (ii) three-sheet hardware architecture via time-multiplexed syndrome extraction

(Section 3); (iii) triangle algebra spanning

6𝐿−3

of the

6𝐿

cross-sheet Z-logicals (Sec-

tion 4); (iv) fault-tolerant surgery protocol with

𝐾=4

verication and

𝑂(𝐿)

gate overhead

(Section 5); (v) distance preservation under merge for both Z- and X-sides (Section 6); (vi)

threshold simulation via custom Stim at

𝐿 = 4, 6, 8

giving

𝑝

𝑍𝑍

= 1.07% ± 0.05%

(toric)

and

𝑝

planar

= 0.76% ± 0.05%

(boundary-aware planar) (Section 7); (vii) comparison with

state of the art (Section 8); (viii) CSS-dual XX-merge primitive demonstrated with com-

parable threshold

≈ 1.0%

𝐿 = 4, 6, 8

on the toric variant (Section 9.3); (ix) three-sheet

Horsman CNOT logical truth table veried at

𝐿 ∈ {4, 6, 8}

but

not yet fault-tolerantly

characterized

(Section 10).

Claim status matrix.

To make the epistemic status of each claim transparent up front:

Claim Status Evidence / Section

[[𝐿

, 2𝐿, 𝐿]]

code parameters (toric);

[[𝐿(𝐿 − 1)

, 𝐿, 𝐿 − 1]]

(planar)

Proven Theorem 2, §2

𝐾=4

active per-qubit connectivity Veried by con-

struction

§5, gate schedule

Z- and X-distance preservation under

merge

Proven (general

𝐿

) + compu-

tational checks

(

𝐿 = 4, 6, 8

)

Theorems 5, 6

Static memory FSS-crossing baseline

≈ 1.0%

(toric, single sheet)

Simulation,

𝐿 ∈

{4, 6}

, MWPM

§7.4

ZZ-merge FSS-crossing threshold

1.07%±0.05%

(toric) /

0.76%±0.05%

(planar boundary-aware)

Simulation,

𝐿 ∈ {4, 6, 8}

MWPM, FSS

data-collapse t

available

§7, §9.2

XX-merge FSS-crossing threshold

≈

1.0% ± 0.1%

(

toric only

; planar XX

has structural blocker)

Simulation,

𝐿 ∈ {4, 6, 8}

MWPM

§9.3

Three-sheet Horsman CNOT logical

truth table at

𝑝 = 0

Veried de-

terministic at

𝐿 ∈ {4, 6, 8}

𝑑

each

∈ {2, 3, 4}

𝐿 = 4

§10

CNOT distance suppression at

𝑝 =

−3

Observed:

4.2𝜎

improvement

𝐿 = 4 → 𝐿 = 6

§10, Fig. 6

CNOT fault-tolerant threshold

Not extracted

;

𝑑

each

-scaling

fails (non-FT)

§10, follow-up direction

Three-layer stacked architecture with

vertical inter-sheet couplers

Specied struc-

turally; per-

coupler delity

not modeled

§3

Magic-state distillation, non-Cliord

layer

Suggested by

octahedral

symmetry; no

protocol con-

structed

§10.8

2 The FCC Sheet Code

2.1 The Triad Decomposition

The FCC lattice has

𝐾 = 12

nearest-neighbor vectors, partitioning naturally into three

orthogonal sheets of 4:

𝑆

𝑥𝑦

: (±1, ±1, 0)

𝑆

𝑥𝑧

: (±1, 0, ±1)

(1)

𝑆

𝑦𝑧

: (0, ±1, ±1)

Each FCC edge belongs to exactly one sheet. At lattice size

𝐿

(even), each sheet contains

𝐿

edges. Restricted to a single sheet, each FCC vertex has

𝐾=4

incident edges.

2.2 The Sheet Code Stabilizers

Denition 1

(FCC sheet code)

Fix one triad sheet

𝑆

(say

𝑆

𝑥𝑦

). Place one physical qubit

on each edge in

𝑆

(

𝑛 = 𝐿

qubits). The stabilizers are:



𝑍

-stabilizers: for each vertex

𝑣

, apply

𝑍

to the 4 edges of

𝑆

incident to

𝑣



𝑋

-stabilizers: for each octahedral void

𝑜

, apply

𝑋

to the 4 edges of

𝑆

connecting the

6 vertices surrounding

𝑜

Both stabilizer types have uniform weight 4. The CSS condition

𝐻

𝑋

𝐻

𝑇

𝑍

= 0

over

GF(2)

satised because each edge in sheet

𝑆

connects two vertices and participates in exactly two

octahedral voids restricted to

𝑆

; the overlap between any X-stabilizer and any Z-stabilizer

is even.

2.3 Code Parameters

Theorem 1

(Sheet code parameters)

At even

𝐿

, the FCC sheet code has parameters

[[𝐿

, 2𝐿, 𝐿]]

𝑛 = 𝐿

physical qubits,

𝑘 = 2𝐿

logical qubits, code distance

𝑑 = 𝐿

The parameters follow from the layer decomposition (Section 2.4) together with standard

2D toric code counting. Computational verication:

𝐿 𝑛 rank(𝐻

𝑍

) rank(𝐻

𝑋

) 𝑘

4 64 28 28 8

6 216 102 102 12

8 512 248 248 16

In each case

rank(𝐻

𝑍

) = rank(𝐻

𝑋

) = (𝐿

− 2𝐿)/2

, giving

𝑘 = 𝐿

− 2(𝐿

− 2𝐿)/2 = 2𝐿

The general result follows from Theorem 2 below.

2.4 Layer Decomposition and Proof of Parameters

Why the distance increases.

The full FCC code has

𝑑 = 3

because weight-3 logical

operators exist at tetrahedral voids: one edge from each of the three triad sheets forms

a triangle commuting with all weight-12 stabilizers. Within a single triad sheet, only

one edge of any such triangle survives, giving a single-edge Pauli that anticommutes with

the appropriate opposite-type sheet stabilizers (a Z-edge with the X-stabilizers of the

sheet; an X-edge with the Z-stabilizers) and is therefore detected. No weight-3 logical

survives the sheet restriction; the minimum-weight logical operators of the sheet code are

non-contractible cycles within the sheet, of length

𝐿

Layer structure.

Each triad sheet decomposes further into

𝐿

independent layers indexed

by the zero-displacement coordinate. For sheet

𝑆

𝑥𝑦

, edges have

𝑑𝑧 = 0

, so each

𝑆

𝑥𝑦

edge

has a well-dened

𝑧

-coordinate equal to the shared

𝑧

of its two endpoints. Edges in layer

𝑧 = 𝑧

form a 2D toric code on a rotated

𝐿 ×𝐿

square lattice. Analogous decompositions

hold for

𝑆

𝑥𝑧

(layered by

𝑦

) and

𝑆

𝑦𝑧

(layered by

𝑥

Theorem 2

(Layer decomposition)

The FCC sheet code on sheet

𝑆

𝑥𝑦

at even

𝐿

is iso-

morphic, as a stabilizer code, to

𝐿

disjoint 2D toric codes, each on an

𝐿×𝐿

rotated square

lattice with

𝐿

data qubits,

𝑘 = 2

logical qubits, and distance

𝐿

. The Z-stabilizers (resp. X-

stabilizers) of the sheet code partition into

𝐿

disjoint sets, one per layer; within each layer,

the rank deciency equals 1 (one product redundancy among

𝐿

vertex stabilizers).

Proof.

Edge partition.

Each

𝑆

𝑥𝑦

edge

(𝑣

, 𝑣

)

has

𝑧(𝑣

) = 𝑧(𝑣

)

since the displacement

vector

𝑣

−𝑣

∈ {(±1, ±1, 0)}

has

𝑑𝑧 = 0

. Dene

layer(𝑒) = 𝑧(𝑣

)

. The map

layer : 𝑆

𝑥𝑦

→

{0, 1, . . . , 𝐿 − 1}

partitions the

𝐿

edges of

𝑆

𝑥𝑦

into

𝐿

disjoint sets of

𝐿

edges each.

Stabilizer partition.

A vertex Z-stabilizer at vertex

𝑣

acts on the 4 sheet-

𝑆

𝑥𝑦

edges

incident to

𝑣

, all of which have the same

𝑧

-coordinate as

𝑣

. Hence each vertex Z-stabilizer

is supported entirely within one layer. An analogous argument applies to octahedral void

X-stabilizers within

𝑆

𝑥𝑦

, since the 4 edges of an oct void restricted to

𝑆

𝑥𝑦

all share the

same

𝑧

-coordinate as the void center.

Layer is a 2D toric code.

Within layer

𝑧 = 𝑧

, the

𝐿

edges connect vertices

{(𝑥, 𝑦, 𝑧

) :

𝑥+𝑦 ≡ 𝑧

(mod 2)}

via the four neighbor vectors

(±1, ±1, 0)

. This is precisely the rotated

𝐿 × 𝐿

square lattice, and the vertex Z-stabilizers and oct-void X-stabilizers on this layer

are exactly the standard 2D toric code stabilizers. The toric code on

𝐿×𝐿

has parameters

[[𝐿

, 2, 𝐿]]

Rank count.

The 2D toric code on

𝐿

data qubits has

𝐿

vertex Z-stabilizers, sat-

isfying one redundancy (the product over all vertices is the identity). Hence per layer,

rank(𝐻

layer

𝑍

) = 𝐿

/2 − 1

. Across

𝐿

layers,

rank(𝐻

sheet

𝑍

) = 𝐿 · (𝐿

/2 − 1) = (𝐿

− 2𝐿)/2

The same argument applies to

𝐻

sheet

𝑋

Code parameters.

𝑘 = 𝑛 − rank(𝐻

𝑍

) − rank(𝐻

𝑋

) = 𝐿

− 2 · (𝐿

− 2𝐿)/2 = 2𝐿

. The

minimum-weight logical operators are the non-contractible cycles of the per-layer 2D toric

codes, each of length

𝐿

. Hence

𝑑 = 𝐿

Consequence for the rank formula.

Theorem 2 eliminates the need for a per-

𝐿

verication of the rank: the formula

rank(𝐻

𝑍

) = rank(𝐻

𝑋

) = (𝐿

−2𝐿)/2

holds for every

even

𝐿 ≥ 2

2.5 Planar Variant

For deployment on planar quantum chips that do not support periodic boundary condi-

tions, each layer becomes a rotated surface code

[[(𝐿−1)

, 1, 𝐿−1]]

via standard boundary

engineering [4]. The resulting

planar sheet code

has parameters

[[𝐿(𝐿 − 1)

, 𝐿, 𝐿 − 1]] (

planar boundaries

(2)

The distance drops by one due to the standard rotated-surface-code boundary truncation,

and the encoding rate halves from

2𝐿 → 𝐿

3 Hardware Embedding: Three Sheets, Three Layers,

or One Chip

The three triad sheets are edge-disjoint:

3𝐿

data qubits in total, with

𝐿

per sheet. We

now address the question of how to physically realize these qubits on hardware. This

question is non-trivial because the sheet code uses

Θ(𝐿

)

data qubits to encode

Θ(𝐿)

logical qubits at distance

𝐿

, and a monolithic 2D embedding of an

𝐿

-vertex 3D graph

cannot maintain unit-length nearest-neighbor couplings as

𝐿

grows.

We discuss three deployment options, in increasing order of scalability.

3.1 Option A: Monolithic Planar Chip (Small to Moderate

𝐿

)

For

𝐿 ≤ 8

(

≤ 1,500

data qubits total), a monolithic planar processor hosts all three sheets

via time-multiplexed syndrome extraction: data qubits occupy xed positions; per-sheet

ancillas are physically distinct but co-located at FCC vertex and oct-void positions; cou-

plers recongure between rounds to activate one sheet at a time (

𝑆

𝑥𝑦

, 𝑆

𝑥𝑧

, 𝑆

𝑦𝑧

in successive

rounds).

Wire-length scaling:

on a monolithic 2D embedding of the

Θ(𝐿

)

-qubit 3D

lattice, average nearest-neighbor distance scales as

Θ(𝐿

1/2

)

— straightforward at

𝐿 = 4

(192 qubits), requires active calibration at

𝐿 = 8

(1,536 qubits), impractical at

𝐿 ≥ 12

without coupler-reach upgrades.

Idle penalty:

while one sheet is measured, the other

two idle; with round time

𝑡

, full cycle is

3𝑡

and each data qubit idles

2𝑡

per cycle, captured

in the Section 7 noise model.

3.2 Option B: Three-Layer Stacked Architecture (Recommended

Deployment)

For

𝐿 ≥ 8

, we recommend three planar

𝐾=4

processors, one per triad sheet, bonded with

through-silicon-vias or inter-layer capacitive couplers for triangle-mediated cross-sheet

operations.

Each chip is a standard planar

𝐾=4

device. Layer hosting

𝑆

𝑥𝑦

carries the

𝐿

data qubits

𝐿

stacked rotated

𝐿 × 𝐿

lattices (Theorem 2). Triangle ancillas sit between chip

layers, coupled via short-range vertical couplers (TSVs) to the three data qubits of their

triangle, one from each chip. Inter-layer couplers activate only during surgery operations

and remain inactive otherwise. The chip-internal

𝐾=4

connectivity is unaected; the

inter-layer couplers add

𝐾 = 1

per data qubit per active triangle, preserving the

𝐾=4

eective constraint during surgery (Section 5). Three-layer stacked QEC architectures

appear in recent hardware roadmaps [7, 10]; IBM’s stacked-die approach for bivariate

bicycle uses analogous couplers. The sheet code’s three-chip architecture has lower per-

chip connectivity (

𝐾=4

vs.

𝐾=6

) but simpler inter-chip coupling (only triangle ancillas

need vertical bonds).

Vertical crosstalk.

Only triangle ancillas (a minority,

𝑂(𝐿)

per surgery primitive vs.

Θ(𝐿

)

data qubits per layer) carry inter-layer couplers; data qubits and per-sheet ancillas

have no vertical wiring, so static memory is unaected by inter-layer phenomena. Cou-

plers activate via control electronics; o-state isolation via detuning is platform-dependent

(superconducting TSVs reach

∼ 40

–

dB [7]; neutral-atom and ion-trap platforms can

achieve higher via mechanical separation). Platform-specic crosstalk budgeting is iden-

tied as future work (Section 10.8).

3.3 Option C: Native FCC Hardware

For maximally ecient embedding, a quantum hardware platform with native 3D con-

nectivity (such as 3D-printed superconducting circuits, neutral atom arrays with 3D-

addressable laser systems, or trapped ion architectures with multi-segment traps) hosts

the full FCC lattice without the embedding penalty of options A or B. The sheet code

runs natively on such hardware with all

𝐾=4

couplings at unit physical distance. This

option is forward-looking; no current commercial platform oers it.

3.4 Hardware Footprint

Across all options:

3𝐿

data qubits (one per FCC edge, partitioned by sheet),

3 × 𝐿

vertex Z-ancillas,

3 × 𝐿

octahedral void X-ancillas, and

𝐿

transient triangle ancillas

per active surgery primitive (reusable). Total:

6𝐿

recurring qubits with a

𝐾=4

active

syndrome-extraction schedule per data qubit, plus

𝑂(𝐿)

transient ancillas during surgery.

3.5 Comment on the “Cross-Block” Terminology

The three triad sheets occupy the same FCC lattice geometry (time-multiplexed in option

A, stacked in option B, co-located in option C), not spatially separated modules in the

bivariate bicycle sense. Throughout this paper we use the phrase

inter-sheet operations

for

any operation between logical qubits in distinct sheets implemented by triangle surgery.

These decompose into two strictly distinguished categories. (1)

Inter-sheet joint Pauli

measurements

— the ZZ-merge and XX-merge primitives of Sections 7 and 9.3 — are

fault-tolerant primitives with FSS-crossing threshold estimates. (2)

Composite logical

Cliord gates

(e.g., the three-sheet Horsman CNOT obtained by sequencing two joint

Pauli measurements) are constructed in Section 10 as correct logical gates at

𝑝 = 0

but

are

not yet fault-tolerantly characterized

; calling these “logical gates” is correct but does

not imply they are FT-thresholded in this work. The sheets are logically distinct CSS

code blocks (each independently encoding

2𝐿

logical qubits with independent stabilizer

groups and decoders) but not physically separated. This intermediate regime, between

surface code lattice surgery within one substrate and bivariate bicycle inter-block gates

across separated modules, is the niche our construction occupies.

4 Triangle Algebra and Cross-Sheet Logicals

4.1 FCC Triangles

Lemma 1

(Triangle structure)

Every triangle (3-cycle) in the FCC graph has one edge

in each of the three triad sheets. At lattice size

𝐿

, the FCC graph contains

4𝐿

triangles,

and each FCC edge participates in exactly 4 triangles.

Proof.

For three mutually adjacent FCC vertices

𝑣

𝑎

, 𝑣

𝑏

, 𝑣

𝑐

, the three edge-vectors

𝑣

𝑏

−𝑣

𝑎

𝑣

𝑐

− 𝑣

𝑎

𝑣

𝑐

− 𝑣

𝑏

must each be FCC neighbor vectors. Direct case analysis on the 12

neighbor vectors shows that any three pairwise-summing-to-zero NN vectors necessarily

lie in distinct sheets. Counting: each FCC vertex is in 24 triangles;

24 · 𝐿

/2/3 = 4𝐿

Each edge appears in

4𝐿

· 3/(3𝐿

) = 4

triangles. See Figure 1.

4.2 Triangle Operators

Denition 2

(Triangle operator)

For an FCC triangle

𝑇

with edges

𝑒

𝑥𝑦

∈ 𝑆

𝑥𝑦

𝑒

𝑥𝑧

∈ 𝑆

𝑥𝑧

𝑒

𝑦𝑧

∈ 𝑆

𝑦𝑧

, dene the Z-triangle operator

𝒵

𝑇

= 𝑍

𝑒

𝑥𝑦

⊗ 𝑍

𝑒

𝑥𝑧

⊗ 𝑍

𝑒

𝑦𝑧

and similarly

𝒳

𝑇

= 𝑋

𝑒

𝑥𝑦

⊗ 𝑋

𝑒

𝑥𝑧

⊗ 𝑋

𝑒

𝑦𝑧

A single triangle Z-operator commutes with all per-sheet Z-stabilizers but anticommutes

with exactly 6 per-sheet X-stabilizers (two per sheet). Products of triangles can be chosen

to commute with all stabilizers.

4.3 Reachable Cross-Sheet Logicals

Theorem 3

(Cross-sheet reachability)

Let

ℬ ∈ GF(2)

|𝑇 |×𝑛

edges

be the triangle-edge inci-

dence matrix and

𝐻

𝑋

the cross-sheet X-stabilizer matrix. Dene the space of valid triangle

products

𝒱 = {𝑚 · ℬ : 𝑚 ∈ GF(2)

|𝑇 |

, 𝐻

𝑋

(𝑚 · ℬ)

𝑇

= 0}.

Then

dim(𝒱 mod row span(𝐻

𝑍

)) = 6𝐿 − 3

, and every operator in this space has support

on exactly two sheets.

Verication:

𝐿 = 4

: 21 logicals (out of

6𝐿 = 24

), all 2-sheet, distributed as 7 per sheet

pair. At

𝐿 = 6

: 33 logicals (out of 36), all 2-sheet, distributed as 11 per sheet pair. The

missing 3 logicals are global homological cycles that no triangle product can form.

Theorem 4

(Per-sheet coverage)

For each sheet

𝑆

𝑖

, the projection of triangle-reachable

cross-sheet logicals onto the Z-logical space of

𝑆

𝑖

has dimension

𝐿

for each partner sheet

𝑆

𝑗

(

𝑗 = 𝑖

). The union of projections via both partners covers the full

2𝐿

-dimensional

Z-logical space of

𝑆

𝑖

Verication at

𝐿 = 4

Each sheet pair reaches a

-dimensional subspace (

= 𝐿

) of each

sheet’s 8-dim (

= 2𝐿

) Z-logical space. The two partner-pair subspaces are distinct; their

union is the full 8-dim space.

4.4 Operational Consequence

Every Z-logical (and by CSS symmetry, every X-logical) of every sheet can participate

in a triangle-mediated joint measurement with at least one partner sheet. Combined

with fresh ancilla logical qubits and standard surgery protocols [5, 6], these reachability

results suggest that arbitrary logical-pair interactions can in principle be mediated using

ancilla logicals and logical-basis routing; a complete routing-depth construction (and the

associated FT-thresholded characterization, since the composed CNOT itself is not yet

FT-thresholded in this work, Section 10) is left to future work.

5 Fault-Tolerant Surgery Protocol

5.1 Ancilla Placement

Each surgery primitive uses

𝐿

triangles forming a localized cluster on the FCC lattice.

For each triangle

𝑇

, a measurement ancilla is placed at the centroid of

𝑇

’s three vertex

positions, coupled to its 3 data qubits via short-range couplers (

𝐾 = 3

at the ancilla).

Figure 2 illustrates the canonical

𝐿 = 4

four-triangle primitive.

Flag-qubit protocol at small

𝐿

A weight-3 measurement with single-fault propaga-

tion produces data errors of weight

≤ 2

. For correctability, we require

𝑤 ≤ (𝑑 + 1)/2

equivalently

𝑑 = 𝐿 ≥ 5

for weight-3 measurements. For

𝐿 ≥ 6

, no ag qubits are needed:

the per-sheet code distance suces for fault-tolerant triangle measurements. For

𝐿 = 4

a ag-qubit protocol [9] catches the worst-case weight-2 propagation.

5.2

𝐾=4

Connectivity Verication

Two senses of

𝐾=4

The connectivity claim made by this paper, here and throughout,

is about

active connectivity per syndrome round

: each data qubit participates in at most

4 two-qubit gates per round of syndrome extraction, including during surgery. This is the

constraint that matters for hardware compatibility (gate scheduling, crosstalk, parallel

CNOT capacity, calibration overhead) and is what makes the architecture compatible

with platforms such as Google Willow and IQM Star/Garnet whose native processors are

designed around

𝐾=4

active connectivity. The

physical coupler layout

of a real chip may

include additional couplers (e.g., diagonal ones used only during specic phases or never

used at all in this protocol); this is a separate hardware-implementation question, and we

do not claim the physical layout must literally have exactly 4 couplers per data qubit.

Throughout this paper, “

𝐾=4

” should be read as active per-qubit connectivity during

any single syndrome round.

Each data qubit therefore participates in at most 4 two-qubit gates per syndrome round,

with no exceptions during surgery. During surgery, some gate slots recongure from

per-sheet ancillas to triangle ancillas. At

𝐿 = 4

with the canonical 4-triangle primitive:

the two

𝑥𝑦

-sheet edges shared between pairs of surgery triangles see 2 triangle-coupler

gates per round (retaining 2 per-sheet coupler gates each); the other 8 data qubits use 1

triangle-coupler gate with 3 per-sheet coupler gates retained.

The

𝐾 = 4

active envelope

is preserved throughout the surgery operation.

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.5

1.0

1.5

2.0

2.5

3.0

T24

T28

Surgery primitive at

= 4: 4 triangles spanning two sheets

(weight-8 joint

measurement)

triangle ancilla

Figure 2: Surgery primitive at

𝐿 = 4

. Four FCC triangles

𝑇

, 𝑇

share two

𝑥𝑦

sheet edges,

(𝑣

, 𝑣

)

and

(𝑣

, 𝑣

)

, which cancel in the product. The net operator has weight

8 with support on 4 edges in

𝑆

𝑥𝑧

and 4 edges in

𝑆

𝑦𝑧

, implementing

𝑍

𝐴

⊗ 𝑍

𝐵

where

𝐴

a logical of sheet

𝑥𝑧

and

𝐵

is a logical of sheet

𝑦𝑧

. Red stars indicate triangle ancilla

positions at each triangle’s centroid.

5.3 Gate Schedule

The triangle measurements’ CNOT gates schedule via graph coloring on the conict graph

𝐺

conﬂict

(nodes: surgery triangles; edges: shared data qubits). At

𝐿 = 4

, the 4-triangle

primitive’s conict graph requires 2 colors; with 3 CNOTs per triangle, the surgery oper-

ation completes in

3 × 2 = 6

time slots. For comparison, a standard per-sheet syndrome

extraction round takes 6–8 time slots.

5.4 Overhead Analysis

At lattice size

𝐿

Quantity Per syndrome round (per-sheet) Per surgery operation

Ancillas (3 sheets)

3𝐿

𝐿

measurement ancillas

Two-qubit gates

12𝐿

3𝐿

5𝐿

Time slots

–

8 ≈ 6

Fraction of per-round cost (at

𝐿=10

) —

< 0.5%

Surgery has subleading gate cost:

𝑂(𝐿)

two-qubit gates for the triangle measurements

compared with

𝑂(𝐿

)

per syndrome round for the per-sheet stabilizer extraction.

Clock-cycle impact.

Triangle measurements during the merge execute in parallel with

per-sheet syndrome extraction; in a time-multiplexed schedule, triangle CNOTs occupy

the same time slots as per-sheet CNOTs without extending wall-clock time. The merge

phase costs

𝐿

syndrome rounds at the same clock cycle as memory, for total surgery

overhead of

3𝐿

rounds vs.

𝐿

for memory. On superconducting transmons (100–400 ns

cycles [8]),

3𝐿

rounds at

𝐿 = 8

2.4

–

s additional wall-clock, well below typical

𝑇

, 𝑇

(

∼ 100

s); on neutral-atom platforms (

∼ 1

ms cycles) the

∼ 24

ms total requires sustained

coherence achievable in recent demonstrations. Idle errors during merge are captured by

the noise model of Section 7.

5.5 Decoder Graph

The decoder for surgery operations operates on a combined detector graph: per-sheet

syndrome detectors (vertex Z, oct void X) plus triangle measurement detectors. Triangle-

triangle correlations arise from shared data qubits: a

𝑍

error on a shared edge ips both

triangles’ outcomes simultaneously. Standard MWPM [13] applies directly to this graph;

the matcher extends the per-sheet syndrome graph with cross-sheet edges induced by

triangle measurements [5, 6].

5.6 Boundary Deformation: Broken Stabilizers and Their Recon-

struction

A standard concern in lattice surgery is that the merge operation temporarily disrupts

the per-block stabilizer structure: some stabilizers become gauge operators during the

merge and must be reconstructed afterward. We characterize this disruption precisely for

the FCC triangle primitive.

Lemma 2

(Broken X-stabilizers per triangle)

Each individual triangle

𝑇

with edges

(𝑒

𝑥𝑦

, 𝑒

𝑥𝑧

, 𝑒

𝑦𝑧

)

, one per sheet, anticommutes with exactly six per-sheet X-stabilizers: the

two octahedral voids in each sheet that contain one of the triangle’s edges. The breakdown

is two X-stabilizers in

𝑆

𝑥𝑦

, two in

𝑆

𝑥𝑧

, and two in

𝑆

𝑦𝑧

Proof.

A triangle Z-operator

𝒵

𝑇

acts on three edges. Each edge

𝑒 ∈ 𝑆

𝑖

is contained in

exactly two octahedral voids of

𝑆

𝑖

, since each FCC edge connects two oct-void neighbors.

The X-stabilizer of an oct void contains

𝑒

as one of its four support edges. Therefore

𝒵

𝑇

overlaps each such X-stabilizer in exactly one edge (odd), and anticommutes with it.

The six X-stabilizers (two per sheet) are distinct since they correspond to distinct oct

voids.

Gauge structure during surgery.

During the multi-round merge, the

𝐿

triangles of

the surgery primitive collectively anticommute with

3𝐿

distinct per-sheet X-stabilizers

(each broken by exactly two triangles, accounting for the

6𝐿

total triangle-stab anticom-

mutations of Lemma 2). These

3𝐿

X-stabilizer measurements become

gauge bits

during

the merge: their outcomes are correlated with the triangle outcomes but do not constrain

the merged code’s logical subspace. Veried numerically:

𝐿

Triangles in primitive Distinct X-stabs broken Per-triangle broken Net broken

4 4 12 6 0

6 6 18 6 0

The “net broken” column counts X-stabilizers with odd total ip count across the

𝐿

triangles. This is zero by construction: the triangle product



𝑇

𝒵

𝑇

= 𝑍

𝐴

⊗𝑍

𝐵

commutes

with all X-stabilizers (Theorem 3), so each X-stabilizer is broken by an even number of

triangles in the primitive.

Post-merge reconstruction.

After the

𝑑

-round merge phase, the

3𝐿

initially-broken

X-stabilizer outcomes are reconstructed from the

𝐿

triangle measurement outcomes plus

surviving stabilizer constraints: each broken X-stabilizer’s eigenvalue is the modulo-2 sum

of (i) its pre-merge eigenvalue, (ii) the triangle measurements whose Z-operators overlap

it in odd parity, and (iii) propagated Pauli corrections from the surgery protocol. This is

the FCC-triangle analog of the standard rough/smooth boundary deformation in surface

code lattice surgery [5]: per-sheet boundary stabilizers are temporarily opened as gauges

and closed upon surgery completion. No per-sheet stabilizer is permanently modied;

the aected weight-4 stabilizers are still physically measured throughout the merge, but

their round-to-round detector constraints are gauge-dependent during the merge window

and are therefore omitted from the DEM during that window (see Section 7.1); they are

restored post-merge using the triangle-outcome records described above.

Single-fault error propagation across sheets.

A single fault on a triangle ancilla

mid-circuit can propagate to at most two data qubits in dierent sheets: for the canonical

CNOT schedule

𝑒

𝑥𝑦

→ 𝑒

𝑥𝑧

→ 𝑒

𝑦𝑧

, a

𝑍

error on the ancilla after the

𝑆

𝑥𝑧

CNOT and

before the

𝑆

𝑦𝑧

CNOT propagates to one data qubit each in

𝑆

𝑥𝑧

and

𝑆

𝑦𝑧

. This weight-2

cross-sheet error triggers detectors in two distinct per-sheet syndrome graphs; the decoder

graph must include edges spanning these per-sheet graphs (Section 7).

6 Distance Preservation

Theorem 5

(Merged code distance)

For any even

𝐿 ≥ 2

and any cross-sheet measure-

ment operator

Op = 𝑍

𝐴

⊗ 𝑍

𝐵

implemented by a weight-

2𝐿

triangle product, with

𝑍

𝐴

supported in sheet

𝑆

𝑖

and

𝑍

𝐵

in sheet

𝑆

𝑗

(

𝑖 = 𝑗

), the merged code formed by adding

as a Z-stabilizer has stabilizer rank increased by exactly 1 (consuming one logical qubit),

and the minimum-weight logical operator of the merged code has weight

𝐿

. Distance is

preserved.

Proof.

Stabilizer rank.

commutes with all original X-stabilizers by Theorem 3 (it is

the product of triangle Z-operators chosen to be in the joint kernel of

𝐻

𝑋

). Furthermore,

is not in the row span of

𝐻

𝑍

since it is a non-trivial element of the Z-logical group.

Therefore appending

𝐻

𝑍

increases the rank by 1, and the merged code has

𝑘

merged

𝑘

pre

− 1

logical qubits.

Minimum logical weight.

The merged Z-logical group is the original Z-logical group

quotiented by the subgroup

⟨Op⟩

. Each non-trivial equivalence class has the form

{𝑔, 𝑔 ⊕

Op}

for a representative

𝑔

in the original Z-logical group, with

𝑔

not equivalent to

modulo the original stabilizers.

Decompose any

𝑔

𝑔 = 𝑔

𝑥𝑦

⊕𝑔

𝑥𝑧

⊕𝑔

𝑦𝑧

where

𝑔

𝑠

denotes the restriction of

𝑔

to sheet

𝑆

𝑠

Similarly

decomposes as

Op = (Op)

𝑖

⊕ (Op)

𝑗

with

(Op)

𝑖

= 𝑍

𝐴

of weight

𝐿

𝑆

𝑖

and

(Op)

𝑗

= 𝑍

𝐵

of weight

𝐿

𝑆

𝑗

. Then

wt(𝑔) = wt(𝑔

𝑥𝑦

) + wt(𝑔

𝑥𝑧

) + wt(𝑔

𝑦𝑧

(3)

wt(𝑔 ⊕ Op) = wt(𝑔

𝑘

) + wt(𝑔

𝑖

⊕ 𝑍

𝐴

) + wt(𝑔

𝑗

⊕ 𝑍

𝐵

(4)

where

𝑘

is the third sheet (

𝑘 /∈ {𝑖, 𝑗}

By the Layer Decomposition Theorem (Theorem 2), each non-trivial logical

𝑔

𝑠

on sheet

𝑆

𝑠

has weight

≥ 𝐿

(per-layer 2D toric code distance).

Since

𝑔

is non-trivial in the merged code, at least one of the following holds:

(a)

𝑔

𝑘

= 0 (mod

stab

𝑆

𝑘

)

, i.e.,

𝑔

𝑘

is a non-trivial logical of

𝑆

𝑘

. Then

wt(𝑔

𝑘

) ≥ 𝐿

. Both

wt(𝑔)

and

wt(𝑔 ⊕ Op)

contain

wt(𝑔

𝑘

) ≥ 𝐿

as a summand, so the class minimum is

≥ 𝐿

(b)

𝑔

𝑘

is trivial in

𝑆

𝑘

(i.e.,

𝑔

𝑘

= 0

or a stabilizer), but

𝑔

𝑖

is a non-trivial logical of

𝑆

𝑖

Then

wt(𝑔

𝑖

) ≥ 𝐿

, so

wt(𝑔) ≥ 𝐿

. For

wt(𝑔 ⊕ Op)

, the contribution

wt(𝑔

𝑗

⊕ 𝑍

𝐵

)

either

≥ 𝐿

(if

𝑔

𝑗

and

𝑍

𝐵

are in distinct logical classes, or if

𝑔

𝑗

is a stabilizer leaving

the

𝑍

𝐵

contribution of weight

𝐿

) or zero (if

𝑔

𝑗

≡ 𝑍

𝐵

(mod

stab

𝑆

𝑗

)

). In the zero

case,

𝑔 ⊕Op

has contributions only from

𝑔

𝑖

⊕𝑍

𝐴

(in

𝑆

𝑖

, weight

≥ 𝐿

) and

𝑔

𝑘

(in

𝑆

𝑘

possibly weight 0). Therefore

wt(𝑔 ⊕ Op) ≥ 𝐿

𝑖

and

𝑗

(d) Multiple sheets contribute non-trivial logicals. Then both

wt(𝑔)

and

wt(𝑔 ⊕ Op)

inherit contributions from at least two non-trivial per-sheet logicals, each

≥ 𝐿

, so

the class minimum is

≥ 𝐿

Achievability of

𝐿

The class containing

𝑍

𝐴

has representatives

{𝑍

𝐴

, 𝑍

𝐴

⊕ Op}

. Since

𝑍

𝐴

⊕ Op = 𝑍

𝐴

⊕ 𝑍

𝐴

⊕ 𝑍

𝐵

= 𝑍

𝐵

modulo per-sheet stabilizers, this class equals

{𝑍

𝐴

, 𝑍

𝐵

}

with representatives of weight

𝐿

each. Therefore the minimum is achieved.

Computational verication.

The proof above is independent of

𝐿

. To rule out edge

cases, we additionally veried Theorem 5 by exhaustive enumeration at small

𝐿

. At

𝐿 = 4

, all

6𝐿 = 24

per-sheet logical generators give class minimum exactly 4, and the

same minimum holds for all





= 276

pair products. At

𝐿 = 6

, all 36 generators and





= 630

pair products give class minimum exactly 6. No combination produced a class

minimum below

𝐿

6.1 X-Distance Preservation

Theorem 5 addresses the Z-logical group of the merged code, which is the side directly

modied by the cross-sheet Z-measurement

Op = 𝑍

𝐴

⊗ 𝑍

𝐵

. The X-side requires a sep-

arate argument: the merged X-logical group is the subgroup of the original X-logicals

that commute with

, modulo the original X-stabilizers (which are unchanged by the

surgery).

Theorem 6

(X-distance preservation)

Under the same conditions as Theorem 5, the

merged code’s X-distance equals

𝐿

(the per-sheet code distance). The merged code’s X-

logical group consists of all original X-logicals that have even overlap with the support of

, modulo the (unchanged) X-stabilizer group.

Proof.

Setup.

The original code has X-stabilizer matrix

𝐻

𝑋

and Z-logical group

ℒ

𝑍

X-logical group

ℒ

𝑋

. After surgery, the merged code has stabilizer matrices

𝐻

merged

𝑋

= 𝐻

𝑋

(unchanged) and

𝐻

merged

𝑍

= 𝐻

𝑍

∪ {Op}

. The merged X-logical group is the normalizer

of the merged stabilizer group restricted to X-type operators, modulo the merged X-

stabilizers:

ℒ

merged

𝑋

= {𝑔 ∈ ℒ

𝑋

: [𝑔, Op] = 0}



⟨𝐻

𝑋

⟩.

The commutativity condition

[𝑔, Op] = 0

for

𝑔

an X-operator and

a Z-operator reduces

to:

𝑔

has even-parity overlap with the support of

Counting.

The original X-logical group has

6𝐿

generators (

2𝐿

per sheet for the toric

code;

𝐿

per sheet for the planar variant).

Op = 𝑍

𝐴

⊗𝑍

𝐵

where

𝐴

is a Z-logical of sheet

𝑆

𝑖

and

𝐵

is a Z-logical of sheet

𝑆

𝑗

. The X-logical

𝑋

𝐴

(the conjugate of

𝐴

𝑆

𝑖

) anticommutes

with

𝑍

𝐴

(per-sheet anticommutation) and commutes with

𝑍

𝐵

(disjoint sheet supports),

𝑋

𝐴

anticommutes with

. Symmetrically,

𝑋

𝐵

anticommutes with

. All other

generators of

ℒ

𝑋

commute with

: per-sheet

𝑋

-logicals in sheets

𝑆

𝑘

𝑘 = 𝑖, 𝑗

have

disjoint support from

and commute trivially; per-sheet

𝑋

-logicals in

𝑆

𝑖

𝑆

𝑗

other

than

𝑋

𝐴

𝑋

𝐵

are independent of

𝑋

𝐴

(resp.

𝑋

𝐵

) and the per-sheet anticommutation

structure ensures they commute with the relevant component of

The merged X-logical group has

6𝐿 − 2

commuting generators from the original

6𝐿

but the product

𝑋

𝐴

· 𝑋

𝐵

(sum of two anticommuting generators) commutes with

(anticommutes with each component, so even total overlap). This product is a non-trivial

X-logical of the merged code, representing the consumed-Z logical’s X-conjugate. Thus

the merged X-logical group has

6𝐿 − 1

generators, consistent with

𝑘

merged

= 6𝐿 − 1

(one

Z-logical consumed by adding

Minimum weight.

The merged X-logical generators are of two types:

(a) Original per-sheet X-logicals that commute with

: weight

𝐿

each (per-sheet 2D

toric distance, Theorem 2).

(b) The new generator

𝑋

𝐴

·𝑋

𝐵

: weight

2𝐿

(disjoint supports across sheets

𝑆

𝑖

and

𝑆

𝑗

The minimum weight over all generators is

𝐿

. Linear combinations of generators yield at

least weight

𝐿

by the same layer-decomposition argument as Theorem 5: each non-trivial

component on a single sheet contributes weight

≥ 𝐿

. Therefore the merged X-distance

equals

𝐿

Computational verication.

𝐿 = 4, 6

, all

6𝐿

X-logical generators have weight

𝐿

; exactly 2 anticommute with

(conrming the

6𝐿 − 2

count of single-generator

commuters), and the minimum weight among merged X-logical generators is exactly

𝐿

Combined with Theorem 5, the merged code distance is

𝐿

for both Z- and X-side errors.

7 Threshold Simulation

We characterize the surgery operation using a custom Stim circuit [12] with explicit tri-

angle measurements, decoded with MWPM via PyMatching [13], and report a nite-size-

scaling (FSS) crossing threshold estimate from pairwise crossings at

𝐿 ∈ {4, 6, 8}

. The

static memory FSS-crossing estimate is reported as a complementary baseline.

7.1 Custom Stim Circuit Construction

We construct a Stim circuit implementing the full surgery operation: two triad sheets

(

𝑆

𝑥𝑧

, 𝑆

𝑦𝑧

for the

𝐿 = 4

primitive) with per-sheet vertex Z- and oct-void X-stabilizer

measurements,

𝐿

triangle ancilla measurements during the merge phase, and auxiliary

𝑆

𝑥𝑦

data qubits used by the triangle measurements. Three phases of

𝑑

syndrome rounds

(pre-merge, merge, post-merge); the

3𝐿

X-stabilizers broken during merge (Lemma 2)

have detectors skipped per the gauge structure of Section 5.6. Key statistics:

𝐿

Total qubits Triangles Broken X-stabs DEM error mechanisms

4 324 4 12 (8 in

𝑆

𝑥𝑧

+ 𝑆

𝑦𝑧

, 4 in

𝑆

𝑥𝑦

) 23,946

6 750 6 18 (12 in

𝑆

𝑥𝑧

+ 𝑆

𝑦𝑧

, 6 in

𝑆

𝑥𝑦

) 70,434

8 2,568 8 24 (16 in

𝑆

𝑥𝑧

+ 𝑆

𝑦𝑧

, 8 in

𝑆

𝑥𝑦

) 539,215

The DEM decomposes via Stim’s

decompose_errors=True

(hyperedges into graph-like

edges where possible) at all three sizes; the resulting graphs are compatible with Py-

Matching’s MWPM decoder.

7.2 Surgery Operation Threshold

Terminology.

The numerical threshold estimates throughout this section and the rest

of the paper are

nite-size crossing estimates

extracted from pairwise crossings of logical

error rates at

𝐿 ∈ {4, 6, 8}

with MWPM decoding. We use the abbreviation FSS-crossing

estimate or simply “threshold estimate” for compactness, but the reader should not read

these as asymptotic (

𝐿 → ∞

) thresholds; they are tight empirical crossings at three

accessible code distances with quoted uncertainty drawn from the three pairwise crossings

and a formal FSS data-collapse t where available (Section 9.2 for the planar variant; the

toric FSS t gives

1.134% ± 0.033%

𝜈 = 1.50 ± 0.18

, statistically consistent with the

pairwise-crossing estimate). Extending to

𝐿 = 10

𝐿 = 12

would tighten the band but

is computationally expensive (Section 10.8).

We perform Z-basis logical memory experiments with the joint observable

𝑍

𝐴

⊗ 𝑍

𝐵

(the

cross-sheet logical that the surgery primitive measures), running

3𝑑

syndrome rounds

(pre-merge

𝑑

, merge

𝑑

, post-merge

𝑑

) followed by destructive Z-measurement. Three code

distances were measured:

𝐿 = 4, 6, 8

𝑝

(%)

𝐿 = 4 𝐿 = 6 𝐿 = 8

0.10

5.0 × 10

−3

< 2 × 10

−3

(0/500)

< 10

−3

(0/1000)

0.20

1.7 × 10

−2

1.5 × 10

−3

< 10

−3

(0/1000)

0.30

3.9 × 10

−2

8.0 × 10

−3

< 10

−3

(0/1000)

0.40 — —

5.5 × 10

−3

0.50

9.7 × 10

−2

3.8 × 10

−2

2.5 × 10

−2

0.60 — —

4.3 × 10

−2

0.70

1.8 × 10

−1

1.1 × 10

−1

8.7 × 10

−2

0.80

2.3 × 10

−1

1.6 × 10

−1

1.2 × 10

−1

0.90

2.8 × 10

−1

2.2 × 10

−1

2.0 × 10

−1

1.00

3.08 × 10

−1

2.77 × 10

−1

2.71 × 10

−1

1.10

3.3 × 10

−1

3.2 × 10

−1

3.4 × 10

−1

1.20

3.6 × 10

−1

4.0 × 10

−1

4.2 × 10

−1

1.50

4.4 × 10

−1

4.6 × 10

−1

4.9 × 10

−1

Pairwise crossings and threshold estimate.

The pairwise crossings of

𝐿 = 4

𝐿 = 6

𝐿 = 4

𝐿 = 8

, and

𝐿 = 6

𝐿 = 8

give three independent estimates:

Crossing

𝑝

cross

𝐿 = 4

𝐿 = 6 1.109%

𝐿 = 4

𝐿 = 8 1.069%

𝐿 = 6

𝐿 = 8 1.024%

The three crossings span

1.02%

1.11%

. We report a nite-size threshold estimate of

𝑝

surgery

= 1.07% ±0.05%

, where the band reects the spread of pairwise crossings. Below

threshold, the

𝑑

-scaling is monotonic with

𝐿

: at

𝑝 = 0.5%

, the logical error rate is

9.7%

𝐿 = 4

3.8%

𝐿 = 6

, and

2.5%

𝐿 = 8

, decreasing as expected for a fault-tolerant

operation in the below-threshold regime.

Formal FSS t.

As a second, parameterized threshold estimate, we t the standard

nite-size scaling ansatz

𝑝

𝐿

(𝑝) = 𝐴 + 𝐵 · 𝑥 + 𝐶 · 𝑥

with

𝑥 = (𝑝 − 𝑝

) · 𝐿

1/𝜈

to the

near-threshold data (

𝑝 ∈ [0.7%, 1.5%]

, 21 data points across

𝐿 = 4, 6, 8

) by least-squares

with weighted residuals (shot-noise statistical errors). The t yields

𝑝

toric

= 1.134% ± 0.033%, 𝜈 = 1.50 ± 0.18,

(5)

with

𝜒

/DOF = 2.75

. Consistent with the pairwise estimate within combined uncertainty

(t centered

∼ 0.07%

higher). The exponent

𝜈 ≈ 1.5

is in the percolation universality

range expected for surface-code-like threshold transitions (

𝜈

2D perc

= 4/3

3/2

[15]). The

𝜒

/DOF > 1

indicates systematic deviations from the simple quadratic ansatz typical of

surface-code FSS at moderate

𝐿

. The pairwise-crossing estimate is the more conservative

(model-free) report; both stated for completeness.

Shot counts and methodology.

Each row uses at least

1,000

shots; near-threshold

rows (

𝑝 = 0.9%

–

1.2%

) use up to

8,000

shots for tighter condence intervals. Standard

error on each rate is the binomial



𝑝(1 − 𝑝)/𝑁

and is plotted as error bars in Figure 3.

This is a nite-size threshold estimate from a custom Stim circuit with explicit triangle

measurements, not a proxy: the DEM is constructed via Stim’s

decompose_errors=True

and decoded by PyMatching’s MWPM. Each circuit is reproducibly built from the cached

surgery primitive at the corresponding

𝐿

. Below threshold,

𝑑

-scaling is strong: at

𝑝 =

0.1%

𝐿 = 4

surgery fails at

5 ×10

−3

while

𝐿 = 6, 8

have zero observed errors in

500

–

1000

shots (rate

< 10

−3

7.3 Comparison with Proxy Estimate

A natural proxy is a single rotated surface code at distance

𝐿

run for

3𝑑

rounds (the

same total syndrome budget), giving

𝑝

proxy

≈ 0.5%

. The custom Stim circuit with nite-

size scaling gives

1.07% ±0.05%

, substantially higher. The proxy is conservative because

it (a) assumes the joint

𝑍

𝐴

⊗ 𝑍

𝐵

observable suers as much logical-error rate as

3𝑑

rounds of single-cycle memory, ignoring the

2𝐿

-qubit cross-sheet support’s robustness; (b)

treats a single 2D code as a stand-in for the sheet code, ignoring the layer decomposition

(Theorem 2) where per-layer X-errors do not propagate; and (c) discards the triangle

measurements’ role as auxiliary syndrome bits during merge.

0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6

Physical error rate

(%)

Logical error rate (per surgery operation)

Surgery threshold: custom Stim circuit with explicit triangle measurements

= 4, 6, 8 finite-size scaling; 3

each

rounds total; MWPM decoding

Pairwise crossings:

= 1.07% ± 0.05%

L=4 (d_each=4)

L=6 (d_each=6)

L=8 (d_each=8)

Figure 3: Surgery threshold from the custom Stim circuit with explicit triangle measure-

ments at

𝐿 = 4, 6, 8

. Each data point uses the full surgery protocol (

𝑑

each

= 𝐿

rounds of

pre-merge, merge with triangle measurements, and post-merge), MWPM decoded on the

decomposed DEM. The three pairwise crossings

𝐿 = 4

𝐿 = 6

𝐿 = 4

𝐿 = 8

𝐿 = 6

𝐿 = 8

occur at

𝑝 = 1.109%, 1.069%, 1.024%

(gray band), giving

𝑝

surgery

= 1.07% ±0.05%

7.4 Static Memory Threshold (Single-Sheet Baseline)

For comparison with the surgery threshold, we also measure the single-sheet static memory

threshold using a custom circuit (no triangle measurements). Z-basis memory experiment

with

𝑑

syndrome rounds on the sheet

𝑆

𝑥𝑦

at L=4 and L=6:

𝑝

(%)

𝐿 = 4

𝑑 = 4 𝐿 = 6

𝑑 = 6

0.10

8.0 × 10

−3

6.0 × 10

−4

0.30

2.6 × 10

−2

4.4 × 10

−3

0.50

5.3 × 10

−2

1.9 × 10

−2

0.80

9.0 × 10

−2

5.4 × 10

−2

1.20

1.6 × 10

−1

1.5 × 10

−1

The single-sheet static memory FSS-crossing estimate from this custom circuit at

𝐿 ∈

{4, 6}

𝑝

static

≈ 1.0%

, statistically consistent with the surgery FSS-crossing estimate

1.07% ± 0.05%

and with the surface-code-literature threshold (

∼ 1%

at MWPM under

circuit-level depolarizing noise).

7.5 Decoder: What’s Veried and What Remains Open

The custom Stim circuit of Section 7.1 produces a decomposable detector error model

(23,946 error mechanisms at

𝐿 = 4

surgery).

Veried:

(a) the DEM can be decomposed

into an MWPM-compatible graphlike model via

decompose_errors=True

(a 2-edge de-

composition of the cross-sheet hyperedges; this addresses the principal Section 5.6 concern

that surgery introduces hyperedges, but the decomposition itself can be lossy — optimal

hypergraph or BP+OSD decoding remains open); (b) MWPM via PyMatching gives the

threshold estimate

1.07% ±0.05%

𝐿 = 4, 6, 8

; (c) the broken-X-stabilizer detector han-

dling produces a deterministic detector pattern combined with triangle outcomes.

Open:

(i) whether BP+OSD [14] extracts a higher FSS-crossing estimate from the same DEM

(the hyperedge decomposition is lossy); (ii) larger-

𝐿

behavior (

𝐿 ≥ 10

, computationally

expensive); (iii) full three-sheet rather than two-sheet circuit (the

𝑆

𝑥𝑦

sheet supplies only

auxiliary triangle data qubits in our circuit; we expect the estimate unchanged); (iv) de-

coders specic to the FCC sheet code’s octahedral symmetry. The custom Stim circuit

conrms fault-tolerant operation under MWPM at FSS-crossing estimates in the same

hardware-compatible regime as the surface code.

8 Comparison with State of the Art

8.1 Practical Position

What the sheet code is not.

A constant-rate qLDPC code. Its rate

𝑘/𝑛 = 2/𝐿

van-

ishes as

𝐿 → ∞

, the same asymptotic scaling as the surface code. Bivariate bicycle codes

and recent “good” qLDPC codes achieve constant or growing rates at higher connectivity

(

𝐾=6

or higher with long-range couplers). The sheet code does

not

solve the asymptotic

rate problem.

What the sheet code does solve.

Native cross-block fault-tolerant

joint Pauli mea-

surements

𝐾=4

active connectivity, with FSS-crossing threshold estimates comparable

to single-block surface-code memory. Surface code lattice surgery provides these primi-

tives only within a single substrate; bivariate bicycle codes target inter-block operations

𝐾=6

with active research on the inter-block protocols. The sheet code oers a third

option: inter-sheet joint Pauli measurements between logically distinct but co-located

CSS code blocks, native at

𝐾=4

via local triangle measurements. The trade is explicit:

surface-code-level rate in exchange for surface-code-level connectivity with native inter-

sheet joint Pauli measurements. Composing these primitives into full FT logical Cliord

gates (e.g., a fault-tolerantly thresholded CNOT) requires the additional protocol rene-

ments identied in Section 10 and is not established by this work.

8.2 Quantitative Comparison Table

Figure 4 visualizes the code-family landscape, and Table 1 gives the underlying parame-

ters.

Rotated

surface

2D toric Gross BB

(d=12)

Sheet (planar,

1 sheet)

Sheet (toric,

1 sheet)

Sheet (toric,

3 sheets, MUX)

Logical qubits

at code distance 8

K = 4

n = 64

lattice surgery

(within patch)

K = 4 (wraps)

n = 128

lattice surgery

K = 6

n = 144

active

development

K = 4

n = 392

FT joint-Pauli

(triangle surgery)

K = 4 (wraps)

n = 512

FT joint-Pauli

(triangle surgery)

K = 4 (wraps)

n = 1536

FT joint-Pauli

(triangle surgery)

Bottom annotations: inter-block primitive (operation between distinct logical code blocks).

Logical density and inter-block primitive availability

across CSS code families at

Figure 4: Logical qubit count at code distance

≈ 8

across CSS code families, with inter-

block primitive availability annotated as the bottom label of each bar. “Inter-block prim-

itive” denotes the operation available between distinct logical code blocks:

lattice surgery

(within patch)

for surface code (within a single patch’s lattice surgery zone);

FT joint-

Pauli measurements via triangle surgery

for the sheet code (FSS-crossing threshold esti-

mate

≈ 1%

);

active development

for bivariate bicycle codes. The sheet-code three-sheet

deployment (rightmost) gives

6𝐿 = 48

logicals at

𝑑 = 8

3𝐿

= 1536

data qubits; the

single-sheet variants give

𝐿

(planar) or

2𝐿

(toric) logicals.

8.3 Trade-os and Practical Niche

Where the sheet code wins.

(i)

Logical density at

𝐾=4

with inter-sheet joint Pauli

measurements

: the toric variant encodes

2𝐿

logicals per sheet (twice the rate of an equiv-

alent count of

𝐾=4

rotated surface code patches), and triangle surgery makes all

6𝐿

logicals across three sheets mutually addressable for joint Pauli measurements — a fea-

ture surface code only provides within a single patch’s lattice surgery zone. (ii)

Single-chip

𝐾=4

deployment with three sheets

: the three sheets sit on the same chip with recong-

urable couplers handling time-multiplexing; the bivariate bicycle architecture explicitly

requires

𝐾=6

with some long-range couplers [11]. (iii)

Inter-sheet joint Pauli measure-

ments as a native primitive

: the FCC lattice’s intrinsic 3D triangle structure gives FT-

thresholded inter-sheet joint Pauli measurements where surface code patches would need

expensive routing or transversal protocols. (iv)

Code-distance scaling

: unlike the full

[[3𝐿

, 2𝐿

+2, 3]]

FCC code’s xed

𝑑 = 3

, the sheet code achieves growing

𝑑 = 𝐿

, and

compared to the 3D toric code’s

𝑑 = 𝐿

at xed

𝑘 = 3

, the sheet code reaches

𝑘 = 2𝐿

per

sheet at the same distance.

Where the sheet code loses.

(i)

Asymptotic rate vs. qLDPC

: rate

Θ(1/𝑑

)

vanishing

𝑑 → ∞

, the same as surface code; bivariate bicycle achieves

Θ(1)

rate at

𝑑 = Θ(

√

𝑛)

recent “good” qLDPC codes at

𝑑 = Θ(𝑛)

. At

𝑑 = 12

the Gross BB code’s rate (

1/12

) is

approximately

12×

higher than the sheet code’s at the same distance. (ii)

Toric variant

needs wrap couplers

; the planar variant has half the rate (

𝐿

logicals per sheet). (iii)

Partial

logical reachability via triangles

: products span

6𝐿 − 3

of the

6𝐿

cross-sheet logicals

(three global-wrap correlations are unreachable); the reachability results suggest that

Table 1: Comparison of CSS codes at code distance

𝑑

(or

𝐿

). Rate is

𝑘/𝑛

data

. “Wraps”

indicates periodic boundary couplers required. “Cross-block primitive” denotes the avail-

able primitive for operations between independent code blocks: lattice surgery within a

patch for surface code; FT joint-Pauli measurements via triangle surgery for the sheet

code; active development for bivariate bicycle. The Gross BB code uses xed

𝑑 = 12

𝑘 = 12

𝑛 = 144

Code

𝑛

data

𝑘 𝑑

Rate

𝐾

Cross-block primitive Threshold

Rotated surface [4]

𝑑

𝑑 1/𝑑

4 Lattice surgery

∼ 0.7

–

1.0%

2D toric [3]

2𝑑

𝑑 1/𝑑

4 (wraps) Lattice surgery

∼ 0.7%

3D toric [2]

3𝐿

𝐿 ∼ 1/𝐿

6 Limited —

Gross BB [10] 144 12 12

1/12

6 Active development [11]

∼ 0.7%

Two-gross BB [10] 288 12 18

1/24

6 Active development —

Full FCC [1]

3𝐿

2𝐿

2/3

12 — —

Sheet code (planar, 1 sheet)

𝐿(𝐿−1)

𝐿 𝐿−1 ∼ 1/𝐿

Triangle surgery

0.76%

Sheet code (toric, 1 sheet)

𝐿

2𝐿 𝐿 2/𝐿

4 (wraps)

Triangle surgery

≈ 1.0%

Sheet code (toric, 3 sheets, MUX)

3𝐿

6𝐿 𝐿 2/𝐿

4 (wraps)

Triangle surgery

1.07% ± 0.05%

joint-Pauli measurements between any logical pair can be mediated via ancilla logicals,

but a complete routing protocol with depth and FT analysis is left to future work. (iv)

Threshold simulation uses MWPM on a decomposed DEM

; BP+OSD may extract further

improvement.

Practical niche.

Hardware with

𝐾 = 4

active connectivity (Google Willow [8] and

future surface-code-compatible deployments), code distances

𝑑 ∈ {6, 8, 10}

where bivariate

bicycle’s

𝑑 ≥ 12

overhead is not yet justied, workloads requiring frequent inter-sheet

joint Pauli measurements where surface code patches would force routing or transversal

protocols across separate substrates. For longer-term larger-scale fault tolerance (

𝑑 ≥ 12

)

𝐾 = 6

hardware, bivariate bicycle codes remain preferable.

9 Discussion

9.1 Universal Gate Set

Joint Pauli measurements are sucient in principle to synthesize arbitrary Cliord op-

erations through standard lattice-surgery protocols with ancilla logical qubits [5, 6]. In

this work, we verify the three-sheet Horsman CNOT logical truth table (Section 10)

𝐿 ∈ {4, 6, 8}

, but nd that the current composed protocol is

not yet fault-tolerantly

thresholded

(its

𝑑

each

-scaling fails because the Pauli-frame correction enters the observ-

able through single-record gauge measurements). Establishing FT-thresholded Cliord

composition — and by extension a fault-tolerantly characterized universal Cliord set

— therefore requires the gauge-measurement renement or transversal route discussed in

Section 10.5, not provided in this paper. Non-Cliord gates (such as

𝑇

) further require

magic state injection or distillation; we expect the FCC lattice’s high symmetry group

(octahedral, order 48) to support ecient magic state protocols, but a detailed analysis

is left to future work.

9.2 Planar-Boundary Surgery

For real hardware deployments without periodic boundary conditions, each layer becomes

a rotated surface code with rough/smooth boundaries: rough on

±𝑥

smooth on

±𝑦

for

𝑆

𝑥𝑦

; rough on

±𝑧

smooth on

±𝑥

for

𝑆

𝑥𝑧

; rough on

±𝑧

smooth on

±𝑦

for

𝑆

𝑦𝑧

. The

rough-on-

𝑧

choice for

𝑆

𝑥𝑧

, 𝑆

𝑦𝑧

aligns the rough boundary with the direction in which

the triangle primitive’s chain extends (the primitive cancels

𝑥𝑦

-sheet edges and extends

through

𝐿

layers along

𝑧

), enabling a non-trivial cross-sheet logical to terminate on rough

boundaries.

Veried parameters.

The planar code satises

𝐻

𝑋

𝐻

𝑇

𝑍

= 0

; the minimum-weight cross-

sheet surgery primitive at

𝐿 = 4, 6, 8

𝐿

Data qubits

𝑛

Logicals

𝑘

Distance Triangles in primitive (Op weight)

4 108 12 3 9 (21)

6 450 18 5 25 (55)

8 1,176 24 7 49 (105)

The planar per-sheet logical count is

𝐿

(one per layer), so the three-sheet code encodes

𝑘 = 3𝐿

vs.

6𝐿

in the toric construction. The planar distance scales as

𝐿 − 1

(standard

rotated surface code nite-size eect). The minimum-weight

naive

surgery primitive uses

(𝐿 − 1)

triangles producing an Op of weight

∼ 𝐿

(a thick block rather than the toric

ribbon), forced by the additional boundary-stabilizer commutation constraints.

Measured planar threshold.

Following the same custom Stim methodology as the

toric case, we run planar surgery logical-memory experiments with

3𝑑

rounds of syndrome

extraction. The naive

(𝐿 − 1)

-triangle primitive (commuting with all boundary stabs)

yields no

𝑑

-scaling crossing across

𝑝 ∈ [5 × 10

−5

, 5 × 10

−3

]

with shot counts up to 30,000

(logical error rate strictly increasing in

𝐿

at every measured

𝑝

, implying threshold below

5 × 10

−5

, dominated by the

𝐿

-weight Op being too heavy to protect).

Boundary-aware planar primitive.

Allowing the primitive’s Z-operator to anticom-

mute with one weight-2 boundary X-stab — treating it as an additional broken stabilizer

during the merge phase, in direct analogy to the toric protocol’s bulk-broken X-stabs

(Lemma 2) — dramatically reduces the primitive’s Op weight:

𝐿

Triangles Op weight Boundary stabs broken Total qubits Total broken X DEM mech.

4 3 7 1 175 6 11,688

6 5 11 1 757 10 41,183

8 7 15 1 1,987 14 137,904

The boundary-aware primitive is a ribbon structure with Op weight

2𝐿 − 1

and triangle

count

𝐿 − 1

, closely analogous to the toric primitive (Op weight

2𝐿

, triangle count

𝐿

It breaks exactly one additional weight-2 boundary X-stabilizer per surgery operation,

handled by the same detector-suppression mechanism used for bulk-broken X-stabs.

Logical error rates per surgery operation under MWPM decoding with the boundary-

aware primitive:

𝑝

(%)

𝐿 = 4 𝐿 = 6 𝐿 = 8

0.10

9.0 × 10

−3

7.5 × 10

−3

7.5 × 10

−3

0.20

1.9 × 10

−2

1.3 × 10

−2

1.2 × 10

−2

0.30

3.7 × 10

−2

2.9 × 10

−2

2.0 × 10

−2

0.50

8.3 × 10

−2

7.3 × 10

−2

5.8 × 10

−2

0.70

1.5 × 10

−1

1.4 × 10

−1

1.3 × 10

−1

0.80

1.7 × 10

−1

1.9 × 10

−1

1.9 × 10

−1

0.90

2.0 × 10

−1

2.3 × 10

−1

2.4 × 10

−1

1.00

2.4 × 10

−1

2.9 × 10

−1

3.1 × 10

−1

1.10

2.6 × 10

−1

3.3 × 10

−1

3.7 × 10

−1

Pairwise crossings and threshold estimate.

Crossing

𝑝

cross

𝐿 = 4

𝐿 = 6 0.725%

𝐿 = 4

𝐿 = 8 0.756%

𝐿 = 6

𝐿 = 8 0.812%

The three crossings span

0.72%

0.82%

, giving

𝑝

planar

= 0.76%±0.05%

for the boundary-

aware planar variant. This is

∼ 30%

lower than toric (

1.07%

), reecting the cost

of boundary-stabilizer breaking, but in the same percentage-level regime — practical

for near-term 2D superconducting hardware. For reference, Google’s distance-7 Willow

demonstration reports a two-qubit CZ error rate of

≈ 0.36%

[8], so a planar sheet-code

memory at

𝐿 ≥ 4

would operate at

∼ 2×

below this FSS-crossing estimate; the toric

variant at

∼ 3×

below.

Formal FSS t (planar).

Applying the same FSS ansatz to the boundary-aware planar

data (

𝑝 ∈ [0.4%, 1.2%]

, 21 data points across

𝐿 = 4, 6, 8

) gives

𝑝

planar

= 0.725% ± 0.018%, 𝜈 = 1.38 ± 0.12,

(6)

with

𝜒

/DOF = 2.12

. The t-based threshold coincides with the lowest pairwise crossing

(

𝐿 = 4

𝐿 = 6

0.725%

), reecting the additional weight that low-

𝐿

data carries

in a quadratic t. The exponent

𝜈 ≈ 1.4

matches the toric value within uncertainty, as

expected for a CSS-symmetric construction.

Comparison summary.

Variant Triangles Op weight Distinct broken X DEM mech. (

𝐿=8

)

𝑝

surgery

Toric

𝐿 2𝐿 3𝐿

539,215

1.07% ± 0.05%

Planar (naive)

(𝐿−1)

∼ 𝐿

∼ 2𝐿

325,423

< 5 × 10

−5

Planar (boundary-aware)

𝐿−1 2𝐿−1 2(𝐿−1)

137,904

0.76% ± 0.05%

0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00

Physical error rate

(%)

Logical error rate (per surgery operation)

Boundary-aware planar-boundary surgery threshold:

= 4, 6, 8

Ribbon primitive (Op weight 2

1) recovers toric-like behavior

Pairwise crossings:

planar

= 0.76% ± 0.05%

L=4 (d_each=4)

L=6 (d_each=6)

L=8 (d_each=8)

Figure 5: Planar-boundary surgery threshold with boundary-aware ribbon primitive at

𝐿 = 4, 6, 8

. The boundary-aware variant allows exactly one weight-2 boundary X-

stabilizer to be broken during merge, in direct analogy to how the toric protocol breaks

bulk X-stabilizers (Lemma 2). The three pairwise crossings

𝐿 = 4

𝐿 = 6

𝐿 = 4

𝐿 = 8

𝐿 = 6

𝐿 = 8

occur at

𝑝 = 0.725%, 0.756%, 0.812%

respectively (gray band),

giving

𝑝

planar

= 0.76% ± 0.05%

The boundary-aware planar variant achieves toric-comparable thresholds with strictly

planar (no-wraparound) hardware, deployable on Google Willow-class platforms without

toric topology. The

∼ 30%

threshold reduction relative to toric is the cost of handling

boundary stabilizers; further optimization (e.g., distributing broken stabs across both

blocks of a multi-block surgery) is future work.

9.3 XX-Merge: the CSS-Symmetric Surgery Primitive

The triangle primitive that implements the ZZ-merge (Section 7) has a CSS dual: replace

the triangle ancilla protocol with the X-basis version — ancilla reset to

|+⟩

, CNOT

direction reversed (ancilla as control, data as target), measure ancilla in X — to obtain an

XX-merge primitive measuring the joint

𝑋

-product of triangle data qubits. The constraint

becomes: triangle combinations whose X-product commutes with all

𝑍

-stabilizers and is

not in the row span of

𝐻

𝑋

Primitive identication (toric).

Searching the kernel of

𝐻

𝑍

·𝐵

𝑇

(mod 2)

at minimum

weight, after skipping trivial X-stabilizer combinations:

𝐿

ZZ-primitive (tri, wt) XX-primitive (tri, wt) Broken stabs Notes

4 4, 8 4, 8 12 Z-stabs exact CSS mirror

6 6, 12 6, 12 18 Z-stabs exact CSS mirror

8 8, 16 10, 20 28 Z-stabs 3-sheet, weight 25% heavier than ZZ

𝐿 = 4, 6

the XX-primitive structurally mirrors the ZZ-primitive exactly: same triangle

count, same operator weight, and the same

3𝐿

distinct broken Z-stabilizers per surgery

(the CSS dual of the ZZ-primitive’s broken X-stabs, Lemma 2). At

𝐿 = 8

, minimum-

weight search returns a 2-sheet artifact (weight 16, sheets

{𝑥𝑦, 𝑥𝑧}

only); the rst 3-sheet

primitive appears at higher weight (10 triangles, weight 20), breaking

Z-stabs.

Threshold measurement.

Custom Stim circuit with

3𝑑

syndrome rounds,

|+⟩

⊗𝑛

ini-

tialization, nal X-basis readout. Z-stab detectors paired from round 1+ except the

3𝐿

broken ones during the merge phase (

𝐿 = 8

); X-stab detectors deterministic in

round 0 and paired throughout. Logical error rates per surgery operation:

𝑝

(%)

𝐿 = 4 𝐿 = 6 𝐿 = 8

0.4

5.60 × 10

−2

1.90 × 10

−2

1.60 × 10

−2

0.6

1.25 × 10

−1

6.98 × 10

−2

9.27 × 10

−2

0.8

2.12 × 10

−1

1.71 × 10

−1

2.23 × 10

−1

0.9

2.53 × 10

−1

2.33 × 10

−1

3.08 × 10

−1

1.0

2.87 × 10

−1

2.97 × 10

−1

3.92 × 10

−1

1.1

3.15 × 10

−1

3.49 × 10

−1

4.30 × 10

−1

Pairwise crossings:

𝐿 = 4

𝐿 = 6

𝑝 ≈ 0.97%

(the cleanest CSS-symmetric comparison;

both primitives are exact mirrors of the ZZ-side),

𝐿 = 4

𝐿 = 8

𝑝 ≈ 0.77%

𝐿 = 6

𝐿 = 8

𝑝 ≈ 0.46%

. The lower

𝐿 = 6

𝐿 = 8

crossing is attributable to the

𝐿 = 8

XX-primitive’s 25% heavier operator weight (20 vs. 12 at

𝐿 = 6

); a tighter

𝐿 = 8

analog

primitive matching the Z-side’s weight 16 is open computational work. Taking the

𝐿 = 4

𝐿 = 6

crossing as the cleanest estimate: the FSS-crossing threshold estimate is

𝑝

𝑋𝑋

≈

1.0% ± 0.1%

, statistically consistent with the ZZ-merge estimate

1.07% ± 0.05%

within

combined uncertainty.

The CSS-symmetric surgery primitive result therefore

holds on the toric variant. On the planar boundary-aware variant the dual

XX-primitive is structurally blocked

(see below), so the “CSS-symmetric” framing

should be read as toric-only.

Multi-input ZZ truth table.

As an additional test of the ZZ-primitive, we run surgery

on all four computational-basis inputs

|𝑐⟩

𝐶

⊗ |𝑎⟩

𝐴

(

𝐶, 𝐴

are the merged per-sheet Z-

logicals, weight-

𝐿

strings in

𝑆

𝑥𝑧

, 𝑆

𝑦𝑧

). Logical inputs prepared by applying matched X-

logicals

𝑋

𝐶

, 𝑋

𝐴

|0⟩

⊗𝑛

. At

𝑝 = 0

, all four inputs yield deterministic outcomes matching

ZZ = 𝑐 ⊕ 𝑎

(

100

shots each); at

𝑝 = 0.005

, the failure rate is input-independent to four

signicant gures (5,000 shots each, all giving 34.4% raw ip rate). The primitive acts

correctly as

𝑍

𝐶

⊗ 𝑍

𝐴

on all classical inputs with input-symmetric noise.

Planar XX-merge: structural blocker.

The planar boundary-aware variant does

not admit an analogous XX-primitive under the standard rough-

𝑧

/smooth-

𝑥𝑦

boundary

choice. Weight-2 Z-boundary stabilizers live only on the rough

𝑧

-boundaries; X-strings run

perpendicular to

𝑧

, so an XX-primitive that breaks exactly one boundary Z-stab cannot

extend as a cross-sheet ribbon. Searching for

𝐿 = 4, 6, 8

planar XX-primitives breaking

one boundary Z-stab yields zero candidates.

The CSS-symmetric XX-merge result

therefore applies only to the toric variant.

CSS-dual planar surgery requires either

a symmetric-boundary planar variant (rough boundaries distributed across all three axes)

or implementing the protocol on the toric variant.

The three-sheet Horsman CNOT, constructed by sequencing the ZZ- and XX-merge prim-

itives, is discussed in detail in Section 10 (split into its own section to make its epistemic

status explicit:

veried logical truth table, not yet fault-tolerantly characterized

10 Three-Sheet Horsman CNOT: Veried Truth Table,

Not Yet Fault-Tolerantly Characterized

This section deliberately stands apart from the joint-Pauli-measurement results of Sec-

tions 7 and 9.3 because its claim hierarchy is dierent. The ZZ- and XX-merges are char-

acterized as fault-tolerant primitives with nite-size crossing estimates of their thresholds.

The three-sheet Horsman CNOT, constructed below by composing the two merges with

a CSS-paired ancilla sheet, is a correct

logical

gate but not, as constructed here, a fault-

tolerantly characterized one. We make both claims explicitly and separately to avoid

the natural conation of “CNOT veried” with “CNOT FT-thresholded”; only the former

holds in this work.

10.1 Construction: Three Sheets, Two Merges, One Logical

CNOT

The standard three-qubit Horsman [5] CNOT places the control

𝐶

, ancilla

𝐴

(initialized

|+⟩

𝐿

), and target

𝑇

in three patches connected by sequential ZZ- and XX-merges.

We assign

𝐶

𝐴

𝑇

to three distinct sheets

𝑆

𝐶

, 𝑆

𝐴

, 𝑆

𝑇

of one FCC sheet code: the ZZ-

primitive (Section 7) spans

(𝑆

𝐶

, 𝑆

𝐴

)

and the XX-primitive (Section 9.3) spans

(𝑆

𝐴

, 𝑆

𝑇

)

sharing exactly sheet

𝑆

𝐴

. A search across valid primitives conrms that such a three-

sheet decomposition exists at every

𝐿 ∈ {4, 6, 8}

tested with the ZZ- and XX-primitives

sharing zero triangles and exactly one data qubit — the CSS-pairing anchor of

𝐴

’s Z- and

X-logicals, as required by the symplectic structure. The default lowest-weight primitive

per sheet pair gives higher overlap as an artifact of greedy primitive selection; choosing

CSS-compatible primitives (Z-prim’s

𝑍

𝐴

component anticommuting at exactly one qubit

with X-prim’s

𝑋

𝐴

component) gives the clean three-sheet decomposition.

10.2 Pauli-Frame Correction via GF(2) Solve

We implement the full merge-split-merge protocol with synchronous syndrome extraction

throughout, intermediate Pauli-frame tracking via Stim’s

OBSERVABLE_INCLUDE

book-

keeping, and the destructive readout in the Z-basis. Two subtleties arise that the bare

Heisenberg picture (which treats the XX-merge as a single non-local measurement of

𝑋

𝐴

⊗ 𝑋

𝑇

) misses. First, the protocol with computational-basis inputs and X-basis an-

cilla measurement leaves

𝑇

in an X-eigenstate, not a Z-eigenstate; measuring

𝐴

in the

Z-basis instead and tracking the joint stabilizer

𝑍

𝐶

𝑍

𝐴

𝑍

𝑇

= (−1)

𝑚

𝑍𝑍

+𝑡

recovers the Z-

basis CNOT output, with the predicted parity

(𝑍

out

𝑇

)

par

⊕(𝑍

out

𝐴

)

par

⊕𝑚

𝑍𝑍

equal to

𝑐 ⊕𝑡

Second, the XX-merge is realized as multiple individual X-triangle measurements; the

cumulative anticommutation count of

𝑍

𝐶

𝑍

𝐴

𝑍

𝑇

with these triangles is even but not neces-

sarily zero, so a gauge bit equal to the XOR of an even subset of triangle outcomes leaks

into the observable. The leaking bit is captured by the rst post-XX-merge measurement

of any non-A-touching broken Z-stab whose anticommutation pattern with the X-triangles

matches the relevant subset; we identify these structurally by solving a linear system over

GF(2). Including the corresponding Z-stab record(s) in the observable cancels the gauge

bit.

10.3 Logical Truth Table Veried at

𝐿 ∈ {4, 6, 8}

Stim’s deterministic-observable check passes for all four computational-basis inputs at

𝐿 = 4, 𝑑

each

= 2

: the detector error model builds cleanly at

𝑝 = 0

, and raw sampling

𝑝 = 0

gives

(𝑐, 𝑐 ⊕ 𝑡)

deterministically for every input, the canonical CNOT truth

table. The protocol generalizes: the same algorithm veries the CNOT truth table at

𝐿 = 6

(where the decomposition picked by the batch primitive-nder happens to require

no Pauli-frame correction — the target anticommutation pattern is identically zero) and

𝐿 = 8

(with one correction Z-stab), and at

𝐿 = 4

with

𝑑

each

∈ {2, 3, 4}

. The GF(2) linear

system for the gauge-bit correction has a solution at every distance and depth we tested.

The veried circuit at

𝐿 = 4, 𝑑

each

= 2

uses

392

physical qubits (

192

data

Z-triangle

merge ancillas

X-triangle merge ancillas

bulk stabilizer ancillas);

𝐿 = 6

uses

1308

qubits (

648

data

Z-tri

X-tri merge ancillas);

𝐿 = 8

uses

3098

qubits (

1536

data

+14

Z-tri

+12

X-tri merge ancillas).

10.4 Behavior Under Depolarizing Noise: Distance Suppression

in a Narrow Regime

We sweep the CNOT logical error rate (per CNOT, dened as either

Obs

devi-

ating from its noise-free expected value) using MWPM decoding via PyMatching on the

combined detector graph, at

𝑑

each

= 𝐿

. The

𝐿 = 4

𝐿 = 6

comparison shows clear

distance suppression in the operationally relevant regime

𝑝 ∈ [10

−3

, 3×10

−3

]

: at

𝑝 = 10

−3

LER

𝐿=4

= 19.8%±1.4%

versus

LER

𝐿=6

= 11.0%±1.6%

, a

4.2𝜎

improvement attributable

to distance. At

𝑝 = 3 ×10

−3

the order reverses (

40.0% ± 1.7%

44.0% ± 2.5%

), placing

the pairwise crossing near

𝑝 ≈ 2

–

3 × 10

−3

10.5 Non-Fault-Tolerant Scaling: Direct Diagnostic and Cause

The protocol as constructed is not fault-tolerant in the standard sense.

direct diagnostic shows that increasing

𝑑

each

does

not

reduce the LER at xed

𝐿

: at

𝐿 = 4

𝑑

each

= 8

gives

higher

LER than

𝑑

each

= 4

across the swept range (e.g.,

26.0%

19.8%

𝑝 = 10

−3

43.8%

29.4%

𝑝 = 2 ×10

−3

). Below

𝑝 ≈ 5 ×10

−4

the LER curves

are no longer cleanly ordered by code distance

𝐿

either. The structural cause is that

the Pauli-frame correction includes one or two specic gauge measurements (the broken

Z-stabs identied by the GF(2) solve above) that enter the observable as single records —

they are not repeated

𝑑

each

times in a way the decoder can majority-vote, so their noise

contribution scales with measurement count rather than beneting from the underlying

code distance.

Path to FT-thresholded status.

Repeating these gauge measurements

𝑑

each

times

in a fashion that is itself fault-tolerant (so the decoder can detect and correct measure-

ment errors on the gauge bits via standard syndrome-dierence detectors) is the next

protocol-level step. An alternative is a decoder that incorporates the Pauli-frame struc-

ture explicitly (e.g., a hypergraph or BP+OSD decoder aware of which detectors share

the gauge-bit error mechanism). Both are concrete follow-up directions but are outside

the scope of this work; we therefore characterize the CNOT only as a

veried logical gate

not as a fault-tolerantly thresholded gate. A possible second route, applicable when the

three-sheet stacked architecture (Section 3) is realized with vertical inter-sheet couplers,

is a transversal inter-sheet CNOT —

the rotational sheet isomorphism that relates two

sheets at the lattice level can be realized as a physical qubit correspondence that is also

a stabilizer-preserving CSS isomorphism between the two sheet codes. This is plausible

(the lattice automorphism induces a permutation of edges that maps each sheet’s vertex-Z

and oct-void-X stabilizers to the corresponding stabilizers in the partner sheet) but we do

not give the explicit stabilizer-level construction or threshold characterization here; both

the stabilizer-mapping lemma and the resulting LER characterization are left to future

work. The transversal route, if substantiated, would inherit the memory threshold and

avoid gauge bookkeeping entirely.

10.6 Summary

The three-sheet Horsman CNOT on the FCC sheet code is a veried correct logical gate

(

𝑝 = 0

truth table at

𝐿 ∈ {4, 6, 8}

𝑑

each

∈ {2, 3, 4}

) but not, as constructed here, a

fault-tolerantly characterized one. Distance suppression is observed in the narrow regime

𝑝 ∼ 10

−3

; under standard FT scaling diagnostics (

𝑑

each

-scaling and low-

𝑝

distance order-

ing), the protocol fails to qualify as fault-tolerant. The cause is structural (single-record

gauge measurements in the observable), and the remediation paths (FT-repeated gauge

measurements, gauge-aware decoder, or transversal in a stacked architecture) are identi-

ed.

10.7 Hardware Practicality and Deployment Niche

Scope.

What this paper characterizes quantitatively falls into three tiers.

Tier

1 (FT primitives with FSS-crossing threshold estimates, hardware-relevant):

the ZZ-

merge (

1.07% ± 0.05%

toric,

0.76% ± 0.05%

planar boundary-aware) and the XX-merge

(

∼ 1.0% ± 0.1%

toric only); the static single-sheet memory baseline near

1.0%

(Sec-

tion 7.4) is reported for context.

Tier 2 (correct logical gate, not yet FT-characterized):

the three-sheet Horsman CNOT — truth table veried at

𝐿 ∈ {4, 6, 8}

4.2𝜎

distance

suppression at

𝑝 = 10

−3

, but increasing

𝑑

each

does not reduce LER (Section 10), so no

clean asymptotic threshold; reaching Tier 1 requires FT-repeated gauge measurements

Physical error rate

(%)

Logical error rate per CNOT (any of Obs

, Obs

flipped)

= 4

= 6 crossing

0.01%

Three-sheet Horsman CNOT under depolarizing noise

(MWPM decoding,

each

)

L=4, d=4

L=6, d=6

L=8, d=8

Figure 6: Three-sheet Horsman CNOT logical error rate (any of the two observables

ipped from its noise-free expected value) under circuit-level depolarizing noise at

𝑑

each

𝐿

, decoded with MWPM (PyMatching) on the combined detector graph.

𝐿 = 4

and

𝐿 = 6

data each use 800 / 400 shots per point across the four computational-basis inputs;

𝐿 = 8

uses 100 shots per point. At the operating regime

𝑝 ∼ 10

−3

the

𝐿 = 4 → 𝐿 = 6

improvement is

4.2𝜎

; the pairwise crossing falls near

𝑝 ≈ 2

–

3 ×10

−3

, roughly

–

5×

below

the individual-merge thresholds (

𝑝

𝑍𝑍

≈ 1.07%

𝑝

𝑋𝑋

≈ 1.0%

). Below

𝑝 ≈ 5 × 10

−4

the

curves are no longer monotonically ordered by distance, attributable to the Pauli-frame

correction Z-stab(s) being measured once (not repeated

𝑑

times) so their noise contribution

does not benet from code distance.

This is the diagnostic that the CNOT is not

yet fault-tolerant in the standard sense

, and the gure should not be read as a

CNOT-threshold extraction.

or a gauge-aware decoder.

Tier 3 (architectural):

stacked three-layer deployment with

vertical inter-sheet couplers, and magic-state protocols suggested by FCC’s octahedral

symmetry, both specied structurally only.

Hardware compatibility.

The single-sheet variant requires

𝐾 = 4

active nearest-

neighbor planar connectivity (Section 5) — identical to surface code. This matches:

Google Willow (105-qubit

𝐾=4

grid [8]), IQM Star and Garnet (

𝐾=4

square grids), and

OQC Toshiko (

𝐾=4

coaxmon lattice). It does not match IBM heavy-hex (

𝐾=3

) or Rigetti

octagonal (

𝐾=3

) topologies without SWAP padding. Trapped-ion (Quantinuum H2,

qubits) and neutral-atom (QuEra Aquila, up to

∼ 256

logically recongurable qubits)

platforms emulate any topology by reconguration, subject to current qubit-count limits;

for these the

𝐾 = 4

envelope constraint is not binding but the qubit count is.

Concrete near-term experiments.

A planar

𝐿 = 4

single-sheet memory experiment

(

∼ 70

physical qubits) is within the scale of contemporary

𝐾=4

superconducting proces-

sors. Google’s 105-qubit Willow processor reports a CZ error rate

≈ 0.36%

[8],

∼ 2×

below the

0.76%

planar memory FSS-crossing estimate. Two-sheet toric surgery at

𝐿 = 4

requires a few-hundred-qubit

𝐾=4

device (

∼ 390

physical qubits) and is therefore a

next-generation demonstration target. The full three-sheet Horsman CNOT at

𝐿 = 4

(

392

qubits) likewise requires a larger

𝐾=4

device; even when the qubit count becomes

available, the gate remains Tier 2 until the FT-gauge renement (Section 10.5). Platform-

specic layout, calibration, and crosstalk analysis remain future work.

Density value proposition.

The Tier-1 capabilities dene a specic deployment niche:

higher logical density than surface code without leaving

𝐾 = 4

planar connectivity

. A

three-sheet toric deployment at

𝐿 = 4

uses

384

physical qubits (

192

data +

vertex-Z

oct-void-X ancillas) for

6𝐿 = 24

logicals —

∼ 16

physical qubits per logical, versus

2𝑑

− 1 = 31

for distance-4 rotated surface code, a

∼ 1.9×

improvement. At

𝐿 = 6

/logical vs

/logical; at

𝐿 = 8

/logical vs

127

/logical. The ratio is essentially at

𝐿

∼ 1.9

–

2.0×

. Single-sheet deployments (when only

2𝐿

logicals are needed) give a

more modest

∼ 1.3×

advantage. Bivariate bicycle codes achieve higher density at

𝐾 = 6

(gross code:

logicals in

144

qubits at

𝑑 = 12

), so the sheet code’s specic niche is

the

𝐾 = 4

slot: higher density than surface code without the connectivity upgrade BB

requires.

What density buys without FT inter-logical CNOTs.

The Tier-1 capabilities

(memory + joint Pauli measurements) suce for several memory-heavy deployment

classes that don’t require Cliord composition between logicals:

quantum networking

nodes

(a sheet-code repeater holds

∼ 2×

more simultaneous entangled-pair memories per

cryostat than a surface-code equivalent, and joint

𝑍𝑍+𝑋𝑋

measurements implement Bell

measurements for entanglement swapping),

benchmarking and characterization through-

put

(proportional speedup of logical RB, process tomography, memory-lifetime sweeps),

NISQ-to-FT bridge experiments

1,000

-qubit

𝐾 = 4

chip supports

∼ 60

sheet-code

logicals vs

∼ 32

surface-code at

𝐿 = 𝑑 = 4

), and

control-electronics amortization

(

∼ 2×

lower xed overhead per logical qubit-hour). Density does

not

, on its own, enable algo-

rithms requiring composable inter-logical FT Cliords (variational chemistry on entangled

multi-logicals, Shor/Grover at the logical layer, magic-state distillation between distilla-

tion patches); those wait on the Tier-2 CNOT reaching Tier-1 status via the gauge-x

renement or transversal-in-stacked-variant route.

Deployment summary.

As of today this work establishes: high-density FT memory

plus FT joint-Pauli primitives, immediately demonstrable at small scale on existing

𝐾 = 4

hardware, scaling to mid-size testbeds and quantum networking nodes on next-generation

𝐾 = 4

chips. A coherent and quantiable hardware-eciency win for memory-heavy

applications, not the universal-FT-computer endpoint, with a dened path to full Cliord

composition through identied protocol or hardware renements.

10.8 Limitations and Open Problems



Higher distances (

𝐿 ≥ 10

The current threshold is from nite-size scaling at

𝐿 = 4, 6, 8

. Extension to

𝐿 = 10

would tighten the band but requires substantially

more compute time per point (

∼ 6,000

qubits at

𝐿 = 10

, extrapolating from

𝐿 = 8

’s

∼ 3,100

). Cached-primitive infrastructure (Section 7.1) extends to

𝐿 = 10



CNOT fault-tolerance renement.

The three-sheet Horsman CNOT truth table

is veried at

𝐿 ∈ {4, 6, 8}

with

𝑑

each

∈ {2, 3, 4}

tested at

𝐿 = 4

(Section 10).

Distance suppression at

𝑝 = 10

−3

4.2𝜎

from

𝐿 = 4

𝐿 = 6

, with the pairwise

crossing near

𝑝 ≈ 2

–

3 × 10

−3

. However, a direct diagnostic test shows the protocol

is not yet fault-tolerant in the standard sense: at

𝐿 = 4

𝑑

each

= 8

gives

higher

LER

than

𝑑

each

= 4

, and below

𝑝 ≈ 5 × 10

−4

the LER curves are not cleanly ordered

by code distance. The root cause is structural: the Pauli-frame correction includes

specic gauge measurements (one or two broken Z-stabs per merge) that enter the

observable as single records, not repeated

𝑑

times. Lifting these gauge measurements

to FT-repeated form (or building a gauge-aware decoder) is a concrete protocol-level

renement that would restore standard FT scaling and enable a clean threshold

extraction. Until that renement is in place, the CNOT should be regarded as a

veried correct logical gate but not as a fault-tolerantly characterized one.



Planar threshold further optimization.

The boundary-aware planar variant

achieves

0.76% ±0.05%

∼ 30%

below toric. Closing this gap is plausible: distribut-

ing broken boundary stabs across both blocks of a multi-block surgery, boundary-

friendly primitives exploiting weight-2 stabs more aggressively, or routing through

bulk ancillas to avoid boundary contact.



Decoder improvements.

Whether BP+OSD [14] or a hypergraph decoder yields

higher threshold than the decomposed-edge MWPM used here is open; specialized

decoders matched to the surgery’s gauge structure may improve performance.



Magic state distillation, three-layer stacked modeling, and BB compar-

ison.

The octahedral symmetry group (order 48) suggests ecient magic state

protocols but a detailed construction is open. Physical characterization of inter-

layer TSV couplers (delity, crosstalk, latency) is treated only structurally here.

Side-by-side comparison with BB-code modular gates (

𝐾=6

, active architectural

development [11]) awaits further development of BB inter-block protocols.



Logical Cliord gates beyond CNOT.

Transversal

𝑆

and Hadamard, and sim-

ilar gates, require lattice surgery in conjugate bases or other protocols; not all are

spelled out.

11 Common Objections and Responses

“Isn’t a single triad sheet just

𝐿

stacked 2D toric codes? Where is the novelty?”

Yes, by Theorem 2, the per-sheet static memory is exactly

𝐿

parallel 2D toric codes. The

novelty is not the per-sheet static memory but the

cross-sheet

primitive: weight-3 FCC

triangle measurements that couple data qubits in three distinct sheets simultaneously,

implementing a joint Pauli measurement between logical qubits in dierent sheets while

preserving

𝐾=4

active connectivity. This primitive does not exist in independent 2D toric

codes.

“Is the

1.07%

gure a real threshold?”

It is a nite-size threshold estimate from

logical error rates at

𝐿 = 4, 6, 8

on the full custom Stim circuit. The three pairwise

crossings (

𝐿 = 4

𝐿 = 6

1.109%

𝐿 = 4

𝐿 = 8

1.069%

𝐿 = 6

𝐿 = 8

1.024%

) give

𝑝

surgery

= 1.07% ± 0.05%

. Shot counts

1,000

–

8,000

per point. Extending

𝐿 = 10

would tighten the band but is computationally expensive. The threshold is

signicantly higher than the proxy estimate of

0.5%

in Section 7.

“Does the surgery preserve the full code distance, not just the Z-side?”

Yes.

Theorem 5 (Z-side) and Theorem 6 (X-side) give proofs general in

𝐿

via the layer decom-

position. Computational sanity checks at

𝐿 = 4, 6

conrm both Z and X distances equal

𝐿

after the merge.

“Does the current threshold simulation benchmark the full merge-split gate,

or only the joint parity measurement?”

Both, but with dierent epistemic status.

Sections 7 and 9.3 characterize the ZZ- and XX-merge primitives individually as fault-

tolerant joint Pauli measurements with FSS-crossing threshold estimates (

1.07% ±0.05%

and

≈ 1.0% ±0.1%

respectively, toric variant). Section 10 addresses the composed three-

sheet Horsman CNOT separately: the logical truth table is veried deterministically at

𝐿 ∈ {4, 6, 8}

𝑝 = 0

, and we ran a depolarizing-noise LER sweep with MWPM decoding

(Figure 6) showing a

4.2𝜎

distance improvement from

𝐿 = 4

𝐿 = 6

𝑝 = 10

−3

However, the CNOT is not yet fault-tolerantly characterized:

a direct diagnostic shows

𝑑

each

-scaling fails (at

𝐿 = 4

𝑑

each

= 8

gives higher LER than

𝑑

each

= 4

) because the Pauli-

frame correction includes single-record gauge measurements that do not benet from

code distance. The full CNOT is therefore a veried

correct logical gate

but not a

fault-

tolerantly thresholded

one. The joint Pauli measurements are; the synthesis is not yet.

Concrete remediation paths (FT-repeated gauge measurements, gauge-aware decoder, or

transversal CNOT in a stacked architecture) are identied in Section 10.5.

“How does this compare to bivariate bicycle codes and 3D toric codes?”

Not

on rate: the sheet code has rate

Θ(1/𝐿

)

matching surface code, while BB codes achieve

constant rate at

𝐾=6

with active research on inter-block protocols. The niche we claim

is specically

𝐾=4

planar connectivity (the envelope of standard transmon processors):

the sheet code achieves cross-block joint-Pauli measurements natively at

𝐾=4

, where the

full 3D toric code at

𝐾=6

𝐾=12

is not surface-code-compatible. Dierent tools for

dierent workloads.

“Is the planar-boundary case veried numerically?”

Yes, at

𝐿 = 4, 6, 8

. Static

planar code has

𝑘 = 3𝐿

, distance

𝐿 −1

, CSS-valid structure. Two planar primitives com-

pared: naive (requires commutation with all boundary stabs, Op weight

∼ 𝐿

, threshold

< 5 × 10

−5

) and boundary-aware (one weight-2 boundary stab broken per surgery, Op

weight

2𝐿 −1

, threshold

0.76% ±0.05%

, Figure 5). The boundary-aware planar threshold

∼ 30%

lower than toric but in the same percentage regime, practical for Willow-class

chips without wraparound couplers.

12 Code Availability

Reference implementation and raw data, packaged as a single archive:

https://github.

com/raghu91302/ssmtheory/blob/main/fcc_code_only.zip

. Files are organized by

section:



Lattice and analytical tests

(Sections 2–6):

sheet_code_fcc_lattice.py

(lattice, stabilizers, triangles),

sheet_code_gf2.py

(GF(2) utilities),

sheet_code_surgery.py

(logical analysis),

sheet_code_test_construction.py

(unit tests).



Toric threshold

(Section 7):

sheet_code_custom_surgery.py

(2-

sheet surgery circuit),

sheet_code_cached_surgery.py

(accelerated

𝐿 = 8

runner),

sheet_code_surgery_threshold.py

(sweep driver),

surgery_primitive_L8.json

surgery_threshold_results.json

(cached

primitive and raw shot data).



Planar variant

(Section 9.2):

sheet_code_planar_stabilizers.py

(planar lat-

tice with rough

𝑧

-boundary),

sheet_code_planar_verify.py

(verication at

𝐿 =

4, 6, 8

sheet_code_planar_ba_surgery.py

(boundary-aware ribbon primitive),

planar_ba_threshold_results.json



XX-merge (Section 9.3) and three-sheet CNOT (Section 10)

sheet_code_xx_surgery_toric.py

(XX-merge),

xx_surgery_results.json

find_3sheet_primitives.py

find_3sheet_primitives_fast.py

(CSS-

compatible primitive enumerators, the latter with batch Gauss-Jordan

for

∼ 30×

speedup at

𝐿 = 8

css_dual.py

(symplectic-dual nder),

sheet_code_cnot_full.py

(three-sheet Horsman CNOT with structural Pauli-

frame correction),

cnot_truth_table_results.json

(truth-table verication),

cnot_ler_sweep.py

cnot_ler_dL_L{4,6,8}.json

(CNOT LER sweeps with

PyMatching, Figure 6).



Reproducibility and FSS.

All sweep runners accept

--seed

and

--dump-dir

ags

for reproducibility, and the

--dump-dir

option emits representative

.stim

.dem

artifacts on demand at any chosen

(𝐿, 𝑝)

point (the runners reproduce, e.g., the

toric

𝐿 = 4, 6

𝑝 = 1.0%

and planar

𝐿 = 4

𝑝 = 0.8%

artifacts refer-

enced in earlier sections).

sheet_code_formal_fss_fit.py

produces the toric

(

𝑝

= 1.134% ± 0.033%

𝜈 = 1.50 ± 0.18

) and planar (

𝑝

= 0.725% ± 0.018%

𝜈 = 1.38 ±0.12

) data-collapse plots. Running

sheet_code_test_construction.py

reproduces every analytical claim; the per-variant runners reproduce every numeri-

cal entry.

13 Conclusion

We have presented an integrated CSS code architecture on the Face-Centered Cubic lat-

tice. The static sheet code

[[𝐿

, 2𝐿, 𝐿]]

provides ecient storage at

𝐾=4

active connectiv-

ity; the cross-sheet triangle surgery primitive provides fault-tolerant

joint Pauli measure-

ments

between logical qubits in dierent sheets, also at

𝐾=4

. The combination occupies

a previously unlled niche:

𝐾=4

hardware compatibility (matching surface code, deploy-

able on Google Willow and similar chips) with inter-sheet joint Pauli measurements as a

native primitive between co-located CSS code blocks — a dierent design point from sur-

face code lattice surgery (conned within one substrate) and bivariate bicycle codes (

𝐾=6

modular inter-block operations between physically separated blocks, an area of active ar-

chitectural development). Under circuit-level depolarizing noise on a custom Stim circuit

with explicit triangle measurements, FSS-crossing threshold estimates at

𝐿 = 4, 6, 8

are

𝑝

𝑍𝑍

= 1.07% ± 0.05%

(toric) and

0.76% ± 0.05%

(planar boundary-aware) for the ZZ-

merge, and

≈ 1.0% ±0.1%

for the toric XX-merge (the planar XX-merge has a structural

blocker under the standard boundary choice). The construction is veried analytically

(with computational sanity checks at

𝐿 = 4, 6, 8

) for code parameters and Z/X distance

preservation under the merge.

Synthesizing the ZZ- and XX-merges into the three-sheet Horsman CNOT, we verify the

logical truth table at

𝐿 ∈ {4, 6, 8}

and observe a

4.2𝜎

distance-induced LER reduction at

𝑝 = 10

−3

, but a direct diagnostic shows the composed gate does not yet exhibit standard

fault-tolerant

𝑑

each

-scaling — the Pauli-frame correction enters the observable through

single-record gauge measurements. The full FT logical CNOT is therefore an identied

follow-up direction with two concrete remediation routes (FT-repeated gauge measure-

ments with a gauge-aware decoder, or transversal CNOT in a stacked architecture); the

joint Pauli primitives stand on their own as the central characterized contribution of this

work.

14 Declarations

Clinical trial registration, Consent to Publish, Ethics and Consent to Partic-

ipate:

not applicable. This study does not involve a clinical trial, human participants,

human data, or animals.

Competing interests:

The author declares no competing interests.

Funding:

This work received no external funding.

Author contributions:

R.K. is the sole author and conceived the construction, per-

formed the mathematical analysis and computational verication, designed and imple-

mented the simulation code, generated the gures, and wrote the manuscript.

Data and code availability:

See Section 12 for the per-le index of the reference imple-

mentation and raw data les. Archive:

https://github.com/raghu91302/ssmtheory/

blob/main/fcc_code_only.zip

References

[1] R. Kulkarni, arXiv:2603.20294 (2026).

[2] E. Dennis, A. Kitaev, A. Landahl, and J. Preskill, J. Math. Phys.

, 4452 (2002).

[3] A. Yu. Kitaev, Ann. Phys.

303

, 2 (2003).

[4] A. G. Fowler

et al.

, Phys. Rev. A

, 032324 (2012).

[5] C. Horsman, A. G. Fowler, S. Devitt, and R. Van Meter, New J. Phys.

, 123011

(2012).

[6] D. Litinski, Quantum

, 128 (2019).

[7] A. Eickbusch

et al.

, Nature Phys.

, 1994 (2025).

[8] R. Acharya

et al.

(Google Quantum AI), “Quantum error correction below the surface

code threshold,” Nature

638

, 920 (2025).

[9] R. Chao and B. W. Reichardt, Phys. Rev. Lett.

121

, 050502 (2018).

[10] S. Bravyi, A. W. Cross, J. M. Gambetta, D. Maslov, P. Rall, T. J. Yoder, Nature

627

, 778 (2024).

[11] T. J. Yoder

et al.

, “Tour de gross: A modular quantum computer based on bivariate

bicycle codes,” arXiv:2506.03094 (2025).

[12] C. Gidney, Quantum

, 497 (2021).

[13] O. Higgott, ACM Trans. Quantum Comput.

, 1 (2022).

[14] P. Panteleev and G. Kalachev, “Degenerate quantum LDPC codes with good nite

length performance,” Quantum

, 585 (2021).

[15] C. Wang, J. Harrington, and J. Preskill, “Connement-Higgs transition in a disor-

dered gauge theory and the accuracy threshold for quantum memory,” Ann. Phys.

303

, 31 (2003).