ISA Spec— Sail (RISC-V), ASL (ARM), Wasm spec (W3C)
←
ISA Spec— Same Sail model. The hub.
L2
Assembly— Machine code on the ISA
RTL— Verilog / SystemVerilog. riscv-formal.
L1
Binary / loader
Silicon— EDA / fab verification
L3 is the hub. Both chains converge at the ISA specification. Sail/Isla produces SMT formulas that serve as oracle for either chain. PL chain: MLIR → LLVM IR → assembly → ISA. HDL chain: Chisel → FIRRTL → Verilog → ISA. RVS spans L2–L5, anchoring both chains: upward (13 vector kernels, 39 compiler IR proofs) and downward (10 RISC-V cores, 151 ops UNSAT, 8 hack@DAC bugs). Zero hand-written RTL.
Section 1
Formal Work by Layer × ISA
RISC-V
Wasm
ARM
x86
LoongArch
L5
CompCert (no RVV) Alive2 (opt only) Nextcodegen TV
Empty Wasm codegen unverified
Alive2 (opt only) EmptyNo NEON/SVE codegen
Alive2 (opt only) EmptyNo AVX codegen
Empty No LLVM backend FV
L4
★ RVS 13 kernels UNSAT 3-layer pipeline
★ RVS Cross-ISA proof RVV == Wasm SIMD128
Empty ASL spec exists, no symbolic SIMD compute kernel prover
L7 (Model), L6 (Quantization), L1 (Silicon) omitted — not in our scope. Shown in overview above.
Why RISC-V and Wasm are center
Our pipeline requires both a formal L3 spec and a symbolic execution toolchain (the L3→L4 bridge). Only RISC-V and Wasm have both.
ISA
L3 Spec
L3→L4 Bridge
Status
RISC-V
Sail(complete, machine-readable)
Isla(symbolic executor → SMT)
★ Ready
Wasm
W3C spec(formal semantics)
Reference interp(+ simpler semantics)
★ Ready
ARM
ASL(complete, machine-readable)
ASLi = interpreter only(not symbolic)
No bridge
x86
Intel SAE = prose(ACL2 partial)
Nothing
No spec
LoongArch
Manual only
Nothing
Nothing
This is the technical reason RISC-V and Wasm are our primary targets. ARM has L3 but not the bridge; x86 has neither. Building an ASL→SMT symbolic executor (an "Isla for ARM") is a significant engineering effort and a potential future research direction — but not ours.
Section 1b — Landscape Matrix
Verification Landscape: Team × Layer
Every known RISC-V / processor verification tool mapped to our 7-layer model. Columns = teams, rows = layers. Key finding: nobody else spans more than 2 layers, and nobody connects to L3 (Sail) except us.
Layer
RVS (ours)
riscv- formal
chiRV- Formal
ChiselFV
DAC’24 Arith
ChiSA
Chisel- Verify
ISA- Formal
Alive2
ARM-TV
Islaris
VeriISLE
Volta
L7
Generator / Model — out of scope for all tools
L6
—
—
● Chisel ref model (hand)
● SVA in Chisel
● Stainless Scala
—
—
—
—
—
—
—
—
L5
● MLIR+LLVM IR 39 proofs
—
—
—
—
○ FIRRTL static analysis
—
—
● LLVM IR peephole
● AArch64 codegen TV
—
—
—
L4
● 13 kernels 344ms
—
—
—
—
—
—
—
—
—
—
● Cranelift inst. sel.
● GPU kernel equiv.
L3 ORACLE
★ Isla/Sail 43 traces auto oracle
✗ RVFI (hand)
✗ Chisel ref (hand from Sail)
✗ user SVA (hand)
✗ no ISA connection
✗ no spec
✗ no spec
○ ARM ASL (hand)
✗ domain spec
✗ domain spec
● Isla/Sail Coq proofs
● Isla/Sail inst. sel.
✗ GPU ISA spec
L2
● 3 cores + vec ALU + 2 bugs
● Verilog BMC scalar only
— L6 only
— L6 only
—
—
○ simulation (not formal)
● ARM RTL scalar only
—
—
—
—
—
L1
Silicon / Layout — out of scope for all tools
Span
L2–L5 (4 layers)
L2 (1 layer)
L6 (1 layer)
L6 (1 layer)
L6 (1 layer)
L5 (1 layer)
L2 (sim.)
L2 (1 layer)
L5 (1 layer)
L5 (1 layer)
L3 (1 layer)
L3–L4 (2 layers)
L4 (1 layer)
Vec?
✓
✗
✗
✗
✗
✗
✗
✗
✗
✗
✗
✗
GPU
Bugs?
✓ hack@DAC
✗
✗
✗
✗
✗
✗
✓ ARM
✓ LLVM
✓ AArch64
✗
✗
✗
● = formal verification at that layer
● = uses Isla/Sail
○ = partial/non-formal
✗ = no connection to ISA spec
★ = automated oracle hub
2. Nobody connects to L3. chiRVFormal hand-translates Sail to Chisel. riscv-formal uses bespoke RVFI. ChiselFV uses user-written SVA. All introduce unverified trust gaps at the oracle.
3. Scalar-only barrier. Every RISC-V RTL/HCL tool is scalar-only. The vector state explosion ($2^{16{,}456}$ for VLEN=256) is a structural barrier, not an engineering gap.
4. Isla is underused. Published at CAV 2021, adopted by Islaris and VeriISLE, but ignored by the Chisel ecosystem. chiRVFormal (SETTA 2024) chose to hand-translate Sail instead of using Isla — 3 years after Isla was available.
Extended ecosystem (C-based, simulation, model checking)
Beyond the Chisel formal ecosystem, these tools use C-based specifications, simulation co-verification, or hardware model checking.
Tool
Group
Layer
Method
Vec?
Oracle
Recent
EBMC
Kroening (Oxford)
L2
Formal
✗
SVA/LTL
Neural MC (NeurIPS’24), LLM lemmas (2025)
DiffTest
ICT-CAS (XiangShan)
L2
Sim
✓
NEMU (C)
Hot Chips’24, HPCA’25
BTOR2MLIR
Gurfinkel (Waterloo)
L2
Formal
✗
BTOR2
FMCAD’23, CAV’25 (Btor2-Select)
ImperasDV
Synopsys
L2
Sim
✓
OVPsim
Industry standard, 7k+ vector tests
ESBMC
Cordeiro (Manchester)
L5–L6
Formal
✗
C/C++
TACAS’24, ARM CCA (SAS’24)
SeaHorn
Gurfinkel (Waterloo)
L2–L5
Formal
✗
CHC
Via BTOR2MLIR → LLVM IR path
Spike
RISC-V International
L3
Exec
✓
C++ ISS
Official golden model, not formal
Key insight: Formal tools (EBMC, BTOR2MLIR) lack vector. Vector-capable tools (DiffTest, Imperas, Spike) are simulation-based. No tool combines formal + vector except RVS.
Simulation tools find bugs but cannot prove absence of bugs. Our spec-first approach bridges this gap.
Section 1c — Why RTL Can’t Scale
State Explosion: RTL vs ISA Specification
RTL verification via bounded model checking must model the full microarchitectural state. At L3, microarchitectural state does not exist.
src1(SEW) + src2(SEW) + dst(SEW or 2×SEW). No pipeline, no forwarding, no register file.
Z3 solve: <1 ms. Linear scaling with VL.
Scaling with VLEN (real implementations):
Implementation
VLEN
VRF bits
Est. total state
Conforming min
128
4,096
~8,000
XiangShan
128
4,096
~8,000
SiFive X280
512
16,384
~25,000
Ara (ETH Zürich)
4,096
131,072
~150,000
For context: the number of atoms in the observable universe ≈ 2266. Ara’s BMC search space is 2150,000. The state explosion is structural — it comes from the hardware’s physical register file width, not from poor encoding. No RTL abstraction can fix this without losing per-element soundness.
Section 1d — Results
Verification Data Inventory by Layer
Every result indexed to the 7-layer model. L3 is the oracle hub — all proofs flow through it.
Four automation paths: (1) Direct Yosys on Verilog/SV (CVA6, SERV), (2) sv2v for complex SystemVerilog (Ibex, CV32E40P, SweRV), (3) firtool + Yosys for Chisel/FIRRTL (Rocket, riscv-mini, BOOM, XiangShan), (4) expose -evert-dff for monolithic CPUs without separate ALU (PicoRV32). All pipelines are scripted in rebuild_bridges.sh.
▽ zoom into L4: RISC-V ecosystem ▽
Section 2
RISC-V L4 Ecosystem: Extension × Concern
RISC-V is modular — each extension family has independent instructions. Within each, compute kernel verification decomposes into four concerns by solver technique.
Integer QBV (bitvector)
Memory QBV + Array
Control Flow Induction / BMC
Float FP theory
RV64I/MBase + Multiply
Trivial
Scalar ops are simple. Pipeline supports them; no research novelty. riscv-formal covers L2.
Trivial
Standard load/store
Trivial
Branch/jump
N/A
RV64VVector SIMD
★ RVS
13 kernels
Q4_0–Q6_K, vecadd, conv1d, ReLU
Partial
Loads axiomatized as fresh symbols. vse32 store trace captured (72 write events).
Partial
Single iteration verified. vsetvli, strip-mining loops, mask (vm=0) TBD.
Skip
FP16 scale factor in quant dot products. Different solver domain.
Zvk*Vector Crypto
Next
AES / SHA / GCM
Sail spec ready (2024). Reuse pipeline.
Empty
Crypto I/O simpler
Empty
Simpler control flow
N/A
Zb*Bitmanip
Empty
clz, ctz, rotate — straightforward BV
N/A
N/A
N/A
RV64F/DScalar Float
N/A
N/A
N/A
Skip
Berkeley FPU (L2). FP theory at L4.
RV64AAtomic
N/A
Empty
Needs memory model
N/A
N/A
same BV solver←——→new technique←————→different domain
"Trivial" = our pipeline handles it, but scalar ops lack the SIMD complexity that makes L4 verification a research contribution. The tool covers them; the paper does not.
▽ zoom into L4: Wasm ecosystem ▽
Section 3
Wasm L4 Ecosystem: Proposal × Concern
Wasm is also modular (proposals). The decomposition mirrors RISC-V — each row maps to a RISC-V extension.
Integer QBV
Memory Linear mem model
Control Flow Structured (simpler)
Float IEEE 754
SIMD128128-bit fixed vectors
★ RVS
Cross-ISA proof
RVV == Wasm SIMD128 (Q8_0). Maps to RV64V integer.
Empty
Linear memory is simpler than RV (no PMA/PMP). Maps to RV64V memory.
Empty
Structured CF (block/loop/if) — no arbitrary jumps. Simpler than RV.
Empty
f32x4, f64x2. No FP16 native. Maps to RV64V float.
Relaxed SIMDPlatform-dependent
Next
relaxed_dot_i8x16 — maps to different ISA backends (NEON/AVX/RVV). Cross-ISA proof opportunity.
Same as SIMD128
Same as SIMD128
relaxed_madd etc.
WebCrypto+ SIMD crypto kernels
Next
AES/SHA via SIMD128 ops. Cross-ISA with Zvk*. Maps to Zvk* integer.
Wasm is structurally simpler: no PMA/PMP (linear memory), no arbitrary jumps (structured CF), no variable-length vectors (fixed 128-bit). This makes cross-ISA proofs tractable — the hard part is always on the RISC-V side.
▽ where to expand next ▽
Section 4
Expansion Map: Adjacency-Based Territory Growth
Next targets are adjacent to conquered territory. Two parallel fronts: Compiler (L5) expands vertically, Crypto expands horizontally.
RVV Int
Wasm Int
RVV Crypto
Wasm Crypto
L5
wave 2
codegen TV
wave 4
wave 4
future
L4
★ done
13 kernels
★ done
cross-ISA
wave 2
Zvk* AES/SHA
wave 3
cross-ISA crypto
Wave 1(done) RVV integer + Wasm cross-ISAWave 2(next, parallel) L5 compiler TV + Zvk* cryptoWave 3Wasm crypto cross-ISAWave 4+Expand codegen to all targets
Why L5 first? We already hold both endpoints: MLIR/LLVM IR specs (top) and Isla ISA traces (bottom). The compiler codegen gap in between is the last unverified link — and nobody does it for any vector ISA. Vertically adjacent to our L4 territory, maximum infrastructure reuse, highest impact.
Smart contracts, zkWasm correctness. Natural extension after crypto cross-ISA.
Appendix A — Reference Only (not in scope)
ARM L4 Ecosystem: Extension × Concern
ARM has a mature ISA spec (ASL) and rich extension set. Included for landscape completeness; not in our research scope.
Integer QBV
Memory QBV + Array
Control Flow Induction / BMC
Float FP theory
AArch64Base (v8/v9)
Trivial
Scalar ops. ASL spec complete. Arm ISA FV team covers L2.
Trivial
Standard load/store. TrustZone address space.
Trivial
Branch/compare.
N/A
NEON128-bit SIMD (v7/v8)
Empty
ASL spec exists. ASLi can extract semantics but no symbolic kernel prover built. Fixed 128-bit vectors.
Empty
NEON load/store lanes.
Empty
No predication (unlike SVE).
Empty
FP16/BF16 in NEON. Nobody at L4.
SVE / SVE2Scalable Vector (v9)
Empty
Variable-length (128–2048 bit). VLA model like RVV. ASL spec exists. Highest research value for ARM L4.
Empty
Gather/scatter, first-faulting loads.
Empty
Predicate-first model (unique to SVE). More expressive than RVV masks.
Empty
SVE FP ops.
CE / AES / SHACrypto Extensions
Empty
AES/SHA/SM3/SM4 instructions. ASL spec complete. No L4 prover.
N/A
N/A
N/A
SMEScalable Matrix (v9.2)
Empty
Matrix tile operations. Very new (2023). Outer product + accumulate.
Empty
Streaming mode memory.
Empty
Streaming SVE mode transition.
Empty
BF16/FP16 matrix ops.
ARM's L3 (ASL) is strong — machine-readable, maintained by Arm Ltd. The gap is L4: nobody has built a symbolic kernel verifier on top of ASL for NEON/SVE/CE. ASLi (interpreter) exists but is not a symbolic executor. The closest analog to our Isla would be building an ASL-to-SMT pipeline, which is a significant engineering effort.
Appendix B — Reference Only (not in scope)
x86 L4 Ecosystem: Extension × Concern
x86 has the largest deployed SIMD surface area but the weakest formal spec. Included for completeness.
Integer QBV
Memory QBV + Array
Control Flow Induction / BMC
Float FP theory
x86-64Base scalar
Trivial
ACL2 model (Centaur/AMD). Intel FV internal. K Framework partial model.
Partial
Complex memory model (TSO). Intel internal verification.
Trivial
Branch/jump.
N/A
SSE / AVX2128/256-bit SIMD
Empty
No formal spec (Intel SAE is prose). ACL2 has some SSE models. Widest deployment, worst formalization.
Empty
Aligned/unaligned loads.
Empty
No masking (pre-AVX-512).
Empty
SSE/AVX FP ops.
AVX-512512-bit SIMD + mask
Empty
No Sail model. Most complex x86 SIMD. Opmask registers (like RVV vm). ~5000 instruction variants.
Empty
Gather/scatter.
Empty
Opmask predication.
Empty
FP16 (AVX-512 FP16).
AES-NI / SHACrypto
Empty
AESENC/AESDEC/SHA instructions. No formal spec. Intel whitepaper only.
N/A
N/A
N/A
AMXAdvanced Matrix (Sapphire Rapids+)
Empty
Tile-based matrix multiply. Very new. Analogous to ARM SME.
Empty
Tile load/store.
Empty
TILECFG state.
Empty
BF16/FP16 tiles.
x86's fundamental problem is L3: there is no machine-readable formal ISA spec. Intel's Software Architecture Manual (SAE) is prose; ACL2 models (Centaur) cover a subset. This makes x86 the hardest target for our pipeline — you'd need to build the ISA model first. Alive2 covers LLVM optimizations (L5 partial) but not codegen to specific SIMD extensions.
References
Cell References
Click any cell above to jump here. Click a cell ID below to jump back.
A — Overview
A1–A7
How L7→L1 works in practice: llama.cpp as a running example
ASL (Arm Architecture Specification Language) — machine-readable spec maintained by Arm Ltd. ASLi — ASL interpreter by Alastair Reid (Arm Research). Interprets ASL specs but does NOT produce symbolic SMT traces; it is a concrete interpreter, not a symbolic executor like Isla.
B14
L3 × x86
ACL2 x86 model (Centaur/AMD) — partial formal model for x86 ISA in ACL2 theorem prover. Covers scalar subset; SIMD incomplete. Intel SAE (Software Architecture Manual) is prose only — no machine-readable formal spec.
ASL — Arm Architecture Specification Language. Machine-readable, maintained by Arm Ltd. Published with each architecture release.
C8
ARM L3→L4 Bridge (missing)
ASLi is an interpreter, not a symbolic executor. It cannot produce SMT traces from ASL specs. Building an "Isla for ARM" (ASL→SMT) is an open engineering challenge.
L5 × RVV Integer — Compiler Codegen TV — next target
CompCert — compcert.org(verified C compiler, no RVV backend) Alive2 — alive2.llvm.org(LLVM IR opt only, not codegen) Gap: nobody verifies LLVM vector codegen (IR → RVV asm) for any ISA. We hold both endpoints — MLIR/LLVM IR specs (top) + Isla ISA traces (bottom). Translation validation per compilation is the most practical path.