Factoidal — W3C test results

Pass

1662

Fail

Skipped

Pass / runnable

100.0% of 1662

Total in suites

1662

Scope. These counts measure conformance only — whether the engine produces the expected result on each W3C-supplied fixture. They say nothing about performance, memory use, scale, or real-world query shape. Skipped tests are features not yet implemented (mostly HTTP-protocol / federated-query endpoints). Raw per-suite logs and machine-readable artifacts are linked below.

SPARQL 1.1 631 pass · 0 fail · 0 skip

SPARQL 1.1 Query Language

aggregates

47/0

bind

10/0

bindings

11/0

cast

6/0

construct

7/0

csv-tsv-res

6/0

exists

6/0

functions

75/0

grouping

6/0

json-res

4/0

negation

12/0

project-expression

7/0

property-path

33/0

subquery

14/0

syntax-query

94/0

SPARQL 1.1 Update

add

8/0

basic-update

13/0

clear

4/0

copy

6/0

delete

19/0

delete-data

6/0

delete-insert

17/0

delete-where

6/0

drop

4/0

move

6/0

syntax-update-1

54/0

syntax-update-2

1/0

update-silent

13/0

SPARQL 1.1 Protocol

http-rdf-update

19/0

protocol

34/0

SPARQL 1.1 Federated Query

service

7/0

syntax-fed

3/0

SPARQL 1.1 Service Description

service-description

3/0

SPARQL 1.1 Entailment Regimes

entailment (SPARQL 1.1 regime — RDFS / D-entailment, 70 tests)

70/0

SPARQL 1.1 failures (click to expand)

(none)

SPARQL 1.1 skips (click to expand)

(none)

RDF 1.1 1031 pass · 0 fail

RDF 1.1 N-Triples

rdf-n-triples

70/0

RDF 1.1 Turtle

rdf-turtle

313/0

RDF 1.1 N-Quads

rdf-n-quads

87/0

RDF 1.1 TriG

rdf-trig

356/0

RDF 1.1 RDF/XML

rdf-xml

166/0

RDF 1.1 Semantics

rdf-mt

39/0

RDF 1.1 failures (click to expand)

(none)

RDF 1.1 skips (click to expand)

(none)

OWL 2 22+ pass via owl_runner across 7 catalogs (profile-RL/EL/QL + 4 DL) · live scoring

Scope. The OWL 2 W3C Test Cases catalog (~3378 test:TestCase entries across 9 categories) is vendored under third_party/testing/owl/. As of Phase 2.3 (2026-05-08), seven catalogs run live through owl_runner with PositiveEntailment / NegativeEntailment / Consistency / Inconsistency scoring: profile-RL.rdf, profile-EL.rdf, profile-QL.rdf, type-positive-entailment.rdf, type-negative-entailment.rdf, type-consistency.rdf, type-inconsistency.rdf, and semantics-direct.rdf. The remaining catalog (syntax-dl.rdf) needs a DL syntactic-profile checker before it can score.

Tableau on the live codepath. The F\* Tableau.tableau_materialise module (1167 LoC, 0 assume val, 0 --lax) is also on the SPARQL entailment regime codepath via w3c_runner.ml: parent4/5/6/7, simple7/8, sparqldl-01…12, etc. — the SPARQL 1.1 Entailment Regimes row above (70/70) passes because Tableau drives the membership check. The owl_runner catalogs below currently score under the RL closure path; wiring Tableau into owl_runner via --regime dl is Phase 2.3d.

Pass-rate context. Across the seven scored catalogs, the bulk of failures fall into two known categories: (1) tests that use OWL Functional-Style Syntax (not RDF/XML) trigger FAIL/no-premise — these need an FSS parser before scoring is meaningful; (2) Inconsistency-Test failures (notably the type-Inc 8% rate) are RL-closure incompleteness — RL doesn't derive every DL contradiction. Lifting requires either extending is_inconsistent or running Tableau (Phase 2.3d).

profile-RL PosEnt

22/8 of 30

profile-RL NegEnt

3/3 of 6

profile-RL Consistency

75/1 of 76

profile-RL Inconsistency

4/10 of 14

type-PosEnt PE

80/126 of 206

type-PosEnt Cons

185/21 of 206

type-NegEnt PE

0/0 of 0

type-NegEnt NE

19/4 of 23

type-NegEnt Cons

23/0 of 23

type-Cons PE

80/126 of 206

type-Cons NE

19/4 of 23

type-Cons Cons

324/30 of 354

type-Inc PE

0/0 of 0

type-Inc Inc

16/112 of 128

profile-EL PE

20/9 of 29

profile-EL NE

4/2 of 6

profile-EL Cons

71/1 of 72

profile-EL Inc

4/10 of 14

profile-QL PE

14/6 of 20

profile-QL NE

1/2 of 3

profile-QL Cons

58/0 of 58

profile-QL Inc

4/2 of 6

sem-Direct PE

80/126 of 206

sem-Direct NE

19/4 of 23

sem-Direct Cons

323/30 of 353

sem-Direct Inc

16/112 of 128

RIF Core

4/0 of 4

syntax-dl

— 646 tests · runner not wired (engine: DL syntactic-profile checker pending)

GeoSPARQL

— ? tests · vendoring + runner pending (see #239)

JSON-LD 1.1

— ? tests · vendoring + runner pending (W3C suite ~250 tests)

CSVW

— ? tests · vendored at third_party/testing/csvw; runner pending

ShEx

— ? tests · vendored at third_party/testing/shex; runner pending

DID

— ? tests · vendored at third_party/testing/did; runner pending

— ? tests · vendored at third_party/testing/vc; runner pending

RML

— ? tests · vendored at third_party/testing/rml; mapping engine pending

OWL 2 (W3C conformance): We vendor the full W3C OWL 2 Test Cases at third_party/testing/owl/ (10 catalog files, ~2500 test:TestCase entries after overlap). After Phase 2.3 (2026-05-08), 7 of 8 main catalogs run live through owl_runner with PE/NE/Cons/Inc scoring. The runner applies owl_rl_closure_with_reflexivity (fuel 100) and for entailment tests checks the conclusion’s triples against the closure (relaxed bnode match); for consistency tests it consults RDF_Graph_Executable.is_inconsistent against the same closure. syntax-dl.rdf is the remaining unwired catalog (DL syntactic-profile checker pending). The roadmap rows below point at adjacent W3C/OGC suites we want to add: GeoSPARQL (#239), JSON-LD 1.1, CSVW, ShEx, DID, VC, RML.

RDFC-1.0 62 pass · 23 fail · 1 stub · of 86 total

eval (HFDQ)

62/23 of 85

map / negative-eval

1 stub — runner does not emit map JSON / HNDQ termination guard

RDFC-1.0 (W3C RDF Dataset Canonicalization): We vendor the full W3C RDFC-1.0 Test Cases at third_party/testing/rdf-canon/. The canonical-labelling algorithm lives in F* at formal/fstar/RDF.Canonical.fst — Hash First Degree Quads (HFDQ) phase verifies cleanly with no --lax and no --admit_smt_queries. Eval tests check the canonical N-Quads form bytewise. Phase 2 work (HNDQ — Hash N-Degree Quads with bounded permutation enumeration) closes the remaining symmetric-cycle eval tests. Map tests are STUB pending a runner-side bnode → canonical-id JSON emitter; NegEval tests need an HNDQ termination guard.

Machine-readable artifacts

latest.csv — one row per suite (timestamp, commit, branch, category, suite, pass/fail/skip/unsupported)
latest.json — same data plus totals, structured
history/<timestamp>.csv / .json — timestamped copies, one pair per runner invocation
perf-parse-serialize.json — parse/serialize/canonicalize throughput (if present; produced by tools/bench-parse-serialize.sh, not this script)

The raw runner logs (including per-test FAIL lines with diffs) are committed to formal/fstar/ocaml-output/sparql_results.log and formal/fstar/ocaml-output/rdf_results.log.

Raw per-suite numbers

SPARQL 1.1

631 total: 631 pass, 0 fail, 0 skip, 0 unsupported
  add                                 pass:8 fail:0 skip:0 unsupported:0
  aggregates                          pass:47 fail:0 skip:0 unsupported:0
  basic-update                        pass:13 fail:0 skip:0 unsupported:0
  bind                                pass:10 fail:0 skip:0 unsupported:0
  bindings                            pass:11 fail:0 skip:0 unsupported:0
  cast                                pass:6 fail:0 skip:0 unsupported:0
  clear                               pass:4 fail:0 skip:0 unsupported:0
  construct                           pass:7 fail:0 skip:0 unsupported:0
  copy                                pass:6 fail:0 skip:0 unsupported:0
  csv-tsv-res                         pass:6 fail:0 skip:0 unsupported:0
  delete                              pass:19 fail:0 skip:0 unsupported:0
  delete-data                         pass:6 fail:0 skip:0 unsupported:0
  delete-insert                       pass:17 fail:0 skip:0 unsupported:0
  delete-where                        pass:6 fail:0 skip:0 unsupported:0
  drop                                pass:4 fail:0 skip:0 unsupported:0
  entailment                          pass:70 fail:0 skip:0 unsupported:0
  exists                              pass:6 fail:0 skip:0 unsupported:0
  functions                           pass:75 fail:0 skip:0 unsupported:0
  grouping                            pass:6 fail:0 skip:0 unsupported:0
  http-rdf-update                     pass:19 fail:0 skip:0 unsupported:0
  json-res                            pass:4 fail:0 skip:0 unsupported:0
  move                                pass:6 fail:0 skip:0 unsupported:0
  negation                            pass:12 fail:0 skip:0 unsupported:0
  project-expression                  pass:7 fail:0 skip:0 unsupported:0
  property-path                       pass:33 fail:0 skip:0 unsupported:0
  protocol                            pass:34 fail:0 skip:0 unsupported:0
  service                             pass:7 fail:0 skip:0 unsupported:0
  service-description                 pass:3 fail:0 skip:0 unsupported:0
  subquery                            pass:14 fail:0 skip:0 unsupported:0
  syntax-fed                          pass:3 fail:0 skip:0 unsupported:0
  syntax-query                        pass:94 fail:0 skip:0 unsupported:0
  syntax-update-1                     pass:54 fail:0 skip:0 unsupported:0
  syntax-update-2                     pass:1 fail:0 skip:0 unsupported:0
  update-silent                       pass:13 fail:0 skip:0 unsupported:0

RDF 1.1

1031 total: 1031 pass, 0 fail, 0 skip, 0 unsupported
  rdf-mt                              pass:39 fail:0 skip:0 unsupported:0
  rdf-n-quads                         pass:87 fail:0 skip:0 unsupported:0
  rdf-n-triples                       pass:70 fail:0 skip:0 unsupported:0
  rdf-trig                            pass:356 fail:0 skip:0 unsupported:0
  rdf-turtle                          pass:313 fail:0 skip:0 unsupported:0
  rdf-xml                             pass:166 fail:0 skip:0 unsupported:0

How this page is generated

Source: formal/fstar/generate-report.sh. It shells out to the w3c_runner binary (extracted from F* specs, compiled via OCaml), scrapes per-suite counts, and writes index.html, latest.csv, and latest.json. Run ./generate-report.sh --run in formal/fstar/ to regenerate. CI re-runs on every push and nightly at 06:00 UTC.