github.com/Zer0pa/Morph-BenchRESEARCH-READY

LIVE

INDUS BLOCKED

Toolkitting for ancient script research.

A shared scoring rig for ancient-script methods · PyPI gnosis-morph-bench v0.1.0 · github.com/Zer0pa/Morph-Bench

Every researcher studying ancient-script morphology rebuilds the same machinery from scratch: scoring routes, permutation nulls, stability sweeps. The result lives in one paper, then the next lab starts again from nothing.

Morph-Bench is the shared rig that ends that cycle — it scores routes, preserves null results, runs stability checks, and records replay so the next study begins from a measured floor. Today 37 tests pass and runs are byte-identical across macOS and Linux. Live Indus Phase 4 rerun is blocked on Phase 3c manifest access.

Gnosis-Morph-Bench approved scientific square mechanics diagram showing morphology benchmark-grid mechanics. — **Scope:** morphology benchmark rig. Scoring, permutation nulls, stability sweeps, and replay pass; live Indus Phase 4 is blocked.

01 · THE GAPNO MEASURE, NO BASELINE

Every ancient-script researcher rebuilds the same baseline; the measure is never kept.

02 · MARKETSADJACENT FORECASTS

Digital humanities'30 · $3.2B

Heritage digitization'30 · $8.1B

Research data management'30 · $6.7B

AI for archaeology'30 · $1.4B

Scholarly infrastructure'30 · $5.3B

Every field on this list has ancient-script work underway. None of it starts from a measured baseline.

03 · VALUE

$8.1B

Cultural-heritage digitization in 2030; research infrastructure for review and reuse, not commercial sale.

04 · INSIGHT

Measure the method once. the next researcher starts from here.

05.1 · CURRENT TECHREBUILT EVERY TIME

Ancient-script research has no standard scoring rig. Each lab writes its own metric, picks its own null, records its own result — then the work moves on. The next researcher starts again from zero.

05.2 · OUR TECHTHE BASELINE, KEPT

Morph-Bench installs from PyPI and gives a researcher a working starting point: a route scorer, a permutation null, a five-mode stability battery, and replayable records that are byte-identical across macOS and Linux Python 3.11. 37 tests pass on a fresh clone. 9 of 9 adapter clauses hold.

05.3 · BENCHMARKSPUBLIC RECEIPTS

Pytest37/37public README

Adapter9/9MUST clauses

Lint0/6forbidden hits

PyPI0.1.0wheel/sdist

Pytest37/37

Adapter9/9

Forbidden0/6

Status: working today. Live Indus Phase 4 rerun blocked on Phase 3c manifest.

06 · MEASUREMENTMETHOD SCORING + REPLAY

Every method scored. every null kept. every blocker named.

06.1 · COMPARATIVE PERFORMANCE · BENCHMARK SURFACE

Pytest surface37 passed

Adapter9/9 MUST clauses

Lint0/6 hits

IndusBLOCKED

Today's measurement is the rig itself: pytest on a fresh Python 3.11 clone, adapter contract coverage, forbidden-pattern lint, and byte-equal replay across macOS and Linux. Live Indus Phase 4 rerun and Cuneiform adapter are not in scope yet.

07 · KEY METRICSPUBLIC RECEIPTS

07.1 · PYTEST SURFACE

37/37

Fresh clone · pytest passes end to end

07.2 · ADAPTER INTERFACE

9/9

Adapter v1 · required clauses covered

07.3 · FORBIDDEN-PATTERN LINT

0/6

Repo lint · no forbidden patterns

07.4 · PYPI RELEASE

0.1.0

Public PyPI · wheel and source ready

07.5 · LIVE INDUS RERUN

null

Blocked · Phase 3c manifest unblocks it

08 · FIXTURE REPLAYBYTE-STABLE SCOPE

Fixture records replay byte-stable; rights-limited data stays outside.

08.1 · WHAT REPLAYS EXACTLYFIXTURE RECORDS ONLY

Smoke and replay outputs are byte-identical across documented macOS and Linux Python 3.11 environments, verified end-to-end through fresh-clone install. Replay records and SHA-256 reference-freeze helpers ship as part of the public package.

That does not prove deterministic live Indus replay from repo custody, undisclosed heavy data, or cultural-heritage imagery — and it does not prove any text recovery. The unit of byte-exactness is benchmark-fixture replay, not domain-result reproduction.

08.2 · HONEST BLOCKER

Honest Blocker ·

Live Indus Phase 4 rerun is blocked on Phase 3c manifest access. Heavy-data and image-bearing release policy is open. Cuneiform adapter is deferred to a separate contract once the first live Indus replay lands. PyPI: gnosis-morph-bench==0.1.0; no 0.1.1 until a receipt. Non-claims: no text recovery, no source identification, no data permission, no hosted service.

09

One baseline for the field the next study can inherit it.

09.1 · THE AMBITION

An ancient-script field that loses its scoring rig with every postdoc loses a generation of measurement. Morph-Bench is the rig the next lab installs instead of writing — a public floor for method comparison that outlives the researcher who first ran it.

09.2 · WHAT WORKS NOW

Public harness works today: 37 tests pass, adapter contract holds, fixture replay is byte-stable, public PyPI release available.

09.3 · WHAT'S STILL OPEN

Still blocked: live Indus rerun; heavy-data policy and Cuneiform adapter remain open.

09.4 · LABS · NEAR-TERM (12–24 MO)

One installable baseline, not three rebuilds

A second lab studying Indus morphology can install the same scoring rig their predecessor used, run it on their candidate method, and compare results that have a common floor. The conversation moves from "whose code" to "whose finding".

09.5 · REVIEW · NEAR-TERM (12–24 MO)

Reviewers see the null beside the result

A journal reviewer reading a morphology paper can ask for the permutation null and the stability sweep alongside the headline number, knowing both were produced by a public harness, not the author's private script. Skeptical review gets cheaper.

09.6 · MUSEUMS · MID-TERM (24–48 MO)

Heritage archives keep the method, not just the picture

When a museum funds a script-morphology study, the resulting paper can ship with a replayable run — same scoring rig, same null check, same stability battery — so the archive's investment outlives the postdoc who did the work. The method becomes part of the collection.

09.7 · CROSS-SCRIPT · MID-TERM (24–48 MO)

Indus and Cuneiform share a floor

Once a Cuneiform scorer sits beside the Indus one, two distant script communities can argue about results on the same scoring floor. Disagreement narrows to the data and the model — the scoring rig stays the same, so the argument finally becomes worth having.

09.8 · CONVENTION · PARADIGM (48 MO+)

A standard the next generation inherits

A graduate student starting in ancient-script morphology in 2030 finds the same baseline their advisor used in 2026 — installable, documented, with null results preserved. The field stops losing a generation of measurement every time the tools rot. One kept harness, many researchers.

01 // WHAT THIS IS

02 // METHOD MECHANICS

03 // KEY METRICS

04 // REPO IDENTITY

05 // READINESS

06 // WHAT WE PROVE

07 // WHAT WE DON'T CLAIM

08 // VERIFICATION STATUS

09 // PROOF ANCHORS

10 // REPO SHAPE