1) High level design
-
We propose an ITEM instance can be the manifest for a transaction-set: it identifies the set (start/stop transactions), declares a job card that tells miners how to extract features and run OP_AI_EVAL, and declares how results must be written back (UPDATE tx format).
-
Keep the on-chain ITEM compact: use short keys, canonical JSON (no whitespace), and a fixed small selector DSL so miners can deterministically compute features.
-
Possibly for heavy models: prefer a two-tier pattern (light deterministic on-chain eval; heavy off-chain with commit/verify
2) Compact ITEM metadata schema (short keys)
Use canonical JSON with short keys (all keys shown in parens).
Top level:
-
v = schema version
-
id = ITEM id (optional if implicit)
-
s = set_start_txid
-
e = set_end_txid (stop)
-
j = job card object
-
o = output / update policy
-
p = provenance {creator,ts,nonce}
-
ctl = control flags (deterministic,gasmeter,maxnodes)
Job card j:
-
jid = job id
-
m = engine (must be “OP_AI_EVAL” for miner exec)
-
mod = model id or light op id (e.g., “m:v1” or “linreg:1”)
-
spec = [ feature spec objects ]
-
seed = optional deterministic seed
-
limit = max nodes / blocks to traverse
Feature spec object (each element in spec):
-
n = feature name (short)
-
sel = selector expression (see DSL below)
-
agg = aggregation: sum|avg|count|min|max|first|last|hist
-
t = type: num|cat|bool|text
-
w = window param (e.g., 10blocks or 24h) optional
-
d = default value if missing (optional)
Output / update o:
-
ut = update tx type/name (e.g., “UPDATE”)
-
ref = reference field name to link back to ITEM (e.g., item_id)
-
fields = list of fields to populate in the update {name->short key} or “cid” for off-chain blob
-
sig = boolean (require updater sig)
Control ctl:
-
det = true/false (must be deterministic)
-
maxN = max nodes to crawl
-
gas = gas limit hint
3) Mini DSL for selectors (sel)
Make this small but expressive. Examples:
-
meta.price → value in this tx metadata
-
meta.items.w → iterate array items, pick w field
-
tx.value → transaction value (cash/payment)
-
rel.parent or rel.parents→ txids listed in this tx’s metadata under parent links
-
blk.height → block height
-
time → tx timestamp
-
functions: sum(…), avg(…), count(…) can be used inside selector or use agg in spec.
-
filters: meta.items[cat=grain].q → select items whose cat = grain.
Keep it unambiguous and implementable with a tiny deterministic parser.
4) BFS crawl algorithm (pseudocode)
Use adjacency stored in each transaction’s metadata (each tx includes rel.parents or rel.children links). ITEM gives start/stop hints and maxN.
queue=[ITEM.s]
seen=set()
count=0
while queue and count<JOB.limit and count<CTL.maxN:
txid=queue.pop(0)
if txid in seen: continue
tx = fetch_tx(txid)
process_tx_for_features(tx) # extract meta fields per job.spec
seen.add(txid); count+=1
for child in tx.meta.rel.children:
if child not in seen: queue.append(child)
for parent in tx.meta.rel.parents:
if parent not in seen: queue.append(parent)
if txid == ITEM.e: break
Miners must be able to fetch rel.* from each tx deterministically.
5) Feature extraction semantics
For each spec entry:
-
Evaluate sel on each tx’s metadata in the visited set.
-
Coerce values to t type.
-
Aggregate over the visited set using agg.
-
Apply w window if present (time or block window).
-
Missing values use d or defined imputation (0 or null).
Return a dense feature vector in a canonical order (order = spec array order).
6) OP_AI_EVAL execution & output
-
j.mod must point to a deterministic, lightweight algorithm representation that miners can execute (e.g., small parametric model, ruleset, linear model, decision tree). Model should be encoded as compact JSON or bytecode with a well-specified interpreter built into miner validation.
-
The evaluation result object to be written back should include:
-
item (ITEM id)
-
job (jid)
-
feat_hash (hash of canonicalized feature vector)
-
model (model id and version)
-
result (score, label, action)
-
optional cid pointing to full result blob (if too big)
-
-
This result is packed into the UPDATE tx declared in o. If o.sig true, updater must sign.
7) Example compact ITEM metadata (fits well within ~800 chars)
Canonical JSON, no spacing (short keys). This is ready to be placed as an ITEM metadata field:
{“v”:“1”,“s”:“txStart123”,“e”:“txStop999”,“j”:{“jid”:“J1”,“m”:“OP_AI_EVAL”,“mod”:“linreg:v1”,“seed”:“0xabc”,“limit”:500,“spec”:[{“n”:“wt_sum”,“sel”:“meta.items.w”,“agg”:“sum”,“t”:“num”},{“n”:“price_avg”,“sel”:“meta.price”,“agg”:“avg”,“t”:“num”},{“n”:“tx_cnt”,“sel”:“txid”,“agg”:“count”,“t”:“num”}]},“o”:{“ut”:“UPDATE”,“ref”:“item_id”,“fields”:[“score”,“action”,“cid”],“sig”:true},“p”:{“creator”:“addr1”,“ts”:1728000000,“nonce”:“42”},“ctl”:{“det”:true,“maxN”:200,“gas”:200000}}
8) UPDATE tx format (recommended)
When writing the evaluation back, use canonical JSON in the UPDATE tx metadata:
{“item”:“<ITEM.id>”,“jid”:“J1”,“model”:“linreg:v1”,“feature_hash”:“0x…”,“result”:{“score”:0.71,“label”:“ok”,“action”:“settle”},“cid”:“bafy…”,“ts”:1728…,“sig”:“0x…”}
If cid present, it points to full payload stored elsewhere (IPFS/CID or on-chain blob).
9) Determinism, miner constraints & security
-
Determinism: OP_AI_EVAL must be deterministic. No randomness unless seed is provided and used consistently.
-
Complexity limits: include limit/maxN/gas to bound work miners must do. Reject JOBs without limits.
-
Model size: require compact models (<< on-chain compute/time). For heavier models use commit/verify: a miner can verify a commit hash and a zero-knowledge or fraud-proof style scheme can attest result off-chain.
-
Canonicalization: define canonical JSON ordering, encoding, float precision, rounding and hashing rules (so different miners compute same feature_hash).
-
Access control: use o.sig and updater whitelist in o to prevent unauthorized writes.
-
Audit trail: include provenance (p) and ensure every UPDATE references item and jid.
10) Best practices & patterns
-
Short keys to save space. Provide a mapping doc off-chain for human readability.
-
Version v so you can evolve the schema safely.
-
Keep selectors simple and implement a tiny deterministic interpreter inside miner code — avoid full Turing-complete logic inside selectors.
-
Use CIDs for large payloads (store full evaluation details off-chain and put CID in the UPDATE).
-
Two-tier model: on-chain for quick deterministic scoring (rules, linear models, trees), off-chain for heavy ML with a commit/verify or oracle pattern.
-
Gas & timeouts: require JOB to specify maximum gas/time; miners skip if exceeds limits; validators reject noncompliant executions.
-
Testing harness: release a local test harness that canonicalizes txs, runs BFS, extracts features, runs OP_AI_EVAL, and produces UPDATEs — helps miners/validators implement behavior identically.