Skip to content

Evidence Engine

Research

friday. — our in-house evidence engine for LLM systems.

Instead of just vectors, friday. stores structured claims with source, temporal scope and trust level. We build it ourselves — and publish what we learn.

The claim model

Every entry in friday. is a claim — not an embedding vector. A claim has a statement, a confidence between 0 and 1, a trust level, source references, a validity window and optionally a structured triple (subject → predicate → object).

  • statement, confidence (0.0–1.0)
  • trust_level (L0–L3), source_refs[]
  • valid_from / valid_until — claims can age
  • RDF-style triple for graph traversal

Hybrid retrieval without compromise

friday. fuses three retrieval methods into a single answer. The same query hits vector, full-text and graph indices — the results are merged into a `ScoredClaim[]` and packed together with the sources into an evidence pack.

  • HNSW vector search for semantic similarity
  • BM25 full-text search for literal matches
  • Graph BFS over entity-relation edges for multi-hop reasoning
  • Fusion: ranked ScoredClaims + sources with authority score

Why build our own database?

Classic vector stores answer similarity. But compliance reality needs evidence-grade claims. We build friday. ourselves because off-the-shelf plug-ins have three gaps we are not willing to accept.

  • Vector search finds similarity — not evidence-grade claims.
  • Classic stores have no temporal scope — an outdated guideline looks identical to a current one.
  • No representation of trust — every piece of content weighs the same.
  • Conflicting claims are not detected, let alone handled.

What friday. does.

Claim model

Structured statements with subject / predicate / object, source and confidence. Unlike an embedding, a claim is referenceable — the citation trail can point to it.

Trust levels L0–L3

L0 unvalidated, L1 sourced (web references without review), L2 reviewed by domain expertise, L3 official source (laws, guidelines). Every claim has a level — and can lose it.

Temporal scope

`valid_from` / `valid_until` make explicit when a claim was true and when it was superseded. Answers can be pinned to an as-of date.

Hybrid retrieval

Vector + BM25 + graph run in parallel; their results are fused. A query hits all three paths — what semantic search misses is caught by the graph or full-text path.

Research

What we are researching.

friday. is not "done". We develop it actively and share what we learn. Three open questions we are currently working on:

Open question 1

Evidence recall in multi-hop retrieval

How do we measure recall cleanly when an answer needs three linked claims via graph hops? Classic recall metrics assume a single retrieval step — multi-hop breaks the model.

Open question 2

Trust decay over time

A 2014 guideline must not disappear just because a 2024 update exists. How do we model trust decay so that older claims stay reachable in the right context?

Open question 3

Federated friday.

Can tenants share claims without exposing their raw data? We are evaluating federation patterns for industry consortia — claims at an abstract level, sources kept local.

We will finalize the publicly communicated research questions together with Marius before the public launch — this list is a working state, not a promise.

How friday. flows into the product.

edith. fills friday. automatically with new claims from web research and document pipelines. The reasoner consumes via the MCP tool `search_knowledge` and, when a gap is detected, sends a `request_research` call back to edith. A closed loop — the platform actively learns.

Publications.

More to come — blog posts, whitepapers and repos will follow once we have concrete results on the questions above. Vague gestures without substance are not useful.

What an evidence pack contains.

A reasoning answer draws its sources from an evidence pack — structured claims plus the sources they came from.

  1. 01 claim.statement

    "Wound size must be documented on every visit."

  2. 02 claim.trust_level

    L2 — reviewed and confirmed by domain expertise.

  3. 03 claim.source_refs

    src-001 (EPUAP guideline, authority_score 0.95, valid from 2019-01-01).

  4. 04 claim.structured

    Subject "Dekubitus" — predicate "requires" — object "wound documentation".

  5. 05 fusion.score

    0.87 — combined score across vector, BM25 and graph retrieval.

The reasoner's citation trail rides on this structure — every cited span maps back to a claim with a trust level and a source.

For engineering and research teams.

friday. is written in Rust, runs as a single binary with gRPC and REST endpoints, and uses RocksDB as its storage engine. Atomic writes, Bloom filters, 10 column families, deterministic ingestion via blake3 deduplication. No LLM call is needed to write.

friday. in the docs
  • Rust + RocksDB, single binary
  • gRPC for agents, REST `POST /api/search` for tools
  • Built-in MCP server for direct LLM tool wiring
  • .fka format for deterministic, idempotent ingestion

Want to see more?

We can show this pillar on one of your own examples — discovery call, no sales pressure.