Proof & Safety

Honest, early, working.

Synthetic multilingual demo cases, a pinned gold-set scoreboard you can reproduce from the CLI, and a deterministic-first architecture where the LLM only verbalizes pre-computed facts. No clinical validation is claimed; the non-claims list at the bottom is explicit.

Source-cited by construction

Every finding carries its document, page and source language.

The packet is an auditable record: each value links back to the document and page it came from, hashed for integrity, and stamped when a clinician attests it.

MedLineageVerified packet

attested

Case A-001 · Nephrology

3 docs · 2022 → 2024 · IT · DE · EN

Creatinine1.8 mg/dL
└ [lab_report · p.2 · IT]
eGFR44 mL/min
└ [lab_report · p.2 · IT]
Hemoglobin11.9 g/dL
└ [discharge · p.1 · DE]
Potassium4.6 mmol/L
└ [lab_report · p.3 · EN]

sha256a91f3c…e7d2

provenance ledger · stamps next build

03Not a generic summarizer

Summaries compress records. MedLineage prepares cases.

A generic summarizer can tell you what appears in the files. Complex care needs more than that. Before a specialist can review a case, someone has to reconstruct the timeline, verify the evidence, identify missing records, and decide what is ready for clinical handoff.

MedLineage turns fragmented records into a structured packet where every claim can be traced back to source document, page, and language. The output is not a paragraph. It is a reviewable clinical artifact.

—

Generic LLM summarizer

MedLineage record-prep layer

Output

Free-text summary

Structured specialist packet

Evidence

Usually cites files loosely, if at all.

Claim-level source citations: document, page, language.

Missing records

Summarizes what is present.

Flags absent or stale records with specialty-aware checks.

Auditability

Hard to replay or verify.

Deterministic structure, provenance trail, clinician-reviewable output.

Workflow fit

Useful reading aid.

Case-preparation layer for second opinions, referrals, and specialist intake.

Measured on synthetic complex-care demo

From fragmented records to specialist-ready intake.

MedLineage turns messy, multilingual medical records into cited second-opinion packets with timeline reconstruction, source provenance, missing-record detection and readiness scoring.

Synthetic demo data

View synthetic demo packet

Source-cited transformation

documents in

51 pages · 5 languages · 8 doc types

fragmented medical recordsPDF · FHIR · HL7 · CSV

PDFFHIRHL7CSVXML

MedLineage

clinical record intelligence

specialist-ready packet

8 pages · 9 events · 37 cited findings

schema-validPDF + JSON

IT · EN · FR · ES · DE

Measured impact · synthetic demo setconservative estimates · not clinical outcome claims

51 pages · 10 documents · 5 languages8-page specialist packet · 37 cited findings · 14 completeness checks

100min
Estimated intake minutes saved
estimated intake time reduction — not a clinical outcome claim
97.4%
Clinical findings cited to source
37 source-cited clinical findings
of cite-eligible findings
51
Input pages compressed
9
Clinical events reconstructed
8
Document types normalized
5
Languages normalized
14
Completeness checks performed
every expected record category and priority metric the engine evaluated
2
Missing-record checks run
subset of completeness checks where a record was flagged absent or stale

Source-grounded · every claim cited
Deterministic-first · LLM only phrases pre-computed facts
Organizes records · does not diagnose or treat

Total intake estimate includes triage reading, document classification, language normalization, timeline reconstruction, completeness checks, and packet assembly. The demo PDF shows the triage-reading component only.

Completeness checks evaluate every expected record category and priority metric for the case specialty. Missing-record checks count only the categories the engine flagged as absent or stale in the demo.

Synthetic fixture, not clinical validation. Metrics describe packet-generation output on demo records. Time-saved values are conservative estimates, not clinical outcome claims.

Flagship case: colorectal_oncology_complex · synthetic colorectal-oncology workup across 5 languages and 10 documentsdemo-impact-metrics.json

Synthetic demo data

100min
Estimated intake minutes saved
estimated intake time reduction — not a clinical outcome claim
97.4%
Clinical findings cited to source
of cite-eligible findings
14
Completeness checks performed
every expected record category and priority metric the engine evaluated

Synthetic fixture, not clinical validation. Metrics describe packet-generation output on demo records. Time-saved values are conservative estimates, not clinical outcome claims.

Where we are

Synthetic-set scoreboard.

Deterministic post-extraction over a pinned multilingual gold set. Synthetic fixtures only — not clinical validation. Every number reproduces from python -m eval_harness --gold golden_set/gold_set.jsonl.

1.00

Extraction F1

253+ synthetic gold facts · 58 unique TP · 0 FP

100%

Provenance coverage

every recovered fact cites its source

0 / 58

Safety redactions

no synthetic case tripped the safety net

73 slices · DE / EN / ES / FR / IT · cardiology, gastroenterology, general_internal_medicine, oncology_colorectal

View full scoreboard JSON →gold_set.jsonl manifest

Status

Honest, early, working.

No invented testimonials, no vanity metrics — these tiles update as real-case evidence lands.

Per run · measurable outputs
Handoff packet PDF · FHIR-like Bundle · Data Room tar.gz · Work Queue tasks · MDT briefing · ledger attestations
In progress
Testing with real multilingual patient cases · clinician review programme
Shipped
Patient Graph · Provenance Ledger · FHIR-like Bundle · signed Data Room · Clinical Work Queue · safe agents

Deterministic-first architecture

What is structural — and what the LLM is allowed to do.

Document classification, the clinical timeline, source provenance, missing-record detection, priority, confidence and trends are all computed by deterministic modules. The LLM only verbalizes those facts in the chosen output language. It never invents structure, priority, confidence, provenance, or trends.

Deterministic, structural

Document classification — typed schema, regex + model_router; never LLM-only.
Clinical timeline — chronological merge of typed observations and clinical events.
Provenance — doc_id + page + source language attached to every claim after the LLM.
Completeness — specialty-aware rule tables, three buckets (critical / recommended / contextual).
Priority, confidence, trends — deterministic scoring; LLM-assigned priority is always overwritten.

LLM-allowed

Phrasing — turn the typed facts into clinician-readable prose in the chosen output language.
Localization — render specialty names, document labels and dates per locale.
Safety — five-language guardrail regex redacts violations after the phrasing pass.
Never invents structure, priority, confidence, provenance, or trends.

Human approval is still required. MedLineage organizes records for clinician review — it does not diagnose, does not recommend treatment, and does not replace clinicians.

Explicit non-claims

What MedLineage is not.

We make this list explicit so visitors and reviewers can’t infer a compliance or clinical posture we haven’t earned.

What MedLineage is not. MedLineage is not a medical device, is not certified for FHIR / GDPR / EHDS conformance, and does not provide diagnostic capability or guaranteed anonymization. It organizes clinical records to support visit preparation, specialist handoff and clinician-review-ready downstream integration.

Inspect the artifacts.

Every number above reproduces from the pinned gold set and the public benchmark JSON.

View scoreboard JSON Open the app