Clinical Code Resolution — term → SNOMED concept

Общая задача в двух pipeline’ах BloodGPT: превратить natural language medical term (“headache”, “колоноскопия с биопсией”, “Kopfschmerzen”) в standardized SNOMED concept ID. Решается по-разному в зависимости от контекста.

Два pipeline’а, одна проблема

	`narrative-to-fhir` (deployed)	V0.5 Mastra write tools (decided, не deployed)
Триггер	Анализ загружен → narrative extraction → FHIR builders	LLM-агент в chat / survey / document-import вызывает `recordSymptom`/`recordCondition`/etc.
Code path	`analysis-core/services/narrative-to-fhir/snomed-coder.ts`	`bloodgpt-for-business-mastra-agents` (branch `feat/mastra-agents`)
Resolution	Two-tier: hardcoded `SNOMED_QUICK_LOOKUP` + LLM fallback (`gpt-5.2`)	LLM возвращает English term → `$expand` через terminology server
Source-of-truth	Port из Python `fhir-services/snomed_service.py`	llm-numeric-codes-policy decision (session `7ff79368`, 2026-03-30)
Status	Production	V0.5: `$expand` НЕ реализован, dedup по name. V1+: planned.

Почему разные решения? Pipelines развивались отдельно — narrative-to-fhir pragmatic port из Python со static lookup table, V0.5 Mastra tools — greenfield design с terminology server в виду. Open вопрос — convergence: имеет ли смысл narrative-to-fhir тоже перейти на terminology server для consistency, или оставить two-tier как pragmatic optimization (faster, no external dep, validated через regular audit).

Pipeline A: `narrative-to-fhir` — hardcoded table + LLM fallback

Реализация: packages/analysis-core/src/services/narrative-to-fhir/snomed-coder.ts (entry: getSnomedCode(text, category, options)). Caller: narrative-to-fhir.ts фаза 2a.

Two-tier архитектура

text term
   ↓
[ Tier 1: SNOMED_QUICK_LOOKUP ]  — hardcoded in-memory, ~80 entries, O(1)
   ↓ (miss)
[ Tier 2: LLM fallback ]          — gpt-5.2, strict JSON, "never invent"
   ↓ (null/low confidence)
[ caller: FHIR resource без coding, только `text` ]

Tier 1 — `SNOMED_QUICK_LOOKUP`

Plain in-memory dictionary Record<string, [code, display]>, ~80 наиболее часто встречающихся в narrative терминов:

Chronic conditions (~30): diabetes → 73211009, hypertension → 38341003, asthma → 195967001, chronic kidney disease → 709044004
GI / vascular / musculoskeletal / cardiac / urological / ophthalmic / endocrine / hematologic (~20)
Symptoms (~10): headache → 25064002, fatigue → 84229001, chest pain → 29857009, dyspnea → 267036007
Allergies (~4): peanut allergy → 91935009, penicillin allergy → 91936005
Procedures (~15): appendectomy → 80146002, colonoscopy → 73761001, biopsy → 86273004
Medications (~9): metformin → 372567009, insulin → 67866001, atorvastatin → 373444002

Normalization простая — text.toLowerCase().trim().replace(/\s+/g, " ").

Special-case allergy: если category = allergy и direct match не сработал — пробуется ${normalized} allergy (для случая когда LLM выдаёт substance bare “penicillin” вместо “penicillin allergy”). Confidence 1.0.

Tier 2 — LLM fallback

generate() с моделью из Langfuse-managed prompt config (или fallback openai/gpt-5.2, maxTokens 512). System prompt — Langfuse snomed_code_mapping, fallback на in-code defaultSnomedSystemPrompt:

«You are a medical coding specialist. Map the supplied medical term to the most appropriate SNOMED CT concept. Only return real SNOMED CT codes. Never invent codes. Return null values if you are not confident.»

Strict JSON schema: { snomed_code: string|null, snomed_display: string|null, confidence: number }. Null fallback при low confidence / schema mismatch / LLM error → caller продолжает без coding (text only).

Langfuse trace name snomed-map-${category} для observability. skipLlm: true опция для тестов.

Parallel execution (narrative-to-fhir.ts:193)

Все SNOMED jobs выполняются параллельно через Promise.all:

const snomedResults = await Promise.all(
  allJobs.map((j) =>
    getSnomedCode(j.term, j.category, { traceId: testId, language })
  )
);

Comment из кода:

Previously this phase ran one sequential getSnomedCode per entity: for a document with 20 narrative entities, an LLM-fallback miss on each added ~2s × 20 = ~40s of wall-clock latency. Running the lookups concurrently collapses that to one LLM round-trip’s worth of time.

6 entity-категорий собираются (conditions / medications / allergies / procedures / family_history / symptoms), filtered на non-empty term, concat в один allJobs, Promise.all, slice обратно с сохранением original order.

Pipeline B: V0.5 Mastra write tools — English term + `$expand`

Decision принято в session 7ff79368 (2026-03-30), документировано в llm-numeric-codes-policy раздел B.

Pattern

LLM-агент в tool description инструктирован: «Always provide English medical term» (не numeric code)
На write-side term → $expand operation FHIR terminology server → SNOMED concept ID
Exact-match dedup по resolved code

Deployment progression

V0.5 (current) — $expand НЕ реализован, dedup по name (английскому). Достаточно для prototype.
V1 — $expand через tx.fhir.org (внешний FHIR terminology server, verified работает в session 7ff79368)
V1+ (Production) — Snowstorm Lite (Docker sidecar) когда SLA / latency станет важен

Где используется

5 write tools на V0.5: recordSymptom, recordMedication, recordAllergy, recordProcedure, recordFamilyHistory. Они работают в health-chat / survey / document-import контекстах. 17/17 тестов на живом HAPI FHIR на момент decision’а.

История проб — что не сработало

В session 7ff79368 (decision-process для llm-numeric-codes-policy + verification на gpt-4o-mini) пробовали:

Bare LLM SNOMED coding (Variant C из RFC) — модель сама генерирует SNOMED код, помечает unsure. Провалилось. Verified на gpt-4o-mini: один и тот же garbage 431855005 (CKD stage 1) на разные термины — "усталость", "метформин", "диабет", "изжога". Код мигрени нестабилен — 37796000 vs 37796009 (одна цифра разницы) при regenerate. Lesson: LLM не умеют надёжно цитировать numeric identifiers (long context tokens с low frequency).
- НО: в narrative-to-fhir Tier 2 LLM именно для coding используется — mitigations: strict JSON schema (отказ если не digits), «Never invent» instruction, confidence score, gpt-5.2 (стабильнее gpt-4o-mini), null fallback на любую schema mismatch. Это mitigated bare LLM coding, не plain Variant C.
JSON cache top-500 терминов — узкое покрытие, manual maintenance. Не принято.
SQLite с SNOMED RF2 — full SNOMED ~500MB, для lookup-сервиса слишком тяжело. Embeddable но cumbersome.
Snowstorm Lite (Docker sidecar) — open-source SNOMED terminology server в контейнере, production-ready. Ещё один контейнер в инфраструктуре. Отложено в V1+ когда latency станет critical.
tx.fhir.org $expand ✅ — внешний FHIR terminology server. Verified в session 7ff79368: валидные коды, multilingual normalization работает. V1 target для V0.5 tools.

Multilingual normalization — общий pattern

Patient input может быть на любом языке (русский / немецкий / иврит / etc.). FHIR terminology servers (tx.fhir.org, Snowstorm, Ontoserver) поддерживают только English (International Edition) — Kopfschmerzen в $expand не найдётся.

Pattern: language normalization в LLM-слое, не в terminology server’е. LLM-агент инструктирован «Always provide English medical term» (V0.5 tool description) или принимает language hint (narrative-to-fhir).

Это работает потому что LLM знает медицинскую терминологию на нескольких языках — и translates на canonical English medical term. Verified: пациент пишет «Kopfschmerzen» → LLM нормализует в «headache» → код находится. Деление обязанностей: LLM = translation + normalization, terminology server = code resolution.

Audit practice — terminology server `$lookup` против hardcoded codes

Static lookup tables в коде (Tier 1 narrative-to-fhir) дрейфуют от живого SNOMED стандарта: concepts становятся inactive, переименовываются, retire’ятся. Без validation — clinical safety risk.

Артур (май 2026) прошёл по SNOMED_QUICK_LOOKUP через CSIRO Ontoserver $lookup (FHIR R4 terminology server, AU edition release 2026-04-30). Predicate: «display text согласуется с тем что концепт реально значит». $lookup возвращает FSN + PT — простой string match.

Конкретный case: 30242009 в hardcoded table был назначен «hyperplastic polyp of colon», а в реальном SNOMED это «scarlet fever». Если бы прошло в production — пациент с биопсией colon-полипа получал бы в FHIR-output диагноз скарлатины. Аналогично нашли несколько других.

Generalisable practice: pre-deploy validation через terminology server для всех hardcoded clinical codes — не только SNOMED, тот же принцип для LOINC / RxNorm / ICD lookups.

Carry-over: эта practice достойна отдельной странички team/clinical-code-validation — общий принцип для всех codesystems с hardcoded lookup tables. Не привязана к SNOMED.

Open questions

Convergence narrative ↔ V0.5 write tools — оставить параллельные pipeline’ы (pragmatic optimization для batch processing vs cleaner pattern для interactive) или унифицировать? Trade-offs не оценены.
Confidence threshold для Tier 2 — narrative-to-fhir LLM возвращает confidence, но caller’ы не фильтруют. Стоит ли применять threshold?
Inactive concepts — $lookup возвращает inactive=true + replacedBy. Hardcoded table может содержать deprecated code. Нужна replace-policy.
Refresh cycle hardcoded table — после каждого International release SNOMED (2× в год) повторять Артуров audit. Триггеры / automation?
Extending для RxNorm (medications) — narrative-to-fhir сейчас coded medications через SNOMED Pharmaceutical / biologic product hierarchy (metformin → 372567009). US Core рекомендует RxNorm. Если выйдем на US — придётся добавить RxNorm resolution.
ICD-10-CM dual-coding — US Core рекомендует SNOMED primary + ICD-10-CM secondary в одном CodeableConcept.coding[]. Не делаем, но если выйдем на US billing — потребуется.

Связано

snomed — SNOMED CT standard (что это такое, hierarchies, editions)
llm-numeric-codes-policy — decision-page где зафиксирована B-policy для V0.5 write tools («LLM не доверять для кодирования»); reference сюда за coding details
medical-context-survey — где V0.5 Mastra tools используются (survey path)
fhir-condition — consumer SNOMED code в Condition.code
fhir-procedure — consumer SNOMED code в Procedure.code (с special-case allergy logic для AllergyIntolerance)
fhir-allergy-intolerance — consumer; substance bare “penicillin” → penicillin allergy suffix logic
langfuse — где живёт snomed_code_mapping prompt (fallback на in-code default)
biomarker-analysis-pipeline — narrative-to-fhir в общем pipeline
agent-vs-workflow — связанный pattern для V0.5: structured-LLM + deterministic resolver layer

Источники

Источники: ¹ ² ³ ⁴ ⁵ ⁶ ⁷.

Сноски

Реализация narrative-to-fhir, accessed 2026-05-17, https://github.com/Realai-plus/bloodgpt-for-business/blob/main/packages/analysis-core/src/services/narrative-to-fhir/snomed-coder.ts. ↩
Caller (parallel execution), accessed 2026-05-17, https://github.com/Realai-plus/bloodgpt-for-business/blob/main/packages/analysis-core/src/services/narrative-to-fhir/narrative-to-fhir.ts. ↩
CSIRO Ontoserver (Артуров audit instrument), accessed 2026-05-17, https://ontoserver.csiro.au/. ↩
tx.fhir.org Terminology Server (V1 target), accessed 2026-05-17, https://tx.fhir.org/. ↩
Snowstorm Lite (V1+ target), accessed 2026-05-17, https://github.com/IHTSDO/snowstorm-lite. ↩
FHIR $expand operation, accessed 2026-05-17, https://www.hl7.org/fhir/valueset-operation-expand.html. ↩
FHIR $lookup operation, accessed 2026-05-17, https://www.hl7.org/fhir/codesystem-operation-lookup.html. ↩

Quartz 4

Explorer

Clinical Code Resolution — term → SNOMED concept

Два pipeline’а, одна проблема

Pipeline A: `narrative-to-fhir` — hardcoded table + LLM fallback

Two-tier архитектура

Tier 1 — `SNOMED_QUICK_LOOKUP`

Tier 2 — LLM fallback

Parallel execution (narrative-to-fhir.ts:193)

Pipeline B: V0.5 Mastra write tools — English term + `$expand`

Pattern

Deployment progression

Где используется

История проб — что не сработало

Multilingual normalization — общий pattern

Audit practice — terminology server `$lookup` против hardcoded codes

Open questions

Связано

Источники

Сноски

Graph View

Table of Contents

Backlinks

Quartz 4

Explorer

Clinical Code Resolution — term → SNOMED concept

Два pipeline’а, одна проблема

Pipeline A: narrative-to-fhir — hardcoded table + LLM fallback

Two-tier архитектура

Tier 1 — SNOMED_QUICK_LOOKUP

Tier 2 — LLM fallback

Parallel execution (narrative-to-fhir.ts:193)

Pipeline B: V0.5 Mastra write tools — English term + $expand

Pattern

Deployment progression

Где используется

История проб — что не сработало

Multilingual normalization — общий pattern

Audit practice — terminology server $lookup против hardcoded codes

Open questions

Связано

Источники

Сноски

Graph View

Table of Contents

Backlinks

Pipeline A: `narrative-to-fhir` — hardcoded table + LLM fallback

Tier 1 — `SNOMED_QUICK_LOOKUP`

Pipeline B: V0.5 Mastra write tools — English term + `$expand`

Audit practice — terminology server `$lookup` против hardcoded codes