Skip to content

The knowledge model

The Knowledge page shows what you have at your disposal. This page explains how that is organised under the hood: which sources exist, how specialization (Vakgebied) filters them, and how the assistant decides per question where to look.

PrudAI combines three knowledge levels in every question:

  1. Live public sources (dynamic) — at the moment of your question the assistant makes a direct call to a register or official site, for example wetten.overheid.nl, rechtspraak.nl, eur-lex.europa.eu, the BIG register, Kadaster BAG, KVK, or supervisors like AP, ACM and NZa. Always up to date, but dependent on the availability of that external site — see Citations & source availability.
  2. Static knowledge bases (curated by PrudAI) — indexed collections such as case-law archives, literature, or industry guidelines. Fast to search because the content is pre-processed; less fresh than a live call, but much more tolerant of peak load.
  3. Organisation knowledge bases (managed by your organisation) — your own documents, templates and case files uploaded via the RAG dashboard. Only your organisation sees them. See Account & RAG dashboards.

In the UI these are labelled Dynamic, Static and Organisation.

Specialization is how you tell the assistant: “this is the work I do”. It has three axes, each of which narrows or widens the set of sources:

  • Legal domains — the legal areas in which you practise (e.g. employment law, environment and planning law, healthcare law, tax). A single profile can hold multiple domains.
  • Role — the type of work you do (e.g. attorney, in-house counsel, consultant, mediator, account manager). Among other things, this determines which prompts and tools are foregrounded.
  • Industry — the sector(s) you operate in (e.g. healthcare, construction, retail). Especially relevant for VERA and ZIA.

The stepper on the Knowledge page walks through the three axes in three steps and auto-saves once you complete step 3. If your profile is still empty, you’ll get a one-time invitation in chat to fill it in. You can return any time via Knowledge → Specialization.

What does this do in practice? An entirely different set of sources appears in the Sources tab — for example, an in-house counsel in healthcare sees Federatie Medisch Specialisten and the IGJ disciplinary database prominently; an environment-and-planning attorney sees Omgevingswet/IPLO and the LiDO ECLI positioning map; a tax lawyer sees Belastingdienst, tax rulings and Vakstudie. Under the hood, PrudAI filters the source registry on the combination of your three axes and only foregrounds the matching sources.

Every source is linked to one or more products:

  • LEO — broad Dutch legal work; gets the widest set of public source families, plus research jobs and the partner inbox.
  • VERA — built environment and construction; gets the environment-and-planning stack (wetten.overheid.nl Omgevingswet, IPLO, official publications, Council of Arbitration) and BIM/IFC sources.
  • ZIA — internal knowledge and operational work; does not get public legal source families in chat, but does get projects, documents and an optional Nedap ONS connection.
  • Cross-cutting (any) — a handful of sources that apply to every product, such as PrudAI’s own product documentation.

Switching products therefore changes not only the logo and the sidebar, but also which source families the assistant has in view.

Not everything at once. Per chat message, the assistant builds an active selection:

  • Turn allowlist — the set of sources that may be called in this specific turn, derived from: your product, your specialization profile, your temporary selection in the Sources tab, and any organisation-wide opt-outs (see below).
  • Dedup — if a tool would fetch the same thing twice in one turn, the assistant reuses the first result. This avoids both rate-limit problems and confusing duplicate source cards in chat.
  • Byte budget — answers from live sources are trimmed to the essentials so the model takes the substance along without clogging context.
  • Recent memory — searches can automatically pull in activated memories from your personal memory (LEO: all, VERA/ZIA: only relevantly tagged).

You always see in the tool card which source was called, with which parameters, and what the response was. No invisible calls.

Public, non-public and shared knowledge bases

Section titled “Public, non-public and shared knowledge bases”

Besides the long list of external registers, PrudAI also has knowledge bases:

  • Public knowledge library — curated by PrudAI, visible to everyone, switched on organisation-wide via the Sources tab.
  • Non-public knowledge bases — marked is_proprietary. Off by default; you must explicitly confirm them before the assistant may consult them. An “enable everything” button skips proprietary KBs.
  • Organisation knowledge bases — created by your owners/admins in the RAG dashboard. Documents are indexed; new ones appear in the Sources tab in real time.

For each non-public KB someone in your organisation initially gave consent; after that it is available for the right roles.

The assistant can remember things between conversations at your request — for example a fixed writing style, an ongoing case, or a personal preference. Memory:

  • lives strictly on your account; colleagues don’t see it;
  • is searchable and editable via Settings → Memory;
  • is automatically tagged with keywords so the assistant can find it back easily;
  • falls under your GDPR export right: one click, full content as JSON.

The Knowledge page exposes extra controls for owners and admins:

  • Disable organisation-wide sources — a source switched off here is no longer available to anyone in the organisation. Useful when you want to block a specific source during an audit, or when a legal domain isn’t relevant to your organisation.
  • Manage non-public KB links — only owners can attach a proprietary KB to the organisation.
  • Capability gates — some sources (such as Nedap ONS or partner exchange) are tied to a capability that is granted per organisation, independent of product. Owners can see which capabilities are active in Settings.

End users only get what their role and their product and their specialization profile support — always the smallest of those three.

Not every obvious public source is available as a live source. The most notable deliberate blind spot today is:

For the curious: PrudAI has no live integration with the official collective labour agreement (CAO) database run by the Dutch Ministry of Social Affairs and Employment. The reason is structural, not editorial.

The publication environment behind that site — internally known as the ANIBUS platform — was not designed for programmatic access:

  • search results live behind a session-bound POST form with no stable URLs;
  • document links contain occasionally-rotating internal IDs, so a previously captured reference can dead-end days later;
  • there is no public API or sitemap, and the HTML structure does not lend itself to reliable automated extraction;
  • there is no public content negotiation between PDF and text — only the final PDF.

A naive scraper therefore produces results that aren’t reproducible a week later, or that come back with random errors during peak load. For a citation platform where every reference must be clickable and verifiable, that’s not good enough.

How we cover it anyway:

  • For the six largest sectors, PrudAI has loaded the full CAO text as a static reference knowledge base (Metalektro, Horeca, VVT-zorg, Retail Non-Food, ABU Uitzendkrachten). It lives under Arbeidsrecht — Referentie-CAO-teksten in the Sources tab.
  • For sectors outside that reference set, the assistant explicitly defers to cao.minszw.nl and instructs the user to verify the validity date by hand.
  • General-binding declarations (AVV decisions) ARE covered, because they’re published via the Staatscourant and that does expose a stable API — we pick them up through the official-publications source.

We track the upstream: as soon as MinSZW ships an API or replaces the platform, we’ll switch to a live-MCP integration. Until then, the combination of paths above is the honest workable solution.