Stage 10 — Event-Driven Ingestion via Kafka
How to complete this stage
Enable Kafka-based ingestion so the IA node can consume RDF data from an event stream rather than only from direct API uploads. This is an optional stage to be completed if you require event-driven ingestion. Before starting:
ianode-accessis running (Stage 4).- Secure Agent Graph is running in secure mode (Stage 9), or you are prepared to restart it.
- Kafka is available (local or containerised, depending on your setup).
Open a terminal for Kafka-related commands.
Approach and rationale
Kafka ingestion demonstrates production-style integration, where data arrives asynchronously from other systems.
It allows the IA node to:
- Ingest data from a stream of events.
- Update the graph without direct client uploads.
- Support integration testing that mirrors real deployments.
This stage is optional but recommended for environments that require streaming integration.
10.1 Prepare the metadata directory
Kafka ingestion requires a persistent metadata directory used by the ingestion components. From the Secure Agent Graph project directory:
Expected behaviour
Ensures Kafka ingestion can store and reuse its internal state. Without this directory, ingestion may fail or stall.
10.2 Start Kafka
Start Kafka using the approach defined by your environment (for example using Docker Compose if provided by the project or your platform tooling).
Kafka should be running and reachable on the expected bootstrap address (commonly localhost:9092) before continuing.
10.3 Restart Secure Agent Graph with Kafka enabled
Stop Secure Agent Graph (Ctrl+C) and restart it using the Kafka configuration:
cd ~/src/secure-agent-graph
USER_ATTRIBUTES_URL=http://localhost:8091 \
JWKS_URL="http://localhost:9229/${USER_POOL_ID}/.well-known/jwks.json" \
java \
-classpath "sag-server/target/classes:sag-system/target/classes:sag-docker/target/dependency/*" \
uk.gov.dbt.ndtp.secure.agent.graph.SecureAgentGraph \
--config sag-docker/mnt/config/dev-server-kafka.ttl
Expected behaviour
- Starts the IA node in Kafka-enabled mode.
- Listens for RDF messages on the configured Kafka topic.
- Ingests messages into the graph automatically.
10.4 Send RDF messages using the provided tooling
Use the Kafka tooling provided by the project (for example jena-kafka-client and the fk script) to publish RDF messages to the configured topic.
Expected behaviour
When messages are published successfully:
- The IA node consumes them.
- The graph is updated.
- The data becomes queryable via SPARQL and GraphQL.
10.5 Verify ingested data is queryable
Fetch an authentication token if required by your secure configuration. Query the graph using SPARQL or GraphQL as demonstrated in Stage 9. Verify that:
- Data published via Kafka is visible.
- Data remains subject to authentication and ABAC filtering.
Operational notes
Kafka ingestion depends on setting the correct topic and bootstrap configuration in dev-server-kafka.ttl.
If ingestion appears stalled, check that:
- The
databasesdirectory exists and is writable. - The topic exists and matches the configuration.
- The node has been restarted after enabling Kafka configuration.
- Kafka is reachable from your host environment.
If running Kafka in Docker, ensure the advertised listener configuration allows connections from your host.
10.6 Checkpoint
At the end of this stage:
- Secure Agent Graph is running with Kafka enabled.
- RDF messages published to Kafka are ingested into the graph.
- Ingested data is queryable.
- Authentication and ABAC filtering still apply.
If ingestion does not work as expected, verify Kafka connectivity and configuration before proceeding.