#005: Detect SHACL incompatibilities and assist data migration #5

Open
opened 2026-04-05 12:58:41 +00:00 by daniel · 0 comments
Owner

Blocked by

Summary

When a SHACL shape is modified in a way that is incompatible with existing data, the system should detect the incompatibility, notify the user, and assist with data migration by suggesting SPARQL Update queries.

Motivation

As schemas evolve, shapes may change: properties are added, removed, renamed, or have their constraints tightened. Existing data that conformed to the old shape may violate the new shape. Without detection, data silently becomes invalid. Without migration assistance, users must manually write complex SPARQL updates to fix the data.

Incompatibility types

Change Example Impact
New mandatory property Add sh:minCount 1 for ex:email Existing entities missing ex:email
Property removed Remove ex:phone from shape Existing triples with ex:phone orphaned
Datatype changed xsd:stringxsd:integer Existing string values invalid
Cardinality tightened sh:maxCount reduced Entities with too many values
Class constraint changed sh:class ex:Orgsh:class ex:Company Existing references point to wrong type
Property path changed ex:nameex:fullName Existing triples use old predicate
Allowed values changed sh:in list modified Existing values not in new list

Design

Step 1: Detect incompatibility

When a SHACL update is submitted (via _shacl/_update):

  1. Parse the new shapes.
  2. Compare with the previous shapes (diff).
  3. Run SHACL validation of existing data against the new shapes.
  4. Collect all violations.

If violations exist, the update is not rejected — the new shapes are saved — but the user is notified of the violations.

Step 2: Notify

Present the list of violations grouped by type:

  • Number of entities affected per violation type
  • Sample entities showing the problem
  • Severity (data loss risk vs cosmetic)

Step 3: Suggest migration queries

For each violation type, generate a SPARQL Update suggestion:

Violation Suggested query
Missing mandatory property INSERT { ?s ex:email "" } WHERE { ?s a ex:Person . FILTER NOT EXISTS { ?s ex:email ?o } }
Orphaned property DELETE WHERE { ?s ex:phone ?o }
Datatype mismatch DELETE { ?s ex:age ?old } INSERT { ?s ex:age ?new } WHERE { ?s ex:age ?old . BIND(xsd:integer(?old) AS ?new) }
Property renamed DELETE { ?s ex:name ?o } INSERT { ?s ex:fullName ?o } WHERE { ?s ex:name ?o }
Excess cardinality (complex — requires user choice of which values to keep)

Suggestions are presented as editable SPARQL in the UI. The user can review, modify, and execute them.

Step 4: Validate after migration

After executing migration queries, re-run SHACL validation to confirm all violations are resolved.

History

For this initial implementation, dataset history is not saved. SHACL changes overwrite the previous shapes. Dataset history strategies are addressed in separate issues:

The history strategy is a per-dataset configuration stored in the meta-dataset (see below).

Meta-dataset

Each dataset has an associated meta-dataset that stores:

  • SHACL shapes (as named graphs)
  • Mapping between shapes and dataset graphs
  • Dataset configuration (history strategy, IRI generators, etc.)

The meta-dataset concept extends beyond SHACL — it is the place for all metadata about the dataset itself.

Locking during migration

While a migration is running:

  • The affected graph in the dataset is locked (read-only).
  • The SHACL file associated to that graph is locked (read-only).
  • Other graphs in the same dataset remain editable (if Oxigraph supports per-graph locking).
  • If Oxigraph does not support per-graph locks, the entire dataset is locked during migration.

Investigation needed: does Oxigraph 0.5.x support per-graph locking or transactions scoped to a named graph?

Destructive migrations

Destructive changes (property removal, datatype narrowing, cardinality reduction, allowed values restriction) require explicit user confirmation before execution. The UI should clearly show:

  • What data will be deleted or modified
  • How many entities are affected
  • Whether the operation is reversible

Permissions

New actions for schema and migration management:

Action Description
concon:EditShaclAction Modify SHACL shapes for a dataset
concon:ViewMigrationAction View pending migration suggestions
concon:ExecuteMigrationAction Execute migration queries
concon:ConfirmDestructiveAction Confirm destructive migrations

By default:

  • Team admins have all four actions.
  • Team members have ViewMigrationAction only.
  • EditShaclAction and ExecuteMigrationAction can be granted via roles (same pattern as concon:FileWriteAction).

Tests

Unit tests

#[test]
fn test_detect_new_mandatory_property() {
    // Old shape: no ex:email property
    // New shape: ex:email with sh:minCount 1
    // Existing data: entities without ex:email
    // Verify: violation detected, count of affected entities
}

#[test]
fn test_detect_removed_property() {
    // Old shape: has ex:phone
    // New shape: ex:phone removed
    // Existing data: entities with ex:phone triples
    // Verify: orphaned triples detected
}

#[test]
fn test_detect_datatype_change() {
    // Old shape: ex:age sh:datatype xsd:string
    // New shape: ex:age sh:datatype xsd:integer
    // Existing data: ex:age "thirty"
    // Verify: datatype mismatch violation
}

#[test]
fn test_suggest_migration_for_new_mandatory() {
    // New mandatory property added
    // Verify: suggested INSERT query with default value
}

#[test]
fn test_suggest_migration_for_rename() {
    // Property path changed from ex:name to ex:fullName
    // Verify: suggested DELETE/INSERT query
}

#[test]
fn test_compatible_change_no_violations() {
    // Add optional property (sh:minCount 0)
    // Verify: no violations, no migration needed
}

#[test]
fn test_locking_during_migration() {
    // Start a migration on a graph
    // Attempt to update the same graph
    // Verify: update blocked or returns lock error
}

Integration tests

#[tokio::test]
async fn test_shacl_update_with_violations_shows_notification() {
    // Update SHACL shapes creating violations
    // Verify: response includes violation report
}

#[tokio::test]
async fn test_execute_suggested_migration() {
    // Get migration suggestion
    // Execute it
    // Re-validate: no violations
}

#[tokio::test]
async fn test_destructive_migration_requires_confirmation() {
    // Attempt destructive migration without confirmation
    // Verify: rejected
    // Retry with confirmation
    // Verify: executed
}

#[tokio::test]
async fn test_permission_check_for_migration() {
    // User without ExecuteMigrationAction
    // Attempt to execute migration
    // Verify: forbidden
}

Manual tests

  1. Edit SHACL to add a mandatory property — verify violation notification
  2. Review suggested migration query — verify it looks correct
  3. Execute migration — verify data updated and violations resolved
  4. Attempt destructive migration — verify confirmation dialog appears
  5. Verify graph is locked during migration (edits rejected)

No further open questions

References

# Blocked by - [\#002 SHACL-driven entity editor](002-shacl-driven-entity-editor.org) (SHACL infrastructure must exist first) - [\#004 Visual SHACL editor](004-shacl-editor.org) (editing SHACL is the trigger for this feature) # Summary When a SHACL shape is modified in a way that is incompatible with existing data, the system should detect the incompatibility, notify the user, and assist with data migration by suggesting SPARQL Update queries. # Motivation As schemas evolve, shapes may change: properties are added, removed, renamed, or have their constraints tightened. Existing data that conformed to the old shape may violate the new shape. Without detection, data silently becomes invalid. Without migration assistance, users must manually write complex SPARQL updates to fix the data. # Incompatibility types | Change | Example | Impact | |--------------------------|-------------------------------------------|-------------------------------------------| | New mandatory property | Add `sh:minCount 1` for `ex:email` | Existing entities missing `ex:email` | | Property removed | Remove `ex:phone` from shape | Existing triples with `ex:phone` orphaned | | Datatype changed | `xsd:string` → `xsd:integer` | Existing string values invalid | | Cardinality tightened | `sh:maxCount` reduced | Entities with too many values | | Class constraint changed | `sh:class ex:Org` → `sh:class ex:Company` | Existing references point to wrong type | | Property path changed | `ex:name` → `ex:fullName` | Existing triples use old predicate | | Allowed values changed | `sh:in` list modified | Existing values not in new list | # Design ## Step 1: Detect incompatibility When a SHACL update is submitted (via `_shacl/_update`): 1. Parse the new shapes. 2. Compare with the previous shapes (diff). 3. Run SHACL validation of existing data against the new shapes. 4. Collect all violations. If violations exist, the update is not rejected — the new shapes are saved — but the user is notified of the violations. ## Step 2: Notify Present the list of violations grouped by type: - Number of entities affected per violation type - Sample entities showing the problem - Severity (data loss risk vs cosmetic) ## Step 3: Suggest migration queries For each violation type, generate a SPARQL Update suggestion: | Violation | Suggested query | |----------------------------|------------------------------------------------------------------------------------------------------------------| | Missing mandatory property | `INSERT { ?s ex:email "" } WHERE { ?s a ex:Person . FILTER NOT EXISTS { ?s ex:email ?o } }` | | Orphaned property | `DELETE WHERE { ?s ex:phone ?o }` | | Datatype mismatch | `DELETE { ?s ex:age ?old } INSERT { ?s ex:age ?new } WHERE { ?s ex:age ?old . BIND(xsd:integer(?old) AS ?new) }` | | Property renamed | `DELETE { ?s ex:name ?o } INSERT { ?s ex:fullName ?o } WHERE { ?s ex:name ?o }` | | Excess cardinality | (complex — requires user choice of which values to keep) | Suggestions are presented as editable SPARQL in the UI. The user can review, modify, and execute them. ## Step 4: Validate after migration After executing migration queries, re-run SHACL validation to confirm all violations are resolved. ## History For this initial implementation, dataset history is not saved. SHACL changes overwrite the previous shapes. Dataset history strategies are addressed in separate issues: - [\#006 Dataset history via named graphs](006-dataset-history-named-graphs.org) - [\#007 Dataset history via external history dataset](007-dataset-history-external.org) - [\#008 Dataset history via file dumps](008-dataset-history-dumps.org) The history strategy is a per-dataset configuration stored in the meta-dataset (see below). ## Meta-dataset Each dataset has an associated meta-dataset that stores: - SHACL shapes (as named graphs) - Mapping between shapes and dataset graphs - Dataset configuration (history strategy, IRI generators, etc.) The meta-dataset concept extends beyond SHACL — it is the place for all metadata about the dataset itself. ## Locking during migration While a migration is running: - The affected graph in the dataset is locked (read-only). - The SHACL file associated to that graph is locked (read-only). - Other graphs in the same dataset remain editable (if Oxigraph supports per-graph locking). - If Oxigraph does not support per-graph locks, the entire dataset is locked during migration. Investigation needed: does Oxigraph 0.5.x support per-graph locking or transactions scoped to a named graph? ## Destructive migrations Destructive changes (property removal, datatype narrowing, cardinality reduction, allowed values restriction) require explicit user confirmation before execution. The UI should clearly show: - What data will be deleted or modified - How many entities are affected - Whether the operation is reversible ## Permissions New actions for schema and migration management: | Action | Description | |-----------------------------------|------------------------------------| | `concon:EditShaclAction` | Modify SHACL shapes for a dataset | | `concon:ViewMigrationAction` | View pending migration suggestions | | `concon:ExecuteMigrationAction` | Execute migration queries | | `concon:ConfirmDestructiveAction` | Confirm destructive migrations | By default: - Team admins have all four actions. - Team members have `ViewMigrationAction` only. - `EditShaclAction` and `ExecuteMigrationAction` can be granted via roles (same pattern as `concon:FileWriteAction`). # Tests ## Unit tests ``` rust #[test] fn test_detect_new_mandatory_property() { // Old shape: no ex:email property // New shape: ex:email with sh:minCount 1 // Existing data: entities without ex:email // Verify: violation detected, count of affected entities } #[test] fn test_detect_removed_property() { // Old shape: has ex:phone // New shape: ex:phone removed // Existing data: entities with ex:phone triples // Verify: orphaned triples detected } #[test] fn test_detect_datatype_change() { // Old shape: ex:age sh:datatype xsd:string // New shape: ex:age sh:datatype xsd:integer // Existing data: ex:age "thirty" // Verify: datatype mismatch violation } #[test] fn test_suggest_migration_for_new_mandatory() { // New mandatory property added // Verify: suggested INSERT query with default value } #[test] fn test_suggest_migration_for_rename() { // Property path changed from ex:name to ex:fullName // Verify: suggested DELETE/INSERT query } #[test] fn test_compatible_change_no_violations() { // Add optional property (sh:minCount 0) // Verify: no violations, no migration needed } #[test] fn test_locking_during_migration() { // Start a migration on a graph // Attempt to update the same graph // Verify: update blocked or returns lock error } ``` ## Integration tests ``` rust #[tokio::test] async fn test_shacl_update_with_violations_shows_notification() { // Update SHACL shapes creating violations // Verify: response includes violation report } #[tokio::test] async fn test_execute_suggested_migration() { // Get migration suggestion // Execute it // Re-validate: no violations } #[tokio::test] async fn test_destructive_migration_requires_confirmation() { // Attempt destructive migration without confirmation // Verify: rejected // Retry with confirmation // Verify: executed } #[tokio::test] async fn test_permission_check_for_migration() { // User without ExecuteMigrationAction // Attempt to execute migration // Verify: forbidden } ``` ## Manual tests 1. Edit SHACL to add a mandatory property — verify violation notification 2. Review suggested migration query — verify it looks correct 3. Execute migration — verify data updated and violations resolved 4. Attempt destructive migration — verify confirmation dialog appears 5. Verify graph is locked during migration (edits rejected) # No further open questions # References - [W3C SHACL Validation Report](https://www.w3.org/TR/shacl/#validation-report) - [Rudof — SHACL validation in Rust](https://github.com/rudof-project/rudof)
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
daniel/concon#5
No description provided.