#007: Dataset history via external history dataset #7

Open
opened 2026-04-05 12:58:42 +00:00 by daniel · 0 comments
Owner

Blocked by

Summary

Save dataset history in a separate Concon dataset associated with the original dataset. Each historical snapshot is a named graph in the history dataset.

Design sketch

For a dataset MetaLearn/molecules, a history dataset is created at MetaLearn/molecules/_history (or similar convention).

When a migration is performed:

  1. Copy the affected graph into the history dataset as a named graph.
  2. Record metadata in the history dataset's default graph.
  3. Apply the migration to the original dataset.

The association between a dataset and its history dataset is stored in the meta-dataset.

Pros

  • Original dataset stays clean and small
  • History can be managed independently (backup, archive, delete old snapshots)
  • Clear separation of concerns

Cons

  • Additional dataset to manage per dataset
  • Cross-dataset queries needed to compare current vs historical data

Configuration

Enabled per-dataset via meta-dataset:

<urn:config:history> concon:historyStrategy concon:ExternalDataset ;
    concon:historyDataset "MetaLearn/molecules/_history" .

Tests

Unit tests

#[test]
fn test_history_dataset_created_automatically() {
    // Configure ExternalDataset strategy
    // Run first migration
    // Verify: history dataset created at expected path
}

#[test]
fn test_snapshot_in_history_dataset() {
    // Run migration
    // Query history dataset for snapshot named graph
    // Verify: contains pre-migration data
}

#[test]
fn test_original_dataset_unchanged_in_size() {
    // Run multiple migrations
    // Verify: original dataset size doesn't grow from snapshots
    // Verify: history dataset contains all snapshots
}

Manual tests

  1. Enable ExternalDataset strategy for a dataset
  2. Run migration, verify history dataset appears in team view
  3. Query history dataset via SPARQL — verify snapshot data
  4. Delete old snapshots from history dataset independently
# Blocked by - [\#005 SHACL migration detection](005-shacl-migration-detection.org) (history is triggered by migrations) # Summary Save dataset history in a separate Concon dataset associated with the original dataset. Each historical snapshot is a named graph in the history dataset. # Design sketch For a dataset `MetaLearn/molecules`, a history dataset is created at `MetaLearn/molecules/_history` (or similar convention). When a migration is performed: 1. Copy the affected graph into the history dataset as a named graph. 2. Record metadata in the history dataset's default graph. 3. Apply the migration to the original dataset. The association between a dataset and its history dataset is stored in the meta-dataset. # Pros - Original dataset stays clean and small - History can be managed independently (backup, archive, delete old snapshots) - Clear separation of concerns # Cons - Additional dataset to manage per dataset - Cross-dataset queries needed to compare current vs historical data # Configuration Enabled per-dataset via meta-dataset: ``` turtle <urn:config:history> concon:historyStrategy concon:ExternalDataset ; concon:historyDataset "MetaLearn/molecules/_history" . ``` # Tests ## Unit tests ``` rust #[test] fn test_history_dataset_created_automatically() { // Configure ExternalDataset strategy // Run first migration // Verify: history dataset created at expected path } #[test] fn test_snapshot_in_history_dataset() { // Run migration // Query history dataset for snapshot named graph // Verify: contains pre-migration data } #[test] fn test_original_dataset_unchanged_in_size() { // Run multiple migrations // Verify: original dataset size doesn't grow from snapshots // Verify: history dataset contains all snapshots } ``` ## Manual tests 1. Enable ExternalDataset strategy for a dataset 2. Run migration, verify history dataset appears in team view 3. Query history dataset via SPARQL — verify snapshot data 4. Delete old snapshots from history dataset independently
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
daniel/concon#7
No description provided.