#003: Federated entity search across datasets #3

Open
opened 2026-04-05 12:58:40 +00:00 by daniel · 0 comments
Owner

Blocked by

Summary

Allow entity search to span multiple datasets when editing properties that reference entities in other datasets. A SHACL extension property (e.g., concon:targetDataset) specifies which dataset to search in for a given property shape.

Motivation

Research data often links across datasets. For example, a paper entity in a publications dataset may reference author entities in a people dataset. Without cross-dataset search, users must manually copy IRIs between datasets.

Design sketch

ex:authorShape sh:path ex:author ;
    sh:class ex:Person ;
    concon:targetDataset "people-dataset" .

When concon:targetDataset is present, the autocomplete search queries the specified dataset instead of the current one. If absent, defaults to the current dataset (as in #001).

Tests

Unit tests

#[test]
fn test_parse_target_dataset_extension() {
    // SHACL shape with concon:targetDataset
    // Verify the extension property is read correctly
}

#[test]
fn test_search_in_target_dataset() {
    // Create two datasets: "papers" and "people"
    // Insert person entities in "people"
    // Search for person from context of "papers" with targetDataset
    // Verify results come from "people" dataset
}

#[test]
fn test_search_defaults_to_current_dataset() {
    // Shape without concon:targetDataset
    // Verify search queries the current dataset
}

Integration tests

#[tokio::test]
async fn test_cross_dataset_search_endpoint() {
    // Create two datasets, insert entities in each
    // Call /_api/search with dataset parameter pointing to other dataset
    // Verify results from the target dataset
}

#[tokio::test]
async fn test_permission_check_on_target_dataset() {
    // User without access to target dataset
    // Attempt cross-dataset search
    // Verify access denied or empty results
}

Open questions

  1. How to handle permissions — can a user search a dataset they don't have access to?
  2. Should federated search use SPARQL SERVICE or direct store access?
  3. Can a property reference entities from multiple datasets?
# Blocked by - [\#002 SHACL-driven entity editor](002-shacl-driven-entity-editor.org) (entity search within a single dataset must work first) # Summary Allow entity search to span multiple datasets when editing properties that reference entities in other datasets. A SHACL extension property (e.g., `concon:targetDataset`) specifies which dataset to search in for a given property shape. # Motivation Research data often links across datasets. For example, a paper entity in a publications dataset may reference author entities in a people dataset. Without cross-dataset search, users must manually copy IRIs between datasets. # Design sketch ``` turtle ex:authorShape sh:path ex:author ; sh:class ex:Person ; concon:targetDataset "people-dataset" . ``` When `concon:targetDataset` is present, the autocomplete search queries the specified dataset instead of the current one. If absent, defaults to the current dataset (as in \#001). # Tests ## Unit tests ``` rust #[test] fn test_parse_target_dataset_extension() { // SHACL shape with concon:targetDataset // Verify the extension property is read correctly } #[test] fn test_search_in_target_dataset() { // Create two datasets: "papers" and "people" // Insert person entities in "people" // Search for person from context of "papers" with targetDataset // Verify results come from "people" dataset } #[test] fn test_search_defaults_to_current_dataset() { // Shape without concon:targetDataset // Verify search queries the current dataset } ``` ## Integration tests ``` rust #[tokio::test] async fn test_cross_dataset_search_endpoint() { // Create two datasets, insert entities in each // Call /_api/search with dataset parameter pointing to other dataset // Verify results from the target dataset } #[tokio::test] async fn test_permission_check_on_target_dataset() { // User without access to target dataset // Attempt cross-dataset search // Verify access denied or empty results } ``` # Open questions 1. How to handle permissions — can a user search a dataset they don't have access to? 2. Should federated search use SPARQL SERVICE or direct store access? 3. Can a property reference entities from multiple datasets?
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
daniel/concon#3
No description provided.