Document Steps 5 and 6 in README

This commit is contained in:
Daniel Hernandez 2026-03-01 13:55:16 +01:00
parent c454189645
commit 2fcb1715c2

View file

@ -159,3 +159,70 @@ does not intended to mean a comment "", but the lack of a comment. So, write a q
### Step 5 - Use well-known vocabularies
For some classes, properties, and individuals we can be represented with Schema.org. For example, the class `migrants:person` can be represented with the class `schema:Person`. Please propose what of these elements could use the Schema.org vocabulary and generate an SPARQL to generate the next graph. Consider using other vocabularies beyond Schema.org, if you consider them appropiate to represent the information on this dataset.
#### Summary
7 SPARQL UPDATE queries in `updates_step05/` add well-known vocabulary properties alongside the existing `migrants:` predicates:
| Query | Mapping |
|-------|---------|
| 001 | Person properties → `schema:givenName`, `schema:familyName`, `schema:birthDate`, `schema:deathDate`, `schema:gender`, `schema:birthPlace`, `schema:deathPlace`, `schema:image`, `schema:hasOccupation`, `schema:citation`, `rdfs:comment` |
| 002 | Person authority identifiers (Wikidata, GND, VIAF, CERL, LCCN, ISNI, SNAC) → `owl:sameAs` and `wdtn:` normalized properties |
| 003 | Location properties → `wgs84:lat`, `wgs84:long`; Wikipedia/Wikidata links → `owl:sameAs` |
| 004 | Organisation properties → `schema:name`, `schema:location`, `rdfs:comment` |
| 005 | Person labels → `rdfs:label` (generated from first\_name + family\_name) |
| 006 | Enumeration instances → `skos:Concept` + `skos:prefLabel` |
| 007 | Class types → `schema:Person`, `schema:Place`, `schema:Organization` |
The program `src/map/step_05.rs` loads `data/graph-04.ttl`, applies all queries, and writes `data/graph-05.ttl` (168,129 triples).
To run:
```sh
cargo run --release --bin step-05
```
### Step 6 - Map to the Theatre Migrants ontology
#### Task
Define a custom OWL ontology (`teatre-migrants.ttl`) for domain-specific terms not covered by well-known vocabularies, published at `https://daniel.degu.cl/ontologies/theatre-migrants/` with prefix `tm:`. Reuse existing vocabularies where possible:
- **Schema.org** for persons, places, organizations, and occupations.
- **W3C Organization Ontology** (`org:`) for work engagements, modeled as `org:Membership` (replacing the original `migrants:work` class). Properties `org:member` and `org:organization` link the membership to the person and organization.
- **SKOS** for enumeration types as subclasses of `skos:Concept`.
Write SPARQL CONSTRUCT queries that produce a new graph using only the `tm:`, `schema:`, `org:`, `skos:`, `owl:`, `wgs84:`, and `wdtn:` vocabularies. The original `http://example.org/migrants/` predicates and class types are replaced; only entity IRIs retain the `migrants:` namespace.
#### Summary
The ontology `teatre-migrants.ttl` defines:
- **6 domain-specific classes:** `tm:Migration`, `tm:Relationship`, `tm:PersonProfession`, `tm:PersonName`, `tm:ReligionAffiliation`, `tm:ImportSource`.
- **11 enumeration classes** (all `rdfs:subClassOf skos:Concept`): `tm:Continent`, `tm:Country`, `tm:State`, `tm:City`, `tm:MigrationReason`, `tm:InstitutionType`, `tm:NameType`, `tm:RelationshipType`, `tm:RelationshipTypePrecise`, `tm:Religion`, `tm:EmploymentType`.
- Object and datatype properties with domains, ranges, and temporal uncertainty modeling (`tm:dateStartMin`, `tm:dateStartMax`, `tm:dateEndMin`, `tm:dateEndMax`, `tm:dateStartFuzzy`, `tm:dateEndFuzzy`).
12 SPARQL CONSTRUCT queries in `constructs_step06/` transform the graph:
| Query | Description |
|-------|-------------|
| 001-persons | Persons with `schema:Person` properties and `tm:` extensions |
| 002-places | Places with `wgs84:` coordinates and `tm:` geographic hierarchy |
| 003-organisations | Organizations with `schema:name` and `tm:institutionType` |
| 004-migrations | Migration events with `tm:migrant`, `tm:startPlace`, `tm:destinationPlace` |
| 005-memberships | Work engagements as `org:Membership` with `org:member`, `org:organization` |
| 006-relationships | Interpersonal relationships with `tm:activePerson`, `tm:passivePerson` |
| 007-person-professions | Personprofession associations |
| 008-person-names | Historical/alternative person names |
| 009-religion-affiliations | Religion affiliations with temporal bounds |
| 010a-occupations-passthrough | Pass through existing `schema:Occupation` instances |
| 010b-occupations-from-profession | Retype `migrants:Profession` as `schema:Occupation` |
| 011-enumerations | Map enumeration instances to `skos:Concept` with `tm:` subtypes |
The program `src/map/step_06.rs` loads `data/graph-05.ttl`, runs all CONSTRUCT queries, collects the resulting triples into a new graph, and writes `data/graph-06.ttl` (148,985 triples).
To run:
```sh
cargo run --release --bin step-06
```