Add Step 2: SPARQL UPDATE queries to transform literals into objects.

19 queries in updates/ convert categorical columns (continent, country,
city, gender, profession, etc.) from literals to typed RDF objects with
rdfs:label. map/step-02.rb applies them to produce data/graph-02.ttl.
Also fix step-01.rb to sanitize column names with spaces and avoid
prefix serialization issues with fragment IRIs.
This commit is contained in:
Daniel Hernandez 2026-02-26 19:45:08 +01:00
parent da22d312a9
commit d2481d6e80
24 changed files with 368421 additions and 7 deletions

View file

@ -31,4 +31,83 @@ docker compose up -d
bundle exec ruby map/step-01.rb
```
### Step 2 -
### Step 2 - Generate Objects
Continents and countries should be objects instead of literals. To this end, we can transform the following data:
```
base:location\/ARG-BahBlanca-00 a base:location;
base:location\#City "Bahia Blanca";
base:location\#Continent "South America";
base:location\#Country "Argentina";
base:location\#GeoNamesID "3865086";
base:location\#IDLocation "ARG-BahBlanca-00";
base:location\#latitude -3.87253e1;
base:location\#longitude -6.22742e1;
base:location\#wikidata "Q54108";
base:location\#wikipedia "https://en.wikipedia.org/wiki/Bah%C3%ADa_Blanca" .
```
Into the following data:
```
base:location\/ARG-BahBlanca-00 a base:location;
base:location\#City base:City-BahiaBlanca;
base:location\#Continent base:Continent-SouthAmerica;
base:location\#Country base:Country-Argentina;
base:location\#GeoNamesID "3865086";
base:location\#IDLocation "ARG-BahBlanca-00";
base:location\#latitude -3.87253e1;
base:location\#longitude -6.22742e1;
base:location\#wikidata "Q54108";
base:location\#wikipedia "https://en.wikipedia.org/wiki/Bah%C3%ADa_Blanca" .
base:City-BahiaBlanca a base:City;
rdfs:label "Bahia Blanca"@en .
base:Continent-SouthAmerica a base:Continent;
rdfs:label "South America"@en .
base:Country-Argentina a base:Country;
rdfs:label "Argentina"@en .
```
Notice that all ranges of property `rdfs:label` are stated to be in English.
Generate an SPARQL UPDATE query that do this tranformation for all elements of the table and save it a new folder called `updates`. Do the same with the other tables, proposing which columns should be defined as objects. For every table define a different SPARQL UPDATE query and to be saved in the `updates` folder. Enumerate these generated queries adding a prefix number like 001, 002, 003, and so on.
After generating the update queries, generate a Ruby script that executes the updates on the RDF graph generated in the previous step and generates a new RDF graph to be saved: `data/graph-02.ttl`.
#### Summary
19 SPARQL UPDATE queries in `updates/` transform literal values into typed objects across all tables:
| Query | Table | Column | Object type |
|-------|-------|--------|-------------|
| 001 | location | Continent | Continent |
| 002 | location | Country | Country |
| 003 | location | State | State |
| 004 | location | City | City |
| 005 | migration_table | reason | MigrationReason |
| 006 | migration_table | reason2 | MigrationReason |
| 007 | organisation | InstType | InstitutionType |
| 008 | person | gender | Gender |
| 009 | person | Nametype | Nametype |
| 010 | person | Importsource | ImportSource |
| 011 | person_profession | Eprofession | Profession |
| 012 | personnames | Nametype | Nametype |
| 013 | relationship | Relationshiptype | RelationshipType |
| 014 | relationship | relationshiptype_precise | RelationshipTypePrecise |
| 015 | religions | religion | Religion |
| 016 | work | Profession | Profession |
| 017 | work | Profession2 | Profession |
| 018 | work | Profession3 | Profession |
| 019 | work | EmploymentType | EmploymentType |
Each query replaces a literal value with an object reference and creates the object with `rdf:type` and `rdfs:label` (in English). The script `map/step-02.rb` loads `data/graph-01.ttl`, applies all queries in order, and writes `data/graph-02.ttl` (164,632 triples).
To run:
```sh
bundle exec ruby map/step-02.rb
```

182044
data/graph-01.ttl Normal file

File diff suppressed because it is too large Load diff

185949
data/graph-02.ttl Normal file

File diff suppressed because it is too large Load diff

View file

@ -66,8 +66,12 @@ def row_iri(table, pk_value)
RDF::URI.new("#{BASE_IRI}#{table}/#{URI.encode_www_form_component(pk_value.to_s)}")
end
def sanitize_name(name)
name.to_s.gsub(/[^a-zA-Z0-9_-]/, '_').gsub(/_+/, '_').gsub(/\A_+|_+\z/, '')
end
def column_iri(table, column)
RDF::URI.new("#{BASE_IRI}#{table}##{column}")
RDF::URI.new("#{BASE_IRI}#{table}##{sanitize_name(column)}")
end
def class_iri(table)
@ -75,7 +79,7 @@ def class_iri(table)
end
def ref_iri(table, fk_col)
RDF::URI.new("#{BASE_IRI}#{table}#ref-#{fk_col}")
RDF::URI.new("#{BASE_IRI}#{table}#ref-#{sanitize_name(fk_col)}")
end
def to_rdf_literal(value)
@ -124,11 +128,10 @@ PRIMARY_KEYS.each do |table, pk_col|
end
end
output_path = File.expand_path('../graph-01.ttl', __dir__)
output_path = File.expand_path('../data/graph-01.ttl', __dir__)
RDF::Turtle::Writer.open(output_path, prefixes: {
rdf: RDF.to_uri,
xsd: RDF::XSD.to_uri,
base: RDF::URI.new(BASE_IRI)
rdf: RDF.to_uri,
xsd: RDF::XSD.to_uri
}) do |writer|
graph.each_statement { |stmt| writer << stmt }
end

38
map/step-02.rb Normal file
View file

@ -0,0 +1,38 @@
#!/usr/bin/env ruby
# frozen_string_literal: true
# Step 2: Transform literal values into RDF objects using SPARQL UPDATE queries
require 'rdf'
require 'rdf/turtle'
require 'sparql'
input_path = File.expand_path('../data/graph-01.ttl', __dir__)
output_path = File.expand_path('../data/graph-02.ttl', __dir__)
updates_dir = File.expand_path('../updates', __dir__)
puts "Loading graph from #{input_path}..."
graph = RDF::Graph.load(input_path)
puts "Loaded #{graph.count} triples."
Dir.glob(File.join(updates_dir, '*.rq')).sort.each do |query_file|
query = File.read(query_file)
name = File.basename(query_file)
before = graph.count
SPARQL.execute(query, graph, update: true)
after = graph.count
puts "Applied #{name}: #{before} -> #{after} triples (#{after - before >= 0 ? '+' : ''}#{after - before})"
end
puts "Writing #{graph.count} triples to #{output_path}..."
RDF::Turtle::Writer.open(output_path, prefixes: {
rdf: RDF.to_uri,
rdfs: RDF::RDFS.to_uri,
xsd: RDF::XSD.to_uri,
base: RDF::URI.new('http://example.org/migrants/')
}) do |writer|
graph.each_statement { |stmt| writer << stmt }
end
puts "Done."

View file

@ -0,0 +1,16 @@
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
DELETE {
?s <http://example.org/migrants/location#Continent> ?val .
}
INSERT {
?s <http://example.org/migrants/location#Continent> ?obj .
?obj a <http://example.org/migrants/Continent> .
?obj rdfs:label ?label .
}
WHERE {
?s <http://example.org/migrants/location#Continent> ?val .
BIND(IRI(CONCAT("http://example.org/migrants/Continent-", ENCODE_FOR_URI(REPLACE(?val, " ", "")))) AS ?obj)
BIND(STRLANG(STR(?val), "en") AS ?label)
}

View file

@ -0,0 +1,16 @@
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
DELETE {
?s <http://example.org/migrants/location#Country> ?val .
}
INSERT {
?s <http://example.org/migrants/location#Country> ?obj .
?obj a <http://example.org/migrants/Country> .
?obj rdfs:label ?label .
}
WHERE {
?s <http://example.org/migrants/location#Country> ?val .
BIND(IRI(CONCAT("http://example.org/migrants/Country-", ENCODE_FOR_URI(REPLACE(?val, " ", "")))) AS ?obj)
BIND(STRLANG(STR(?val), "en") AS ?label)
}

View file

@ -0,0 +1,16 @@
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
DELETE {
?s <http://example.org/migrants/location#State> ?val .
}
INSERT {
?s <http://example.org/migrants/location#State> ?obj .
?obj a <http://example.org/migrants/State> .
?obj rdfs:label ?label .
}
WHERE {
?s <http://example.org/migrants/location#State> ?val .
BIND(IRI(CONCAT("http://example.org/migrants/State-", ENCODE_FOR_URI(REPLACE(?val, " ", "")))) AS ?obj)
BIND(STRLANG(STR(?val), "en") AS ?label)
}

View file

@ -0,0 +1,16 @@
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
DELETE {
?s <http://example.org/migrants/location#City> ?val .
}
INSERT {
?s <http://example.org/migrants/location#City> ?obj .
?obj a <http://example.org/migrants/City> .
?obj rdfs:label ?label .
}
WHERE {
?s <http://example.org/migrants/location#City> ?val .
BIND(IRI(CONCAT("http://example.org/migrants/City-", ENCODE_FOR_URI(REPLACE(?val, " ", "")))) AS ?obj)
BIND(STRLANG(STR(?val), "en") AS ?label)
}

View file

@ -0,0 +1,16 @@
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
DELETE {
?s <http://example.org/migrants/migration_table#reason> ?val .
}
INSERT {
?s <http://example.org/migrants/migration_table#reason> ?obj .
?obj a <http://example.org/migrants/MigrationReason> .
?obj rdfs:label ?label .
}
WHERE {
?s <http://example.org/migrants/migration_table#reason> ?val .
BIND(IRI(CONCAT("http://example.org/migrants/MigrationReason-", ENCODE_FOR_URI(REPLACE(?val, " ", "")))) AS ?obj)
BIND(STRLANG(STR(?val), "en") AS ?label)
}

View file

@ -0,0 +1,15 @@
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
DELETE {
?s <http://example.org/migrants/migration_table#reason2> ?val .
}
INSERT {
?s <http://example.org/migrants/migration_table#reason2> ?obj .
?obj a <http://example.org/migrants/MigrationReason> .
?obj rdfs:label ?label .
}
WHERE {
?s <http://example.org/migrants/migration_table#reason2> ?val .
BIND(IRI(CONCAT("http://example.org/migrants/MigrationReason-", ENCODE_FOR_URI(REPLACE(?val, " ", "")))) AS ?obj)
BIND(STRLANG(STR(?val), "en") AS ?label)
}

View file

@ -0,0 +1,16 @@
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
DELETE {
?s <http://example.org/migrants/organisation#InstType> ?val .
}
INSERT {
?s <http://example.org/migrants/organisation#InstType> ?obj .
?obj a <http://example.org/migrants/InstitutionType> .
?obj rdfs:label ?label .
}
WHERE {
?s <http://example.org/migrants/organisation#InstType> ?val .
BIND(IRI(CONCAT("http://example.org/migrants/InstitutionType-", ENCODE_FOR_URI(REPLACE(?val, " ", "")))) AS ?obj)
BIND(STRLANG(STR(?val), "en") AS ?label)
}

View file

@ -0,0 +1,16 @@
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
DELETE {
?s <http://example.org/migrants/person#gender> ?val .
}
INSERT {
?s <http://example.org/migrants/person#gender> ?obj .
?obj a <http://example.org/migrants/Gender> .
?obj rdfs:label ?label .
}
WHERE {
?s <http://example.org/migrants/person#gender> ?val .
BIND(IRI(CONCAT("http://example.org/migrants/Gender-", ENCODE_FOR_URI(REPLACE(?val, " ", "")))) AS ?obj)
BIND(STRLANG(STR(?val), "en") AS ?label)
}

View file

@ -0,0 +1,16 @@
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
DELETE {
?s <http://example.org/migrants/person#Nametype> ?val .
}
INSERT {
?s <http://example.org/migrants/person#Nametype> ?obj .
?obj a <http://example.org/migrants/Nametype> .
?obj rdfs:label ?label .
}
WHERE {
?s <http://example.org/migrants/person#Nametype> ?val .
BIND(IRI(CONCAT("http://example.org/migrants/Nametype-", ENCODE_FOR_URI(REPLACE(?val, " ", "")))) AS ?obj)
BIND(STRLANG(STR(?val), "en") AS ?label)
}

View file

@ -0,0 +1,16 @@
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
DELETE {
?s <http://example.org/migrants/person#Importsource> ?val .
}
INSERT {
?s <http://example.org/migrants/person#Importsource> ?obj .
?obj a <http://example.org/migrants/ImportSource> .
?obj rdfs:label ?label .
}
WHERE {
?s <http://example.org/migrants/person#Importsource> ?val .
BIND(IRI(CONCAT("http://example.org/migrants/ImportSource-", ENCODE_FOR_URI(REPLACE(?val, " ", "")))) AS ?obj)
BIND(STRLANG(STR(?val), "en") AS ?label)
}

View file

@ -0,0 +1,16 @@
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
DELETE {
?s <http://example.org/migrants/person_profession#Eprofession> ?val .
}
INSERT {
?s <http://example.org/migrants/person_profession#Eprofession> ?obj .
?obj a <http://example.org/migrants/Profession> .
?obj rdfs:label ?label .
}
WHERE {
?s <http://example.org/migrants/person_profession#Eprofession> ?val .
BIND(IRI(CONCAT("http://example.org/migrants/Profession-", ENCODE_FOR_URI(REPLACE(?val, " ", "")))) AS ?obj)
BIND(STRLANG(STR(?val), "en") AS ?label)
}

View file

@ -0,0 +1,16 @@
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
DELETE {
?s <http://example.org/migrants/personnames#Nametype> ?val .
}
INSERT {
?s <http://example.org/migrants/personnames#Nametype> ?obj .
?obj a <http://example.org/migrants/Nametype> .
?obj rdfs:label ?label .
}
WHERE {
?s <http://example.org/migrants/personnames#Nametype> ?val .
BIND(IRI(CONCAT("http://example.org/migrants/Nametype-", ENCODE_FOR_URI(REPLACE(?val, " ", "")))) AS ?obj)
BIND(STRLANG(STR(?val), "en") AS ?label)
}

View file

@ -0,0 +1,16 @@
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
DELETE {
?s <http://example.org/migrants/relationship#Relationshiptype> ?val .
}
INSERT {
?s <http://example.org/migrants/relationship#Relationshiptype> ?obj .
?obj a <http://example.org/migrants/RelationshipType> .
?obj rdfs:label ?label .
}
WHERE {
?s <http://example.org/migrants/relationship#Relationshiptype> ?val .
BIND(IRI(CONCAT("http://example.org/migrants/RelationshipType-", ENCODE_FOR_URI(REPLACE(?val, " ", "")))) AS ?obj)
BIND(STRLANG(STR(?val), "en") AS ?label)
}

View file

@ -0,0 +1,16 @@
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
DELETE {
?s <http://example.org/migrants/relationship#relationshiptype_precise> ?val .
}
INSERT {
?s <http://example.org/migrants/relationship#relationshiptype_precise> ?obj .
?obj a <http://example.org/migrants/RelationshipTypePrecise> .
?obj rdfs:label ?label .
}
WHERE {
?s <http://example.org/migrants/relationship#relationshiptype_precise> ?val .
BIND(IRI(CONCAT("http://example.org/migrants/RelationshipTypePrecise-", ENCODE_FOR_URI(REPLACE(?val, " ", "")))) AS ?obj)
BIND(STRLANG(STR(?val), "en") AS ?label)
}

View file

@ -0,0 +1,16 @@
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
DELETE {
?s <http://example.org/migrants/religions#religion> ?val .
}
INSERT {
?s <http://example.org/migrants/religions#religion> ?obj .
?obj a <http://example.org/migrants/Religion> .
?obj rdfs:label ?label .
}
WHERE {
?s <http://example.org/migrants/religions#religion> ?val .
BIND(IRI(CONCAT("http://example.org/migrants/Religion-", ENCODE_FOR_URI(REPLACE(?val, " ", "")))) AS ?obj)
BIND(STRLANG(STR(?val), "en") AS ?label)
}

View file

@ -0,0 +1,16 @@
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
DELETE {
?s <http://example.org/migrants/work#Profession> ?val .
}
INSERT {
?s <http://example.org/migrants/work#Profession> ?obj .
?obj a <http://example.org/migrants/Profession> .
?obj rdfs:label ?label .
}
WHERE {
?s <http://example.org/migrants/work#Profession> ?val .
BIND(IRI(CONCAT("http://example.org/migrants/Profession-", ENCODE_FOR_URI(REPLACE(?val, " ", "")))) AS ?obj)
BIND(STRLANG(STR(?val), "en") AS ?label)
}

View file

@ -0,0 +1,15 @@
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
DELETE {
?s <http://example.org/migrants/work#Profession2> ?val .
}
INSERT {
?s <http://example.org/migrants/work#Profession2> ?obj .
?obj a <http://example.org/migrants/Profession> .
?obj rdfs:label ?label .
}
WHERE {
?s <http://example.org/migrants/work#Profession2> ?val .
BIND(IRI(CONCAT("http://example.org/migrants/Profession-", ENCODE_FOR_URI(REPLACE(?val, " ", "")))) AS ?obj)
BIND(STRLANG(STR(?val), "en") AS ?label)
}

View file

@ -0,0 +1,15 @@
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
DELETE {
?s <http://example.org/migrants/work#Profession3> ?val .
}
INSERT {
?s <http://example.org/migrants/work#Profession3> ?obj .
?obj a <http://example.org/migrants/Profession> .
?obj rdfs:label ?label .
}
WHERE {
?s <http://example.org/migrants/work#Profession3> ?val .
BIND(IRI(CONCAT("http://example.org/migrants/Profession-", ENCODE_FOR_URI(REPLACE(?val, " ", "")))) AS ?obj)
BIND(STRLANG(STR(?val), "en") AS ?label)
}

View file

@ -0,0 +1,16 @@
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
DELETE {
?s <http://example.org/migrants/work#EmploymentType> ?val .
}
INSERT {
?s <http://example.org/migrants/work#EmploymentType> ?obj .
?obj a <http://example.org/migrants/EmploymentType> .
?obj rdfs:label ?label .
}
WHERE {
?s <http://example.org/migrants/work#EmploymentType> ?val .
BIND(IRI(CONCAT("http://example.org/migrants/EmploymentType-", ENCODE_FOR_URI(REPLACE(?val, " ", "")))) AS ?obj)
BIND(STRLANG(STR(?val), "en") AS ?label)
}