migrants/README.md
Daniel Hernandez da22d312a9 Add Step 1: Direct mapping from MariaDB to RDF.
Dockerfile and docker-compose.yml for MariaDB container,
map/step-01.rb implementing the W3C Direct Mapping for all 9 tables.
2026-02-26 16:42:30 +01:00

34 lines
No EOL
1.7 KiB
Markdown

# Theatre Migrants
To generate a knowledge graph about migrants in the theatre in Europe.
## Generating the ontology
Next there are set of steps describing how to generate the migrants RDF graph.
### Step 1 - Loading the input data into a relational database
#### Task
The file `teatre-migrants.sql` contains the dump of a MariaDB database. The tables involved in this schema are described in the file `db_schema.md`. We will load this data in MariaDB to access the data with SQL. To this end:
1. Create a Dockerfile to create a docker container for MariaDB.
2. Upload the dump into a database in the container.
3. Create a Ruby script `map/step-01.rb` that uses the gem `sequel` to connect to the database. This Ruby script should return a file called `graph-01.ttl` containing all the data from the tables loaded in the database using the direct mapping from relational databases to RDF.
#### Summary
The `Dockerfile` creates a MariaDB 10.11 container that automatically loads `teatre-migrants.sql` on first start. The `docker-compose.yml` exposes the database on port 3306 with a healthcheck.
The script `map/step-01.rb` connects to the database via `sequel` and implements the [W3C Direct Mapping](https://www.w3.org/TR/rdb-direct-mapping/) for all 9 tables (`location`, `migration_table`, `organisation`, `person`, `person_profession`, `personnames`, `relationship`, `religions`, `work`). Each table row becomes an RDF resource identified by its primary key, each column becomes a datatype property, and each foreign key becomes an object property linking to the referenced row. The output file `graph-01.ttl` contains 162,029 triples.
To run:
```sh
docker compose up -d
bundle exec ruby map/step-01.rb
```
### Step 2 -