- Python 98.3%
- Shell 1.7%
| semase | ||
| tests | ||
| .gitignore | ||
| contributors.ttl | ||
| LICENSE | ||
| pyproject.toml | ||
| README.md | ||
| setup.sh | ||
SemASE
SemASE integrates the Atomic Simulation Environment (ASE) with knowledge graphs stored in a SPARQL endpoint. It extends ASE classes with methods to save and load atomic simulation data as RDF triples, enabling semantic interoperability and structured querying of simulation results.
SemASE works by patching ASE classes at runtime, so existing code requires
minimal changes. After activation, classes like Atoms and Cell gain
save_to_kg() and update_from_kg() methods that serialize their state into an
RDF knowledge graph and reconstruct it back.
This project is under development
Installation
pip install semase
Dependencies
- ASE -- Atomic Simulation Environment
- rdflib -- RDF triple construction and serialization
- SPARQLWrapper -- SPARQL endpoint communication
Examples
Activating SemASE
Call semase.activate() once at the start of your script. After activation, all
ASE classes patched by SemASE gain knowledge graph methods.
SemASE can work against a remote SPARQL endpoint (e.g., Apache Jena Fuseki):
import semase
semase.activate(endpoint="http://localhost:3030/ase/sparql")
Alternatively, you can use pyoxigraph as a local store, which requires no server setup. With no arguments, the store is in-memory and data is lost when the process exits:
import semase
import pyoxigraph
semase.activate()
store = pyoxigraph.Store() # in-memory, not persistent
To persist data across sessions, pass a directory path. Oxigraph saves the database in that directory and reloads it automatically:
store = pyoxigraph.Store("/path/to/my-data") # persistent on disk
Then pass store= to save_to_kg() and update_from_kg():
from ase import Atoms
water = Atoms('H2O', positions=[(0, 0, 0), (0, 0.76, 0.59), (0, -0.76, 0.59)])
water.save_to_kg(uri="ase:water-001", store=store)
loaded = Atoms()
loaded.update_from_kg(uri="ase:water-001", store=store)
print(loaded.get_chemical_formula()) # H2O
Saving a molecule to the knowledge graph
from ase import Atoms
# Build a water molecule
water = Atoms('H2O', positions=[(0, 0, 0), (0, 0.76, 0.59), (0, -0.76, 0.59)])
# Save it to the knowledge graph
water.save_to_kg(uri="ase:water-001")
The save_to_kg() method serializes the Atoms object -- including its Cell,
atomic symbols, and positions -- into RDF triples and inserts them into the
configured SPARQL endpoint.
Loading a molecule from the knowledge graph
from ase import Atoms
# Reconstruct the Atoms object from the knowledge graph
water = Atoms()
water.update_from_kg(uri="ase:water-001")
print(water.get_chemical_formula()) # H2O
print(water.positions)
The update_from_kg() method queries the SPARQL endpoint for the given URI and
populates the object's attributes from the stored triples.
Saving a periodic crystal
from ase.build import bulk
si = bulk('Si', 'diamond', a=5.43)
si.save_to_kg(uri="ase:silicon-diamond")
Both the Atoms object and its associated Cell are committed together. The
cell vectors and periodic boundary conditions are stored as part of the graph.
Querying the knowledge graph directly
SemASE also exposes its SPARQL client for custom queries:
from semase.sparql.client import SparqlClient
client = SparqlClient(store=store) # or SparqlClient(endpoint="http://...")
results = client.query("""
PREFIX semase: <https://semase.org/ontology#>
SELECT ?system ?formula WHERE {
?system a semase:AtomicSystem ;
semase:chemicalFormula ?formula .
}
""")
for row in results:
print(row["system"], row["formula"])
Updating an existing entry
If a URI already exists in the knowledge graph, save_to_kg() replaces its
triples with the current state of the object:
from ase import Atoms
water = Atoms()
water.update_from_kg(uri="ase:water-001")
# Modify the structure
water.positions[1] += [0, 0.01, 0]
# Save the updated version back
water.save_to_kg(uri="ase:water-001")
License
Roadmap
Next there are some ideas to consider in the roadmap of this project.
-
We need to test if this work with a toy data example.
-
This project saves objects to a shared knowledge graph of the researchers. We could improve this by integrating it with a file store. In particular, at the University of Stuttgart we use DARUS. Thus, it would be useful to include a module to save the simulation data to DARUS.
-
We should also integrate this tools with research workflows in order to make computations repeatable.