2.8 KiB
Good Bye HIDA
Small script to transform XML Documents of the HIDA/MIDAS architecture to a sqlite database.
Prerequisites
create a virtual environment:
python3 -m venv venv
activate the virtual environment:
source venv/bin/activate
install requirements:
pip install -r requirements.txt
place the XML files in the docs folder or for evaluation purposes few files in the test-docs folder.
Purpose
The program iterates through the docs dir and in a first run, it builds database schemas from the structure of the XML files, which will be used to create a data model for an SQLite database. In a second iteration, every node with children or node with a "txt" element will be an entity (beginning with c__) all other elements will be attributes (beginning with f__). Entities will be connected with an uuid foreign key to their parent entity in relational tables (beginning with r__).
Usage
To have a test run, place XML-files in a dir named test-docs, then type
python3 goodByeHida.py --buildSchemas True
You will get a dir test-schemas and a sqlite database test.db with the imported data.
If everything looks good you can run the script with the docs folder:
python3 goodByeHida.py --production True --buildSchemas True
You will get a dir schemas and a sqlite database database.db with the imported data.
If you like to restart the process and delete the database, type:
python3 goodByeHida.py --production True --buildSchemas True --deleteDatabase True
Import data WissKI
Run skripts in this order: importMaterials.py importAdministrator.py importSource.py importLiterature.py importArtist.py importWorkshops.py importAdministratorStatus.py importArtistRelation.py importMarks.py importInspectionMarks.py importInspectionMarkLocation.py importInspectionMarkRelation.py importMarkDatingInfo.py importSourceReference.py importClient.py importGoldsmithRelation.py importOrigin.py importBirth.py importDeath.py importArtifactRelation.py importNumDation.py importArtifacts.py ImportMarkInfo.py importMarkInformation.py importArtifactToMarkAssignments.py
Roadmap
From XML to database
- Build database schemas from XML files
- Parse HIDA/MIDAS XML files to SQL database
From database to WissKI
- Importer for material
- Importer for artifacts
- Importer for marks
- Importer for mark information
- Importer for artifact to mark assignments
- Importer for artists
- Importer for inspection marks
- Importer for literature
- Importer for continuation of the workshop
- Importer for relation to artist
Other
- Reduce redundancy by importing features: first collect features possibilies in own table and then import them in a second step