No description
Find a file
2025-09-09 10:16:31 +02:00
.vscode new commit 2025-09-09 10:16:31 +02:00
models full import scripts 2025-02-21 22:26:38 +01:00
.$ngk.drawio.bkp full import scripts 2025-02-21 22:26:38 +01:00
.$ngk_side.drawio.dtmp full import scripts 2025-02-21 22:26:38 +01:00
.gitignore new commit 2025-09-09 10:16:31 +02:00
00_start.py new commit 2025-09-09 10:16:31 +02:00
01_importMaterialsAndTechnique.py new commit 2025-09-09 10:16:31 +02:00
02_importAdministrator.py new commit 2025-09-09 10:16:31 +02:00
03_importAdministratorStatus.py new commit 2025-09-09 10:16:31 +02:00
03_importSource.py new commit 2025-09-09 10:16:31 +02:00
04_importArtistSourceReferenceAssignment.py new commit 2025-09-09 10:16:31 +02:00
04_importMarks.py new commit 2025-09-09 10:16:31 +02:00
04_importSourceReferenceAssignment.py new commit 2025-09-09 10:16:31 +02:00
05_importArtist.py new commit 2025-09-09 10:16:31 +02:00
06_importLiterature.py new commit 2025-09-09 10:16:31 +02:00
07_importInspectionMark.py new commit 2025-09-09 10:16:31 +02:00
07_importJournalAssignment.py new commit 2025-09-09 10:16:31 +02:00
07_importLiteratureReferenceAssignment.py new commit 2025-09-09 10:16:31 +02:00
07_importParentLiteratureAssignment.py new commit 2025-09-09 10:16:31 +02:00
08_importInspectionMarkLocation.py new commit 2025-09-09 10:16:31 +02:00
09_importInspectionMarkRelation.py new commit 2025-09-09 10:16:31 +02:00
10_importMarkDatingInfo.py new commit 2025-09-09 10:16:31 +02:00
12_importBirth.py new commit 2025-09-09 10:16:31 +02:00
13_importDeath.py new commit 2025-09-09 10:16:31 +02:00
14_importDating.py new commit 2025-09-09 10:16:31 +02:00
15_importGoldsmithRelation.py new commit 2025-09-09 10:16:31 +02:00
16_importClient.py new commit 2025-09-09 10:16:31 +02:00
17_importMentioned.py new commit 2025-09-09 10:16:31 +02:00
18_importNumDating.py new commit 2025-09-09 10:16:31 +02:00
19_importOriginAssignment.py new commit 2025-09-09 10:16:31 +02:00
20_importWorkshops.py new commit 2025-09-09 10:16:31 +02:00
21_importArtifacts.py new commit 2025-09-09 10:16:31 +02:00
22_importArtifactRelation.py new commit 2025-09-09 10:16:31 +02:00
24_importArtistAssignment.py new commit 2025-09-09 10:16:31 +02:00
25_importMarkInformation.py new commit 2025-09-09 10:16:31 +02:00
26_importPhotographer.py new commit 2025-09-09 10:16:31 +02:00
98__r__importArtifactToArtistRelationRelation.py new commit 2025-09-09 10:16:31 +02:00
98__r__importArtifactToClientAssignmentRelation.py new commit 2025-09-09 10:16:31 +02:00
98__r__importArtifactToInspectionMarkLocationRelation.py new commit 2025-09-09 10:16:31 +02:00
98__r__importArtifactToLiteratureReferenceAssignmentRelation.py new commit 2025-09-09 10:16:31 +02:00
98__r__importArtifactToMarkInformationAssignmentRelation.py new commit 2025-09-09 10:16:31 +02:00
98__r__importArtifactToMaterialRelation.py new commit 2025-09-09 10:16:31 +02:00
98__r__importArtifactToNumericeDateRelation.py new commit 2025-09-09 10:16:31 +02:00
98__r__importArtifactToPhotographRelation.py new commit 2025-09-09 10:16:31 +02:00
98__r__importArtifactToRelationRelation.py new commit 2025-09-09 10:16:31 +02:00
98__r__importArtifactToSourceRelation.py new commit 2025-09-09 10:16:31 +02:00
98__r__importArtifactToStatusAdministratorRelation.py new commit 2025-09-09 10:16:31 +02:00
98__r__importArtistToBirthRelation.py new commit 2025-09-09 10:16:31 +02:00
98__r__importArtistToDeathRelation.py new commit 2025-09-09 10:16:31 +02:00
98__r__importArtistToGoldsmithRelation.py new commit 2025-09-09 10:16:31 +02:00
98__r__importArtistToLiteratureReferenceRelation.py new commit 2025-09-09 10:16:31 +02:00
98__r__importArtistToMentionedRelation.py new commit 2025-09-09 10:16:31 +02:00
98__r__importArtistToOriginRelation.py new commit 2025-09-09 10:16:31 +02:00
98__r__importArtistToWorkshopRelation.py new commit 2025-09-09 10:16:31 +02:00
98__r__importInspectionMarkDatingInformationAssignmentRelation.py new commit 2025-09-09 10:16:31 +02:00
98__r__importInspectionMarkRelationRelation.py new commit 2025-09-09 10:16:31 +02:00
98__r__importInspectionMarkToLiteratureReferenceRelation.py new commit 2025-09-09 10:16:31 +02:00
98__r__importLiteratureToJournalRelation.py new commit 2025-09-09 10:16:31 +02:00
98__r__importLiteratureToParentPublicationRelation.py new commit 2025-09-09 10:16:31 +02:00
98__r__importMarkToDatingRelation.py new commit 2025-09-09 10:16:31 +02:00
98__r__importMarkToLiteratureRelation.py new commit 2025-09-09 10:16:31 +02:00
98__r__importMarkToMarkInformationRelation.py new commit 2025-09-09 10:16:31 +02:00
98__r__importMarkToSourceRelation.py new commit 2025-09-09 10:16:31 +02:00
98__r__importSourceToDateRelation.py new commit 2025-09-09 10:16:31 +02:00
98__r__importSourceToLiteratureReferenceAssignmentRelation.py new commit 2025-09-09 10:16:31 +02:00
99_deleter.py full import scripts 2025-02-21 22:26:38 +01:00
__init__.py first commit 2024-02-07 15:33:41 +01:00
buildSchemas.py first commit 2024-02-07 15:33:41 +01:00
cleanedProcessedRows.csv full import scripts 2025-02-21 22:26:38 +01:00
dataWrangler.py full import scripts 2025-02-21 22:26:38 +01:00
goodByeHida.py first commit 2024-02-07 15:33:41 +01:00
importer.py first commit 2024-02-07 15:33:41 +01:00
initDb.py new commit 2025-09-09 10:16:31 +02:00
initSchemas.py new commit 2025-09-09 10:16:31 +02:00
ngk.drawio full import scripts 2025-02-21 22:26:38 +01:00
prepareArtifact.py full import scripts 2025-02-21 22:26:38 +01:00
processedRows.csv full import scripts 2025-02-21 22:26:38 +01:00
README.md full import scripts 2025-02-21 22:26:38 +01:00
requirements.txt new commit 2025-09-09 10:16:31 +02:00
utils.py first commit 2024-02-07 15:33:41 +01:00

Good Bye HIDA

Small script to transform XML Documents of the HIDA/MIDAS architecture to a sqlite database.

Prerequisites

create a virtual environment:

python3 -m venv venv

activate the virtual environment:

source venv/bin/activate

install requirements:

pip install -r requirements.txt

place the XML files in the docs folder or for evaluation purposes few files in the test-docs folder.

Purpose

The program iterates through the docs dir and in a first run, it builds database schemas from the structure of the XML files, which will be used to create a data model for an SQLite database. In a second iteration, every node with children or node with a "txt" element will be an entity (beginning with c__) all other elements will be attributes (beginning with f__). Entities will be connected with an uuid foreign key to their parent entity in relational tables (beginning with r__).

Usage

To have a test run, place XML-files in a dir named test-docs, then type

python3 goodByeHida.py --buildSchemas True 

You will get a dir test-schemas and a sqlite database test.db with the imported data.

If everything looks good you can run the script with the docs folder:

python3 goodByeHida.py --production True --buildSchemas True 

You will get a dir schemas and a sqlite database database.db with the imported data.

If you like to restart the process and delete the database, type:

python3 goodByeHida.py --production True --buildSchemas True --deleteDatabase True

Import data WissKI

Run skripts in this order: importMaterials.py importAdministrator.py importSource.py importLiterature.py importArtist.py importWorkshops.py importAdministratorStatus.py importArtistRelation.py importMarks.py importInspectionMarks.py importInspectionMarkLocation.py importInspectionMarkRelation.py importMarkDatingInfo.py importSourceReference.py importClient.py importGoldsmithRelation.py importOrigin.py importBirth.py importDeath.py importArtifactRelation.py importNumDation.py importArtifacts.py ImportMarkInfo.py importMarkInformation.py importArtifactToMarkAssignments.py

Roadmap

From XML to database

  • Build database schemas from XML files
  • Parse HIDA/MIDAS XML files to SQL database

From database to WissKI

  • Importer for material
  • Importer for artifacts
  • Importer for marks
  • Importer for mark information
  • Importer for artifact to mark assignments
  • Importer for artists
  • Importer for inspection marks
  • Importer for literature
  • Importer for continuation of the workshop
  • Importer for relation to artist

Other

  • Reduce redundancy by importing features: first collect features possibilies in own table and then import them in a second step