No description
Find a file
2024-02-07 15:48:37 +01:00
.gitignore !ignore docs and add README 2024-02-07 15:45:20 +01:00
__init__.py first commit 2024-02-07 15:33:41 +01:00
buildSchemas.py first commit 2024-02-07 15:33:41 +01:00
goodByeHida.py first commit 2024-02-07 15:33:41 +01:00
importer.py first commit 2024-02-07 15:33:41 +01:00
initDb.py first commit 2024-02-07 15:33:41 +01:00
initSchemas.py first commit 2024-02-07 15:33:41 +01:00
README.md fix typos 2024-02-07 15:48:37 +01:00
requirements.txt first commit 2024-02-07 15:33:41 +01:00
utils.py first commit 2024-02-07 15:33:41 +01:00

Good Bye HIDA

Small script to transform XML Documents of the HIDA/MIDAS architecture to a sqlite database.

Prerequisites

create a virtual environment:

python3 -m venv venv

activate the virtual environment:

source venv/bin/activate

install requirements:

pip install -r requirements.txt

place the XML files in the docs folder or for evaluation purposes few files in the test-docs folder.

Purpose

The program iterates through the docs dir and in a first run, it builds database schemas from the structure of the XML files, which will be used to create a data model for an SQLite database. In a second iteration, every node with children or node with a "txt" element will be an entity (beginning with c__) all other elements will be attributes (beginning with f__). Entities will be connected with an uuid foreign key to their parent entity in relational tables (beginning with r__).

Usage

To have a test run, place XML-files in a dir named test-docs, then type

python3 goodByeHida.py --buildSchemas True 

You will get a dir test-schemas and a sqlite database test.db with the imported data.

If everything looks good you can run the script with the docs folder:

python3 goodByeHida.py --production True --buildSchemas True 

You will get a dir schemas and a sqlite database database.db with the imported data.

If you like to restart the process and delete the database, type:

python3 goodByeHida.py --production True --buildSchemas True --deleteDatabase True