Human-Friendly Exploration Tool for Archivematica METS Files
This tool is based on METSFlask by Tim Walsh.
METSFlask provides a foundation for interacting with and visualizing METS (Metadata Encoding and Transmission Standard) files. Building on this work, this tool offers additional features designed specifically for exploring and understanding Archivematica METS files, with a focus on accessibility and user-friendly navigation.
Install latest from the GitHub repository:
$ pip install git+https://github.com/nakamura196/mets_tools.git
Documentation can be found hosted on this GitHub repository’s pages.
from mets_tools.core import METSFile
local_file = "./test.xml"
ins = METSFile(local_file)
ins.parse_mets()
original_files = ins.get_original_files()
original_files
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
</style>
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
amdsec_id | filepath | uuid | hashtype | hashvalue | bytes | format | version | puid | modified_date | fits_modified_unixtime | filename | size | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | amdSec_2 | S001.png | 386c4295-64fb-45b9-b5c6-3adfcd7f2bcf | sha256 | 5288586020b7ff120ad53f94432f719aa0ca1c5e094dc9... | 9097282 | Portable Network Graphics | 1.2 | <a href="http://nationalarchives.gov.uk/PRONOM... | 2024-10-25T03:00:20Z | S001.png | 9 MB |
ins.visualize_file_format_counts()
ins.visualize_file_events_count()
df = ins.parse_file_sec()
df
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
</style>
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
USE | File ID | Group ID | ADMID | File Location | LOCTYPE | OTHERLOCTYPE | |
---|---|---|---|---|---|---|---|
0 | original | file-386c4295-64fb-45b9-b5c6-3adfcd7f2bcf | Group-386c4295-64fb-45b9-b5c6-3adfcd7f2bcf | amdSec_2 | objects/S001.png | OTHER | SYSTEM |
1 | submissionDocumentation | file-f6acdbb3-095f-4771-8fd9-8c5b74e296ec | Group-f6acdbb3-095f-4771-8fd9-8c5b74e296ec | amdSec_5 | objects/submissionDocumentation/transfer-png-c... | OTHER | SYSTEM |
2 | preservation | file-6f93b80d-fb6d-41b6-819e-b9bc3ad085e6 | Group-386c4295-64fb-45b9-b5c6-3adfcd7f2bcf | amdSec_1 | objects/S001-6f93b80d-fb6d-41b6-819e-b9bc3ad08... | OTHER | SYSTEM |
3 | text/ocr | file-bd353cfc-64aa-4f48-af61-4a4d8ee3a55b | Group-386c4295-64fb-45b9-b5c6-3adfcd7f2bcf | amdSec_3 | objects/metadata/OCRfiles/S001-386c4295-64fb-4... | OTHER | SYSTEM |
4 | metadata | file-d8aa2e08-71bd-4852-9d87-e6eba354affb | Group-d8aa2e08-71bd-4852-9d87-e6eba354affb | amdSec_4 | objects/metadata/transfers/png-c4688ddd-8bb7-4... | OTHER | SYSTEM |
ins.visualize_structMap()
StructMap (TYPE: physical, LABEL: Archivematica default)
└── Directory: png-e5e48d8a-421a-461d-8e55-468bf37253a8
└── Directory: objects
├── Item: S001.png
├── Item: S001-6f93b80d-fb6d-41b6-819e-b9bc3ad085e6.tif
├── Directory: submissionDocumentation
│ └── Directory: transfer-png-c4688ddd-8bb7-4593-b4bf-f4302ea6882c
│ └── Item: METS.xml
└── Directory: metadata
├── Directory: OCRfiles
│ └── Item: S001-386c4295-64fb-45b9-b5c6-3adfcd7f2bcf.txt
└── Directory: transfers
└── Directory: png-c4688ddd-8bb7-4593-b4bf-f4302ea6882c
└── Item: directory_tree.txt
StructMap (TYPE: logical, LABEL: Normative Directory Structure)
└── Directory: png-e5e48d8a-421a-461d-8e55-468bf37253a8
└── Directory: objects
├── Item: S001.png
├── Item: S001-6f93b80d-fb6d-41b6-819e-b9bc3ad085e6.tif
├── Directory: submissionDocumentation
│ └── Directory: transfer-png-c4688ddd-8bb7-4593-b4bf-f4302ea6882c
│ └── Item: METS.xml
└── Directory: metadata
├── Directory: OCRfiles
│ └── Item: S001-386c4295-64fb-45b9-b5c6-3adfcd7f2bcf.txt
└── Directory: transfers
└── Directory: png-c4688ddd-8bb7-4593-b4bf-f4302ea6882c
└── Item: directory_tree.txt
ins.show_file_changes()
ファイル名の変更は見つかりませんでした。