Exam metadata generation and ingest for DSpace

This is a generalized workflow followed by the University of Toronto Libraries for its DSpace-based repository of previous exam questions from its 3 campuses.

step1.py creates Dubin Core metadata from PDF's filename + department code in the spreadsheets based on the campus.
step2.py packages DSpace simple archive that consists of the PDF, DC metadata in XML and "content" file. These archives can then be imported using the DSpace admin batch import functionality.

System Requirements

Installation

Clone or download the scripts to your local repository. Ensure you have a the pre-requistie software installed before running the scripts.

You must run step1.py before running step2.py, there are more details below about the usage and workflow.

Usage

python step1.py /directory_path_to_pdf_exams/ campus[A, B or C]
python step2.py '/directory_path_to_pdf_exams/

Workflow

1. Scanning & Filenaming

Exams are scanned/created in PDF with file names based on this file naming convention
Each PDF file must contain the course code, month and year.

2. Generate metadata

Run step1.py to generate metadata from PDF's filename
The script also uses a CSV file of departmental codes per campus for mapping

sample generated metadata file found here

3. Package DSpace Simple Archive

Run step2.py script to package the PDFs and metadata into DSpace simple archives for ingest

4. Batch Import Into TSpace

Import DSpace simple archives into their respective collections via DSpace batch import

License

DSpace Simple Archives Importer is licensed under Apache License 2.0.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
Campus_A.csv		Campus_A.csv
Campus_B.csv		Campus_B.csv
Campus_C.csv		Campus_C.csv
LICENSE		LICENSE
README.md		README.md
exam-pdf-filename-conventions.png		exam-pdf-filename-conventions.png
mat700h-ap18.xml		mat700h-ap18.xml
step1.py		step1.py
step2.py		step2.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Exam metadata generation and ingest for DSpace

System Requirements

Installation

Usage

Workflow

1. Scanning & Filenaming

2. Generate metadata

3. Package DSpace Simple Archive

4. Batch Import Into TSpace

License

About

Releases

Packages

Contributors 2

Languages

License

utlib/dspace-exams-ingest-scripts

Folders and files

Latest commit

History

Repository files navigation

Exam metadata generation and ingest for DSpace

System Requirements

Installation

Usage

Workflow

1. Scanning & Filenaming

2. Generate metadata

3. Package DSpace Simple Archive

4. Batch Import Into TSpace

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages