Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

including the processing log for process_transfers bin script in the transfer #37

Open
photomedia opened this issue Dec 15, 2021 · 1 comment
Labels
enhancement New feature or request

Comments

@photomedia
Copy link
Collaborator

photomedia commented Dec 15, 2021

Currently, we don't include the verbose processing log from the results of each process_transfers command on each eprint in the transfer AIP for archivematica. It would make sense to include this as a part of the "submission documentation" as described here:
https://www.archivematica.org/en/docs/archivematica-1.13/user-manual/transfer/transfer/#transfers-with-submission-documentation
That would mean adding an additional subfolder here:
amid>metadata>submissionDocumentation
and placing the verbose log from the process_transfers command on that eprint in there as a text file, something like : "processing_log.txt"
The logs's summary (not full log) is currently stored in the Eprints archivematica objects as comments, and optionally, if you redirect output of the bin script, as a log file on the file system.
Ideally, this log would actually be included in the transfer, along with each AIP.
Here is an example of what that log would look like for an eprint:

*Processing Archivematica ID: 50

  • [1] Export - start 50
  • [1] Copy - '/opt/eprints3/archives/REPOID/documents/disk0/00/98/78/78/01/Parish_MSc_S2021.pdf' '/archivematica-path/auto-transfers/50/objects/documents/documentid-171996/files/2470878/Parish_MSc_S2021.pdf'
  • [1] Copy - '/opt/eprints3/archives/REPOID/documents/disk0/00/98/78/78/02/lightbox.jpg' '/archivematica-path/auto-transfers/50/objects/derivatives/2/lightbox.jpg'
  • [1] Copy - '/opt/eprints3/archives/REPOID/documents/disk0/00/98/78/78/03/preview.jpg' '/archivematica-path/auto-transfers/50/objects/derivatives/3/preview.jpg'
  • [1] Copy - '/opt/eprints3/archives/REPOID/documents/disk0/00/98/78/78/04/medium.jpg' '/archivematica-path/auto-transfers/50/objects/derivatives/4/medium.jpg'
  • [1] Copy - '/opt/eprints3/archives/REPOID/documents/disk0/00/98/78/78/05/small.jpg' '/archivematica-path/auto-transfers/50/objects/derivatives/5/small.jpg'
  • [1] Copy - '/opt/eprints3/archives/REPOID/documents/disk0/00/98/78/78/06/indexcodes.txt' '/archivematica-path/auto-transfers/50/objects/derivatives/6/indexcodes.txt'
  • [1] Write - /archivematica-path/auto-transfers/50/metadata/EP3.xml
  • [1] Copy - '/opt/eprints3/archives/REPOID/documents/disk0/00/98/78/78/revisions/14.xml' '/archivematica-path/auto-transfers/50/metadata/revisions/14.xml'
  • [1] Copy - '/opt/eprints3/archives/REPOID/documents/disk0/00/98/78/78/revisions/7.xml' '/archivematica-path/auto-transfers/50/metadata/revisions/7.xml'
  • [1] Copy - '/opt/eprints3/archives/REPOID/documents/disk0/00/98/78/78/revisions/12.xml' '/archivematica-path/auto-transfers/50/metadata/revisions/12.xml'
  • [1] Copy - '/opt/eprints3/archives/REPOID/documents/disk0/00/98/78/78/revisions/8.xml' '/archivematica-path/auto-transfers/50/metadata/revisions/8.xml'
  • [1] Copy - '/opt/eprints3/archives/REPOID/documents/disk0/00/98/78/78/revisions/1.xml' '/archivematica-path/auto-transfers/50/metadata/revisions/1.xml'
  • [1] Copy - '/opt/eprints3/archives/REPOID/documents/disk0/00/98/78/78/revisions/10.xml' '/archivematica-path/auto-transfers/50/metadata/revisions/10.xml'
  • [1] Copy - '/opt/eprints3/archives/REPOID/documents/disk0/00/98/78/78/revisions/21.xml' '/archivematica-path/auto-transfers/50/metadata/revisions/21.xml'
  • [1] Copy - '/opt/eprints3/archives/REPOID/documents/disk0/00/98/78/78/revisions/20.xml' '/archivematica-path/auto-transfers/50/metadata/revisions/20.xml'
  • [1] Copy - '/opt/eprints3/archives/REPOID/documents/disk0/00/98/78/78/revisions/2.xml' '/archivematica-path/auto-transfers/50/metadata/revisions/2.xml'
  • [1] Copy - '/opt/eprints3/archives/REPOID/documents/disk0/00/98/78/78/revisions/19.xml' '/archivematica-path/auto-transfers/50/metadata/revisions/19.xml'
  • [1] Copy - '/opt/eprints3/archives/REPOID/documents/disk0/00/98/78/78/revisions/5.xml' '/archivematica-path/auto-transfers/50/metadata/revisions/5.xml'
  • [1] Copy - '/opt/eprints3/archives/REPOID/documents/disk0/00/98/78/78/revisions/6.xml' '/archivematica-path/auto-transfers/50/metadata/revisions/6.xml'
  • [1] Copy - '/opt/eprints3/archives/REPOID/documents/disk0/00/98/78/78/revisions/11.xml' '/archivematica-path/auto-transfers/50/metadata/revisions/11.xml'
  • [1] Copy - '/opt/eprints3/archives/REPOID/documents/disk0/00/98/78/78/revisions/16.xml' '/archivematica-path/auto-transfers/50/metadata/revisions/16.xml'
  • [1] Copy - '/opt/eprints3/archives/REPOID/documents/disk0/00/98/78/78/revisions/9.xml' '/archivematica-path/auto-transfers/50/metadata/revisions/9.xml'
  • [1] Copy - '/opt/eprints3/archives/REPOID/documents/disk0/00/98/78/78/revisions/15.xml' '/archivematica-path/auto-transfers/50/metadata/revisions/15.xml'
  • [1] Copy - '/opt/eprints3/archives/REPOID/documents/disk0/00/98/78/78/revisions/4.xml' '/archivematica-path/auto-transfers/50/metadata/revisions/4.xml'
  • [1] Copy - '/opt/eprints3/archives/REPOID/documents/disk0/00/98/78/78/revisions/17.xml' '/archivematica-path/auto-transfers/50/metadata/revisions/17.xml'
  • [1] Copy - '/opt/eprints3/archives/REPOID/documents/disk0/00/98/78/78/revisions/13.xml' '/archivematica-path/auto-transfers/50/metadata/revisions/13.xml'
  • [1] Copy - '/opt/eprints3/archives/REPOID/documents/disk0/00/98/78/78/revisions/3.xml' '/archivematica-path/auto-transfers/50/metadata/revisions/3.xml'
  • [1] Copy - '/opt/eprints3/archives/REPOID/documents/disk0/00/98/78/78/revisions/18.xml' '/archivematica-path/auto-transfers/50/metadata/revisions/18.xml'
  • [1] Manifest - Checksum correct for '/archivematica-path/auto-transfers/50/objects/derivatives/5/small.jpg' (74066631bd34ec0f30981588df3e0e0a)
  • [1] Manifest - Checksum correct for '/archivematica-path/auto-transfers/50/objects/derivatives/6/indexcodes.txt' (941453ba5b640531ec2011a1419ac808)
  • [1] Manifest - Checksum correct for '/archivematica-path/auto-transfers/50/objects/derivatives/4/medium.jpg' (707d925e129dab35ee2357d78163ee32)
  • [1] Manifest - Checksum correct for '/archivematica-path/auto-transfers/50/objects/derivatives/2/lightbox.jpg' (8670be38e3d766f28655d496d921c44c)
  • [1] Manifest - Checksum correct for '/archivematica-path/auto-transfers/50/objects/derivatives/3/preview.jpg' (d7e23469a2a1e9519da1591259f2e992)
  • [1] Manifest - Checksum correct for '/archivematica-path/auto-transfers/50/objects/documents/documentid-171996/files/2470878/Parish_MSc_S2021.pdf' (122f0e0b3a201f95f670a5a953a549f5)
  • [1] Export - end 50

This log would include warnings, for example, if a checksum was missing and was generated during export.

I am proposing this as a future enhancement to the export plugin. Do you think it's worthwhile to add this?

@photomedia photomedia added the enhancement New feature or request label Dec 15, 2021
@photomedia
Copy link
Collaborator Author

photomedia commented Jan 17, 2022

One way to include the processing log would be to use PREMIS format, so that Archivematica can actually use the metadata.

Info on importing/including Premis.xml in the transfer:
https://www.archivematica.org/en/docs/archivematica-1.13/user-manual/transfer/import-metadata/#premis-xml

Here is an updated example of what that would look like in XML (Thanks to Artefactual for their assistance with this). The example describes 2 PDF object, 4 events in total, 2 on each file (transfer and fixity check), and 1 agent (the EPrintsArchivematica plugin itself):

<?xml version="1.0" encoding="UTF-8"?>

<premis:premis xmlns:premis="http://www.loc.gov/premis/v3" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.loc.gov/premis/v3 https://www.loc.gov/standards/premis/premis.xsd" version="3.0">
  
<premis:object xsi:type="premis:file">
    <premis:objectIdentifier>
      <premis:objectIdentifierType>local</premis:objectIdentifierType>
      <premis:objectIdentifierValue>9999</premis:objectIdentifierValue>
    </premis:objectIdentifier>

    <premis:objectCharacteristics>
    <premis:format>
    <premis:formatDesignation>
    <premis:formatName></premis:formatName>
    <premis:formatVersion></premis:formatVersion>
    </premis:formatDesignation>
    </premis:format>
    </premis:objectCharacteristics>

    <premis:originalName>
      objects/documents/documentid-1111/files/9999/supplementalfile.pdf
    </premis:originalName>
    <premis:linkingEventIdentifier>
      <premis:linkingEventIdentifierType>EPrintsArchivematica Event ID</premis:linkingEventIdentifierType>
      <premis:linkingEventIdentifierValue>EA-3</premis:linkingEventIdentifierValue>
    </premis:linkingEventIdentifier>
    <premis:linkingEventIdentifier>
      <premis:linkingEventIdentifierType>EPrintsArchivematica Event ID</premis:linkingEventIdentifierType>
      <premis:linkingEventIdentifierValue>EA-4</premis:linkingEventIdentifierValue>
    </premis:linkingEventIdentifier>
  </premis:object>

<premis:object xsi:type="premis:file">
    <premis:objectIdentifier>
      <premis:objectIdentifierType>local</premis:objectIdentifierType>
      <premis:objectIdentifierValue>2500117</premis:objectIdentifierValue>
    </premis:objectIdentifier>

    <premis:objectCharacteristics>
    <premis:format>
    <premis:formatDesignation>
    <premis:formatName></premis:formatName>
    <premis:formatVersion></premis:formatVersion>
    </premis:formatDesignation>
    </premis:format>
    </premis:objectCharacteristics>

    <premis:originalName>
      objects/documents/documentid-185158/files/2500117/TNeugebauer.pdf
    </premis:originalName>
    <premis:linkingEventIdentifier>
      <premis:linkingEventIdentifierType>EPrintsArchivematica Event ID</premis:linkingEventIdentifierType>
      <premis:linkingEventIdentifierValue>EA-1</premis:linkingEventIdentifierValue>
    </premis:linkingEventIdentifier>
    <premis:linkingEventIdentifier>
      <premis:linkingEventIdentifierType>EPrintsArchivematica Event ID</premis:linkingEventIdentifierType>
      <premis:linkingEventIdentifierValue>EA-2</premis:linkingEventIdentifierValue>
    </premis:linkingEventIdentifier>
  </premis:object>


  <premis:event>
    <premis:eventIdentifier>
      <premis:eventIdentifierType>EPrintsArchivematica Event ID</premis:eventIdentifierType>
      <premis:eventIdentifierValue>EA-1</premis:eventIdentifierValue>
    </premis:eventIdentifier>
    <premis:eventType>transfer</premis:eventType>
    <premis:eventDateTime>2022-07-04T22:46:07.773391+00:00</premis:eventDateTime>
    <premis:eventDetailInformation>
      <premis:eventDetail>Processed Transfer</premis:eventDetail>
    </premis:eventDetailInformation>
    <premis:eventOutcomeInformation>
      <premis:eventOutcome/>
      <premis:eventOutcomeDetail>
        <premis:eventOutcomeDetailNote/>
      </premis:eventOutcomeDetail>
    </premis:eventOutcomeInformation>
    <premis:linkingAgentIdentifier>
      <premis:linkingAgentIdentifierType>local</premis:linkingAgentIdentifierType>
      <premis:linkingAgentIdentifierValue>EPrintsArchivematica</premis:linkingAgentIdentifierValue>
    </premis:linkingAgentIdentifier>
  </premis:event>


  <premis:event>
    <premis:eventIdentifier>
      <premis:eventIdentifierType>EPrintsArchivematica Event ID</premis:eventIdentifierType>
      <premis:eventIdentifierValue>EA-2</premis:eventIdentifierValue>
    </premis:eventIdentifier>
    <premis:eventType>fixity check</premis:eventType>
    <premis:eventDateTime>2019-07-04T22:46:07.773391+00:00</premis:eventDateTime>
    <premis:eventDetailInformation>
      <premis:eventDetail>checking that EPrints checksum matches</premis:eventDetail>
    </premis:eventDetailInformation>
    <premis:eventOutcomeInformation>
      <premis:eventOutcome>Pass</premis:eventOutcome>
      <premis:eventOutcomeDetail>
        <premis:eventOutcomeDetailNote>Missing checksum in EPrints, generated new checksum and stored in EPrints
        </premis:eventOutcomeDetailNote>
      </premis:eventOutcomeDetail>
    </premis:eventOutcomeInformation>
    <premis:linkingAgentIdentifier>
      <premis:linkingAgentIdentifierType>local</premis:linkingAgentIdentifierType>
      <premis:linkingAgentIdentifierValue>EPrintsArchivematica</premis:linkingAgentIdentifierValue>
    </premis:linkingAgentIdentifier>
  </premis:event>


<premis:event>
    <premis:eventIdentifier>
      <premis:eventIdentifierType>EPrintsArchivematica Event ID</premis:eventIdentifierType>
      <premis:eventIdentifierValue>EA-3</premis:eventIdentifierValue>
    </premis:eventIdentifier>
    <premis:eventType>transfer</premis:eventType>
    <premis:eventDateTime>2022-07-04T23:46:07.773391+00:00</premis:eventDateTime>
    <premis:eventDetailInformation>
      <premis:eventDetail>Processed Transfer</premis:eventDetail>
    </premis:eventDetailInformation>
    <premis:eventOutcomeInformation>
      <premis:eventOutcome/>
      <premis:eventOutcomeDetail>
        <premis:eventOutcomeDetailNote/>
      </premis:eventOutcomeDetail>
    </premis:eventOutcomeInformation>
    <premis:linkingAgentIdentifier>
      <premis:linkingAgentIdentifierType>local</premis:linkingAgentIdentifierType>
      <premis:linkingAgentIdentifierValue>EPrintsArchivematica</premis:linkingAgentIdentifierValue>
    </premis:linkingAgentIdentifier>
  </premis:event>


  <premis:event>
    <premis:eventIdentifier>
      <premis:eventIdentifierType>EPrintsArchivematica Event ID</premis:eventIdentifierType>
      <premis:eventIdentifierValue>EA-4</premis:eventIdentifierValue>
    </premis:eventIdentifier>
    <premis:eventType>fixity check</premis:eventType>
    <premis:eventDateTime>2019-07-04T23:48:07.773391+00:00</premis:eventDateTime>
    <premis:eventDetailInformation>
      <premis:eventDetail>checking that EPrints checksum matches</premis:eventDetail>
    </premis:eventDetailInformation>
    <premis:eventOutcomeInformation>
      <premis:eventOutcome>Pass</premis:eventOutcome>
      <premis:eventOutcomeDetail>
        <premis:eventOutcomeDetailNote>Missing checksum in EPrints, generated new checksum and stored in EPrints
        </premis:eventOutcomeDetailNote>
      </premis:eventOutcomeDetail>
    </premis:eventOutcomeInformation>
    <premis:linkingAgentIdentifier>
      <premis:linkingAgentIdentifierType>local</premis:linkingAgentIdentifierType>
      <premis:linkingAgentIdentifierValue>EPrintsArchivematica</premis:linkingAgentIdentifierValue>
    </premis:linkingAgentIdentifier>
  </premis:event>


  <premis:agent>
    <premis:agentIdentifier>
      <premis:agentIdentifierType>local</premis:agentIdentifierType>
      <premis:agentIdentifierValue>EPrintsArchivematica</premis:agentIdentifierValue>
    </premis:agentIdentifier>
    <premis:agentName>EPrintsArchivematica Plugin (Version 1.2)</premis:agentName>
    <premis:agentType>software</premis:agentType>
  </premis:agent>


</premis:premis>

@photomedia photomedia pinned this issue Jan 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant