-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support bagging to a destination other than the source #92
base: master
Are you sure you want to change the base?
Support bagging to a destination other than the source #92
Conversation
…iple sources remain broken, awaiting clarification.
Looks good to me. @acdha is on vacation, but I am sure once he gets back he will take a look at it. |
Ping. |
@@ -132,7 +133,9 @@ def find_locale_dir(): | |||
UNICODE_BYTE_ORDER_MARK = u'\uFEFF' | |||
|
|||
|
|||
def make_bag(bag_dir, bag_info=None, processes=1, checksums=None, checksum=None, encoding='utf-8'): | |||
def make_bag(bag_dir, | |||
bag_info=None, processes=1, checksums=None, checksum=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer not to have formatting changes in this PR or to follow PEP-8 if we do. In this case, the existing code has been using the aligned indent style.
@@ -148,28 +151,57 @@ def make_bag(bag_dir, bag_info=None, processes=1, checksums=None, checksum=None, | |||
checksums = DEFAULT_CHECKSUMS | |||
|
|||
bag_dir = os.path.abspath(bag_dir) | |||
LOGGER.info(_("Creating bag for directory %s"), bag_dir) | |||
if dest_dir: | |||
LOGGER.info(_("Creating bag for directory %s"), dest_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer a single logging call rather than two near-duplicate lines. In this case, I think it might be easier to follow if we set dest_dir
using bag_dir
when called with dest_dir=None
so most of the code could assume that dest_dir
is what you use when assembling paths
# FIXME: we should do the permissions checks before changing directories | ||
old_dir = os.path.abspath(os.path.curdir) | ||
if dest_dir is not None: | ||
dest_dir = os.path.abspath(dest_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to normalize dest_dir
earlier in the function
old_dir = os.path.abspath(os.path.curdir) | ||
if dest_dir is not None: | ||
dest_dir = os.path.abspath(dest_dir) | ||
if os.path.exists(dest_dir): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block of code is somewhat complicated. As a minor point, OSError
or EnvironmentError
is probably more appropriate than RuntimeError
but I'm leaning towards simplifying the entire block to:
if not os.path.isdir(dest_dir):
os.makedirs(dest_dir)
That will raise OSError
if the path already exists but has a component which is not a directory.
data_subdir = os.path.relpath( | ||
bag_dir, os.path.join(bag_dir, os.pardir) | ||
) | ||
if os.path.abspath(os.path.join(bag_dir, data_subdir)) == bag_dir: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could be os.path.samefile(os.path.join(bag_dir, data_subdir), bag_dir)
if not os.path.exists(data_dir): | ||
os.mkdir(data_dir) | ||
elif os.path.exists(os.path.join(data_dir, data_subdir)): | ||
shutil.rmtree(os.path.join(data_dir, data_subdir)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems risky at first read – shouldn't we just throw an error if the data directory already exists?
os.chmod('data', os.stat(cwd).st_mode) | ||
os.chmod(data_dir, os.stat(bag_dir).st_mode) | ||
|
||
if dest_dir: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering whether this should be one call with src_dir
either being bag_dir
or None
for clarity
|
||
LOGGER.info(_("Creating bagit.txt")) | ||
txt = """BagIt-Version: 0.97\nTag-File-Character-Encoding: UTF-8\n""" | ||
with open_text_file('bagit.txt', 'w') as bagit_file: | ||
bagit_file.write(txt) | ||
if dest_dir: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This entire section is where I was thinking it might be cleaner to always have dest_dir
defined since I'm not happy about so many if statements in a row
@acdha Not sure if this branch is worth keeping around. I'm going to be out of pocket for at least a few months and don't have time to tighten it up before. |
See #35.