Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/synced collections #484

Merged
merged 55 commits into from
Feb 20, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
7605b6b
Improve Sync Data Structures (#336)
vishav1771 Jul 31, 2020
1906db8
Add alternative backends for synced_collection (#364)
vishav1771 Aug 20, 2020
ee015af
Added validation layer to SyncedCollection (#378)
vishav1771 Sep 9, 2020
feca409
Added buffering to SyncedCollection (#363)
vishav1771 Dec 22, 2020
6d7388c
Feature/synced collection/reorg (#437)
vyasr Dec 22, 2020
f8b3cd2
Feature/synced collection/reorg tests (#438)
vyasr Dec 22, 2020
d1abf8f
Fix incomplete merge
vyasr Dec 22, 2020
e7118a6
Remove lingering old file.
vyasr Dec 22, 2020
df54278
Feature/synced collection/cleanup (#445)
vyasr Dec 25, 2020
71d5941
Merge master, apply all relevant formatting tools, and add documentat…
vyasr Dec 26, 2020
d11d05d
Feature/synced collection/cleanup2 (#447)
vyasr Dec 27, 2020
4e08121
Feature/synced collection/optimization (#453)
vyasr Jan 3, 2021
0b2bbcf
Remove unnecessary backend str from tests.
vyasr Jan 3, 2021
260583f
Feature/synced collection/test mongodb redis (#464)
vyasr Jan 5, 2021
ac93f37
Make SyncedCollections thread-safe (#463)
vyasr Jan 6, 2021
af991e7
Implements an in-memory buffer for SyncedCollections (#462)
vyasr Jan 6, 2021
25c68c1
Clean up miscellaneous outstanding to-do items (#466)
vyasr Jan 6, 2021
d5a9f57
Make buffering thread safe (#468)
vyasr Jan 10, 2021
8c72b08
Feature/synced collection/unify buffering (#469)
vyasr Jan 10, 2021
81aaef3
Feature/synced collection/contexts (#470)
vyasr Jan 12, 2021
63bc9fd
Merge remote-tracking branch 'origin/master' into feature/synced_coll…
vyasr Jan 12, 2021
f4fb0c6
Install pymongo on pypy.
vyasr Jan 13, 2021
8fda448
Don't sync on construction.
vyasr Jan 18, 2021
b289b5f
Add comparison operators to SyncedList and make sure modifying the fi…
vyasr Jan 19, 2021
fab18cb
Remove unnecessary constructor validation, providing both data and re…
vyasr Jan 19, 2021
57c6ef9
Merge remote-tracking branch 'origin/master' into feature/synced_coll…
bdice Jan 20, 2021
0f876d5
Fix unused imports.
bdice Jan 20, 2021
f77e17c
Feature/synced collection/replace jsondict (#472)
vyasr Jan 25, 2021
472722c
Merge remote-tracking branch 'origin/master' into feature/synced_coll…
vyasr Jan 25, 2021
7a408ab
Fix synced collection support for 0d numpy arrays.
vyasr Jan 25, 2021
8608412
Add oldest supported version of pymongo and ensure that zarr/mongo co…
vyasr Jan 25, 2021
d89ff95
Deprecate json module (#480)
vyasr Jan 25, 2021
dd3c565
Feature/synced collection/reorg (#481)
vyasr Jan 26, 2021
5ccda3f
Feature/synced collection/simplify global buffering (#482)
vyasr Jan 26, 2021
d280549
Feature/synced collection/deprecate old (#483)
vyasr Jan 27, 2021
0f90092
Feature/synced collection/fix buffer reload (#486)
vyasr Jan 28, 2021
23915b5
Fix lots of documentation issues.
vyasr Jan 29, 2021
0744d1e
Address first round of PR comments.
vyasr Jan 29, 2021
b63be11
Update changelog.
vyasr Jan 29, 2021
505b0e9
Fix mypy error.
vyasr Jan 29, 2021
5613ddc
Merge branch 'master' into feature/synced_collections
vyasr Jan 29, 2021
73dd05f
Don't error check uid unless it's provided.
vyasr Jan 29, 2021
c070ad5
First pass to address PR comments.
vyasr Feb 12, 2021
9bb8887
Feature/synced collection/remove attr access (#504)
vyasr Feb 17, 2021
d487007
Feature/synced collection/numpy (#503)
vyasr Feb 18, 2021
9f24fa7
Merge remote-tracking branch 'origin/master' into feature/synced_coll…
vyasr Feb 18, 2021
adddcb8
Remove add_validators and specify validators in class definition (#507)
vyasr Feb 19, 2021
e71942d
Don't use os.path.join where not needed. (#511)
vyasr Feb 19, 2021
48863c4
Disable recursive validation during recursive conversion of nested ty…
vyasr Feb 19, 2021
f0537c4
Feature/synced collection/optimize jsondict validation (#508)
vyasr Feb 19, 2021
28cfe46
Feature/synced collection/optimize protected key lookup (#510)
vyasr Feb 19, 2021
e5b8057
Defer statepoint instantiation, unify reset_statepoint logic (#497)
bdice Feb 20, 2021
9259316
Feature/synced collection/optimize (#513)
vyasr Feb 20, 2021
ca3397c
Optionally skip validation in SyncedCollection _update. (#512)
bdice Feb 20, 2021
9a73e9a
Merge remote-tracking branch 'origin/master' into feature/synced_coll…
vyasr Feb 20, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .circleci/ci-oldest-reqs.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ h5py==2.8.0
numpy==1.13.3
packaging==15.0
pandas==1.0.0
pymongo==3.0.0
pytest-cov==2.10.1
pytest==6.2.1
tables==3.3.0
Expand Down
15 changes: 14 additions & 1 deletion .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,8 @@ jobs:
linux-python-39: &linux-template
docker:
- image: circleci/python:3.9
- image: circleci/mongo:latest
- image: circleci/redis:latest
vyasr marked this conversation as resolved.
Show resolved Hide resolved

environment:
BENCHMARKS: "RUN"
Expand Down Expand Up @@ -138,16 +140,22 @@ jobs:
<<: *linux-template
docker:
- image: circleci/python:3.8
- image: circleci/mongo:latest
- image: circleci/redis:latest

linux-python-37:
<<: *linux-template
docker:
- image: circleci/python:3.7
- image: circleci/mongo:latest
- image: circleci/redis:latest

linux-python-36-oldest:
<<: *linux-template
docker:
- image: circleci/python:3.6
- image: circleci/mongo:latest
- image: circleci/redis:latest
environment:
BENCHMARKS: "SKIP"
DEPENDENCIES: "OLDEST"
Expand Down Expand Up @@ -180,7 +188,12 @@ jobs:
${PYTHON} -m pip install --progress-bar off -U pip>=20.3
${PYTHON} -m pip install --progress-bar off -U codecov
${PYTHON} -m pip install --progress-bar off -U -r requirements/requirements-test.txt
${PYTHON} -m pip install --progress-bar off -U -r requirements/requirements-test-optional.txt

# For some reason Zarr doesn't install correctly on Windows (runs
# into pip SSL errors), so we skip that test.
grep -v zarr requirements-test-optional.txt > requirements-test-optional-windows.txt
${PYTHON} -m pip install --progress-bar off -U -r requirements-test-optional-windows.txt

${PYTHON} -m pip install --progress-bar off -U -e .
- run:
name: Run tests
Expand Down
1 change: 1 addition & 0 deletions changelog.txt
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ next
Added
+++++

- New ``SyncedCollection`` class and subclasses to replace ``JSONDict`` with more general support for different types of resources (such as MongoDB collections or Redis databases) and more complete support for different data types synchronized with files (#196, #234, #249, #316, #383, #397, #465, #484). This change introduces a minor-backwards incompatible change; for users making direct use of signac buffering, the ``force_write`` parameter is no longer respected. If the argument is passed, a warning will now be raised to indicate that it is ignored and will be removed in signac 2.0.
- Unified querying for state point and document filters using 'sp' and 'doc' as prefixes (#332, #514). This change introduces a minor backwards-incompatible change to the ``Collection`` index schema ('statepoint'->'sp'), but this does not affect any APIs, only indexes saved to file using a previous version of signac. Indexing APIs will be removed in signac 2.0.


Expand Down
108 changes: 108 additions & 0 deletions doc/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -206,3 +206,111 @@ signac.errors module
:members:
:undoc-members:
:show-inheritance:

synced\_collections package
===========================

Data Types
----------

synced\_collections.synced\_collection module
+++++++++++++++++++++++++++++++++++++++++++++

.. automodule:: signac.synced_collections.data_types.synced_collection
:members:
:private-members:
:show-inheritance:

synced\_collections.synced\_dict module
+++++++++++++++++++++++++++++++++++++++++++++

.. automodule:: signac.synced_collections.data_types.synced_dict
:members:
:show-inheritance:

synced\_collections.synced\_list module
+++++++++++++++++++++++++++++++++++++++++++++

.. automodule:: signac.synced_collections.data_types.synced_list
:members:
:show-inheritance:

Backends
--------

synced\_collections.backends.collection\_json module
+++++++++++++++++++++++++++++++++++++++++++

.. automodule:: signac.synced_collections.backends.collection_json
:members:
:show-inheritance:

synced\_collections.backends.collection\_mongodb module
++++++++++++++++++++++++++++++++++++++++++++++

.. automodule:: signac.synced_collections.backends.collection_mongodb
:members:
:show-inheritance:

synced\_collections.backends.collection\_redis module
++++++++++++++++++++++++++++++++++++++++++++

.. automodule:: signac.synced_collections.backends.collection_redis
:members:
:show-inheritance:

synced\_collections.backends.collection\_zarr module
+++++++++++++++++++++++++++++++++++++++++++

.. automodule:: signac.synced_collections.backends.collection_zarr
:members:
:show-inheritance:

Buffers
-------

synced\_collections.buffers.buffered\_collection module
+++++++++++++++++++++++++++++++++++++++++++++++

.. automodule:: signac.synced_collections.buffers.buffered_collection
:members:
:private-members:
:show-inheritance:

synced\_collections.buffers.file\_buffered\_collection module
+++++++++++++++++++++++++++++++++++++++++++++++++++++

.. automodule:: signac.synced_collections.buffers.file_buffered_collection
:members:
:show-inheritance:

synced\_collections.buffers.serialized\_file\_buffered\_collection module
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

.. automodule:: signac.synced_collections.buffers.serialized_file_buffered_collection
:members:
:show-inheritance:

synced\_collections.buffers.memory\_buffered\_collection module
+++++++++++++++++++++++++++++++++++++++++++++++++++++++

.. automodule:: signac.synced_collections.buffers.memory_buffered_collection
:members:
:show-inheritance:

Miscellaneous Modules
---------------------

synced\_collections.utils module
++++++++++++++++++++++++++++++++

.. automodule:: signac.synced_collections.utils
:members:
:show-inheritance:

synced\_collections.validators module
+++++++++++++++++++++++++++++++++++++

.. automodule:: signac.synced_collections.validators
:members:
:show-inheritance:
7 changes: 5 additions & 2 deletions doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -316,6 +316,9 @@ def __getattr__(cls, name):
intersphinx_mapping = {
"python": ("https://docs.python.org/3", None),
"pymongo": ("https://pymongo.readthedocs.io/en/stable/", None),
"pandas": ("https://pandas.pydata.org/docs/", None),
"h5py": ("https://docs.h5py.org/en/stable/", None),
"pandas": ("https://pandas.pydata.org/pandas-docs/stable/", None),
"h5py": ("http://docs.h5py.org/en/stable/", None),
"zarr": ("https://zarr.readthedocs.io/en/stable", None),
"redis": ("https://redis-py.readthedocs.io/en/stable/", None),
"numcodecs": ("https://numcodecs.readthedocs.io/en/stable/", None),
}
3 changes: 3 additions & 0 deletions requirements/requirements-test-optional.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,7 @@ h5py==3.1.0; implementation_name=='cpython'
numpy==1.20.0
pandas==1.2.1; implementation_name=='cpython'
pymongo==3.11.2; implementation_name=='cpython'
redis==3.5.3
ruamel.yaml==0.16.12
tables==3.6.1; implementation_name=='cpython'
zarr==2.4.0; platform_system!='Windows'
6 changes: 3 additions & 3 deletions setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ ignore = E123,E126,E203,E226,E241,E704,W503,W504
match = ^((?!\.sync-zenodo-metadata|setup|benchmark|mpipool|connection|crypt|host|filesystems|indexing).)*\.py$
match-dir = ^((?!\.|tests|configobj|db).)*$
ignore-decorators = "deprecated"
ignore = D105, D107, D203, D204, D213
add-ignore = D105, D107, D203, D204, D213

[mypy]
ignore_missing_imports = True
Expand All @@ -34,8 +34,8 @@ omit =
*/signac/common/configobj/*.py

[tool:pytest]
filterwarnings =
ignore: .*[The indexing module | get_statepoint] is deprecated.*: DeprecationWarning
filterwarnings =
ignore: .*[The indexing module | get_statepoint | Use of.+as key] is deprecated.*: DeprecationWarning

[bumpversion:file:setup.py]

Expand Down
15 changes: 11 additions & 4 deletions signac/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,15 +27,21 @@
from .contrib import filesystems as fs
from .contrib import get_job, get_project, index, index_files, init_project
from .core.h5store import H5Store, H5StoreManager
from .core.jsondict import JSONDict
from .core.jsondict import buffer_reads_writes as buffered
from .core.jsondict import flush_all as flush
from .core.jsondict import get_buffer_load, get_buffer_size
from .core.jsondict import in_buffered_mode as is_buffered
from .db import get_database
from .diff import diff_jobs
from .synced_collections.backends.collection_json import (
BufferedJSONAttrDict as JSONDict,
)
from .version import __version__

# Alias some properties related to buffering into the signac namespace.
buffered = JSONDict.buffer_backend
is_buffered = JSONDict.backend_is_buffered
get_buffer_load = JSONDict.get_current_buffer_size
get_buffer_size = JSONDict.get_buffer_capacity
set_buffer_size = JSONDict.set_buffer_capacity

__all__ = [
"__version__",
"contrib",
Expand Down Expand Up @@ -69,6 +75,7 @@
"flush",
"get_buffer_size",
"get_buffer_load",
"set_buffer_size",
"JSONDict",
"H5Store",
"H5StoreManager",
Expand Down
2 changes: 1 addition & 1 deletion signac/contrib/collection.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@

import argparse
import io
import json
import logging
import operator
import re
Expand All @@ -24,7 +25,6 @@
from math import isclose
from numbers import Number

from ..core import json
from .filterparse import parse_filter_arg
from .utility import _nested_dicts_to_dotted_keys, _to_hashable

Expand Down
4 changes: 2 additions & 2 deletions signac/contrib/errors.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,8 @@ class DestinationExistsError(Error, RuntimeError):

Parameters
----------
destination :
The destination object causing the error.
destination : str
The destination causing the error.

"""

Expand Down
3 changes: 1 addition & 2 deletions signac/contrib/filterparse.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,10 @@
# This software is licensed under the BSD 3-Clause License.
"""Parse the filter arguments."""

import json
import sys
from collections.abc import Mapping

from ..core import json


def _print_err(msg=None):
"""Print the provided message to stderr.
Expand Down
4 changes: 3 additions & 1 deletion signac/contrib/hashing.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@
import hashlib
import json

from ..synced_collections.utils import SyncedCollectionJSONEncoder

# We must use the standard library json for exact consistency in formatting


Expand All @@ -27,7 +29,7 @@ def calc_id(spec):
Encoded hash in hexadecimal format.

"""
blob = json.dumps(spec, sort_keys=True)
blob = json.dumps(spec, cls=SyncedCollectionJSONEncoder, sort_keys=True)
vyasr marked this conversation as resolved.
Show resolved Hide resolved
m = hashlib.md5()
m.update(blob.encode())
return m.hexdigest()
4 changes: 2 additions & 2 deletions signac/contrib/import_export.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
"""Provides features for importing and exporting data."""

import errno
import json
import logging
import os
import re
Expand All @@ -16,7 +17,6 @@
from tempfile import TemporaryDirectory
from zipfile import ZIP_DEFLATED, ZipFile

from ..core import json
from .errors import DestinationExistsError, StatepointParsingError
from .utility import _dotted_dict_to_nested_dicts, _mkdir_p

Expand Down Expand Up @@ -766,7 +766,7 @@ def _copy_to_job_workspace(src, job, copytree):
raise DestinationExistsError(job)
raise
else:
job._init()
job.init()
return dst


Expand Down
Loading