-
Notifications
You must be signed in to change notification settings - Fork 67
uproot.update #530
Comments
As discussed in #381, updating existing ROOT files is unlikely to ever be implemented. When this is ported to Uproot 4, there will be a clearer placeholder in As it is, you should be able to write new ROOT files with Uproot 3, but not change them in place. The short story is that it's much easier to maintain the consistency of the internal structures within a ROOT file when writing them fresh. An "update" feature would have to be able to accept any valid ROOT file and change it into another valid ROOT file, but we don't know the full inclusive set of what counts as "valid." Knowing an exclusive subset of what counts as "valid" is all you need to do "recreate," and that's why we have "recreate" but not "update." |
#460 was fixed: you just need to update. Last week, this Uproot changed its name in PyPI to pip install uproot3 and use import uproot3 as uproot in your scripts. Hopefully today, but maybe tomorrow (at this rate), the PyPI package named |
Thanks. I noticed there is a suggestion in #381 to implement uproot.update for the files that were made by uproot. In our case, we are using uproot to produce the initial root files, so being able to update such files would be very useful. As you suggested, I can use "recreate" to write the information of the root files along with the updates to a new file. The main problem is that we were planning to use uproot to create and update many root files with many entries. So recreating them is not the most efficient way. |
If you're accumulating entries in batches, the best thing might be to create little files and concatenate them afterward with "hadd" (regardless of whether it's ROOT or Uproot). There's a fast-clone and basket-combining, basket-sorting options which trade speed of "hadd" for speed of access later. If you're going to read the resultant files many times, you probably want to at least combine baskets, maybe sort them in a way that benefits your reading pattern. If you're reopening the files to add just one or two entries (not a "batch"), then it's not efficient in any sense. There's a lot of overhead to opening a file and rearranging the objects in it, which would make the "open, write one entry, close, reopen" pattern horribly slow (ROOT or Uproot). If the latter is your access pattern, you might want to consider a different file format, even if only for the intermediate files that you need to write in small bits.
The only thing lacking from all three of these suggestions (CSV, HDF5, NPY) is support for "jagged" arrays. They only accept flat tables, but I think that's the kind of data you have. |
Hi,
I followed the uproot documentation to update a root file by adding more trees, but I'm getting "compression" error. Here is my script:
`with uproot.update("Test_files/"+"%s.root"%'test') as f:
#making 2 trees
for i in range(2):
seriesnumber=random.randint(1010,1011-1)
dic={}
#making 5 branches
for i in range(5):
dic['zip'+str(i)]="bool"
events=(np.random.rand(10)*10**5).astype(int)
The error:
TypeError: _openfile() missing 1 required positional argument: 'compression'
Including a compression method I get the error below:
__init__() got an unexpected keyword argument 'compression'
Also, I noticed uproot might raise "NotImplementedError" error :
~/anaconda3/lib/python3.7/site-packages/uproot/write/TFile.py in __init__(self, path) 27 class TFileUpdate(object): 28 def __init__(self, path): ---> 29 self._openfile(path) 30 raise NotImplementedError
So i'm a bit confused. I'm wondering if this feature is implemented, and if so why my script is not working?
Thanks,
Ata @bloer @mdiamon @pibion (CDMS collaboration)
The text was updated successfully, but these errors were encountered: