Skip to content
This repository has been archived by the owner on Nov 19, 2021. It is now read-only.

Bulk Exporter not exporting sub directories under the node ref specified #15

Open
choepeter opened this issue Sep 30, 2017 · 4 comments

Comments

@choepeter
Copy link

I am trying to run the bulk exporter on an environment that has a large set of sub folders and documents.

It only seems to import the content files, but doesn't recursively go to the subfolders. There is no error generated during the export and reports successful export.

This is the url I used:
http://localhost:8080/alfresco/s/extensions/bulkexport/export?base=/opt/alfresco/export&nodeRef=workspace://SpacesStore/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxx&ignoreExported=true&useNodeCache=true

I tried this on another environment and it worked. Is there a limitation on the number sub folders the bulk exporter can traverse?

Any assistance is greatly appreciated.

Peter Choe

@malcolmcif
Copy link
Collaborator

Hi Peter,
so we can assume your command line parameters are ok, and it would appear you have specified the correct nodeRef (given your content in the node is exported), so lets look at reasons why it may not work:

  1. is there any log information in alfresco.log? look at the working version and non working version to get a feel for what it should look like. In particular this line "Nodes to export = ".
  2. there is a limitation in the code where it will not export more than java's List size (int i think), but you are not hitting this problem, a shame it exists in the code, oops.
  3. does it take a while to not work correctly or does it finish quickly?
    3.1 if it takes a long time then maybe the java virtual machine is running out of memory
    3.2 if it is quick then it is not identifying the folders, unsure why - what number is defined in the log for "Nodes to export = "
  4. maybe turn the cache off (i.e. useNodeCache=false) and see if it changes the behavior

we will see if any of the above helps.

good luck.

@choepeter
Copy link
Author

Thank you for getting back to me so quickly.

  1. The nodes to export show an initial number of 3511, but I can't tell if those are only files or include the sub folders.
  2. ???
  3. The export finishes, but it is taking a lot shorter than it should given the number of documents that are in the repository.
  4. I've ran the query without the useNodeCache explicitly set, which is supposed to be default to false.

I've directly put the node reference of a sub folder and that one seems to be exporting all the sub folders also. (over 1 million objects)

@malcolmcif
Copy link
Collaborator

You are exercising a bug somewhere in the code, something to do with the amount you are trying to export at a guess.

Are you sure your jvm has enough memory?

  1. the number 3511 should represent every node (including folders) it intends to export. given your last comment about 1 million items in a sub folder, it is obvious 3511 is incorrect. I presume when you export the sub folder, the Nodes to export represents a sensible number.

  2. the implication is you will not be able to export more than 2147483647 items.

  3. understood.

  4. correct, default is false. when you set to true do you see the .cache file generated in the root node?

Do you see the message "Bulk Export finished" when doing either export?

@choepeter
Copy link
Author

Yes. I do see the message Bulk Export finished when the export is run.

The Alfresco service is allocated 9 GB. I haven't seen any error messages related to memory issues in the log.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants