Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with Recognize - fcntl64: symbol not found #1157

Closed
RikkiBC opened this issue Jul 10, 2024 · 8 comments
Closed

Error with Recognize - fcntl64: symbol not found #1157

RikkiBC opened this issue Jul 10, 2024 · 8 comments
Labels
bug Something isn't working

Comments

@RikkiBC
Copy link

RikkiBC commented Jul 10, 2024

Which version of recognize are you using?

7.0.3

Enabled Modes

Face recognition

TensorFlow mode

WASM mode

Downstream App

Memories App

Which Nextcloud version do you have installed?

29.0.3

Which Operating system do you have installed?

Ubuntu 22.04, running official NextcloudAIO in Docker

Which database are you running Nextcloud on?

PostrgreSQL

Which Docker container are you using to run Nextcloud? (if applicable)

No response

How much RAM does your server have?

10GB

What processor Architecture does your CPU have?

ARM64

Describe the Bug

A month ago, it looks like Recognize stopped working. I only just realised it as it's not often that I check on my recognised faces. In the admin section I saw these messages:

image

Please Note: It didn't originally say "Face recognition: 0 Queued files", it originally had something like 680~ queued files. It did still say 0 queued jobs though.

It also originally said that the models needed to be downloaded. Using the "sudo docker exec --user www-data -it nextcloud-aio-nextcloud php occ recognize:download-models" command fixed that problem, however there were never any further jobs scheduled, and the error messages about "An error occurred during face recognition, please check the Nextcloud logs" was still there too.

Please Also Note: There was another error message there too, originally: "There are queued files in the face recognition queue but no background job is scheduled to process them."

The following was then done in order.

1, attempted manual run of clustering

I tried to cluster faces with "sudo docker exec --user www-data -it nextcloud-aio-nextcloud php occ recognize:cluster-faces", but of course that's not the issue - it just went through each user and let me know that there were no face detections found for them.

2, checking Node

I checked Node in a few ways.

  • Running "sudo docker exec --user www-data -it nextcloud-aio-nextcloud node --version" returned "v20.13.1".
  • I checked the Recognize admin settings - Node.js is defined as located at "/var/www/html/custom_apps/recognize/bin/node". I have not touched this since AIO was first set up, as, originally at least, it was fine.
  • I checked the file exists by running "sudo docker exec --user www-data -it nextcloud-aio-nextcloud ls /var/www/html/custom_apps/recognize/bin/node".
    • It returned: "/var/www/html/custom_apps/recognize/bin/node".
  • Here's where things start looking a bit off, as far as I can tell? I tried to test the node version directly on that file, by running "sudo docker exec --user www-data -it nextcloud-aio-nextcloud /var/www/html/custom_apps/recognize/bin/node --version".
    • It returned "Error relocating /var/www/html/custom_apps/recognize/bin/node: fcntl64: symbol not found"

3, manually running recrawl

Perhaps stupidly, I ran "sudo docker exec --user www-data -it nextcloud-aio-nextcloud php occ recognize:recrawl" as, as far as I could tell, it was the only way to get Recognize to schedule a job again. It did... and also set me up for re-crawling aaall my old photos too. Sigh.

Side Note: Is there any downside to running recognize:recrawl? Apart from taking a long time as I have tens of thousands of images... will it get rid of the work I've done in manually classifying faces into people?

After running recrawl, the "There are queued files in the face recognition queue but no background job is scheduled to process them." error message disappeared. Likewise, the number of jobs awaiting classification dropped to zero, before (after several minutes) climbing well into the thousands with 5 scheduled jobs.

4, manually stopping background jobs

I stopped the background jobs for now with "sudo docker exec --user www-data -it nextcloud-aio-nextcloud php occ recognize:clear-background-jobs". It's at this point that we get to the screenshot I shared above.

5, checking Nextcloud Error Logs

I couldn't see anything in the logs before this point. My logs only went back a week and it looks like the jobs stopped a while ago. But, thanks to Step 3 above, I now had some fresh logs to check.

image

It's the error "Classifier process output: Error relocating /var/www/html/custom_apps/recognize/bin/node: fcntl64: symbol not found" that really catches my eye. fcntl64: symbol not found is the same error I got when trying to check the node version at the end of Step 2 above.

It's at this point I'm stuck. I hope I've provided enough info but I really don't know what the issue is or where to go from here. Searching "fcntl64: symbol not found" online isn't providing much assistance - my Google Fu might be weak right now, but all I'm finding are esoteric pages about stuff which looks unrelated.

Please help.

Expected Behavior

Recognize continues working when it hasn't been touched (except maybe by AIO?).

To Reproduce

Not sure what caused this issue in the first place, sorry. I hope I added enough information in the Describe The Bug section, otherwise please ask for more details if needed.

Debug log

No response

@RikkiBC RikkiBC added the bug Something isn't working label Jul 10, 2024
Copy link

Hello 👋

Thank you for taking the time to open this issue with recognize. I know it's frustrating when software
causes problems. You have made the right choice to come here and open an issue to make sure your problem gets looked at
and if possible solved.
I try to answer all issues and if possible fix all bugs here, but it sometimes takes a while until I get to it.
Until then, please be patient.
Note also that GitHub is a place where people meet to make software better together. Nobody here is under any obligation
to help you, solve your problems or deliver on any expectations or demands you may have, but if enough people come together we can
collaborate to make this software better. For everyone.
Thus, if you can, you could also look at other issues to see whether you can help other people with your knowledge
and experience. If you have coding experience it would also be awesome if you could step up to dive into the code and
try to fix the odd bug yourself. Everyone will be thankful for extra helping hands!
One last word: If you feel, at any point, like you need to vent, this is not the place for it; you can go to the forum,
to twitter or somewhere else. But this is a technical issue tracker, so please make sure to
focus on the tech and keep your opinions to yourself. (Also see our Code of Conduct. Really.)

I look forward to working with you on this issue
Cheers 💙

@RikkiBC RikkiBC changed the title Recognize can no longer ruin jobs in Nextcloud AIO Recognize can no longer run jobs in Nextcloud AIO Jul 10, 2024
@RikkiBC RikkiBC changed the title Recognize can no longer run jobs in Nextcloud AIO Error with Recognize - fcntl64: symbol not found Jul 11, 2024
@marcelklehr
Copy link
Member

marcelklehr commented Jul 25, 2024

Hello @RikkiBC

Thank you for the feedback and for bearing with us -- we have quite some workload at the moment on multiple fronts. This isssue seems to be related to the node binary. It seems that the binary that we install no longer works. This may be because you are on ARM64. The combination of Alpine linux (thanks to AIO) and ARM64 is somewhat problematic. If you can, you can try to open a shell inside the nextcloud container and install a proper alpine linux node.js package. In the recognize settings, you can then set the path to that binary.

Edit: I see in your post above that node.js is already installed, then you only need to set the path to the working node.js binary in the recognize settings and things should work again.

@szaimen
Copy link
Contributor

szaimen commented Aug 1, 2024

Yes, AIOs Nextcloud container includes node.js by default

@RikkiBC
Copy link
Author

RikkiBC commented Aug 1, 2024

Thanks for the response @marcelklehr and @szaimen , and apologies for my own late response, I've been unexpectedly busy.

Is there a simple way to find the location of the node.js included by AIO so I can link to that instead?

@szaimen
Copy link
Contributor

szaimen commented Aug 2, 2024

try sudo docker exec nextcloud-aio-nextcloud which node

@github-project-automation github-project-automation bot moved this to Backlog in Recognize Aug 28, 2024
@github-project-automation github-project-automation bot moved this from Backlog to Done in Recognize Sep 2, 2024
@RikkiBC
Copy link
Author

RikkiBC commented Sep 7, 2024

Hi @marcelklehr and @szaimen

Apologies yet again for my tardiness. Once more, lots going on personally outside of this for me.

I understand this has already been closed, but for you and anyone else who comes across this issue I thought I'd update here.

I ran sudo docker exec nextcloud-aio-nextcloud which node, which returned /usr/bin/node. Plugging this into the admin settings cleared up the errors under the "Node.js" settings:

image

So it looks like it's ready to go again, which is great, thank you!

Last question on this: How do I get background jobs to be scheduled to start again? I can't find anything around a command that just re-schedules the jobs, rather than starting crawling/classifying/etc from scratch.

@marcelklehr
Copy link
Member

I can't find anything around a command that just re-schedules the jobs, rather than starting crawling/classifying/etc from scratch.

This is currently not possible. You'll need to start from scratch sadly.

@RikkiBC
Copy link
Author

RikkiBC commented Sep 9, 2024

I can't find anything around a command that just re-schedules the jobs, rather than starting crawling/classifying/etc from scratch.

This is currently not possible. You'll need to start from scratch sadly.

It is what it is 😄 I've hit the "Rescan All Files" button in the admin settings and it seems to be going without any more errors. There are some tens of thousands of photos to get through, and my server is in WASM mode... so I'll just check once or twise a day until it's done hah.

Thanks again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
Development

No branches or pull requests

3 participants