Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Web scraper stage produces output which cannot be consumed by cudf. #1342

Closed
2 tasks done
cwharris opened this issue Nov 6, 2023 · 0 comments · Fixed by #1478
Closed
2 tasks done

[BUG]: Web scraper stage produces output which cannot be consumed by cudf. #1342

cwharris opened this issue Nov 6, 2023 · 0 comments · Fixed by #1478
Assignees
Labels
bug Something isn't working sherlock Issues/PRs related to Sherlock workflows and components

Comments

@cwharris
Copy link
Contributor

cwharris commented Nov 6, 2023

Version

fea-sherlock

Which installation method(s) does this occur on?

No response

Describe the bug.

The WebScraperStage produces output that cannot be consumed by cudf, so downstream stages must continue to use Pandas or remove the offending columns.

# Not using cudf to avoid error: pyarrow.lib.ArrowInvalid: cannot mix list and non-list, non-null values
return MessageMeta(pd.DataFrame(final_rows))

Instead, the WebScraperStage should produce output that can be consumed by cudf.

Minimum reproducible example

No response

Relevant log output

Click here to see error details

[Paste the error here, it will be hidden by default]

Full env printout

Click here to see environment details

[Paste the results of print_env.sh here, it will be hidden by default]

Other/Misc.

No response

Code of Conduct

  • I agree to follow Morpheus' Code of Conduct
  • I have searched the open bugs and have found no duplicates for this bug report
@cwharris cwharris added the bug Something isn't working label Nov 6, 2023
@mdemoret-nv mdemoret-nv added the sherlock Issues/PRs related to Sherlock workflows and components label Dec 12, 2023
@dagardner-nv dagardner-nv self-assigned this Jan 27, 2024
@dagardner-nv dagardner-nv moved this from Todo to In Progress in Morpheus Boards Jan 27, 2024
@jarmak-nv jarmak-nv moved this from In Progress to Review - Ready for Review in Morpheus Boards Jan 29, 2024
@rapids-bot rapids-bot bot closed this as completed in 77cc0e5 Feb 12, 2024
@github-project-automation github-project-automation bot moved this from Review - Ready for Review to Done in Morpheus Boards Feb 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working sherlock Issues/PRs related to Sherlock workflows and components
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants