Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect splitting for separating hashtags #31

Open
neumannm opened this issue Aug 28, 2017 · 0 comments
Open

Incorrect splitting for separating hashtags #31

neumannm opened this issue Aug 28, 2017 · 0 comments

Comments

@neumannm
Copy link

In the case where there is another tweet, a picture or similar embedded in a tweet, the text of the tweet and this attachment are not separated. See this example:
https://twitter.com/ACM_CHIIR/status/837247495864479744
Hooray - the proceedings arrived today! #chiir2017pic.twitter.com/lLZMfTpR0F

This is by itself unsatisfying, but also leads to problems in the detection of a hashtag.

If like in the example, the last token of a tweet before the attachment is a hashtag, tweet.getHashtags() will output #chiir2017pic as a hashtag, where it should simply be #chiir2017 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant