Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add HTML parser/replacer #2

Open
1 of 3 tasks
Gummibeer opened this issue Mar 4, 2021 · 3 comments · May be fixed by #11
Open
1 of 3 tasks

Add HTML parser/replacer #2

Gummibeer opened this issue Mar 4, 2021 · 3 comments · May be fixed by #11

Comments

@Gummibeer
Copy link
Member

Gummibeer commented Mar 4, 2021

The original JavaScript Twemoji client allows replacing all emojis in a given text with the corresponding Twemoji image tag in one go.
We should provide a similar method like Twemoji::parse($html) which will search for emojis and replace them.

As this requires a DOM/HTML parser to don't replace emojis in image alternate attributes for example this should be an opt-in feature. SO the DOM library shouldn't be part of the default dependencies but the suggestions.

The method should also be flagged as experimental so everyone knows that it's possible that this method makes trouble.

To test this we should use snapshot testing on some blog posts for example. (could be faked)
They should cover emojis in plain text .txt, in the displayed content of HTML (outside of tags, in tag(s), with body or without) and as part of HTML attributes.

content parser and emoji replacer for:

@mallardduck
Copy link

Had twemoji concepts on my mind recently because of how poorly I noticed Slack handles their implementation. Specifically, compared to twitters on-site implementation (meaning the results on twitter.com) it's really bad on slack. For instance on twitter you can freely copy tweet text and ensure the emojis are preserved.

Granted where you paste them to provides varying results based on that program/app. However any app that can take "plain text" and supports unicode/emoji will gladly take the paste and keep the emoji in place. (again with small edge-cases about device compatibility and such.)

All of that in mind, I was thinking it'd be cool to make sure this package is able to "do the right thing". The fix for this is kinda simple TBH. When rendering the twemoji image tag, set an alt text to the unicode for the emoji.

I think that the feature I'm talking about here and the idea of parsing a block of content have some important overlap. At the very least in the sense that for blog posts use case I'd want the generated HTML to include the proper alt text for accessibility. Long story short, LMK if you'd be open to me taking a pass at solving this issue and working in this accessibility feature too.

@Gummibeer
Copy link
Member Author

Could be that I'm dumb but so far I see and know my code it does exactly what you want!?

protected function replace(string $replacement, ?Closure $alt = null): string
{
$text = $this->text;
$text = preg_replace_callback(
$this->regexp(),
fn (array $matches): string => str_replace(
['%{alt}', '%{src}'],
[
$alt
? $alt($matches[0])
: $matches[0],
Twemoji::emoji($matches[0])
->base($this->base)
->type($this->type)
->url(),
],
$replacement
),
$text
);
return $text;
}

https://github.com/Astrotomic/php-twemoji/blob/960bc12c1e156a21a869efdb9045ed42f54a2c6c/tests/__snapshots__/ReplacerTest__it_can_replace_emojis_in_plain_text_to_html__1.txt

So the original Emoji 🚀 is the alt of the image!? 🤔

@Gummibeer
Copy link
Member Author

Regarding your offer: for sure you can start with the open part of that issue, parsing HTML.

@Gummibeer Gummibeer changed the title Add content parser/batch replacer Add HTML parser/replacer Apr 27, 2022
@mallardduck mallardduck linked a pull request Oct 2, 2022 that will close this issue
@Gummibeer Gummibeer linked a pull request Oct 4, 2022 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

2 participants