Add HTML parser/replacer #2

Gummibeer · 2021-03-04T16:20:08Z

The original JavaScript Twemoji client allows replacing all emojis in a given text with the corresponding Twemoji image tag in one go.
We should provide a similar method like Twemoji::parse($html) which will search for emojis and replace them.

As this requires a DOM/HTML parser to don't replace emojis in image alternate attributes for example this should be an opt-in feature. SO the DOM library shouldn't be part of the default dependencies but the suggestions.

The method should also be flagged as experimental so everyone knows that it's possible that this method makes trouble.

To test this we should use snapshot testing on some blog posts for example. (could be faked)
They should cover emojis in plain text .txt, in the displayed content of HTML (outside of tags, in tag(s), with body or without) and as part of HTML attributes.

content parser and emoji replacer for:

Text .txt Add a batch replacer for plain text #3
HTML .html
Markdown .md

The text was updated successfully, but these errors were encountered:

mallardduck · 2022-04-27T16:54:09Z

Had twemoji concepts on my mind recently because of how poorly I noticed Slack handles their implementation. Specifically, compared to twitters on-site implementation (meaning the results on twitter.com) it's really bad on slack. For instance on twitter you can freely copy tweet text and ensure the emojis are preserved.

Granted where you paste them to provides varying results based on that program/app. However any app that can take "plain text" and supports unicode/emoji will gladly take the paste and keep the emoji in place. (again with small edge-cases about device compatibility and such.)

All of that in mind, I was thinking it'd be cool to make sure this package is able to "do the right thing". The fix for this is kinda simple TBH. When rendering the twemoji image tag, set an alt text to the unicode for the emoji.

I think that the feature I'm talking about here and the idea of parsing a block of content have some important overlap. At the very least in the sense that for blog posts use case I'd want the generated HTML to include the proper alt text for accessibility. Long story short, LMK if you'd be open to me taking a pass at solving this issue and working in this accessibility feature too.

Gummibeer · 2022-04-27T17:04:37Z

Could be that I'm dumb but so far I see and know my code it does exactly what you want!?

php-twemoji/src/EmojiText.php

Lines 42 to 65 in 960bc12

    
           protected function replace(string $replacement, ?Closure $alt = null): string 
        
           { 
        
               $text = $this->text; 
        
               $text = preg_replace_callback( 
        
                   $this->regexp(), 
        
                   fn (array $matches): string => str_replace( 
        
                       ['%{alt}', '%{src}'], 
        
                       [ 
        
                           $alt 
        
                               ? $alt($matches[0]) 
        
                               : $matches[0], 
        
                           Twemoji::emoji($matches[0]) 
        
                               ->base($this->base) 
        
                               ->type($this->type) 
        
                               ->url(), 
        
                       ], 
        
                       $replacement 
        
                   ), 
        
                   $text 
        
               ); 
        
               return $text; 
        
           }

https://github.com/Astrotomic/php-twemoji/blob/960bc12c1e156a21a869efdb9045ed42f54a2c6c/tests/__snapshots__/ReplacerTest__it_can_replace_emojis_in_plain_text_to_html__1.txt

So the original Emoji 🚀 is the alt of the image!? 🤔

Gummibeer · 2022-04-27T17:07:01Z

Regarding your offer: for sure you can start with the open part of that issue, parsing HTML.

Gummibeer added enhancement help wanted labels Mar 4, 2021

Gummibeer mentioned this issue Mar 4, 2021

Add a batch replacer for plain text #3

Closed

Gummibeer changed the title ~~Add content parser/batch replacer~~ Add HTML parser/replacer Apr 27, 2022

mallardduck linked a pull request Oct 2, 2022 that will close this issue

Add HTML parsing features #11

Open

Gummibeer linked a pull request Oct 4, 2022 that will close this issue

Add HTML parsing features #11

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add HTML parser/replacer #2

Add HTML parser/replacer #2

Gummibeer commented Mar 4, 2021 •

edited

Loading

mallardduck commented Apr 27, 2022

Gummibeer commented Apr 27, 2022

Gummibeer commented Apr 27, 2022

Add HTML parser/replacer #2

Add HTML parser/replacer #2

Comments

Gummibeer commented Mar 4, 2021 • edited Loading

mallardduck commented Apr 27, 2022

Gummibeer commented Apr 27, 2022

Gummibeer commented Apr 27, 2022

Gummibeer commented Mar 4, 2021 •

edited

Loading