Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add Network URL non-ascii -> punycode warning #12813

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

digiwand
Copy link
Contributor

@digiwand digiwand commented Dec 20, 2024

Description

Similar to the extension, we want to warn the user if the URL has non-ASCII characters. Inline alerts are not supported on mobile, so we display the new alert as a banner.

Notes:

  • existing Network URL logic removes the path with hideKeyFromUrl
  • punycode is deprecated, however, the url library does not seem to parse the url into its punycode encoded version in react-native. It is tricky as the url does parse into its punycode version in node.js, jest tests

Related issues

Fixes: https://github.com/MetaMask/MetaMask-planning/issues/2365
Related to: MetaMask/metamask-extension#29490 (fixes isValidASCIIURL to include path check in extension)

Manual testing steps

Test switching to a custom network

Screenshots/Recordings

Before

After

Pre-merge author checklist

Pre-merge reviewer checklist

  • I've manually tested the PR (e.g. pull and build branch, run the app, test code being changed).
  • I confirm that this PR addresses all acceptance criteria described in the ticket it closes and includes the necessary testing evidence such as recordings and or screenshots.

@digiwand digiwand requested a review from a team as a code owner December 20, 2024 16:15
Copy link
Contributor

CLA Signature Action: All authors have signed the CLA. You may need to manually re-run the blocking PR check if it doesn't pass in a few minutes.

@metamaskbot metamaskbot added the team-confirmations Push issues to confirmations team label Dec 20, 2024
@digiwand digiwand added the Run Smoke E2E Triggers smoke e2e on Bitrise label Jan 6, 2025
Copy link
Contributor

github-actions bot commented Jan 6, 2025

https://bitrise.io/ Bitrise

✅✅✅ pr_smoke_e2e_pipeline passed on Bitrise! ✅✅✅

Commit hash: b51117b
Build link: https://app.bitrise.io/app/be69d4368ee7e86d/pipelines/60aea724-ad25-4a52-a37e-4a1f67c759c3

Note

  • You can kick off another pr_smoke_e2e_pipeline on Bitrise by removing and re-applying the Run Smoke E2E label on the pull request

console.error(exp);
return false;
}
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@digiwand : why is the check for valid ASCII includes only the host part of the URL ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @jpuri, good question. I copied this method verbatim from the metamask-extension code. I think you created the one in the metamask-extension.

will double-check this method with consideration of @NicholasEllul's comment above https://github.com/MetaMask/metamask-mobile/pull/12813/files#r1905686031

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please see #12813 (comment)

@@ -24,6 +26,35 @@ export function isBridgeUrl(url: string) {
}
}

export const isValidASCIIURL = (urlString?: string) => {
try {
return urlString?.includes(punycode.toASCII(new URL(urlString).host));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is able to be bypassed when a URL string such as https://iոfura.io/gnosis?x=xn--ifura-dig.io is provided. Perhaps we can add this as a test case.

We will need to ensure we are comparing only to the host itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @NicholasEllul and @jpuri, thanks for flagging this! This helper method was copied over from the extension. I updated the extension code and the code in this PR to include the path in its check. can I get another look at this please?

related extension PR: MetaMask/metamask-extension#29490

@digiwand digiwand added Run Smoke E2E Triggers smoke e2e on Bitrise and removed Run Smoke E2E Triggers smoke e2e on Bitrise labels Jan 7, 2025
Copy link
Contributor

github-actions bot commented Jan 7, 2025

https://bitrise.io/ Bitrise

✅✅✅ pr_smoke_e2e_pipeline passed on Bitrise! ✅✅✅

Commit hash: 3e2aac9
Build link: https://app.bitrise.io/app/be69d4368ee7e86d/pipelines/c850e466-34be-40f9-a878-f3e9ae77b565

Note

  • You can kick off another pr_smoke_e2e_pipeline on Bitrise by removing and re-applying the Run Smoke E2E label on the pull request

jpuri
jpuri previously approved these changes Jan 8, 2025
Copy link
Contributor

@jpuri jpuri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work

@digiwand digiwand requested a review from NicholasEllul January 8, 2025 07:59
const urlPunycodeString = punycode.toASCII(new URL(urlString).href);
return urlPunycodeString?.includes(urlString);
} catch (exp: unknown) {
console.error(exp);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, does this warrant an error log if it's handled?

And if we still want a log, should we add some context for clarity such as:

console.log('URL contains non-ASCII characters', urlString, urlPunycodeString)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matthewwalsh0

unsure if the punycode library could throw an error. Just in case, keeping the console.error and updating phrasing

updated console error to
console.error(`Failed to detect if URL contains non-ASCII characters: ${urlString}`);

const pathname =
url.pathname === '/' && !urlString.endsWith('/') ? '' : url.pathname;

return `${protocol}//${punycodeHostname}${port}${pathname}${search}${hash}`;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to build this manually and only convert the hostname to ASCII?

Could we pass the entire href in one go as we do in isValidASCIIURL?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I originally proposed href then took it out to handle the pathname === '/' logic. Taking a second look at this we can keep the href and reword the logic. Thanks for the callout! updated 4c48863

Copy link
Contributor

@NicholasEllul NicholasEllul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice the following notice in the punycode README: https://github.com/mathiasbynens/punycode.js?tab=readme-ov-file#installation

⚠️ Note that userland modules don't hide core modules. For example, require('punycode') still imports the deprecated core module even if you executed npm install punycode. Use require('punycode/') to import userland modules rather than core modules.

We should double check to ensure we are using the punycode built into node which is soft-deprecated. I notice other files like this one import punycode as the following:

import punycode from 'punycode/punycode';

Comment on lines 33 to 34
const urlPunycodeString = punycode.toASCII(new URL(urlString).href);
return urlPunycodeString?.includes(urlString);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a security POV I believe this satisfies things! However, on the UX side of things there is another edge case.

The toASCII function is intended to only process domains or email addresses. This means the behaviour when processing full hrefs may result in undefined behaviour.

E.g if the url string in this case is https://opensea.io/language=français, the toASCII function will incorrectly turn it into https://example.xn--com/lang=franais-opb.

Potential solution:

By default in nodejs, if you call new URL(urlString) and later call url.hostname or url.href it should automatically punycode the hostname in whatever the URL object returns.

With this in mind could we do something like

hasPunycodeHostname = urlString !== (new URL(urlString).href);
...

Im not super familiar with mobile's runtime, so this may or may not be possible in this environment, but if it is, it could help with that decoding.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I just noticed your note in the PR description:

punycode is deprecated, however, the url library does not seem to parse the url into its punycode encoded version in react-native. It is tricky as the url does parse into its punycode version in node.js, jest tests

So looks like this may not be possible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NicholasEllul

yeah, the react-native runtime URL library didn't seem to support the punycode encoding


E.g if the url string in this case is https://opensea.io/language=français, the toASCII function will incorrectly turn it into https://example.xn--com/lang=franais-opb.

This is the reason why I was told we are displaying the given input rather than the punycode version. We show the punycode different in the warning. If only we could mimic the behavior of the browser search bar. An attempt at this would be out-of-scope

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@digiwand to avoid false flagging URLs that contain non-ascii characters in the path/query params, what if we switch to doing something like this:

  const isValidASCIIURL = (urlString?: string) => {
      if (!urlString) { return false; }

    try {
      const { hostname: originalHostname = new URL(urlString);
      const punycodeHostname = toASCII(originalHostname);
      return originalHostname === punycodeHostname
    } catch (exp: unknown) {
      console.error(`Failed to detect if URL contains non-ASCII characters: ${urlString}. Error: ${exp}`);
      return false;
    }
  };

This would ensure that we only fail on cases where the hostname itself contains a non-ascii character. This works because react-natives behaviour of not punycode encoding hostnames when accessed through the URL object.

However, we would need to add a test case that would fail if react native ever caught this change in the future

@digiwand digiwand removed the Run Smoke E2E Triggers smoke e2e on Bitrise label Jan 8, 2025
@digiwand digiwand added the Run Smoke E2E Triggers smoke e2e on Bitrise label Jan 9, 2025
Copy link
Contributor

github-actions bot commented Jan 9, 2025

https://bitrise.io/ Bitrise

❌❌❌ pr_smoke_e2e_pipeline failed on Bitrise! ❌❌❌

Commit hash: f891d73
Build link: https://app.bitrise.io/app/be69d4368ee7e86d/pipelines/fdbd03b6-5fa2-4d78-a480-9b565270bb4d

Note

  • You can kick off another pr_smoke_e2e_pipeline on Bitrise by removing and re-applying the Run Smoke E2E label on the pull request

Tip

  • Check the documentation if you have any doubts on how to understand the failure on bitrise

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Run Smoke E2E Triggers smoke e2e on Bitrise team-confirmations Push issues to confirmations team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants