-
Notifications
You must be signed in to change notification settings - Fork 950
Open
Description
Context - Why was this issue created?
Integration tests for WPSEO_Utils::sanitize_url method (tests/WP/Inc/Utils_Test.php) are failing on the following case:
'with_non_encoded_non_latin_url' => [
'expected' => 'https://example.com/%da%af%d8%b1%d9%88%d9%87-%d8%aa%d9%84%da%af%d8%b1%d8%a7%d9%85-%d8%b3%d8%a6%d9%88',
'url_to_sanitize' => 'https://example.com/گروه-تلگرام-سئو',
]
The issue seems to be with wp_parse_url() call, which returns a corrupted string for the URL's path (گر��-ت�گرا�-سئ�_ instead of روه-تلگرام-سئو).
Considering this test case has been written in March 2020, it might be that something has changed in wp_parse_url() implementation.
What is the goal of this issue?
- Restore the original
WPSEO_Utils::sanitize_urlbehaviour for URLs that have non-Unicode characters in their path.
What needs to be done to achieve the goal?
- Investigate
wp_parse_url()current behaviour - Change
WPSEO_Utils::wp_parse_url()accordingly
Does the issue still need UX or research?
No
If available, what are the tips for fixing the problem or possible solutions?
- If
wp_parse_url()needs its input to be encoded, changeWPSEO_Utils::sanitize_urlbehaviour accordingly (I would say by limiting the change only in the specific case covered by the test (i.e., when non-encoded non-latin characters are present in the URL's path)
What is the expected result/behavior?
WPSEO_Utils::sanitize_url()should return a correctly encoded URL, as expected in the integration test.
Should documentation be added or updated for this change? And if so, where?
No
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels