# Punycodes

**TL;DR:** Using alternate yet similar-looking characters in a URL, attackers can potentially exploit not-so-well-known features of URLs to trick users. Well... if they can do it, so can we! Using similar concepts, it's easy to trick people into downloading malware and/or running arbitrary code.

Wikipedia:

> Punycode is a representation of Unicode with the limited [ASCII](https://en.wikipedia.org/wiki/ASCII) character subset used for Internet [hostnames](https://en.wikipedia.org/wiki/Hostname). Using Punycode, host names containing Unicode characters are transcoded to a subset of ASCII consisting of letters, digits, and hyphens, which is called the letter–digit–hyphen (LDH) subset.

*"Cool! So... what exactly do we use this stuff for?"* I hear you ask. Well, it's not exactly *malware* malware material, instead, let me explain this using an example. I'd like you to take a moment and observe the difference between the following:

```
URL_1:     https://adidas.com/
URL_2:     https://αdidas.com/
```

Easy to spot, right? (the first `a` is replaced with a greek alpha: `α`) Now try this:

```
URL_1:     https://github.com/kubernetes/kubernetes/archive/refs/tags/v1.27.1.zip
URL_2:     https://github.com∕kubernetes∕kubernetes∕archive∕refs∕tags∕@v1.27.1.zip
```

If you can't spot it, or if the characters look a bit messed up on your device, check [this malwarebytes article](https://www.malwarebytes.com/blog/news/2023/05/zip-domains). This is a relatively new thing so if you want a very in depth analysis, watch this:

{% embed url="<https://youtu.be/LFriS1PICE0>" %}

### References

* <https://en.wikipedia.org/wiki/Punycode>
* <https://www.jamf.com/blog/punycode-attacks/>
* <https://en.wikipedia.org/wiki/List_of_Unicode_characters>
* <https://www.malwarebytes.com/blog/news/2023/05/zip-domains>
