Base64, URL Encoding and Punycode — Guide
What Base64 is and when to use it, how URL encoding works and what Punycode is for international domain names.
What Is Base64 — and Why Does It Exist?
Base64 is an encoding scheme that converts binary data into a string of printable ASCII characters. The reason it exists is straightforward: many protocols and systems (email, HTTP headers, JSON) were designed for text and cannot transport raw bytes without corrupting them.
Base64 uses an alphabet of 64 characters: the letters A–Z and a–z (52), the digits 0–9 (10), and the symbols + and / (2). Every 3 input bytes become 4 output characters — a size increase of roughly 33%. If the byte count isn't divisible by 3, padding is added using the = character (one or two) so the output length is always a multiple of 4.
Input (3 bytes): M a n Binary: 01001101 01100001 01101110 Base64 groups: 010011 010110 000101 101110 Output: T W F u → TWFu
Practical Uses of Base64
JWT Tokens
A JSON Web Token consists of three dot-separated parts: header.payload.signature. The first two parts (header and payload) are encoded with Base64URL — a URL-safe variant of Base64. This means anyone can decode them and read the contents:
eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9
→ {"alg":"HS256","typ":"JWT"}
Only the signature (third part) guarantees integrity — JWT encodes data, it does not encrypt it.
HTTP Basic Authentication
Basic Auth encodes credentials as username:password in Base64 and sends them in the request header:
Authorization: Basic dXNlcjpwYXNz
Decoding: dXNlcjpwYXNz → user:pass. This is why Basic Auth must always be used over HTTPS — the encoding offers no protection against interception.
Data URIs
Images, fonts and other files can be embedded directly in HTML or CSS without an additional HTTP request:
data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA...
Useful for small icons in email templates or reducing HTTP requests on a page.
Email MIME Attachments
The MIME protocol uses Base64 to carry file attachments (PDFs, images) inside an email body, which is transmitted as plain text.
Base64 vs Base64URL — What's the Difference?
Standard Base64 uses + and /, which have special meaning in URLs (query strings, paths). Base64URL replaces them:
| Variant | Characters | Padding | Use case |
|---|---|---|---|
| Base64 | A–Z a–z 0–9 + / | = (required) | Email, data URIs, general use |
| Base64URL | A–Z a–z 0–9 - _ | Usually omitted | JWT, OAuth tokens, URL parameters |
URL Encoding (Percent-Encoding)
URL encoding (also called percent-encoding) converts characters that are not allowed in URLs into the form %XX, where XX is the hexadecimal value of the byte in UTF-8. Every special character, space or non-ASCII letter is replaced this way:
Ελλάδα → %CE%95%CE%BB%CE%BB%CE%AC%CE%B4%CE%B1 space → %20 (or + in form data) & → %26 = → %3D
JavaScript provides two different functions for this:
encodeURIComponent(): encodes everything exceptA–Z a–z 0–9 - _ . ! ~ * ' ( ). Use for individual query string values.encodeURI(): leaves characters that have a structural role in a URL untouched (: / ? # [ ] @ ! $ & ' ( ) * + , ; =). Use for complete URLs.
encodeURIComponent("hello world&foo=bar")
→ "hello%20world%26foo%3Dbar"
encodeURI("https://example.com/search?q=hello world")
→ "https://example.com/search?q=hello%20world"
Punycode and Internationalized Domain Names (IDN)
The DNS system was designed exclusively for ASCII characters. To support domain names containing Greek, Chinese or Arabic letters (IDN — Internationalized Domain Names), Punycode (RFC 3492) was developed.
Punycode converts Unicode labels into ASCII using the xn-- prefix:
ελλάδα.gr → xn--hxakic4aa.gr münchen.de → xn--mnchen-3ya.de παράδειγμα.com → xn--hxajbheg2az3al.com
The browser displays the Unicode form in the address bar (more human-readable), but the Punycode form is always what travels on the network. This matters for DNS queries, SSL certificates and email server configuration.
Comparison: Base64 / URL Encoding / Punycode
| Encoding | Purpose | Typical Input | Typical Output |
|---|---|---|---|
| Base64 | Binary → ASCII text | Files, bytes, credentials | dXNlcjpwYXNz |
| URL Encode | Safe embedding in URLs | Query params, non-ASCII text | %CE%B5%CE%BB |
| Punycode | Unicode domain → ASCII DNS | IDN domain names | xn--hxakic4aa.gr |
Try Base64, URL Encoding and Punycode conversions directly in the tool:
→ Base64 & URL EncoderFrequently Asked Questions
A–Z a–z 0–9 + / =, its length is always a multiple of 4 (with padding), and it often ends with one or two = signs. If the length isn't divisible by 4 or it contains - and _ instead of + and /, it's likely Base64URL.rawurlencode in PHP?urlencode() encodes spaces as + (HTML form data style), while rawurlencode() encodes them as %20 per RFC 3986. For URL paths and query parameters not originating from a form submission, use rawurlencode() — it is safer and compatible with all HTTP clients.xn--...) as well, since some certificate authorities issue certificates only for one form. Also note that sending email from an IDN domain requires special support (EAI / RFC 6531) from your mail server.