What are HTML entities and why are they used?

HTML entities are special codes that represent characters which have reserved meaning in HTML (like , &, ") or aren't available on a standard keyboard (like ©, €, →). They start with & and end with ;. For example, < represents <. Without entities, browsers would interpret these characters as HTML markup.

How does HTML encoding prevent XSS attacks?

Cross-Site Scripting (XSS) attacks inject malicious HTML/JavaScript into web pages. HTML encoding converts dangerous characters like into their entity equivalents (< and >), so the browser renders them as text instead of executing them as code. Always encode user-generated content before inserting it into HTML.

What is the difference between HTML encoding and URL encoding?

HTML encoding converts characters to HTML entities (& → &) for safe display in HTML documents. URL encoding converts characters to percent-encoded format (space → %20) for safe transmission in URLs. They serve different purposes and are not interchangeable.

What are the most common HTML entities?

Should I use named or numeric HTML entities?

Named entities like & are more readable but not all characters have named versions. Numeric entities like & (decimal) or & (hex) work for any Unicode character. For common entities, use named versions for readability. For uncommon characters, use numeric. Both are valid in all modern browsers.

What is the HTML entity for the ampersand character?

& (5 characters) renders as a single ampersand. Required because a literal ampersand inside HTML markup could start an entity reference. Other essential escapes: < for less-than, > for greater-than, " for double quote, ' for apostrophe.

How do HTML entities prevent XSS attacks?

Cross-site scripting (XSS) injects malicious HTML or JavaScript. Encoding user-controlled strings into HTML entities prevents them from being parsed as markup — a script payload becomes literal text instead of executing. Always escape on output, not input.

What is the non-breaking space HTML entity?

is a non-breaking space — visually identical to a regular space but the browser will not break a line at this point. Used to keep words together: Mr. Smith, 100 km, Section 3.2. Excessive nbsp use breaks responsive layouts.

Free HTML Entity Encoder — XSS-Safe Online

Char	Named	Numeric	Description
&	&	&	Ampersand
<	<	<	Less than
>	>	>	Greater than
"	"	"	Double quote
'	'	'	Single quote
			Non-breaking space
©	©	©	Copyright
—	—	—	Em dash
→	→	→	Right arrow
…	…	…	Ellipsis

HTML entities are escape sequences (&, <, >, ©) that represent characters which would otherwise be parsed as HTML markup or are difficult to type. Named entities are more readable (©); numeric entities are more compatible across older renderers (©). This free HTML entity encoder and decoder outputs both formats and is essential for XSS-safe content rendering.

Examples

Common entity reference§§< → <§§> → >§§" → "§§& → &§§' → '§§These 5 entities are the minimum needed to escape any user-controlled string for safe HTML rendering.

Named vs numeric§§© → © (named, readable) OR © (numeric, universal compatibility)§§Named entities are easier for humans; numeric entities are guaranteed in every browser including legacy.

Non-breaking space§§ prevents line breaks at the space, useful for "Mr. Smith" or "100 km" where the words must stay together.

What are HTML entities and why do they matter?

An HTML entity is a special sequence of characters that represents a reserved or non-printable character in HTML markup. They start with an ampersand (&) and end with a semicolon (;). The entity < renders as <, & renders as &, © renders as ©. Defined originally in HTML 4.01 and extended in HTML5 to over 2,000 named entities, they exist for two reasons: escaping reserved characters (so the browser doesn't think your text is markup) and typing characters that aren't on your keyboard (em dashes, math symbols, currency).

The five mandatory escapes — the ones that XSS-prevention guides drill into every web developer:

Char	Named entity	Numeric (decimal)	Numeric (hex)	Why escape?
`<`	`<`	`<`	`<`	Starts a tag — the #1 XSS vector
`>`	`>`	`>`	`>`	Ends a tag
`&`	`&`	`&`	`&`	Starts an entity — escape to literal
`"`	`"`	`"`	`"`	Closes attribute values
`'` (apostrophe)	`'` (HTML5)	`'`	`'`	Closes single-quoted attributes

If your application takes user input and renders it inside HTML without escaping these five characters, you have an XSS vulnerability. Every modern web framework escapes by default; raw string concatenation or "trust-us" templating bypasses that protection. This is why innerHTML with user data is dangerous and textContent is safe.

Named vs numeric vs hex — three ways to encode the same character

Form	Example	Pros	Cons
Named entity	`©`	Readable, memorable	~2,000 names; not all parsers support all of them
Decimal numeric	`©`	Universal — every char has a decimal code point	Less readable than named
Hex numeric	`©`	Matches Unicode references (U+00A9)	Slightly less common; same support as decimal

For the five mandatory escapes, named entities are universally supported and most readable. For uncommon symbols (em dash, em space, math operators, arrows), numeric entities are safer because every parser recognizes them. Always include the trailing semicolon — most browsers tolerate missing semicolons in HTML5, but XML and strict parsers reject the entity entirely.

HTML5's named-entity gotchas

' only became official in HTML5. It works in modern browsers but breaks in XHTML 1.0 — use ' if XHTML compatibility matters.
Some "obvious" names don't exist. There's no &asterisk; or &dollar; — those are plain ASCII. Don't over-escape.
Case matters. Ä = Ä, ä = ä — different characters.

Useful named entities by category

Currency & math

Symbol	Named	Numeric
©	`©`	`©`
®	`®`	`®`
™	`™`	`™`
€	`€`	`€`
£	`£`	`£`
¥	`¥`	`¥`
×	`×`	`×`
÷	`÷`	`÷`
±	`±`	`±`
°	`°`	`°`

Punctuation & whitespace

Symbol	Named	Numeric
(non-breaking space)	` `	` `
—	`—`	`—`
–	`–`	`–`
…	`…`	`…`
"	`“`	`“`
"	`”`	`”`
'	`‘`	`‘`
'	`’`	`’`
«	`«`	`«`
»	`»`	`»`
·	`·`	`·`
•	`•`	`•`

Arrows

Symbol	Named	Numeric
←	`←`	`←`
→	`→`	`→`
↑	`↑`	`↑`
↓	`↓`	`↓`
⇐	`⇐`	`⇐`
⇒	`⇒`	`⇒`

HTML entity encoding for XSS prevention — the critical rules

XSS (Cross-Site Scripting) happens when user-supplied content is rendered as HTML/JavaScript instead of as text. The fix: encode user input before placing it in an HTML context. But "an HTML context" is plural — different contexts need different encoding.

Context	Example	Required encoding
HTML body / text node	`<p>USER</p>`	Escape `< > &`
HTML attribute (quoted)	`<a title="USER">`	Escape `< > & "`
HTML attribute (unquoted)	`<a title=USER>`	Encode every non-alphanumeric (or quote the attribute)
JavaScript context	`<script>var x = "USER";</script>`	JavaScript escape (`\x3C`), NOT HTML escape
CSS context	`<style>.a { content: "USER" }</style>`	CSS escape (`\3C`), NOT HTML escape
URL parameter	`<a href="?q=USER">`	URL encode (`%3C`), NOT HTML escape

The classic mistake: HTML-escaping content that ends up inside JavaScript. < becomes a literal in JS — the < is never recovered. Use the right escape for the destination context, not the source.

Defense in depth: use a Content Security Policy (CSP) header that bans inline scripts (script-src 'self'). Even if escaping fails, CSP blocks the attack.

HTML escaping in 8 programming languages

JavaScript

javascript

// Modern: use textContent (no encoding bugs possible)
el.textContent = userInput;          // Safe ✓

// If you MUST build HTML strings, escape manually
function escapeHtml(s) {
  return s.replace(/[&<>"']/g, c => ({
    '&': '&amp;', '<': '&lt;', '>': '&gt;',
    '"': '&quot;', "'": '&#39;'
  }[c]));
}

// To DECODE entities (use a textarea — the browser does it)
function decodeHtml(s) {
  const ta = document.createElement('textarea');
  ta.innerHTML = s;
  return ta.value;
}

Python

python

import html

html.escape("<script>alert(1)</script>")
# '&lt;script&gt;alert(1)&lt;/script&gt;'

html.escape("It's & \"quoted\"", quote=True)
# 'It&#x27;s &amp; &quot;quoted&quot;'

# Decode
html.unescape("&lt;p&gt;Hello&lt;/p&gt;")  # '<p>Hello</p>'

PHP

php

// Mandatory: htmlspecialchars (escapes 5 reserved chars)
echo htmlspecialchars($input, ENT_QUOTES | ENT_HTML5, 'UTF-8');

// htmlentities — encodes ALL applicable entities (rarely what you want)
echo htmlentities($input, ENT_QUOTES | ENT_HTML5, 'UTF-8');

// Decode
echo html_entity_decode($input, ENT_QUOTES | ENT_HTML5, 'UTF-8');

Java

java

// Apache Commons Text
import org.apache.commons.text.StringEscapeUtils;

String safe = StringEscapeUtils.escapeHtml4(userInput);
String back = StringEscapeUtils.unescapeHtml4(safe);

// OWASP Java Encoder (recommended for XSS prevention)
import org.owasp.encoder.Encode;
String safe = Encode.forHtml(userInput);
String safeAttr = Encode.forHtmlAttribute(userInput);
String safeJs = Encode.forJavaScript(userInput);  // different escape!

Go

import "html"

escaped := html.EscapeString("<script>")
// "&lt;script&gt;"

unescaped := html.UnescapeString("&lt;p&gt;hi&lt;/p&gt;")
// "<p>hi</p>"

// In Go templates, html/template auto-escapes by default — use it:
import "html/template"
t := template.Must(template.New("x").Parse(`<p>{{.}}</p>`))
t.Execute(os.Stdout, "<script>")  // automatically escaped

Ruby

ruby

require 'cgi'

CGI.escapeHTML("<script>alert(1)</script>")
# "&lt;script&gt;alert(1)&lt;/script&gt;"

CGI.unescapeHTML("&lt;p&gt;hi&lt;/p&gt;")
# "<p>hi</p>"

# In Rails ERB, the <%= %> syntax auto-escapes
<%= user.name %>        # auto-escaped, safe
<%= raw user.name %>    # NOT escaped — only when you've already trusted the input

Rust

rust

use html_escape::{encode_text, decode_html_entities};

let safe = encode_text("<script>");
// "&lt;script&gt;"

let back = decode_html_entities("&lt;p&gt;");
// "<p>"

// Or use Askama / Tera templates — they auto-escape

Bash (recode / sed / xmlstarlet)

bash

# GNU recode
echo '<p>Hello & world</p>' | recode html..ascii
# "&lt;p&gt;Hello &amp; world&lt;/p&gt;"

# Plain sed (5 mandatory chars)
sed -e 's/&/\&amp;/g' -e 's/</\&lt;/g' -e 's/>/\&gt;/g' \
    -e 's/"/\&quot;/g' -e "s/'/\&#39;/g"

# Decode with xmlstarlet
echo '&lt;p&gt;hi&lt;/p&gt;' | xmlstarlet unesc

HTML entity best practices

Always use auto-escaping templating engines. Jinja2, ERB, Thymeleaf, html/template (Go), React's JSX — all escape by default. Manual escaping is bug-prone.
Use textContent in JavaScript, not innerHTML, when inserting user data. Eliminates the encoding question entirely.
Never trust input from anywhere. Database content, third-party APIs, your own admin panel — all can be sources of XSS payloads.
Encode at the boundary, not the storage. Store raw user input; encode only when rendering. Lets you change templates and re-render correctly.
Pick the right encoder for the context. HTML escape, JS escape, URL encode, CSS escape — they're all different. OWASP's library handles them all.
Layer defenses with CSP. Even if XSS slips through, a tight Content Security Policy blocks inline-script execution.
Sanitize, don't escape, when allowing some HTML. If users paste rich content, use DOMPurify (JS) or Bleach (Python) to strip dangerous tags. Don't try to write a regex sanitizer.
Don't HTML-escape data going into JSON. JSON has different rules — use JSON.stringify or your language's JSON library.

HTML Entity Encoder & Decoder Online

Examples

What are HTML entities and why do they matter?

Named vs numeric vs hex — three ways to encode the same character

HTML5's named-entity gotchas

Useful named entities by category

Currency & math

Punctuation & whitespace

Arrows

HTML entity encoding for XSS prevention — the critical rules

HTML escaping in 8 programming languages

JavaScript

Python

PHP

Java

Go

Ruby

Rust

Bash (recode / sed / xmlstarlet)

HTML entity best practices

How to use the HTML entity encoder

Common mistakes to avoid

Frequently Asked Questions

Browse all 50 free developer tools

Encoding & Conversion

Formatting & Generators

Minifiers & DevOps

Security & Hashing

Code & Text

Network & APIs

Time & Dates

SEO & Meta

Symbol	Named	Numeric
(non-breaking space)	` `	` `
—	`—`	`—`
–	`–`	`–`
…	`…`	`…`
"	`“`	`“`
"	`”`	`”`
'	`‘`	`‘`
'	`’`	`’`
«	`«`	`«`
»	`»`	`»`
·	`·`	`·`
•	`•`	`•`

Symbol	Named	Numeric
(non-breaking space)	` `	` `
—	`—`	`—`
–	`–`	`–`
…	`…`	`…`
"	`“`	`“`
"	`”`	`”`
'	`‘`	`‘`
'	`’`	`’`
«	`«`	`«`
»	`»`	`»`
·	`·`	`·`
•	`•`	`•`

HTML Entity Encoder & Decoder Online

Examples

What are HTML entities and why do they matter?

Named vs numeric vs hex — three ways to encode the same character

HTML5's named-entity gotchas

Useful named entities by category

Currency & math

Punctuation & whitespace

Arrows

HTML entity encoding for XSS prevention — the critical rules

HTML escaping in 8 programming languages

JavaScript

Python

PHP

Java

Go

Ruby

Rust

Bash (recode / sed / xmlstarlet)

HTML entity best practices

How to use the HTML entity encoder

Common mistakes to avoid

Frequently Asked Questions

Related Tools

Browse all 50 free developer tools

Encoding & Conversion

Formatting & Generators

Minifiers & DevOps

Security & Hashing

Code & Text

Network & APIs

Time & Dates

SEO & Meta

Symbol	Named	Numeric
(non-breaking space)	` `	` `
—	`—`	`—`
–	`–`	`–`
…	`…`	`…`
"	`“`	`“`
"	`”`	`”`
'	`‘`	`‘`
'	`’`	`’`
«	`«`	`«`
»	`»`	`»`
·	`·`	`·`
•	`•`	`•`