Copied!
Back
Encoding Tool

HTML Entity Encoder & Decoder Online

Encode special characters to HTML entities or decode entities back to readable text. Converts reserved HTML characters (<, >, &, ", ') to their entity equivalents — essential for safely displaying user-generated content and preventing Cross-Site Scripting (XSS) attacks (OWASP Top 10). Supports named entities (&amp;), decimal numeric (&#38;), hex numeric (&#x26;), and full Unicode escape. Runs in your browser; even sensitive payloads stay local.

Last updated: May 2026
html-entity.tool
0 characters
Output will appear here...
CharNamedNumericDescription
&&amp;&#38;Ampersand
<&lt;&#60;Less than
>&gt;&#62;Greater than
"&quot;&#34;Double quote
'&#39;&#39;Single quote
 &nbsp;&#160;Non-breaking space
©&copy;&#169;Copyright
&mdash;&#8212;Em dash
&rarr;&#8594;Right arrow
&hellip;&#8230;Ellipsis

What are HTML entities and why do they matter?

An HTML entity is a special sequence of characters that represents a reserved or non-printable character in HTML markup. They start with an ampersand (&) and end with a semicolon (;). The entity &lt; renders as <, &amp; renders as &, &copy; renders as ©. Defined originally in HTML 4.01 and extended in HTML5 to over 2,000 named entities, they exist for two reasons: escaping reserved characters (so the browser doesn't think your text is markup) and typing characters that aren't on your keyboard (em dashes, math symbols, currency).

The five mandatory escapes — the ones that XSS-prevention guides drill into every web developer:

CharNamed entityNumeric (decimal)Numeric (hex)Why escape?
<&lt;&#60;&#x3C;Starts a tag — the #1 XSS vector
>&gt;&#62;&#x3E;Ends a tag
&&amp;&#38;&#x26;Starts an entity — escape to literal
"&quot;&#34;&#x22;Closes attribute values
' (apostrophe)&apos; (HTML5)&#39;&#x27;Closes single-quoted attributes

If your application takes user input and renders it inside HTML without escaping these five characters, you have an XSS vulnerability. Every modern web framework escapes by default; raw string concatenation or "trust-us" templating bypasses that protection. This is why innerHTML with user data is dangerous and textContent is safe.

Named vs numeric vs hex — three ways to encode the same character

FormExampleProsCons
Named entity&copy;Readable, memorable~2,000 names; not all parsers support all of them
Decimal numeric&#169;Universal — every char has a decimal code pointLess readable than named
Hex numeric&#xA9;Matches Unicode references (U+00A9)Slightly less common; same support as decimal

For the five mandatory escapes, named entities are universally supported and most readable. For uncommon symbols (em dash, em space, math operators, arrows), numeric entities are safer because every parser recognizes them. Always include the trailing semicolon — most browsers tolerate missing semicolons in HTML5, but XML and strict parsers reject the entity entirely.

HTML5's named-entity gotchas

  • &apos; only became official in HTML5. It works in modern browsers but breaks in XHTML 1.0 — use &#39; if XHTML compatibility matters.
  • Some "obvious" names don't exist. There's no &asterisk; or &dollar; — those are plain ASCII. Don't over-escape.
  • Case matters. &Auml; = Ä, &auml; = ä — different characters.

Useful named entities by category

Currency & math

SymbolNamedNumeric
©&copy;&#169;
®&reg;&#174;
&trade;&#8482;
&euro;&#8364;
£&pound;&#163;
¥&yen;&#165;
×&times;&#215;
÷&divide;&#247;
±&plusmn;&#177;
°&deg;&#176;

Punctuation & whitespace

SymbolNamedNumeric
(non-breaking space)&nbsp;&#160;
&mdash;&#8212;
&ndash;&#8211;
&hellip;&#8230;
"&ldquo;&#8220;
"&rdquo;&#8221;
'&lsquo;&#8216;
'&rsquo;&#8217;
«&laquo;&#171;
»&raquo;&#187;
·&middot;&#183;
&bull;&#8226;

Arrows

SymbolNamedNumeric
&larr;&#8592;
&rarr;&#8594;
&uarr;&#8593;
&darr;&#8595;
&lArr;&#8656;
&rArr;&#8658;

HTML entity encoding for XSS prevention — the critical rules

XSS (Cross-Site Scripting) happens when user-supplied content is rendered as HTML/JavaScript instead of as text. The fix: encode user input before placing it in an HTML context. But "an HTML context" is plural — different contexts need different encoding.

ContextExampleRequired encoding
HTML body / text node<p>USER</p>Escape < > &
HTML attribute (quoted)<a title="USER">Escape < > & "
HTML attribute (unquoted)<a title=USER>Encode every non-alphanumeric (or quote the attribute)
JavaScript context<script>var x = "USER";</script>JavaScript escape (\x3C), NOT HTML escape
CSS context<style>.a { content: "USER" }</style>CSS escape (\3C), NOT HTML escape
URL parameter<a href="?q=USER">URL encode (%3C), NOT HTML escape

The classic mistake: HTML-escaping content that ends up inside JavaScript. &lt; becomes a literal in JS — the < is never recovered. Use the right escape for the destination context, not the source.

Defense in depth: use a Content Security Policy (CSP) header that bans inline scripts (script-src 'self'). Even if escaping fails, CSP blocks the attack.

HTML escaping in 8 programming languages

JavaScript

javascript
// Modern: use textContent (no encoding bugs possible)
el.textContent = userInput;          // Safe ✓

// If you MUST build HTML strings, escape manually
function escapeHtml(s) {
  return s.replace(/[&<>"']/g, c => ({
    '&': '&amp;', '<': '&lt;', '>': '&gt;',
    '"': '&quot;', "'": '&#39;'
  }[c]));
}

// To DECODE entities (use a textarea — the browser does it)
function decodeHtml(s) {
  const ta = document.createElement('textarea');
  ta.innerHTML = s;
  return ta.value;
}

Python

python
import html

html.escape("<script>alert(1)</script>")
# '&lt;script&gt;alert(1)&lt;/script&gt;'

html.escape("It's & \"quoted\"", quote=True)
# 'It&#x27;s &amp; &quot;quoted&quot;'

# Decode
html.unescape("&lt;p&gt;Hello&lt;/p&gt;")  # '<p>Hello</p>'

PHP

php
// Mandatory: htmlspecialchars (escapes 5 reserved chars)
echo htmlspecialchars($input, ENT_QUOTES | ENT_HTML5, 'UTF-8');

// htmlentities — encodes ALL applicable entities (rarely what you want)
echo htmlentities($input, ENT_QUOTES | ENT_HTML5, 'UTF-8');

// Decode
echo html_entity_decode($input, ENT_QUOTES | ENT_HTML5, 'UTF-8');

Java

java
// Apache Commons Text
import org.apache.commons.text.StringEscapeUtils;

String safe = StringEscapeUtils.escapeHtml4(userInput);
String back = StringEscapeUtils.unescapeHtml4(safe);

// OWASP Java Encoder (recommended for XSS prevention)
import org.owasp.encoder.Encode;
String safe = Encode.forHtml(userInput);
String safeAttr = Encode.forHtmlAttribute(userInput);
String safeJs = Encode.forJavaScript(userInput);  // different escape!

Go

go
import "html"

escaped := html.EscapeString("<script>")
// "&lt;script&gt;"

unescaped := html.UnescapeString("&lt;p&gt;hi&lt;/p&gt;")
// "<p>hi</p>"

// In Go templates, html/template auto-escapes by default — use it:
import "html/template"
t := template.Must(template.New("x").Parse(`<p>{{.}}</p>`))
t.Execute(os.Stdout, "<script>")  // automatically escaped

Ruby

ruby
require 'cgi'

CGI.escapeHTML("<script>alert(1)</script>")
# "&lt;script&gt;alert(1)&lt;/script&gt;"

CGI.unescapeHTML("&lt;p&gt;hi&lt;/p&gt;")
# "<p>hi</p>"

# In Rails ERB, the <%= %> syntax auto-escapes
<%= user.name %>        # auto-escaped, safe
<%= raw user.name %>    # NOT escaped — only when you've already trusted the input

Rust

rust
use html_escape::{encode_text, decode_html_entities};

let safe = encode_text("<script>");
// "&lt;script&gt;"

let back = decode_html_entities("&lt;p&gt;");
// "<p>"

// Or use Askama / Tera templates — they auto-escape

Bash (recode / sed / xmlstarlet)

bash
# GNU recode
echo '<p>Hello & world</p>' | recode html..ascii
# "&lt;p&gt;Hello &amp; world&lt;/p&gt;"

# Plain sed (5 mandatory chars)
sed -e 's/&/\&amp;/g' -e 's/</\&lt;/g' -e 's/>/\&gt;/g' \
    -e 's/"/\&quot;/g' -e "s/'/\&#39;/g"

# Decode with xmlstarlet
echo '&lt;p&gt;hi&lt;/p&gt;' | xmlstarlet unesc

HTML entity best practices

  • Always use auto-escaping templating engines. Jinja2, ERB, Thymeleaf, html/template (Go), React's JSX — all escape by default. Manual escaping is bug-prone.
  • Use textContent in JavaScript, not innerHTML, when inserting user data. Eliminates the encoding question entirely.
  • Never trust input from anywhere. Database content, third-party APIs, your own admin panel — all can be sources of XSS payloads.
  • Encode at the boundary, not the storage. Store raw user input; encode only when rendering. Lets you change templates and re-render correctly.
  • Pick the right encoder for the context. HTML escape, JS escape, URL encode, CSS escape — they're all different. OWASP's library handles them all.
  • Layer defenses with CSP. Even if XSS slips through, a tight Content Security Policy blocks inline-script execution.
  • Sanitize, don't escape, when allowing some HTML. If users paste rich content, use DOMPurify (JS) or Bleach (Python) to strip dangerous tags. Don't try to write a regex sanitizer.
  • Don't HTML-escape data going into JSON. JSON has different rules — use JSON.stringify or your language's JSON library.

How to use the HTML entity encoder

Convert special characters to HTML entities (&lt;, &amp;, &quot;) so they render as text inside HTML instead of being parsed as markup. Essential for displaying user-submitted content safely (XSS prevention), embedding code samples in blog posts, and quoting HTML inside JavaScript strings.

Common mistakes to avoid

Frequently Asked Questions

What are HTML entities and why are they used?
HTML entities are special codes that represent characters which have reserved meaning in HTML (like <, >, &) or characters not available on a standard keyboard (like ©, , ). They follow the format &name; or &#number;. Without entities, browsers would misinterpret these characters as HTML tags or markup, breaking the page layout or creating security vulnerabilities.
How does HTML encoding prevent XSS attacks?
Cross-Site Scripting (XSS) attacks inject malicious HTML or JavaScript into web pages. HTML encoding converts dangerous characters like < and > into their entity equivalents (&lt; and &gt;), so the browser renders them as visible text instead of executing them as code. This is a critical defense — always encode user-generated content before inserting it into HTML. The OWASP Top 10 lists XSS as one of the most common web vulnerabilities.
What is the difference between HTML encoding and URL encoding?
HTML encoding converts characters to HTML entities (&&amp;) for safe display inside HTML documents. URL encoding converts characters to percent-encoded format (space%20) for safe use in URLs. They serve different purposes: HTML encoding prevents markup injection in web pages, URL encoding ensures special characters are transmitted correctly in URLs. They are not interchangeable.
Should I use named or numeric HTML entities?
Named entities like &amp; are more human-readable but not every character has a named version. Numeric entities — decimal (&#38;) or hexadecimal (&#x26;) — work for any Unicode character. For common entities (&lt;, &gt;, &amp;), use named versions for readability. For uncommon or Unicode characters, use numeric entities. Both are valid in all modern browsers and HTML5.
Which character encoding should I use — UTF-8 or ASCII?
UTF-8 — always. The W3C and WHATWG HTML5 spec both recommend UTF-8 as the default character encoding. UTF-8 supports all Unicode characters (emoji, CJK, Arabic, etc.) while remaining backward-compatible with ASCII. Set <meta charset="UTF-8"> in your HTML <head>. Over 98% of websites use UTF-8 as of 2026.

Browse all 50 free developer tools

All tools run in your browser, no signup required, nothing sent to a server.