What counts as one SMS message?

A single SMS fits 160 characters using GSM-7 encoding (7 bits per character). If any character requires extended GSM-7 (like the Euro sign), it consumes two slots. Using UCS-2 encoding (for emoji or non-Latin scripts) drops the limit to 70 characters per segment. Longer messages are concatenated with a 6-byte UDH header, reducing each segment to 153 GSM-7 chars or 67 UCS-2 chars.

How are words counted?

Words are sequences of non-whitespace characters separated by whitespace (spaces, tabs, newlines). Multiple spaces count as a single separator. Numbers, hyphenated compounds, and contractions count as single words. This matches most word processors. Note that in CJK text (Chinese, Japanese, Korean) there are no whitespace word boundaries, so the word count equates to chunks of text, not linguistic words.

Why do UTF-8 bytes differ from characters?

UTF-8 encodes each code point as 1 to 4 bytes. ASCII characters (A–Z, 0–9, basic punctuation) use 1 byte. Most accented Latin, Greek, and Cyrillic glyphs use 2 bytes. Most CJK ideographs use 3 bytes, and supplementary characters like emoji or uncommon CJK extensions use 4 bytes. Database varchar columns, HTTP headers, and JSON payload sizes are measured in bytes, not characters, so the byte count is what affects storage and network cost.

How accurate are the reading and speaking time estimates?

Reading time uses 200 words per minute, the midpoint of the commonly cited 175–250 wpm silent-reading range for adults on screens. Speaking time uses 130 wpm, the typical pace for podcasts, audiobook narration and conference talks. Actual numbers vary — dense technical content trends slower, conversational copy trends faster, and non-native readers typically sit at 100–150 wpm. Use the estimate as a starting point, then calibrate with your own test reads.

How are characters counted with emoji?

Depends on the counter. JavaScript string.length counts UTF-16 code units — a basic emoji is 2, a flag is 4. Grapheme cluster counting treats every emoji as 1. Twitter, SMS billing, and database char-limit checks use different rules — always test with real emoji input.

What is the optimal meta description length?

120-155 characters is the safe zone for Google desktop SERP. Google truncates descriptions at about 158 characters on desktop and 120 on mobile. Going under 120 wastes preview space; going over 158 risks truncation mid-sentence.

Why does an emoji count as 2 characters?

Most emoji are stored as surrogate pairs in UTF-16 — two 16-bit code units representing a single Unicode code point above U+FFFF. JavaScript and Java string length functions return code units, not characters as users see them. Use Intl.Segmenter for accurate human-perceived counts.

Character Counter — Words, Bytes, Tweet/SMS

Character counting is harder than it looks because emoji, accented letters, and combining marks consist of multiple Unicode codepoints — JavaScript's string.length returns codepoints, not what users see. A face-with-tears-of-joy emoji is 1 grapheme cluster but 2 codepoints. This free character counter shows both: codepoints (what JS sees) and grapheme clusters (what users see) — critical for Twitter limits, SMS billing, and database column constraints.

Examples

ASCII vs emoji§§"hello" = 5 characters, 5 codepoints§§"hello 👋" = 7 grapheme clusters, 8 codepoints§§"👨‍👩‍👧‍👦 family" = 9 grapheme clusters, 12 codepoints§§Family emoji is one visible character but 4 emoji joined by zero-width joiners.

Twitter character limit§§Twitter counts grapheme clusters, not codepoints. A face-with-tears-of-joy is 1 character to Twitter, even though string.length returns 2.

SMS billing§§Carriers charge per 160-character GSM-7 SMS or per 70-character UCS-2 SMS (when any emoji or non-Latin character is present). One emoji can cut your effective SMS length by more than half.

What does "character count" actually mean?

"Character count" sounds simple — count the letters — but it's deceptively complex once you account for Unicode, emoji, and platform-specific quirks. The same 30-letter sentence can register as 30, 60, or even 90 "characters" depending on whether you measure code points, UTF-16 code units, UTF-8 bytes, or graphemes (visible character clusters). Picking the right metric depends on where the text is going.

Metric	What it counts	When to use
Code points (Unicode chars)	Each Unicode character, regardless of byte size	Most "human" character counts — Word, Google Docs, blog editors
UTF-16 code units	JavaScript's `.length` — counts each 16-bit unit	Default in JavaScript, Java, .NET. Emoji often count as 2.
UTF-8 bytes	Bytes when stored in UTF-8	Database column sizing, SMS billing, network payloads
Graphemes (visible chars)	What humans see as "one character"	Cursor positioning, deletion behavior, what users intuitively expect
Words	Sequences separated by whitespace	Reading time, content sizing, Word/Docs counts

Take the family emoji 👨‍👩‍👧‍👦. To a human, it's one character. JavaScript's "👨‍👩‍👧‍👦".length returns 11 (UTF-16 code units). Stored as UTF-8, it's 25 bytes. Twitter counts it as 2 weighted units. All five answers are technically correct — just for different questions.

Platform-specific character limits in 2026

Platform	Limit	Counting rules
Twitter / X	280 (Free) / 25,000 (Premium)	BMP chars = 1, CJK + emoji = 2. URLs always 23 (t.co shortener).
SMS (GSM-7)	160 single / 153 per segment when concatenated	Latin alphabet only. Extended chars (€, [, ], ~) consume 2 slots.
SMS (UCS-2)	70 single / 67 per segment when concatenated	Triggered by any non-GSM-7 char (emoji, Cyrillic, Arabic, Chinese)
Email subject (RFC 5322)	998 hard / ~78 recommended	78 chars before wrapping; mobile previews truncate ~50.
Email body line	998 hard / 78 recommended	RFC 5322 §2.1.1. Many MTAs reject > 998.
HTML `<title>`	~60 chars visible in SERPs	Google truncates around 580 pixels (≈55–60 chars on desktop)
Meta description	~155 chars mobile / 160 desktop	Truncated with "…" beyond.
Open Graph title	60 visible / 70 truncate	Facebook, LinkedIn, Slack previews
Open Graph description	200 visible / 297 truncate	Slack collapses to 1 line; Facebook expands.
YouTube title	100 char limit / 70 visible	Mobile truncates around 50.
YouTube description	5,000 chars total / first 157 visible "above the fold"	Use the first 157 wisely — that's the SERP snippet too.
Instagram caption	2,200 chars	Truncated with "...more" after 125 chars.
LinkedIn post	3,000 chars	Truncated after 210 chars on feed; click "see more" to expand.
Reddit title	300 chars	Most subreddits enforce shorter custom limits.
HN title	80 chars	Hard limit. Concise wins.
Slack message	40,000 chars	Effectively unlimited; mobile collapses long messages.
WhatsApp message	65,536 chars	Effectively unlimited.

Tweet character counting — the surprising rules

Twitter/X's "280 character" limit isn't 280 raw characters. It's 280 weighted units after Unicode NFC normalization, with two important rules:

Most characters = 1 weight. Latin letters, digits, common punctuation, currency symbols, even emoji modifiers like skin tones.
CJK characters = 2 weight. Chinese, Japanese, Korean characters in specific Unicode blocks count double. This is intentional — they convey more information per glyph.
URLs = 23 weight always. Twitter wraps every URL in its t.co shortener, so https://example.com/an/extremely/long/url?with=params still costs only 23 units. This is why URL placement strategy matters.
Mentions and hashtags = standard char count. @username is 9 chars — no discount.

Tweet hack: writing in CJK is a 2× character cost in the counter, but a ~3–4× content-density gain in actual semantic information. For multilingual brands, mixing scripts can pack more meaning into 280.

SMS character counting — GSM-7, UCS-2, and the segment break

SMS is even trickier than tweets. The 160-character limit comes from a 1980s decision: 7 bits per character × 140 bytes per SMS = 160 chars per single message.

GSM-7 (default for Latin-alphabet languages)

The standard SMS encoding fits exactly 160 chars in one SMS. But it has a quirky character set — only 128 chars from a specific Latin/Greek table. Common symbols like €, [, ], {, }, ~, \, | aren't in the basic set; they live in the extended table and consume 2 slots each. So a message with 80 letters + 5 Euro signs = 80 + 10 = 90 effective chars.

UCS-2 (when GSM-7 isn't enough)

Any character outside the GSM-7 table (a single emoji, a Cyrillic letter, a Chinese char, an em dash) automatically switches the entire message to UCS-2 — 2 bytes per char — capping single messages at 70 chars. One emoji can drop your limit from 160 to 70 in a single message.

Concatenated messages

Long messages get split into segments with a 6-byte User Data Header (UDH). Each segment fits 153 GSM-7 chars or 67 UCS-2 chars. The receiver's phone reassembles them. Billing is per segment — sending a 200-char SMS = 2 segments = 2× the cost.

Reading and speaking time math

Activity	Average rate	Use case
Silent reading (adult, English prose)	200–250 wpm	Blog posts, articles. Use 200 wpm for skim-friendly content.
Silent reading (technical content)	50–125 wpm	Code-heavy or jargon-dense docs. Slower because comprehension matters.
Speaking (conversational)	120–150 wpm	Conversation, presentations
Speaking (broadcast / podcast)	140–160 wpm	Pro speakers; what most podcasts trend toward
Speaking (news anchor)	150–170 wpm	Faster than conversational; trained pace
Speaking (auctioneer)	250+ wpm	The extreme; not useful for content estimation

Practical estimates: 3-minute blog post = ~600 words. 5-minute podcast monologue = ~700 words. 10-minute conference talk = ~1,300 words spoken at relaxed pace. Use these to size content before writing.

Counting characters in 8 programming languages

JavaScript

javascript

// Naive — counts UTF-16 code units, NOT graphemes
"hello".length;          // 5  ✓
"👨‍👩‍👧‍👦".length;          // 11 ✗ (one family, 11 code units)

// Correct grapheme count (modern browsers)
[...new Intl.Segmenter().segment("👨‍👩‍👧‍👦")].length;  // 1  ✓

// UTF-8 byte count
new TextEncoder().encode("café").length;  // 5 (é = 2 bytes)

// Word count
text.trim().split(/\s+/).filter(Boolean).length;

Python

python

# Code-point count (Python 3 default — usually what you want)
len("café")          # 4  ✓

# UTF-8 byte count
len("café".encode("utf-8"))  # 5

# Grapheme count (uses regex package, not stdlib)
import regex
len(regex.findall(r'\X', "👨‍👩‍👧‍👦"))  # 1

# Word count
len(text.split())

PHP

php

// strlen() returns BYTES (not chars) — common bug for UTF-8
strlen("café");          // 5  ✗ (counts bytes)

// mb_strlen — code-point count
mb_strlen("café", "UTF-8");  // 4  ✓

// Word count
str_word_count($text);

Go

// len() returns BYTES on strings
len("café")              // 5  (bytes)

// Rune count (code points)
import "unicode/utf8"
utf8.RuneCountInString("café")  // 4  ✓

// Range over runes
for _, r := range s { ... }

Rust

rust

// .len() returns bytes
"café".len();                        // 5

// .chars().count() — code points
"café".chars().count();              // 4 ✓

// Grapheme count via unicode-segmentation crate
use unicode_segmentation::UnicodeSegmentation;
"👨‍👩‍👧‍👦".graphemes(true).count();    // 1

Java

java

// .length() returns UTF-16 code units
"café".length();              // 4
"👨‍👩‍👧‍👦".length();              // 11

// Code-point count
"👨‍👩‍👧‍👦".codePointCount(0, "👨‍👩‍👧‍👦".length());  // 7

// UTF-8 byte count
"café".getBytes(StandardCharsets.UTF_8).length;  // 5

Ruby

ruby

"café".length              # 4 (chars in default encoding)
"café".bytesize            # 5 (bytes)
"café".chars.length        # 4

# Grapheme cluster count (Ruby 2.5+)
"👨‍👩‍👧‍👦".grapheme_clusters.length  # 1

Bash

bash

# wc — word/char/byte/line count
echo "hello world" | wc -c    # bytes (12 — includes newline!)
echo -n "hello world" | wc -c # bytes (11)
echo -n "hello world" | wc -m # chars (locale-aware)
echo "hello world" | wc -w    # words (2)
echo -e "line1\nline2" | wc -l # lines (2)

# String length in bash
str="café"
echo "${#str}"                # 4 (chars)

Best character counter for 2026 — what to compare

Search results for "character counter online", "word counter", and "tweet character count" return many tools but most fail on real-world counts: they count UTF-16 code units instead of characters (so an emoji counts as 2), they ignore Twitter's CJK double-counting rule, or they don't surface SMS GSM-7 vs UCS-2 segment math. Here's how the most-used counters compare in 2026:

Tool	Unicode-correct	Tweet rule + CJK	SMS GSM-7 / UCS-2	Reading time	Cost
FreeDevTool Character Counter	NFC normalized + grapheme clusters	Yes (with CJK 2-unit rule)	Both with segment count	200 WPM read + 130 WPM speak	Free
charactercount.online	Code units only	No	Generic only	Yes	Free, ad-funded
wordcounter.net	Code units	No	No	Yes	Free, ad-heavy
twittercount.com	Tweet-specific	Yes	No	No	Free
Microsoft Word "Word Count"	Code units	No	No	Yes	Built into Office

How do I count characters for a tweet correctly (with CJK and emoji)?

Twitter's character count is NOT a simple JavaScript str.length. The rules: 1) apply Unicode NFC normalization first; 2) each Basic Multilingual Plane glyph counts as 1 unit; 3) characters in certain Chinese, Japanese, Korean, and emoji ranges count as 2 units (so a single Chinese ideograph eats 2 of your 280 budget); 4) URLs are always shortened to a fixed 23-character t.co length regardless of original length, even though the visible URL stays full. This counter applies all four rules — paste any tweet draft and the displayed count matches what twitter.com will count. Most generic counters get this wrong by 20-50% on multi-script content.

What's the difference between UTF-8 bytes, characters, and grapheme clusters?

Three distinct quantities frequently confused: UTF-8 bytes = how the text is encoded on disk or in transit. ASCII = 1 byte, accented Latin = 2 bytes, most CJK = 3 bytes, emoji = 4 bytes. Database varchar columns and HTTP payloads measure this. Characters (code points) = abstract Unicode characters. The letter "é" is 1 character but 2 bytes in UTF-8. Grapheme clusters = what humans perceive as one character. The emoji 👨‍👩‍👧 (family) is 1 grapheme cluster, 5 code points, 17 UTF-8 bytes. Word processors and Twitter count grapheme clusters; databases count bytes; JavaScript's str.length counts UTF-16 code units (a surrogate-pair emoji like 🚀 reads as 2). This counter shows all three so you can match whichever your downstream system measures.

Character counter alternative to wordcounter.net — 4 reasons writers switched

Unicode-correct counting. Emoji 🚀 counts as 1 grapheme (correct), not 2 (UTF-16 code units). Critical for any social, marketing, or i18n copy work.
Platform limit progress bars. Tweet (280), SMS (160/70), Meta description (155), SEO title (60), Google Ads headline (30), Instagram caption (2,200) — all visible simultaneously with overflow warnings.
SMS GSM-7 vs UCS-2 segment math. Drop a single emoji into an SMS and it switches from 160-char GSM-7 to 70-char UCS-2 — multi-segment cost balloons. This counter shows the segment count + per-segment cost in real time.
No ads, no popups, no upload. Tools indexed for "character counter online" almost universally inject ads. This page is browser-only and persists nothing.

Pair the character counter with the Lorem Ipsum Generator for placeholder copy, the Case Converter for naming-convention transforms, the String Escape Tool for character-level transformations, and the Code & Text Tools hub for the broader text toolkit.

Character counter best practices

Pick the right metric for the destination. Twitter wants weighted units; SMS wants encoded bytes; databases want UTF-8 bytes; humans want graphemes.
For UTF-8 columns, size by bytes. A "VARCHAR(255)" can hold 255 ASCII chars or 63 emoji. Plan accordingly.
Test with real-world content. Your average user has at least one accented character or emoji somewhere. Lorem ipsum doesn't catch encoding bugs.
Beware of String.length in JavaScript and Java. Both count UTF-16 code units, not characters. Use Intl.Segmenter or grapheme libraries for user-facing counts.
Strip URLs before counting tweets. They cost a fixed 23 units regardless of length — your "real" content has more room than the raw count suggests.
Reading-time estimates are rough. Use 200 wpm as a default; show "5-min read" not "4 min 47 sec." Precision implies false confidence.
For SMS, test with a single emoji. One emoji drops your limit from 160 to 70. Marketing messages are often optimized for GSM-7 only.
Display "X / 280" not just "X". Users want to see the limit too. Color the counter red as it approaches the limit.

Character & String Length Counter Online

Top 5 Most Frequent Words

Examples

What does "character count" actually mean?

Platform-specific character limits in 2026

Tweet character counting — the surprising rules

SMS character counting — GSM-7, UCS-2, and the segment break

GSM-7 (default for Latin-alphabet languages)

UCS-2 (when GSM-7 isn't enough)

Concatenated messages

Reading and speaking time math

Counting characters in 8 programming languages

JavaScript

Python

PHP

Go

Rust

Java

Ruby

Bash

Best character counter for 2026 — what to compare

How do I count characters for a tweet correctly (with CJK and emoji)?

What's the difference between UTF-8 bytes, characters, and grapheme clusters?

Character counter alternative to wordcounter.net — 4 reasons writers switched

Character counter best practices

Frequently Asked Questions

Browse all 50 free developer tools

Encoding & Conversion

Formatting & Generators

Minifiers & DevOps

Security & Hashing

Code & Text

Network & APIs

Time & Dates

SEO & Meta

Character & String Length Counter Online

Top 5 Most Frequent Words

Examples

What does "character count" actually mean?

Platform-specific character limits in 2026

Tweet character counting — the surprising rules

SMS character counting — GSM-7, UCS-2, and the segment break

GSM-7 (default for Latin-alphabet languages)

UCS-2 (when GSM-7 isn't enough)

Concatenated messages

Reading and speaking time math

Counting characters in 8 programming languages

JavaScript

Python

PHP

Go

Rust

Java

Ruby

Bash

Best character counter for 2026 — what to compare

How do I count characters for a tweet correctly (with CJK and emoji)?

What's the difference between UTF-8 bytes, characters, and grapheme clusters?

Character counter alternative to wordcounter.net — 4 reasons writers switched

Character counter best practices

Frequently Asked Questions

Related Tools

Browse all 50 free developer tools

Encoding & Conversion

Formatting & Generators

Minifiers & DevOps

Security & Hashing

Code & Text

Network & APIs

Time & Dates

SEO & Meta