Back Text

Character & String Length Counter Online

Character count, instantly — paste any text and watch live totals for characters (with and without spaces), words, lines, paragraphs, sentences, and raw UTF-8 bytes. Built-in progress bars show how close you are to the Twitter/X 280-char limit, the SMS GSM-7 160-char limit, and the 78-character email subject sweet spot. Reading-time (200 wpm) and speaking-time (130 wpm) estimates help size blog posts, scripts, and podcast outlines. Top five most-frequent words, average word length, and longest word help spot filler and repetition. Runs in your browser — no upload.

Last updated: May 2026

Top 5 Most Frequent Words

0
Copied!

What does "character count" actually mean?

"Character count" sounds simple — count the letters — but it's deceptively complex once you account for Unicode, emoji, and platform-specific quirks. The same 30-letter sentence can register as 30, 60, or even 90 "characters" depending on whether you measure code points, UTF-16 code units, UTF-8 bytes, or graphemes (visible character clusters). Picking the right metric depends on where the text is going.

MetricWhat it countsWhen to use
Code points (Unicode chars)Each Unicode character, regardless of byte sizeMost "human" character counts — Word, Google Docs, blog editors
UTF-16 code unitsJavaScript's .length — counts each 16-bit unitDefault in JavaScript, Java, .NET. Emoji often count as 2.
UTF-8 bytesBytes when stored in UTF-8Database column sizing, SMS billing, network payloads
Graphemes (visible chars)What humans see as "one character"Cursor positioning, deletion behavior, what users intuitively expect
WordsSequences separated by whitespaceReading time, content sizing, Word/Docs counts

Take the family emoji 👨‍👩‍👧‍👦. To a human, it's one character. JavaScript's "👨‍👩‍👧‍👦".length returns 11 (UTF-16 code units). Stored as UTF-8, it's 25 bytes. Twitter counts it as 2 weighted units. All five answers are technically correct — just for different questions.

Platform-specific character limits in 2026

PlatformLimitCounting rules
Twitter / X280 (Free) / 25,000 (Premium)BMP chars = 1, CJK + emoji = 2. URLs always 23 (t.co shortener).
SMS (GSM-7)160 single / 153 per segment when concatenatedLatin alphabet only. Extended chars (€, [, ], ~) consume 2 slots.
SMS (UCS-2)70 single / 67 per segment when concatenatedTriggered by any non-GSM-7 char (emoji, Cyrillic, Arabic, Chinese)
Email subject (RFC 5322)998 hard / ~78 recommended78 chars before wrapping; mobile previews truncate ~50.
Email body line998 hard / 78 recommendedRFC 5322 §2.1.1. Many MTAs reject > 998.
HTML <title>~60 chars visible in SERPsGoogle truncates around 580 pixels (≈55–60 chars on desktop)
Meta description~155 chars mobile / 160 desktopTruncated with "…" beyond.
Open Graph title60 visible / 70 truncateFacebook, LinkedIn, Slack previews
Open Graph description200 visible / 297 truncateSlack collapses to 1 line; Facebook expands.
YouTube title100 char limit / 70 visibleMobile truncates around 50.
YouTube description5,000 chars total / first 157 visible "above the fold"Use the first 157 wisely — that's the SERP snippet too.
Instagram caption2,200 charsTruncated with "...more" after 125 chars.
LinkedIn post3,000 charsTruncated after 210 chars on feed; click "see more" to expand.
Reddit title300 charsMost subreddits enforce shorter custom limits.
HN title80 charsHard limit. Concise wins.
Slack message40,000 charsEffectively unlimited; mobile collapses long messages.
WhatsApp message65,536 charsEffectively unlimited.

Tweet character counting — the surprising rules

Twitter/X's "280 character" limit isn't 280 raw characters. It's 280 weighted units after Unicode NFC normalization, with two important rules:

  • Most characters = 1 weight. Latin letters, digits, common punctuation, currency symbols, even emoji modifiers like skin tones.
  • CJK characters = 2 weight. Chinese, Japanese, Korean characters in specific Unicode blocks count double. This is intentional — they convey more information per glyph.
  • URLs = 23 weight always. Twitter wraps every URL in its t.co shortener, so https://example.com/an/extremely/long/url?with=params still costs only 23 units. This is why URL placement strategy matters.
  • Mentions and hashtags = standard char count. @username is 9 chars — no discount.
Tweet hack: writing in CJK is a 2× character cost in the counter, but a ~3–4× content-density gain in actual semantic information. For multilingual brands, mixing scripts can pack more meaning into 280.

SMS character counting — GSM-7, UCS-2, and the segment break

SMS is even trickier than tweets. The 160-character limit comes from a 1980s decision: 7 bits per character × 140 bytes per SMS = 160 chars per single message.

GSM-7 (default for Latin-alphabet languages)

The standard SMS encoding fits exactly 160 chars in one SMS. But it has a quirky character set — only 128 chars from a specific Latin/Greek table. Common symbols like , [, ], {, }, ~, \, | aren't in the basic set; they live in the extended table and consume 2 slots each. So a message with 80 letters + 5 Euro signs = 80 + 10 = 90 effective chars.

UCS-2 (when GSM-7 isn't enough)

Any character outside the GSM-7 table (a single emoji, a Cyrillic letter, a Chinese char, an em dash) automatically switches the entire message to UCS-2 — 2 bytes per char — capping single messages at 70 chars. One emoji can drop your limit from 160 to 70 in a single message.

Concatenated messages

Long messages get split into segments with a 6-byte User Data Header (UDH). Each segment fits 153 GSM-7 chars or 67 UCS-2 chars. The receiver's phone reassembles them. Billing is per segment — sending a 200-char SMS = 2 segments = 2× the cost.

Reading and speaking time math

ActivityAverage rateUse case
Silent reading (adult, English prose)200–250 wpmBlog posts, articles. Use 200 wpm for skim-friendly content.
Silent reading (technical content)50–125 wpmCode-heavy or jargon-dense docs. Slower because comprehension matters.
Speaking (conversational)120–150 wpmConversation, presentations
Speaking (broadcast / podcast)140–160 wpmPro speakers; what most podcasts trend toward
Speaking (news anchor)150–170 wpmFaster than conversational; trained pace
Speaking (auctioneer)250+ wpmThe extreme; not useful for content estimation

Practical estimates: 3-minute blog post = ~600 words. 5-minute podcast monologue = ~700 words. 10-minute conference talk = ~1,300 words spoken at relaxed pace. Use these to size content before writing.

Counting characters in 8 programming languages

JavaScript

javascript
// Naive — counts UTF-16 code units, NOT graphemes
"hello".length;          // 5  ✓
"👨‍👩‍👧‍👦".length;          // 11 ✗ (one family, 11 code units)

// Correct grapheme count (modern browsers)
[...new Intl.Segmenter().segment("👨‍👩‍👧‍👦")].length;  // 1  ✓

// UTF-8 byte count
new TextEncoder().encode("café").length;  // 5 (é = 2 bytes)

// Word count
text.trim().split(/\s+/).filter(Boolean).length;

Python

python
# Code-point count (Python 3 default — usually what you want)
len("café")          # 4  ✓

# UTF-8 byte count
len("café".encode("utf-8"))  # 5

# Grapheme count (uses regex package, not stdlib)
import regex
len(regex.findall(r'\X', "👨‍👩‍👧‍👦"))  # 1

# Word count
len(text.split())

PHP

php
// strlen() returns BYTES (not chars) — common bug for UTF-8
strlen("café");          // 5  ✗ (counts bytes)

// mb_strlen — code-point count
mb_strlen("café", "UTF-8");  // 4  ✓

// Word count
str_word_count($text);

Go

go
// len() returns BYTES on strings
len("café")              // 5  (bytes)

// Rune count (code points)
import "unicode/utf8"
utf8.RuneCountInString("café")  // 4  ✓

// Range over runes
for _, r := range s { ... }

Rust

rust
// .len() returns bytes
"café".len();                        // 5

// .chars().count() — code points
"café".chars().count();              // 4 ✓

// Grapheme count via unicode-segmentation crate
use unicode_segmentation::UnicodeSegmentation;
"👨‍👩‍👧‍👦".graphemes(true).count();    // 1

Java

java
// .length() returns UTF-16 code units
"café".length();              // 4
"👨‍👩‍👧‍👦".length();              // 11

// Code-point count
"👨‍👩‍👧‍👦".codePointCount(0, "👨‍👩‍👧‍👦".length());  // 7

// UTF-8 byte count
"café".getBytes(StandardCharsets.UTF_8).length;  // 5

Ruby

ruby
"café".length              # 4 (chars in default encoding)
"café".bytesize            # 5 (bytes)
"café".chars.length        # 4

# Grapheme cluster count (Ruby 2.5+)
"👨‍👩‍👧‍👦".grapheme_clusters.length  # 1

Bash

bash
# wc — word/char/byte/line count
echo "hello world" | wc -c    # bytes (12 — includes newline!)
echo -n "hello world" | wc -c # bytes (11)
echo -n "hello world" | wc -m # chars (locale-aware)
echo "hello world" | wc -w    # words (2)
echo -e "line1\nline2" | wc -l # lines (2)

# String length in bash
str="café"
echo "${#str}"                # 4 (chars)

Best character counter for 2026 — what to compare

Search results for "character counter online", "word counter", and "tweet character count" return many tools but most fail on real-world counts: they count UTF-16 code units instead of characters (so an emoji counts as 2), they ignore Twitter's CJK double-counting rule, or they don't surface SMS GSM-7 vs UCS-2 segment math. Here's how the most-used counters compare in 2026:

ToolUnicode-correctTweet rule + CJKSMS GSM-7 / UCS-2Reading timeCost
FreeDevTool Character CounterNFC normalized + grapheme clustersYes (with CJK 2-unit rule)Both with segment count200 WPM read + 130 WPM speakFree
charactercount.onlineCode units onlyNoGeneric onlyYesFree, ad-funded
wordcounter.netCode unitsNoNoYesFree, ad-heavy
twittercount.comTweet-specificYesNoNoFree
Microsoft Word "Word Count"Code unitsNoNoYesBuilt into Office

How do I count characters for a tweet correctly (with CJK and emoji)?

Twitter's character count is NOT a simple JavaScript str.length. The rules: 1) apply Unicode NFC normalization first; 2) each Basic Multilingual Plane glyph counts as 1 unit; 3) characters in certain Chinese, Japanese, Korean, and emoji ranges count as 2 units (so a single Chinese ideograph eats 2 of your 280 budget); 4) URLs are always shortened to a fixed 23-character t.co length regardless of original length, even though the visible URL stays full. This counter applies all four rules — paste any tweet draft and the displayed count matches what twitter.com will count. Most generic counters get this wrong by 20-50% on multi-script content.

What's the difference between UTF-8 bytes, characters, and grapheme clusters?

Three distinct quantities frequently confused: UTF-8 bytes = how the text is encoded on disk or in transit. ASCII = 1 byte, accented Latin = 2 bytes, most CJK = 3 bytes, emoji = 4 bytes. Database varchar columns and HTTP payloads measure this. Characters (code points) = abstract Unicode characters. The letter "é" is 1 character but 2 bytes in UTF-8. Grapheme clusters = what humans perceive as one character. The emoji 👨‍👩‍👧 (family) is 1 grapheme cluster, 5 code points, 17 UTF-8 bytes. Word processors and Twitter count grapheme clusters; databases count bytes; JavaScript's str.length counts UTF-16 code units (a surrogate-pair emoji like 🚀 reads as 2). This counter shows all three so you can match whichever your downstream system measures.

Character counter alternative to wordcounter.net — 4 reasons writers switched

  1. Unicode-correct counting. Emoji 🚀 counts as 1 grapheme (correct), not 2 (UTF-16 code units). Critical for any social, marketing, or i18n copy work.
  2. Platform limit progress bars. Tweet (280), SMS (160/70), Meta description (155), SEO title (60), Google Ads headline (30), Instagram caption (2,200) — all visible simultaneously with overflow warnings.
  3. SMS GSM-7 vs UCS-2 segment math. Drop a single emoji into an SMS and it switches from 160-char GSM-7 to 70-char UCS-2 — multi-segment cost balloons. This counter shows the segment count + per-segment cost in real time.
  4. No ads, no popups, no upload. Tools indexed for "character counter online" almost universally inject ads. This page is browser-only and persists nothing.

Pair the character counter with the Lorem Ipsum Generator for placeholder copy, the Case Converter for naming-convention transforms, the String Escape Tool for character-level transformations, and the Code & Text Tools hub for the broader text toolkit.

Character counter best practices

  • Pick the right metric for the destination. Twitter wants weighted units; SMS wants encoded bytes; databases want UTF-8 bytes; humans want graphemes.
  • For UTF-8 columns, size by bytes. A "VARCHAR(255)" can hold 255 ASCII chars or 63 emoji. Plan accordingly.
  • Test with real-world content. Your average user has at least one accented character or emoji somewhere. Lorem ipsum doesn't catch encoding bugs.
  • Beware of String.length in JavaScript and Java. Both count UTF-16 code units, not characters. Use Intl.Segmenter or grapheme libraries for user-facing counts.
  • Strip URLs before counting tweets. They cost a fixed 23 units regardless of length — your "real" content has more room than the raw count suggests.
  • Reading-time estimates are rough. Use 200 wpm as a default; show "5-min read" not "4 min 47 sec." Precision implies false confidence.
  • For SMS, test with a single emoji. One emoji drops your limit from 160 to 70. Marketing messages are often optimized for GSM-7 only.
  • Display "X / 280" not just "X". Users want to see the limit too. Color the counter red as it approaches the limit.

Frequently Asked Questions

Does a tweet count characters or bytes?
Twitter/X counts characters after Unicode NFC normalization, with one important twist — each glyph from the Basic Multilingual Plane counts as 1 unit, but characters from certain Chinese, Japanese, Korean and emoji ranges count as 2 units. The effective limit is 280 units for standard accounts (Blue subscribers get a higher cap). URLs are always shortened to a fixed 23-character t.co length regardless of the original URL size, so a 500-character link still burns only 23 units.
What counts as one SMS message?
A single SMS fits 160 characters using the GSM-7 encoding (7 bits per character). If any character requires the extended GSM-7 table (like the Euro sign, square brackets, or the tilde), it consumes two slots. Using UCS-2 encoding (needed for emoji or non-Latin scripts like Arabic, Hebrew, Thai, or CJK) drops the limit to 70 characters per segment. Longer messages are concatenated with a 6-byte UDH header, reducing each segment to 153 GSM-7 chars or 67 UCS-2 chars.
How are words counted?
Words are sequences of non-whitespace characters separated by whitespace — spaces, tabs, and newlines. Multiple consecutive spaces collapse to a single separator. Numbers, hyphenated compounds ("state-of-the-art"), and contractions ("don't") each count as a single word, matching Microsoft Word and Google Docs behavior. Note that in CJK text (Chinese, Japanese, Korean) there are no whitespace word boundaries, so the count approximates chunks of contiguous text rather than linguistic words. For those languages, rely on the character count instead.
Why do UTF-8 bytes differ from characters?
UTF-8 encodes each code point as 1 to 4 bytes. ASCII characters (A–Z, 0–9, basic punctuation) use 1 byte. Most accented Latin, Greek, and Cyrillic glyphs use 2 bytes. Most CJK ideographs use 3 bytes, and supplementary-plane characters like emoji, uncommon CJK extensions, and mathematical symbols use 4 bytes. Database varchar columns, HTTP headers, JWT tokens, and JSON payload sizes are measured in bytes, not characters — so the byte count is what actually affects storage, bandwidth, and API request limits.
How accurate are the reading and speaking time estimates?
Reading time uses 200 words per minute, the midpoint of the commonly cited 175–250 wpm silent-reading range for adults on screens. Speaking time uses 130 wpm, the typical pace for podcasts, audiobook narration, and conference talks — TED talks average around 150 wpm. Real numbers vary: dense technical content trends slower, conversational copy trends faster, and non-native readers typically sit at 100–150 wpm. Use the estimate as a starting point, then calibrate with your own test reads aloud.
Is my text sent anywhere?
No. All counting runs in your browser with vanilla JavaScript — no network request, no upload, no log. You can disconnect from the internet after the page loads and everything still works. Your text never leaves the device. This matters for proprietary content, client drafts, unreleased announcements, and anything covered by an NDA. If you need extra assurance, open your browser's network inspector while typing — you will see zero outgoing requests from this tool.
What is a good email subject-line length?
Most email clients truncate the subject line around 78 characters on desktop and as few as 30–40 characters on mobile preview panes. Studies from Mailchimp, Campaign Monitor, and HubSpot consistently find that subjects between 40 and 60 characters get the highest open rates. Put your hook in the first 30 characters so it survives mobile truncation. Avoid ALL CAPS, excessive punctuation, and spam-trigger words like "free," "guaranteed," and "act now," which can drop delivery into the promotions tab.

Browse all 50 free developer tools

All tools run in your browser, no signup required, nothing sent to a server.