Character & Word Limits 2026: Twitter, SMS, SEO, Instagram Guide

A character limit is the maximum number of Unicode code points a platform accepts in a single field: 280 for a Twitter post, 160 for a single-segment SMS in GSM-7, around 160 for a Google meta description before truncation. The number you care about depends on where you publish and whether your text contains emoji, smart quotes, or CJK characters, all of which change the math.

This guide is for social-media writers, SEO specialists, marketing copywriters, SMS senders billed per segment, and developers writing validation that has to match what Twitter, Instagram, or SMS gateways actually count. Jump to the quick reference table for the 25-platform cheat sheet, or check your draft live against six major platforms in the Word Counter, where progress bars turn red the moment you cross a limit.

Quick reference: every platform’s character and word limit

The table below covers the 30+ fields writers and developers run into most often. “Hard limit” is the platform-enforced ceiling; “Visible / above the fold” is what readers see before a truncation point; “Sweet spot” is the empirical range where content performs best.

Platform	Hard limit	Visible / above the fold	Sweet spot	Counts emoji as
Twitter / X post	280 chars	280	70-100 chars	1 codepoint
Twitter / X bio	160 chars	160	—	1 codepoint
Twitter / X display name	50 chars	50	—	1 codepoint
X Premium long-form	25,000 chars	—	—	1 codepoint
Instagram caption	2,200 chars	first 125 (then “more”)	<125 for hook	1 codepoint
Instagram bio	150 chars	150	—	1 codepoint
Instagram hashtags	max 30	—	5-10	—
LinkedIn post	3,000 chars	first 210 (then “see more”)	<1,300	1 codepoint
LinkedIn article	110,000 chars	—	—	1 codepoint
LinkedIn headline	220 chars	220	—	1 codepoint
Facebook post	63,206 chars	~477 desktop / ~125 mobile	<80 for organic	1 codepoint
TikTok caption	2,200 chars	first ~100	<150	1 codepoint
YouTube title	100 chars	70 (search)	<60	1 codepoint
YouTube description	5,000 chars	first 100-150 above fold	first 150 for hook	1 codepoint
YouTube comment	10,000 chars	—	—	1 codepoint
Reddit title	300 chars	—	<60 (subreddit-dependent)	1 codepoint
Reddit comment	10,000 chars	—	—	1 codepoint
Discord message	2,000 chars	2,000	—	1 codepoint
Discord embed description	4,096 chars	—	—	1 codepoint
Slack message	40,000 chars	—	<2,000 for readability	1 codepoint
Pinterest pin description	500 chars	first 50-60	<125	1 codepoint
Mastodon toot	500 chars (configurable)	500	—	1 codepoint
Bluesky post	300 chars	300	—	1 grapheme cluster
Threads post	500 chars	500	—	1 codepoint
SEO meta description (Google)	~160 chars desktop / ~120 mobile	150-160	150-160	1 codepoint
SEO page title (Google)	~60 chars desktop / ~50 mobile	50-60	50-60	1 codepoint
Open Graph description	~200 chars before LinkedIn/FB clip	150-200	150-200	1 codepoint
Twitter Card description	200 chars max	200	150-200	1 codepoint
SMS single segment (GSM-7)	160 chars	—	—	special — see below
SMS single segment (UCS-2 / emoji)	70 chars	—	—	1 codepoint
WhatsApp message text	65,536 chars	—	—	1 codepoint
Email subject line	no platform limit	~60 desktop / ~30 mobile	<50	1 codepoint
Google Ads headline	30 chars × 15 headlines	30 each	30	1 codepoint
Google Ads description	90 chars × 4 desc	90 each	90	1 codepoint
App Store title	30 chars	30	30	1 codepoint
App Store subtitle	30 chars	30	30	1 codepoint
App Store description	4,000 chars	first 252 above fold	252 hook	1 codepoint
Play Store short description	80 chars	80	80	1 codepoint
Play Store long description	4,000 chars	first 80 above fold	80 hook	1 codepoint

Content above the “sweet spot” line tends to get truncated, downranked, or cropped off the visible card. X Premium long-form and Mastodon (configurable per instance) are the rare exceptions that let you write past 500 characters without penalty. Every count above, except where SMS rules apply, is a Unicode code-point count: one emoji costs 1 character, not 2. To verify a draft against the six most common limits at once, paste it into the Word Counter; the progress bars catch over-limit text before you hit publish.

How characters are actually counted (Unicode code points vs UTF-16)

Three different tools can hand you three different character counts for the same string. “Character” is not a single thing: it could mean a Unicode code point, a UTF-16 code unit, or a grapheme cluster, and each platform picks one.

What is a “character”: codepoint vs code unit vs grapheme

A codepoint is a Unicode scalar value: any integer from U+0000 to U+10FFFF that Unicode has assigned to a character or marked as reserved. A code unit is the smallest piece of an encoding; UTF-16 uses 16-bit code units, UTF-8 uses 8-bit code units. A grapheme cluster is what humans perceive as a single visible character. Sometimes that means one codepoint, sometimes a base codepoint plus combining marks, sometimes a zero-width-joiner sequence like the family emoji 👨‍👩‍👧‍👦 (seven codepoints joined into one visible glyph).

For the string "a🌍👨‍👩‍👧" the three counts disagree:

Counting method	Result	Used by
UTF-16 code units (JS `string.length`)	10	Naive JavaScript code
Unicode code points	6	Twitter, Instagram, SMS gateways
Grapheme clusters	3	Bluesky, screen readers, text editors

Why `string.length` lies about emoji

JavaScript stores strings as UTF-16 internally. Any codepoint above U+FFFF (every emoji, all astral-plane characters) is encoded as a surrogate pair: two 16-bit code units. The .length property reports those two units, not one character.

"🌍".length              // 2   (UTF-16 code units)
[..."🌍"].length         // 1   (codepoints — what Twitter/SMS counts)
"🌍".match(/./gu).length // 1   (codepoints via regex with /u flag)

The spread operator and the /u regex flag both iterate by codepoint, which matches what Twitter, Instagram, and SMS gateways measure against their limits. A validation function that uses raw .length will reject tweets that are actually under the cap, or, worse, let through messages your downstream system will reject.

What about CJK and combining marks

Chinese, Japanese, and Korean ideographs are each a single codepoint and count as one character on every platform. Where they get expensive is SMS: any non-GSM-7 character flips the whole message to UCS-2 encoding, dropping the segment limit from 160 to 70 (covered in the next section).

Combining marks behave differently. The accented á written as á is one codepoint; the same á written as a + ́ (combining acute accent) is two codepoints but one grapheme cluster. Most platforms count by codepoint, so the second form costs one extra character. Bluesky is the visible exception: it counts grapheme clusters, so both forms cost 1.

Counting in different languages: quick reference

// JavaScript
[...str].length                          // codepoints
Array.from(str).length                   // codepoints

// Python 3 — len() is codepoint by default
len(s)

// Go — utf8 package
utf8.RuneCountInString(s)

// Rust — chars() iterates codepoints
s.chars().count()

// Java — codePointCount
s.codePointCount(0, s.length())

For comparison, the Base64 encoder reminds you of the other direction: when text is encoded to Base64 for transmission, every 3 bytes of UTF-8 input become 4 ASCII output characters, so the encoded length depends on the byte count, not the codepoint count. Paste a single emoji and watch the Base64 output expand to 8 characters; the same emoji that costs 1 character on Twitter takes 4 bytes in UTF-8.

To see codepoint counts (the number Twitter actually measures) on any draft, the Word Counter is Unicode-correct by default.

SMS character limit: GSM-7, UCS-2, and multi-part messages

SMS is the only major channel where adding a single emoji can literally double your bill. The reason is encoding, and the math has been the same since 1985.

The 160-character magic number: GSM-7 history

The 1985 GSM-03.38 standard fixed an SMS payload at 140 bytes. With a 7-bit character encoding, 140 bytes hold 1,120 bits ÷ 7 = 160 characters. That’s where the famous sms character limit of 160 comes from. The GSM-7 character set covers 128 base characters plus a 10-character extension (covering { } [ ] | \ ~ ^ € and form feed). Inside that set you get the full 160-char budget per segment.

Characters that fall outside GSM-7 and force a switch:

All emoji
Curly / smart quotes (" " ' '); note these are different from the ASCII straight quotes " '
Most accented Latin letters beyond the 35 in GSM-7 (é á ñ ü ø etc.; GSM-7 includes only ä ö å æ ø à è ì ò ù and a few others)
Full-width punctuation, CJK characters, Arabic, Hebrew, Greek lowercase, Cyrillic
Backtick ` and tilde ~ (the tilde is in the GSM-7 extension table, so it costs 2 of your 160 chars)

UCS-2 trap: one emoji drops you from 160 to 70

The moment a single non-GSM-7 character appears anywhere in the message, the entire message switches to UCS-2 encoding. UCS-2 uses 16 bits per character, so 140 bytes ÷ 2 = 70 characters per segment. Some real examples:

"Hello, your code is 12345"            → 26 chars, GSM-7, 1 segment
"Hello, your code is 12345 ✓"          → 28 chars, GSM-7 (✓ in extension), 1 segment
"Hello, your code is 12345 ✅"          → 28 chars, UCS-2 (emoji), 1 segment (under 70)
"Hello, "your" code is 12345 ✅"        → smart quotes + emoji → UCS-2
"Hi 你好"                                → CJK → UCS-2, 1 segment (5 chars)

That last “Hi 你好” example is the gotcha: it’s only 5 characters but it eats UCS-2 pricing and the next 65 characters you add will fit in one segment, then segment 2 starts.

Multi-part SMS segments (concatenation)

Once you cross 160 (GSM-7) or 70 (UCS-2), the message splits into multiple segments. Each segment carries a 7-character User Data Header (UDH) used for reassembly, so the available payload per segment drops:

GSM-7 multi-part: 153 characters per segment
UCS-2 multi-part: 67 characters per segment

The receiving phone reassembles the segments invisibly to the recipient, but billing is per segment, not per message. A 161-character GSM-7 message costs 2 segments. A 1,000-character GSM-7 message costs 7 segments (153 × 6 = 918, 7th segment carries the last 82).

Cost math: when one emoji doubles your bill

Take an 80-character plain-text marketing message:

Plain text: 80 chars → GSM-7 → 1 segment at price X
Add one emoji: 80 chars → UCS-2 → 80 > 70 → 2 segments at price 2X

Doubling the bill from one emoji is real and it scales. A campaign of 100,000 messages at $0.0075 per segment costs $750 in GSM-7 vs. $1,500 in UCS-2, a $750 emoji. Every major SMS provider (Twilio, Bandwidth, AWS SNS, MessageBird, Vonage) bills this way. The encoding rules are GSM standard, not vendor policy. The history of byte-level encoding tradeoffs, and why ASCII / UTF-8 / UCS-2 even exist as separate standards, is covered in Understanding Base64, which is the same family of “bits into characters” problem applied to email instead of SMS.

How to keep messages in GSM-7

Use ASCII straight quotes " ', not smart quotes
Use ASCII hyphen -, not em-dash — or en-dash –
Spell out (c) and (R), not © and ®
Avoid emoji unless the campaign budget assumes UCS-2 cost
Provider consoles (Twilio’s, Bandwidth’s, MessageBird’s) show “encoding: GSM-7” or “UCS-2” next to the preview; verify before broadcast

The fastest sanity check during drafting is the Word Counter’s SMS progress bar, which reports against the 160-char baseline. If your text triggers UCS-2, mentally divide your character count by 2.29 to estimate the segment count under the 70-char rule.

SEO limits: meta description, title tag, OG, Twitter Card

SEO character limits are softer than platform limits (Google won’t reject your page if a meta description hits 300 characters), but the practical truncation rules matter for click-through rate. The numbers below still apply in 2026.

Meta description: 150-160 character sweet spot

Google’s desktop search results truncate the meta description around 155-165 characters; mobile clips somewhere between 100 and 120. The exact truncation point varies because Google measures display pixels, not characters. A description full of W and M glyphs hits the truncation pixel earlier than one full of i and l.

Practical writing rules:

Target 150-160 characters total
Put core message in the first 120 characters (mobile-safe)
Lead with the meta description character limit keyword for the page in the first 30 characters
End with a CTA in the last 30 characters, readable even when desktop cuts the middle

The 2017-2018 era saw Google briefly expand meta description display to 320 characters, and a generation of SEO tutorials still cites that number. Google reverted to 160 in mid-2018. Writing past 200 characters today just hides the second half.

A different failure mode: descriptions under 120 characters often get replaced entirely. Google decides your description doesn’t fully serve the query and pulls a different passage from the page body, so you lose CTR control without warning.

Title tag: 60 desktop, 50 mobile

Title tags clip at roughly 60 characters on desktop and 50 on mobile. Same pixel-based truncation as descriptions, same caveat about wide glyphs.

Sweet spot: 50-60 characters, with the target keyword in the first 30 so it survives any clip. Long-tail brand suffixes (| Brand Name) belong at the end, where truncation is least painful.

Pixel-width vs character-count: Google’s actual rule

Google’s SERP description container is roughly 920 pixels wide on desktop. Average character width sits around 6.5 pixels, yielding the 140-160 character empirical target. But the per-character spread is wide: i renders at about 3 pixels, M at about 11. A description of all-caps copy (“BEST WIDGETS FOR WINTER WEDDINGS”) clips substantially earlier than a lowercase equivalent.

Pre-publish previews using pixel-accurate SERP simulators are more reliable than character counters for SEO copy.

OG description and Twitter Card description

The Open Graph protocol’s og:description is what Facebook, LinkedIn, Slack, and Discord render under a shared link preview. Display caps vary by platform: most clip around 200 characters, some extend to 300. The Twitter Card twitter:description is hard-capped at 200 characters in Twitter’s parser.

Sensible defaults:

150-200 characters for both OG and Twitter Card
They can match your meta description, but OG can run slightly longer because OG length doesn’t affect search ranking
Validate your structured-data choices (especially what gets pulled into OG by mistake) using the patterns in Security Best Practices, where untrusted OG metadata is a common phishing vector

What “no character limit” actually means

H1 tags, body content, and URL slugs have no platform-enforced SEO character limit, but soft limits still apply:

H1 > 70 characters breaks visual hierarchy and skim-ability
URL slugs technically unlimited; Google displays around 90 characters in the SERP, anything beyond is cosmetic
Body content has no length cap, but Google ranks helpful content over padding, so word count alone is not a ranking signal

The Word Counter tracks both meta description (160) and title tag (60) live as you draft, with progress bars that turn amber and red as you approach the truncation pixel.

Each platform’s character ceiling has a story behind it and a sweet spot below the hard limit where content actually performs.

Twitter / X: 280, premium 25,000, URL substitution rule

The standard twitter character limit is 280 characters, doubled from 140 in November 2017. X Premium subscribers can post long-form content up to 25,000 characters with rich formatting, but the 280-char post is still the dominant form for organic reach.

The non-obvious rule is URL substitution. Twitter wraps every URL, no matter how long, in a 23-character t.co short link at publish time. The 23-character cost is fixed.

published_length = raw_length − URL_length + 23

Example: a draft like "Check this: https://example.com/very-long-path?id=12345" is 53 raw characters. The URL is 38 characters, so it gets replaced with a 23-char t.co link, and the published length is 53 − 38 + 23 = 38 characters. Save 15 characters you didn’t know you had.

For pasting a long URL into a draft, the URL encoder/decoder is a quick way to verify what counts as a URL (Twitter recognizes URLs by RFC 3986 patterns, query strings and fragments included). Subdomains, schemes, ports, paths, queries, and fragments are all swallowed by the 23-character substitution.

Other Twitter fields: display name 50 chars, bio 160 chars, handle 15 chars. Threads (Meta’s Twitter equivalent) uses a 500-character limit instead.

Instagram: 2,200 caption, 30 hashtags, 125-char hook

Instagram captions allow 2,200 characters, but the feed only shows the first 125 characters before collapsing the rest behind a ”… more” tap. More than half of readers never tap. The instagram caption limit that matters for engagement is therefore 125, even though the hard limit is 2,200.

The 30-hashtag cap is hard, and attempting a 31st hashtag fails the post. The 5-10 hashtag range tends to perform best; beyond 11 the discovery boost flattens and the post starts looking like spam to the algorithm.

Other fields: bio 150 chars, display name 30 chars, DM 1,000 chars.

LinkedIn: 3,000 post, 1,300 sweet spot, “see more” fold

The linkedin character limit for posts is 3,000, but feed displays only the first 210 characters before the “see more” fold. Posts in the 1,200-1,500 character range win engagement on LinkedIn (multiple Buffer and Hootsuite studies converge on around 1,300 as the peak); they’re long enough to demonstrate value, short enough not to wear out the scroll.

LinkedIn Articles (the long-form publishing surface) allow 110,000 characters, which is effectively unlimited. Profile headlines cap at 220, about-section text at 2,600.

Facebook: 63,206 chars, 80-char organic sweet spot

Facebook’s 63,206-character post limit is mostly trivia; in practice posts under 80 characters get about 30% higher organic engagement than longer ones (HubSpot consistently reports this across years). Above the fold, desktop shows about 477 characters; mobile cuts at around 125.

Comment max is 8,000 characters. Reactions, shares, and click-throughs all skew toward shorter posts, so long copy belongs in the linked article, not the Facebook caption.

Newer platforms: Bluesky, Mastodon, Threads, TikTok

Bluesky posts cap at 300 characters and are the unusual case: Bluesky counts grapheme clusters, so the seven-codepoint family emoji 👨‍👩‍👧‍👦 costs 1 character, not 7
Mastodon defaults to 500 characters per toot, but instance admins can raise this to 5,000 or even unlimited; check the instance you’re posting from
Threads uses Twitter-style 500-character limits with codepoint counting
TikTok captions allow 2,200 characters with about 100 shown above the fold

Reddit, Discord, Slack: long-form and community defaults

Reddit title 300 characters (subreddit moderators often enforce <60 via AutoModerator); comments 10,000 characters
Discord standard message 2,000 characters; embed descriptions 4,096; Nitro raises to 4,000 on plain messages
Slack message 40,000 characters; above 2,000 readability drops sharply and many recipients ignore long messages

Word count targets by content type

Character limits dominate social and SEO; word counts dominate everything else: academic work, billing, content marketing, manuscripts. The table below gives a target range and a reading-time estimate (230 wpm, the Brysbaert 2019 silent-reading meta-analysis median) for each common content type.

Content type	Word target	Reading time @ 230 wpm	Notes
Tweet	30-40 words	10 sec	optimize for character, not word
LinkedIn post (sweet spot)	170-250 words	1 min	above the fold
Instagram caption (hook)	20-25 words	<10 sec	first 125 chars
Blog post — short	500-700 words	2-3 min	listicle, news, hot take
Blog post — standard	1,000-1,500 words	4-7 min	tutorial, deep guide
Blog post — long	2,000-3,000 words	9-13 min	comprehensive guide
SEO pillar page	2,500-5,000 words	11-22 min	topical authority
Academic essay (high school)	500-1,500 words	2-7 min	varies by assignment
Academic essay (undergrad)	1,500-3,000 words	7-13 min	per assignment
NaNoWriMo daily	1,667 words/day	—	50K words in 30 days
Novel — short	50,000-70,000 words	—	YA, mystery
Novel — standard	80,000-100,000 words	—	adult fiction
Conference talk (12 min @ 130 wpm)	1,500-1,600 words	speaking	rehearse to confirm
Podcast episode (30 min @ 130 wpm)	3,900 words	speaking	scripted portion

Reading time is the more useful target unit for content marketing; readers respond to a “5-minute read” label more reliably than to a “1,150 words” label. Word count remains the unit for billing (translation invoiced per source word), platform compliance (NaNoWriMo’s 50K, an academic 2,000-word ceiling), and contract terms. The Word Counter shows both in real time as you type, plus speaking time at 130 wpm for talks and podcasts.

6 counting mistakes that break real apps

Six recurring failures seen in shipped code and shipped marketing campaigns. Each one is paired with the symptom, the root cause, and the fix.

Mistake 1: Using `string.length` for character-limit validation

Symptom: A user pastes a tweet with three emoji that’s actually 270 codepoints. Your front-end validation says 276 and refuses to submit. Or, worse, your code accepts a 285-codepoint draft because the emoji budget cancels out, and Twitter rejects it server-side.

Root cause: String.prototype.length in JavaScript returns UTF-16 code units. Every emoji is a surrogate pair, costing 2 units. Every astral-plane character (math symbols, ancient scripts) does the same.

Fix: Iterate by codepoint with the spread operator or Array.from.

// ❌ wrong
function isUnderTwitterLimit(text) {
  return text.length <= 280;
}

// ✅ correct
function isUnderTwitterLimit(text) {
  return [...text].length <= 280;
}

For deeper regex-based codepoint iteration patterns (including grapheme cluster handling), the Regex Cheat Sheet covers the /u and /v flags and Unicode property escapes.

Mistake 2: Splitting CJK text on whitespace for word count

Symptom: A 500-character Chinese article reports as 1 word. The translation quote based on it is off by 500x.

Root cause: CJK languages don’t use word-spaces. text.split(/\s+/) returns a single token containing the entire essay.

Fix: Count each CJK ideograph as one word, which is the convention used by Microsoft Word, Google Docs, and every native CJK word processor.

function countWordsMixed(text) {
  const cjk = (text.match(/[一-鿿぀-ヿ가-힯]/g) || []).length;
  const latin = (text
    .replace(/[一-鿿぀-ヿ가-힯]/g, ' ')
    .match(/[A-Za-z0-9]+(?:['’-][A-Za-z0-9]+)*/g) || []).length;
  return cjk + latin;
}

The Unicode ranges cover CJK Unified Ideographs (U+4E00 to U+9FFF), Hiragana and Katakana (U+3040 to U+30FF), and Hangul Syllables (U+AC00 to U+D7AF), which are the four blocks Microsoft Word’s word-count counts as ideographs.

Mistake 3: Forgetting Twitter URL 23-char substitution

Symptom: A draft shows 320 characters in your counter, including an 80-character URL. You spend 10 minutes trimming it, only to realize Twitter would have accepted the original at 263 characters.

Root cause: Twitter replaces every URL with a 23-character t.co link at publish time. Your raw counter doesn’t know.

Fix: Pre-compute published length using raw − URL_length + 23 for each URL. For drafts containing multiple URLs, sum the corrections. URL detection in published content follows RFC 3986, the same parsing rules the URL Encoding & Decoding guide walks through.

Mistake 4: Writing meta description to 320 chars (old guideline)

Symptom: You crafted a 280-character meta description with the CTA at the end. In Google search results, the description cuts off mid-sentence at character 158 and the CTA never appears.

Root cause: Between December 2017 and May 2018, Google briefly expanded meta description display to 320 characters. Many SEO tutorials still cite that number. Google reverted to ~160 in mid-2018 and has held there ever since.

Fix: Write to 150-160 characters. Put the primary keyword in the first 30 characters and the CTA in the last 30. Use a pixel-accurate SERP simulator for high-stakes pages; wide glyphs (W, M, K) eat the budget faster than narrow ones (i, l, t).

Mistake 5: Confusing 280 characters with 280 words

Symptom: Someone on the team writes “we need a 280-word tweet” and produces 1,500 characters of perfectly fine prose. The tweet won’t post.

Root cause: Character-versus-word confusion. The two units differ by roughly 5-6x for English prose.

Fix: Pin the rule per platform. Twitter, SMS, and SEO meta count characters. NaNoWriMo, academic assignments, translation contracts, and most content-marketing briefs count words. When in doubt, check the platform’s own counter (Twitter’s compose box, Word’s Review > Word Count) before locking the spec.

Mistake 6: Pasting smart quotes that silently switch SMS to UCS-2

Symptom: You copy a customer-receipt template from a Google Doc into your SMS sender. The original was 145 characters and shipped as one GSM-7 segment. After paste, it’s the same 145 characters but bills as 2 UCS-2 segments. Costs double across a million-message campaign.

Root cause: Google Docs and Word auto-convert " and ' to typographer’s quotes " " and ' '. Those quotes aren’t in the GSM-7 character set, which flips the entire message to UCS-2.

Fix: Normalize before transmit:

function toGsm7Quotes(s) {
  return s
    .replace(/[“”]/g, '"')   // " " → "
    .replace(/[‘’]/g, "'")   // ' ' → '
    .replace(/[–—]/g, '-');  // – — → -
}

Run this before billing-sensitive sends. Twilio, MessageBird, and Bandwidth all expose an encoding field on the response; log it and alert when UCS-2 appears in templates you intended as GSM-7.

FAQ

What is the difference between character count and word count?

Character count counts every character including spaces, punctuation, and emoji, measured by Unicode codepoint on most modern platforms. Word count counts whitespace-separated tokens for Latin scripts and ideograph-by-ideograph for CJK. Twitter, SMS, and SEO meta descriptions use character count. Academic essays, NaNoWriMo manuscripts, and translation invoices use word count.

Why does Twitter count emoji as 1 character but JavaScript counts them as 2?

Twitter measures by Unicode code point, and every emoji is one codepoint, one character. JavaScript’s string.length measures UTF-16 code units. Most emoji are above U+FFFF and are encoded as surrogate pairs in UTF-16, so they take two code units and .length returns 2. Use [...text].length or Array.from(text).length to get the codepoint count Twitter actually counts.

Why is the SMS character limit 160 sometimes and 70 other times?

SMS uses 7-bit GSM-7 encoding by default, giving 160 characters in a 140-byte payload. If the message contains any non-GSM-7 character (emoji, smart quotes, CJK, accented Latin beyond a small set), the whole message switches to 16-bit UCS-2 encoding and the per-segment limit drops to 70 characters. One emoji anywhere in the message triggers the switch.

What is the ideal meta description length in 2026?

Aim for 150-160 characters. Google’s desktop SERP truncates around 155-165 depending on display pixel width; mobile clips between 100 and 120. Below 120 characters Google often replaces your description entirely with a passage from page body. Lead with the primary keyword in the first 30 characters and end with the CTA in the last 30, so the message survives truncation either direction.

Does character limit include spaces and emoji?

Yes, on virtually every platform. Spaces, line breaks, punctuation, and emoji each count as one Unicode codepoint. The two exceptions worth knowing: SMS where emoji trigger the encoding switch described above, and Bluesky which counts grapheme clusters so a multi-codepoint emoji like the family 👨‍👩‍👧‍👦 costs 1 character instead of 7.

How is word count calculated for Chinese, Japanese, Korean text?

Each CJK ideograph counts as one word, the convention used by Microsoft Word’s Chinese-mode word count, Google Docs, native CJK editors, and every commercial translation memory system. A 500-character Chinese essay reports as 500 words. Mixed text counts CJK ideographs by character and Latin tokens by whitespace, summing the two.

How does Twitter handle URL length in the 280-character limit?

Twitter automatically wraps every URL in a 23-character t.co short link at publish time, regardless of original length. The published length follows the formula published = raw − URL_length + 23 per URL. A draft of 320 characters containing one 100-character URL ships as 243 characters. Twitter recognizes URLs by RFC 3986 patterns, so query strings and fragments are absorbed into the URL token.

Regex Cheat Sheet: pattern matching for character validation, Unicode property escapes
Text Diff Online Guide: comparing two pieces of text, line by line and character by character
URL Encoding & Decoding Guide: character escaping rules when text travels through URLs
Understanding Base64: the other half of “bits into characters” encoding, applied to email and binary data