Have you ever had different tools give you conflicting word counts? Although word count is a seemingly mundane task it is sometimes the cause of a lot of unnecessary stress in client-translator relationships. Your client's tool reports one word count, and your tool reports a different word count. What is causing the difference? This tool will tell you. TM-Town's Word Count Analyzer searches your text for areas that are known to cause word count discrepancies across different tools and reports those to you. Try the live demo below!
Word Count Analyzer is an open source tool built by TM-Town. Currently this tool supports English; other language support is coming very soon. To see all advanced configuration options choose the "Advanced" radio button below.
or
TM-Town's recommended settings (aka the default settings) are highlighted in orange text
Ellipsis
Hyperlink
Contraction
Hyphenated word
Date
Number
Numbered List
XML and HTML
Forward Slash
Backslash
Dotted line
Dashed line
Underscore
Stray punctuation
Tool
Word Count
Your selected settings
TM-Town
Microsoft Word / wc (Unix)*
Pages*
Word count totals for Microsoft Word, wc (Unix) and Pages are merely estimates. The algorithms behind those word counts are liable to change at any time.
Analyzing your text, please wait...
Learn More
Common word count gray areas include:
Ellipses
Hyperlinks
Contractions
Hyphenated Words
Dates
Numbers
Numbered Lists
XML and HTML tags
Forward slashes and backslashes
Punctuation
Other gray areas not covered by this tool:
Headers
Footers
Hidden Text specific to Microsoft Word
Ellipsis
default = 'ignore'
'ignore'
Ignores all ellipses in the word count total.
'no_special_treatment'
Ellipses will not be searched for in the string.
Checks for any occurrences of ellipses in your text. Writers tend to use different formats for ellipsis, and although there are style guides, it is rare that these rules are followed.
Three Consecutive Periods ...
Tool
Word Count
TM-Town
0
Microsoft Word / wc (Unix)
1
Pages
0
Four Consecutive Periods ....
Tool
Word Count
TM-Town
0
Microsoft Word / wc (Unix)
1
Pages
0
Three Periods With Spaces . . .
Tool
Word Count
TM-Town
0
Microsoft Word / wc (Unix)
3
Pages
0
Four Periods With Spaces . . . .
Tool
Word Count
TM-Town
0
Microsoft Word / wc (Unix)
4
Pages
0
Horizontal Ellipsis …
Tool
Word Count
TM-Town
0
Microsoft Word / wc (Unix)
1
Pages
0
Hyperlink
default = 'count_as_one'
'count_as_one'
Counts a hyperlink as one word.
'no_special_treatment'
Hyperlinks will not be searched for in the string. Therefore, how a hyperlink is handled in the word count will depend on other settings (mainly slashes).
'split_at_period'
Pages will split hyperlinks at a period and count each token as a separate word.
http://www.example.com
Tool
Word Count
TM-Town
1
Microsoft Word / wc (Unix)
1
Pages
4
Contraction
default = 'count_as_one'
'count_as_one'
Counts a contraction as one word.
'count_as_multiple'
Splits a contraction into the words that make it up. Examples:
don't => do not (2 words)
o'clock => of the clock (3 words)
Most tools count contractions as one word. Some might argue a contraction is technically more than one word.
can't
Tool
Word Count
TM-Town
1
Microsoft Word / wc (Unix)
1
Pages
1
Hyphenated word
default = 'count_as_one'
'count_as_one'
Counts a hyphenated word as one word.
'count_as_multiple'
Breaks a hyphenated word at each hyphen and counts each word separately. Example:
devil-may-care (3 words)
devil-may-care
Tool
Word Count
TM-Town
1
Microsoft Word / wc (Unix)
1
Pages
3
Date
default = 'no_special_treatment'
'no_special_treatment'
Dates will not be searched for in the string. Therefore, how a date is handled in the word count will depend on other settings.
'count_as_one'
Counts a date as one word. This is more commonly seen in translation CAT tools where a date is thought of as a placeable that can usually be automatically translated. Examples:
Monday, April 4th, 2011 (1 word)
April 4th, 2011 (1 word)
04/04/2011 (1 word)
04.04.2011 (1 word)
2011/04/04 (1 word)
2011-04-04 (1 word)
2003Nov9 (1 word)
2003 November 9 (1 word)
2003-Nov-9 (1 word)
and others...
Most word processing tools do not do recognize dates, but translation CAT tools tend to recognize dates as one word or placeable. TM-Town's tool checks for many date formats including those that include day or month abbreviations. A few examples are listed below (not an exhaustive list).
Monday, April 4th, 2011
Tool
Word Count
TM-Town
4
Microsoft Word / wc (Unix)
4
Pages
4
04/04/2011
Tool
Word Count
TM-Town
1
Microsoft Word / wc (Unix)
1
Pages
3
04.04.2011
Tool
Word Count
TM-Town
1
Microsoft Word / wc (Unix)
1
Pages
1
Number
default = 'count'
'count'
Counts a number as one word.
'ignore'
Ignores any numbers in the string (with the exception of dates and numbered_lists) and does not count them towards the word count.
Simple number 200
Tool
Word Count
TM-Town
1
Microsoft Word / wc (Unix)
1
Pages
1
Number with preceding unit $200
Tool
Word Count
TM-Town
1
Microsoft Word / wc (Unix)
1
Pages
1
Number with unit following 50%
Tool
Word Count
TM-Town
1
Microsoft Word / wc (Unix)
1
Pages
1
Numbered list
default = 'count'
'count'
Counts a number in a numbered list as one word.
'ignore'
Ignores any numbers that are part of a numbered list and does not count them towards the word count.
1. List item a 2. List item b 3. List item c
Tool
Word Count
TM-Town
12
Microsoft Word / wc (Unix)
12
Pages
9
XML and HTML
default = 'remove'
'remove'
Removes any XML or HTML opening and closing tags from the string.
'count_as_multiple_except_dates'
Separates any tokens that include a forward slash (except dates) at the slash(s) and counts each token individually. Example:
she/he/it 4/25/2014 (4 words)
'count_as_multiple'
Separates any tokens that include a forward slash at the slash(s) and counts each token individually. Whether dates, hyperlinks and xhtml are included depends on what is set for those options. Example:
she/he/it (3 words)
'count_as_one'
Counts any tokens that include a forward slash as one word. Example:
she/he/it (1 word)
she/he/it
Tool
Word Count
TM-Town
3
Microsoft Word / wc (Unix)
1
Pages
3
Backslash
default = 'count_as_one'
'count_as_one'
Counts any tokens that include a backslash as one word. Example:
c:\Users\johndoe (1 word)
'count_as_multiple'
Separates any tokens that include a backslash at the slash(s) and counts each token individually. Example:
c:\Users\johndoe (3 words)
c:\Users\johndoe
Tool
Word Count
TM-Town
1
Microsoft Word / wc (Unix)
1
Pages
3
Dotted line
default = 'ignore'
'ignore'
Ignores any dotted lines in the string and does not count them towards the word count.
'count'
Counts a dotted line as one word.
.........
Tool
Word Count
TM-Town
0
Microsoft Word / wc (Unix)
1
Pages
0
………………………
Tool
Word Count
TM-Town
0
Microsoft Word / wc (Unix)
1
Pages
0
Dashed line
default = 'ignore'
'ignore'
Ignores any dashed lines in the string and does not count them towards the word count.
'count'
Counts a dashed line as one word.
-----------
Tool
Word Count
TM-Town
0
Microsoft Word / wc (Unix)
1
Pages
0
Underscore
default = 'ignore'
'ignore'
Ignores any series of underscores in the string and does not count them towards the word count.
'count'
Counts a series of underscores as one word.
____________
Tool
Word Count
TM-Town
0
Microsoft Word / wc (Unix)
1
Pages
0
Stray punctuation
default = 'ignore'
'ignore'
Ignores any punctuation marks surrounded on both sides by a whitespace in the string and does not count them towards the word count.
'count'
Counts a punctuation mark surrounded on both sides by a whitespace as one word.