Bug #1246
Some works have no spaces after commas or are missing dashes
Start date:
03/07/2016
Due date:
% Done:
0%
Description
Some works include mdash and ndash entities. These are currently not converted by the front-end and are therefore ignored. This causes the commas to be placed directly next the words without a space.
<span class="placename" data-key="tgn,7012913">Danube</span> ,which after all is but a small fraction of this continent,and, by the destruction of the Persian Empire, they afterwards added to that the dominion of <span class="placename" data-key="tgn,1000004">Asia</span>
Here is the original XML:
<placeName key="tgn,7012913">Danube</placeName>,—which after all is but a small fraction of this continent,—and, by the destruction of the Persian Empire, they afterwards added to that the dominion of <placeName key="tgn,1000004">Asia</placeName>.
The actual parsed content in the database does not include the dashes. Thus, this is getting lost in the import process:
<placeName key="tgn,7012913">Danube</placeName>,which after all is but a small fraction of this continent,and, by the destruction of the Persian Empire, they afterwards added to that the dominion of <placeName key="tgn,1000004">Asia</placeName>
Apparently, mdash and ndash are various lengths of dashes:
– is an en dash (–) — is an em dash (—)
See http://stackoverflow.com/questions/16550678/is-it-necessary-to-use-mdash-and-ndash-in-xhtml-or-html5.
History
#1 Updated by Luke Murphey almost 9 years ago
A couple of options:
- Add a filter that adds spaces after commas
- Fix the importer to include the &mdash entities
I believe the importer is dropping the mdash entities because they are not valid entities; that may be the harder fix.
#2 Updated by Luke Murphey over 8 years ago
- Subject changed from Some works have no spaces after commas to Some works have no spaces after commas or are missing dashes
#3 Updated by Luke Murphey over 8 years ago
Some works are missing dashes. Compare:
#4 Updated by Luke Murphey about 6 years ago
- Target version changed from 3.0 to Plus_3