Project

General

Profile

Bug #2616

Berean bible has weird characters

Added by Luke Murphey over 4 years ago. Updated over 4 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Target version:
Start date:
03/22/2020
Due date:
% Done:

100%


Description

They are &#147


Related issues

Related to TextCritical.net - Feature #2461: Add Berean Study Bible Closed 02/06/2020

Associated revisions

Revision 1100 (diff)
Added by luke.murphey over 4 years ago

Fixing Berean import failure

Reference #2616

Revision 1109 (diff)
Added by luke.murphey over 4 years ago

Fixing Berean import failure

Reference #2616

Revision 1101 (diff)
Added by luke.murphey over 4 years ago

Fixing test cases where the file got fixed in the test somehow

Reference #2616

Revision 1110 (diff)
Added by luke.murphey over 4 years ago

Fixing test cases where the file got fixed in the test somehow

Reference #2616

History

#1 Updated by Luke Murphey over 4 years ago

Obs: Qs:
  • Is there something missing where the chars are?
    • Double quotes
  • What char is this?
  • What encoding is this file?
    • UTF-8 works with the double-quotes but complains about the start character: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa9 in position 40: invalid start byte
    • cp1252 cannot load: UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 614: character maps to <undefined>
    • iso-8859-1 fails to load the double-quotes
    • windows-1252: UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 614: character maps to <undefined>
  • What is the difference between cp1252 and windows-1252?
  • Where exactly is the problem in the file with cp1252?
    • That looks good actually
    • Tests fail on it though with surrogateescape. Strange because I imported it this way fine.
Refs:

#2 Updated by Luke Murphey over 4 years ago

  • Assignee set to Luke Murphey
  • Target version set to 3.2.1

#3 Updated by Luke Murphey over 4 years ago

#4 Updated by Luke Murphey over 4 years ago

python3 manage.py import_berean_bible -f /db/bsb.txt

#5 Updated by Luke Murphey over 4 years ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

Also available in: Atom PDF