Bug #442
Imported works with no descriptor on first verse fail
100%
Description
- Symposium By Lucian
Associated revisions
Added option to ignore content before the first milestone. Reference #442.
Added option to ignore content before the first milestone. Reference #442.
Added option to ignore content before the first milestone. Reference #442.
History
#1 Updated by Luke Murphey about 12 years ago
- Subject changed from Imported works no descriptor on first verse fail to Imported works with no descriptor on first verse fail
#2 Updated by Luke Murphey about 12 years ago
These can be found by looking for events with the following in them: "Milestone observed that did not have an associated unit, the sequence number will be used instead"
#3 Updated by Luke Murphey about 12 years ago
Many of the works by Lucian are affected by this.
#4 Updated by Luke Murphey almost 12 years ago
The problem is that the text within the div1 node is getting created in a division that is at the same level of the section node after it. The div1 should be one level above:
<text> <body> <pb id="v.5.p.476"/> <div1> <p> a)pokhruxqei/s tis i)atrikh\n e)ce/maqen. mane/nta to\n pate/ra kai\ u(po\ tw=n a)llwn i)atrw=n a)pegnwsme/non i)asa/menos farma/kou do/sei a)nelh/fqh au)=qis e)s to\ ge/nos. meta\ tau=ta memhnui/an th\n mhtruia\n i)a/sasqai keleuo/menos <gap/> a)pokhru/ttetai.</p> <p> <milestone unit="section" n="1"/> ou) kaina\ me\n tau=ta,
#5 Updated by Luke Murphey almost 12 years ago
- Assignee set to Luke Murphey
- % Done changed from 0 to 30
#6 Updated by Luke Murphey almost 12 years ago
Need to rethink how levels are assigned. Below is a list of the type of divisions that exist:
- text nodes
- divisions (div0, div1)
- milestone (chapters, sections)
We could assign the nodes a static range of levels. Something like:
- text = level 0
- div = level 1-10
- milestone = level 11
The only problem with this is that the document will have gaps in the levels.
We could store a value which indicates the maximum level assigned to a div or text node.
Finally, we could pass the closet division object down to the calls to process the downstream nodes when processing the milestones.
#7 Updated by Luke Murphey almost 12 years ago
Here is how the level is currently defined:
- text nodes are assigned a level 0
- division nodes are a assigned the level that is parsed from the node tag name (e.g. div1 is level 1)
- milestones are assigned a level from the state info where the level is assigned based on the position in the state-info node (the first is level 1, the second level 2, etc.)
Assigning levels are important for: * Determining if division is under or above another division * Determining if divisions in a given work are under the same division
#8 Updated by Luke Murphey almost 12 years ago
Works by Epictetus may be suffering from this. It seems like the divisions are awkwardly assigned.
#9 Updated by Luke Murphey almost 12 years ago
From what I can tell, the divisions are assigned correctly. The refs declaration seem to comport with reality:
<refsDecl doctype="TEI.2"> <state unit="text"/> <state delim="." unit="book"/> <state delim="." unit="chapter"/> <state n="chunk" unit="section"/> </refsDecl>
The only real problem is with verses that are not associated with milestones.
#10 Updated by Luke Murphey almost 12 years ago
The following works are affected:
- Abdicatus
- Anacharsis
- Bis accusatus sive tribunalia
- Cataplus
- Contemplantes
- De morte Peregrini
- De parasito sive artem esse parasiticam
- De saltatione
- Dearum judicium
- Deorum concilium
- Dialogi Marini
- Eunuchus
- Fugitivi
- Gallus
- Icaromenippus
- Imagines
- Juppiter confuatus
- Juppiter trageodeus
- Lexiphanes
- Necyomantia
- Nigrinus
- Philopsuedes sive incredulus
- Piscator
- Pro imaginibus
- Prometheus
- Symposium
- Timon
- Toxaris vel amicitia
- Tyrannicida
- Vitarum auctio
Also, Dialogi Marini has a weird first verse with no content.
#11 Updated by Luke Murphey almost 12 years ago
The question is whether I handle this when chopping up the divisions or when creating the verses. However, milestone node content is not included in the division since the functions that import the verse nodes take this to be a new verse.
#12 Updated by Luke Murphey almost 12 years ago
- % Done changed from 30 to 70
After making this change, I am getting the following error:
2012-11-28 02:12:11,811 [ERROR] reader.importer.PerseusBatchImporter: Exception generated when attempting to process file="07_gk.xml" Traceback (most recent call last): File "/Users/lmurphey/Documents/SP/Workspace/TextCritical.com/src/reader/importer/PerseusBatchImporter.py", line 327, in process_directory if self.__process_file__( os.path.join( root, f) ): File "/Users/lmurphey/Documents/SP/Workspace/TextCritical.com/src/reader/importer/PerseusBatchImporter.py", line 278, in __process_file__ return self.process_file(file_path, document_xml, title, author, language) File "/Users/lmurphey/Documents/SP/Workspace/TextCritical.com/src/reader/importer/PerseusBatchImporter.py", line 456, in process_file perseus_importer.import_file(file_path) File "/Users/lmurphey/Documents/SP/Workspace/TextCritical.com/src/reader/importer/Perseus.py", line 144, in import_file return self.import_xml_document(doc) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/django/db/transaction.py", line 209, in inner return func(*args, **kwargs) File "/Users/lmurphey/Documents/SP/Workspace/TextCritical.com/src/reader/importer/Perseus.py", line 700, in import_xml_document raise Exception("No verses were discovered, title=%s" % (self.work.title) ) Exception: No verses were discovered, title=Nigrinus
#13 Updated by Luke Murphey almost 12 years ago
- Status changed from New to Closed
- % Done changed from 70 to 100
When no division id is provided in the URL, get_chapters_list() returns all of the