Project

General

Profile

Bug #439

Feature #403: Perseus Book Importer

Importer fails on works with no state set

Added by Luke Murphey about 12 years ago. Updated about 12 years ago.

Status:
Closed
Priority:
High
Assignee:
Target version:
Start date:
Due date:
% Done:

100%


Description

The following works have no state sets and cannot be imported:

1    aristoph.wasps_gk.xml
2    aristoph.thes_gk.xml
3    aristoph.pl_gk.xml
4    aristoph.peace_gk.xml
5    aristoph.lys_gk.xml
6    aristoph.kn_gk.xml
7    aristoph.frogs_gk.xml
8    aristoph.eccl_gk.xml
9    aristoph.cl_gk.xml
10    aristoph.birds_gk.xml
11    aristoph.ach_gk.xml
reader.importer.PerseusBatchImporter: Exception generated when attempting to process file="aristoph.wasps_gk.xml" 
Traceback (most recent call last):
  File "/Users/lmurphey/Documents/SP/Workspace/TextCritical.com/src/reader/importer/PerseusBatchImporter.py", line 326, in process_directory
    if self.__process_file__( os.path.join( root, f) ):
  File "/Users/lmurphey/Documents/SP/Workspace/TextCritical.com/src/reader/importer/PerseusBatchImporter.py", line 277, in __process_file__
    return self.process_file(file_path, document_xml, title, author, language)
  File "/Users/lmurphey/Documents/SP/Workspace/TextCritical.com/src/reader/importer/PerseusBatchImporter.py", line 453, in process_file
    perseus_importer.import_file(file_path)
  File "/Users/lmurphey/Documents/SP/Workspace/TextCritical.com/src/reader/importer/Perseus.py", line 139, in import_file
    return self.import_xml_document(doc)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/django/db/transaction.py", line 209, in inner
    return func(*args, **kwargs)
  File "/Users/lmurphey/Documents/SP/Workspace/TextCritical.com/src/reader/importer/Perseus.py", line 571, in import_xml_document
    current_state_set = state_sets[self.state_set]
IndexError: list index out of range

History

#1 Updated by Luke Murphey about 12 years ago

  • Description updated (diff)
  • Parent task set to #403

#2 Updated by Luke Murphey about 12 years ago

  • Status changed from New to In Progress

#3 Updated by Luke Murphey about 12 years ago

  • Description updated (diff)

The following methods of PerseusTextImporter are likely to need to be modified to accept documents that do not have state sets:

  • is_milestone_chunk
  • get_state_for_milestone
  • is_milestone_in_state_set

#4 Updated by Luke Murphey about 12 years ago

Looks like this was a problem with the import policy which indicated that state set one ought to be used (which doesn't exist).

#5 Updated by Luke Murphey about 12 years ago

  • Status changed from In Progress to Closed
  • % Done changed from 0 to 100

Also available in: Atom PDF