Bug #1247
Word lookup fails to return all forms
Start date:
03/07/2016
Due date:
% Done:
100%
Description
Here are some cases that don't work
- νόμος (νόμον works)
- ἔργα (doesn't return all forms though ἔργον returns many of them)
This means the search engine isn't expanding words into all of the possible forms.
History
#1 Updated by Luke Murphey over 8 years ago
- Subject changed from Word lookup fails to Word lookup fails to return all forms
- Description updated (diff)
#2 Updated by Luke Murphey over 8 years ago
Call stack:
- GreekVariations::get_variations()
- reader.utils.get_all_related_forms()
- reader.utils.get_word_descriptions()
#3 Updated by Luke Murphey over 8 years ago
νόμος has a word form:
http://127.0.0.1:8080/admin/reader/wordform/?q=%CE%BD%CF%8C%CE%BC%CE%BF%CF%82
from reader.models import * descriptions = WordDescription.objects.filter( word_form__form="νόμος") lemmas = [] for d in descriptions: if d.lemma not in lemmas: lemmas.append(d.lemma) for l in lemmas: matching_descs = WordDescription.objects.filter(lemma=l) for m in matching_descs: if m.word_form not in word_forms: word_forms.append( m.word_form )
#4 Updated by Luke Murphey over 8 years ago
I;m seeing uppercase characters in the forms:
[<WordForm: Νόμε>, <WordForm: Νόμοι>, <WordForm: Νόμοιν>, <WordForm: Νόμοιο>, <WordForm: Νόμοις>, <WordForm: Νόμοισι>, <WordForm: Νόμοισιν>, <WordForm: Νόμον>, <WordForm: Νόμος>, <WordForm: Νόμου>, <WordForm: Νόμους>, <WordForm: Νόμω>, <WordForm: Νόμωι>, <WordForm: Νόμων>, <WordForm: Νόμως>, <WordForm: Νόμῳ>, <WordForm: νόμε>, <WordForm: νόμοι>, <WordForm: νόμοιν>, <WordForm: νόμοιο>, <WordForm: νόμοις>, <WordForm: νόμοισι>, <WordForm: νόμοισιν>, <WordForm: νόμον>, <WordForm: νόμος>, <WordForm: νόμου>, <WordForm: νόμους>, <WordForm: νόμω>, <WordForm: νόμωι>, <WordForm: νόμων>, <WordForm: νόμως>, <WordForm: νόμῳ>]
#5 Updated by Luke Murphey over 8 years ago
- Status changed from New to In Progress
- % Done changed from 0 to 20
#6 Updated by Luke Murphey over 8 years ago
I'm finding that a search doesn't return any word descriptions: http://127.0.0.1:8080/api/word_forms/%CE%BD%CF%8C%CE%BC%CE%BF%CF%82
This works for some reason though:
from reader.utils import * get_word_descriptions(u"νόμος")
#7 Updated by Luke Murphey over 8 years ago
I think this has to do with the two forms of sigmas.
#8 Updated by Luke Murphey over 8 years ago
Removing Greek.beta_code_to_unicode() from line 426 in get_variations() makes this work.
#9 Updated by Luke Murphey over 8 years ago
- Status changed from In Progress to Closed
- % Done changed from 20 to 100