Project

General

Profile

Feature #307

Support Parsing of Content with jQuery-like Syntax

Added by Luke Murphey over 13 years ago. Updated about 13 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Scan Engine
Target version:
-
Start date:
01/12/2011
Due date:
01/15/2011
% Done:

100%


Description

Parsing content with jQuery-like syntax would make it easy to parse content. See http://jsoup.org/cookbook/extracting-data/selector-syntax

History

#1 Updated by Luke Murphey over 13 years ago

  • Assignee set to Luke Murphey
  • Target version set to 1.0.3

#2 Updated by Luke Murphey over 13 years ago

  • Category set to Scan Engine

#3 Updated by Luke Murphey over 13 years ago

  • Due date set to 01/15/2011
  • Start date set to 01/12/2011
  • % Done changed from 0 to 40

JSoup has been added. However. the parsed content is not stored which means that definitions will have to reparse the content each time (slowing down definitions). An alternative would be to:

  1. Cache the jsoup document in HttpResponseData
  2. Replace the older HTML parsing library completely for jsoup

#4 Updated by Luke Murphey over 13 years ago

Two options are available for this:
  • Could add the ability to store items about the specimen in the specimen itself (like decorations)
  • Could use the ability provide in r1157 to store variables for the current URL evaluation (this would have to be done by convention, not something enforced by the scan engine)

#5 Updated by Luke Murphey over 13 years ago

  • Target version deleted (1.0.3)

#6 Updated by Luke Murphey about 13 years ago

  • Status changed from New to Closed
  • % Done changed from 40 to 100

Having all of the content pre-processed should not be be necessary.

Also available in: Atom PDF