Feature #1168
Output raw data
Start date:
05/18/2016
Due date:
% Done:
100%
Description
Add the ability to just output the raw content of the page that is matched. The user can then parse the content themselves in SPL (such as rex). This way, the user can import things such as:
- JSON files
- HTML that is not valid or is rendered by Javascript (especially document.write() calls).
- Raw content that needs no parsing (like the Internet Storm Center status)
Subtasks
Related issues
History
#1 Updated by Luke Murphey almost 9 years ago
- Description updated (diff)
#3 Updated by Luke Murphey over 8 years ago
Here are some issues that would need to be addressed to get this to work:
- how to handle source-typing.
- For XML, it would need to output raw XML so that xpath could be used (http://docs.splunk.com/Documentation/Splunk/6.0.7/SearchReference/Xpath) and KV_MODE=xml
- For JSON, the sourcetype would need to be such that Splunk would treat the content as JSON (using INDEXED_EXTRACTIONS = json)
- How to handle the extra fields that are usually included as key-value pairs
#4 Updated by Luke Murphey over 8 years ago
#5 Updated by Luke Murphey over 8 years ago
Not sure how to output raw data and define sourcetype, index
#6 Updated by Luke Murphey over 8 years ago
I currently have to strip endlines. To include endlines, I would need to configure a better line-break (http://docs.splunk.com/Documentation/Splunk/6.2.2/Data/Indexmulti-lineevents).
#7 Updated by Luke Murphey over 8 years ago
- Status changed from New to Closed
#8 Updated by Luke Murphey almost 8 years ago
- Related to Feature #1748: Add parsing of JSON fields added