Notes » History » Version 2
Version 1 (Luke Murphey, 03/28/2015 01:25 AM) → Version 2/4 (Luke Murphey, 09/17/2015 04:53 PM)
h1. Notes
h2. lxml refused to parse content that is Unicode if the content is xml with an encoding declaration
lxml will refuse to parse a Unicode string containing XML that declares the encoding even if the encoding declaration matches the encoding used. The app handles this by attempting to parse the content a second time if it failed when using Unicode. This is necessary because I cannot allow lxml to discover the encoding since it doesn't know what the HTTP headers are and cannot sniff the encoding as well as the input does (which uses several methods to determine the encoding). See #987.
h2. I changed the sourcetype and now the match field is no longer a multi-value field; what do I do?
You can use rex to parse out the content into a field of your choice. In the example below, the matches are being parsed into a field called "file".
<pre>
sourcetype="downloads" | rex field=_raw "match=(?<file>[.a-zA-Z0-9_]+)" max_match=50 | mvexpand file
</pre>
h2. lxml refused to parse content that is Unicode if the content is xml with an encoding declaration
lxml will refuse to parse a Unicode string containing XML that declares the encoding even if the encoding declaration matches the encoding used. The app handles this by attempting to parse the content a second time if it failed when using Unicode. This is necessary because I cannot allow lxml to discover the encoding since it doesn't know what the HTTP headers are and cannot sniff the encoding as well as the input does (which uses several methods to determine the encoding). See #987.
h2. I changed the sourcetype and now the match field is no longer a multi-value field; what do I do?
You can use rex to parse out the content into a field of your choice. In the example below, the matches are being parsed into a field called "file".
<pre>
sourcetype="downloads" | rex field=_raw "match=(?<file>[.a-zA-Z0-9_]+)" max_match=50 | mvexpand file
</pre>