Feature #2220
Stream results without caching them to reduce memory usage
100%
Associated revisions
Adding test server that includes a large chunk of text
This is useful for performance testing
Reference #2220
Adding streaming of results as they arrive
Reference #2220
Increasing the size of test output to increase chance of seeing performance problems
Reference #2220
Forcing garbage collection to make performance analysis easier
This also reduce overall memory usage by cleaning some things up
while the input waits for the next run
Reference #2220
History
#1 Updated by Luke Murphey over 6 years ago
web_input.py: calls output_event in the modular input class
To make this work, I would need to:- Pass an output result function to scrape_page()
- Pass the output result function to get_result_single()
- Keep a result count around. Maybe just having get_result_single() return a small set of fields, like just the URL.
#2 Updated by Luke Murphey over 6 years ago
Before I do the change I should test this and monitor memory usage. I could do this by patching the internal web-server to return content with variable URLs that should cause the existing design to load up memory.
Then, I can run this afterwards to show that memory usage is down.
#3 Updated by Luke Murphey over 6 years ago
- Target version changed from 4.5.3 to 4.5.2
#4 Updated by Luke Murphey over 6 years ago
Monitoring memory with:
source=top PID=57782 | timechart max(RES) as memory
#5 Updated by Luke Murphey over 6 years ago
It seems like some memory isn't being reclaimed. When I use output only when contents change, the memory usage seems like it is actually lower and reclaims faster once the input is done.
Next steps:
- Disable streaming, see if the problem changes
- Disable parts of the output function, see if the problem changes
#6 Updated by Luke Murphey over 6 years ago
- % Done changed from 0 to 70
#9 Updated by Luke Murphey over 6 years ago
- Status changed from New to Closed
- % Done changed from 70 to 100