Feature #1882
Restrict inputs to HTTPS sites if on cloud
100%
Associated revisions
Restricting access on Splunk Cloud to HTTPS
Reference #1882
Changes examples to use HTTPS
Reference #1882
Making icon call use HTTPS
Reference #1882
Making the web_scrape command ensure that connections use HTTPS on Cloud
Reference #1882
Making sure link extraction requires HTTPS on Cloud
Reference #1882
History
#1 Updated by Luke Murphey over 7 years ago
- Status changed from New to In Progress
- Assignee set to Luke Murphey
#2 Updated by Luke Murphey over 7 years ago
- [done] Mod input editor
- [done] Modular input code
- [done] Wizard view
- [done] Preview controller
- [done] Search command
- [done] Spider link extraction
#3 Updated by Luke Murphey over 7 years ago
From some reason the freaking endpoint isn't showing up in SplunkWeb.
Observations:- Removing the other entries from web.conf doesn't help
- The entry looks equialvent to the one used in Website Montprinh
- http://127.0.0.1:8000/en-US/splunkd/services/admin/app_website_input/default fails
- https://127.0.0.1:8090/services/admin/app_website_input works
- Is not included in C:\Program Files\Splunk\var\run\splunk\merged
#4 Updated by Luke Murphey over 7 years ago
It is available under http://127.0.0.1:8000/en-US/splunkd/__raw/services/admin/app_website_input/default
#5 Updated by Luke Murphey over 7 years ago
- % Done changed from 0 to 80
#6 Updated by Luke Murphey over 7 years ago
Wierd, the non __raw endpoint works on Mac and 6.6.0.
#7 Updated by Luke Murphey over 7 years ago
I set the form value of the URL with the following to force the call to attempt to scrape the page:
document.getElementById('inputURL').value = "http://textcritical.net"
#8 Updated by Luke Murphey over 7 years ago
I have to update several calls to pass the https_only parameter to:
- scrape_page
- get_result_single
- extract_links
#9 Updated by Luke Murphey over 7 years ago
Tested with:
| webscrape selector="h3" url="https://www.reddit.com/r/popular/" page_limit=50 url_filter="*" depth_limit=25 empty_matches=0
#10 Updated by Luke Murphey over 7 years ago
I want to make tests for this.
To do this, I need:
- decorator for run_only_on_cloud
- decorator for run_only_on_enterprise
- Tests for Cloud:
- Scrape page doesn't extract non-HTTPS links
- Controller doesn't extract non-HTTPS links
- Scrape page won't scan non-HTTPS links
- Controller won't scan non-HTTPS links
- Wizard: rejects non-HTTPS
#11 Updated by Luke Murphey over 7 years ago
I wonder if I should change scrape_page to throw an exception if the URL provided is not HTTPS. Currently, https_only is only applicable to link extractions.
#12 Updated by Luke Murphey over 7 years ago
I recall now why I didn't add this into scrape_page: it doesn't have the session key to lookup whether the host is in the Cloud.
#13 Updated by Luke Murphey over 7 years ago
- Status changed from In Progress to Closed
- % Done changed from 80 to 100