Feature #1882: Restrict inputs to HTTPS sites if on cloud - Website Input - LukeMurphey.net

Feature #1882

Restrict inputs to HTTPS sites if on cloud

Added by Luke Murphey over 8 years ago. Updated over 8 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Luke Murphey

Category:

Input: Web Spider

Target version:

4.3.0

Start date:

05/25/2017

Due date:

% Done:

100%

Associated revisions

Revision 444 (diff)
Added by lukemurphey over 8 years ago

Restricting access on Splunk Cloud to HTTPS

Reference #1882

Revision 446 (diff)
Added by luke.murphey over 8 years ago

Changes examples to use HTTPS

Reference #1882

Revision 447 (diff)
Added by luke.murphey over 8 years ago

Making icon call use HTTPS

Reference #1882

Revision 448 (diff)
Added by lukemurphey over 8 years ago

Making the web_scrape command ensure that connections use HTTPS on Cloud

Reference #1882

Revision 450 (diff)
Added by lukemurphey over 8 years ago

Making sure link extraction requires HTTPS on Cloud

Reference #1882

History

#1 Updated by Luke Murphey over 8 years ago

Status changed from New to In Progress
Assignee set to Luke Murphey

#2 Updated by Luke Murphey over 8 years ago

To update:

[done] Mod input editor
[done] Modular input code
[done] Wizard view
[done] Preview controller
[done] Search command
[done] Spider link extraction

#3 Updated by Luke Murphey over 8 years ago

From some reason the freaking endpoint isn't showing up in SplunkWeb.

Observations:

Removing the other entries from web.conf doesn't help
The entry looks equialvent to the one used in Website Montprinh
http://127.0.0.1:8000/en-US/splunkd/services/admin/app_website_input/default fails
https://127.0.0.1:8090/services/admin/app_website_input works
Is not included in C:\Program Files\Splunk\var\run\splunk\merged

#4 Updated by Luke Murphey over 8 years ago

It is available under http://127.0.0.1:8000/en-US/splunkd/__raw/services/admin/app_website_input/default

#5 Updated by Luke Murphey over 8 years ago

% Done changed from 0 to 80

#6 Updated by Luke Murphey over 8 years ago

Wierd, the non __raw endpoint works on Mac and 6.6.0.

#7 Updated by Luke Murphey over 8 years ago

I set the form value of the URL with the following to force the call to attempt to scrape the page:

document.getElementById('inputURL').value = "http://textcritical.net"

#8 Updated by Luke Murphey over 8 years ago

I have to update several calls to pass the https_only parameter to:

scrape_page
get_result_single
extract_links

#9 Updated by Luke Murphey over 8 years ago

Tested with:

| webscrape selector="h3" url="https://www.reddit.com/r/popular/" page_limit=50 url_filter="*" depth_limit=25 empty_matches=0

#10 Updated by Luke Murphey over 8 years ago

I want to make tests for this.

To do this, I need:

decorator for run_only_on_cloud
decorator for run_only_on_enterprise
Tests for Cloud:
- Scrape page doesn't extract non-HTTPS links
- Controller doesn't extract non-HTTPS links
- Scrape page won't scan non-HTTPS links
- Controller won't scan non-HTTPS links
- Wizard: rejects non-HTTPS

#11 Updated by Luke Murphey over 8 years ago

I wonder if I should change scrape_page to throw an exception if the URL provided is not HTTPS. Currently, https_only is only applicable to link extractions.

#12 Updated by Luke Murphey over 8 years ago

I recall now why I didn't add this into scrape_page: it doesn't have the session key to lookup whether the host is in the Cloud.

#13 Updated by Luke Murphey over 8 years ago

Status changed from In Progress to Closed
% Done changed from 80 to 100

Also available in: Atom PDF

Project

General

Profile

Splunk Apps » Website Input

Issues

Custom queries

Feature #1882

Restrict inputs to HTTPS sites if on cloud

Associated revisions

History

#1 Updated by Luke Murphey over 8 years ago

#2 Updated by Luke Murphey over 8 years ago

#3 Updated by Luke Murphey over 8 years ago

#4 Updated by Luke Murphey over 8 years ago

#5 Updated by Luke Murphey over 8 years ago

#6 Updated by Luke Murphey over 8 years ago

#7 Updated by Luke Murphey over 8 years ago

#8 Updated by Luke Murphey over 8 years ago

#9 Updated by Luke Murphey over 8 years ago

#10 Updated by Luke Murphey over 8 years ago

#11 Updated by Luke Murphey over 8 years ago

#12 Updated by Luke Murphey over 8 years ago

#13 Updated by Luke Murphey over 8 years ago