Difference between revisions of "Scripting file downloads from secure sites"

From Michael's Information Zone
Jump to navigation Jump to search
(Created page with "==Purpose== I need to automate downloading files from a site requiring credentials. However, the site also uses a "Trusted this device" cookie as well. ==Process== I started...")
 
 
(One intermediate revision by the same user not shown)
Line 4: Line 4:
 
==Process==
 
==Process==
 
I started with using Lynx as I could log into the web form. This allowed me to save session cookies, but I had an issue with the trusted device cookie. Curl seems to be the answer as I can post data, save cookies, and create an additional "session" using the saved data. Though I was looking forward to using Lynx, learning curl will be more beneficial in the long run.
 
I started with using Lynx as I could log into the web form. This allowed me to save session cookies, but I had an issue with the trusted device cookie. Curl seems to be the answer as I can post data, save cookies, and create an additional "session" using the saved data. Though I was looking forward to using Lynx, learning curl will be more beneficial in the long run.
<ref>https://www.linuxquestions.org/questions/programming-9/click-%27submit%27-in-a-remote-web-form-from-bash-really-882839/</ref><ref>https://developer.mozilla.org/en-US/docs/Web/HTTP/Cookies</ref><ref>https://williamjxj.wordpress.com/2010/12/17/curl-vs-wget-vs-lynx/</ref><ref>https://stackoverflow.com/questions/51882330/curl-or-lynx-scripting-with-chrome-cookie</ref>
+
<ref>https://www.linuxquestions.org/questions/programming-9/click-%27submit%27-in-a-remote-web-form-from-bash-really-882839/</ref><ref>https://developer.mozilla.org/en-US/docs/Web/HTTP/Cookies</ref><ref>https://williamjxj.wordpress.com/2010/12/17/curl-vs-wget-vs-lynx/</ref><ref>https://stackoverflow.com/questions/51882330/curl-or-lynx-scripting-with-chrome-cookie</ref><ref>http://arquivo.splat-n.com/scripts/lynx-saving-cookies-and-scripting/</ref>
 
<br>
 
<br>
 
<br>
 
<br>
Line 11: Line 11:
 
#Save the session cookie.
 
#Save the session cookie.
 
#Download the files using both the session and trusted device cookie.
 
#Download the files using both the session and trusted device cookie.
So far I have the following
+
I was able to figure out what was needed by using a browser and selecting copy curl, trimming the output, and entering the fields needed. Then I was able to get the session cookie using the device cookie. Unlike what the curl man page said
 +
<br>
 +
<br>
 +
Instead of
 +
<pre>
 +
curl -b "xxx=xxx"
 +
</pre>
 +
I needed
 
<pre>
 
<pre>
curl -c temp.file -b xxxxx=xxxxxxx -d 'username=xxx&password=xxx&Login.x=27&Login.y=8&login-form-type=pwd&CUR_ST_PAGE=file+login&CUR_ST_ERROR&CUR_ST_ERROR_CODE&CUR_ST_USERNAME=unauthenticated&CUR_ST_LOCATION'
+
curl -H "Cookie: XXXX="%"2Fwpsnew; xxx=xxx"
 
</pre>
 
</pre>
 +
Where XXXX="%"2Fwpsnew is requesting a new session cookie, and xxx=xxx is the trusted cookie. By adding "-c cookie.file" I can save the generated session cookie and reuse it to download the files on another page.

Latest revision as of 12:50, 17 August 2018

Purpose

I need to automate downloading files from a site requiring credentials. However, the site also uses a "Trusted this device" cookie as well.

Process

I started with using Lynx as I could log into the web form. This allowed me to save session cookies, but I had an issue with the trusted device cookie. Curl seems to be the answer as I can post data, save cookies, and create an additional "session" using the saved data. Though I was looking forward to using Lynx, learning curl will be more beneficial in the long run. [1][2][3][4][5]

I figured out I need to do the following

  1. POST login information to the login form, while sending the trusted device cookie.
  2. Save the session cookie.
  3. Download the files using both the session and trusted device cookie.

I was able to figure out what was needed by using a browser and selecting copy curl, trimming the output, and entering the fields needed. Then I was able to get the session cookie using the device cookie. Unlike what the curl man page said

Instead of

curl -b "xxx=xxx"

I needed

curl  -H "Cookie: XXXX="%"2Fwpsnew; xxx=xxx"

Where XXXX="%"2Fwpsnew is requesting a new session cookie, and xxx=xxx is the trusted cookie. By adding "-c cookie.file" I can save the generated session cookie and reuse it to download the files on another page.