In this article, you can find how to get all subrequests for a page by using Python requests. This is equivalent to the network tab of Chrome or Firefox which shows all subrequest of a given page like: assets, images, JS files etc.

To check for redirects or broken links with Python you can use: Python Script to Check for Broken Links And Redirects

To check subrequests we will use an additional library: selenium-wire 5.1.0. This library is installed by:

pip install selenium-wire

Then we can use it to get all subrequests by:

from seleniumwire import webdriver
import pandas as pd

pages = [
	'http://httpbin.org/',
	'http://wikipedia.org/',
	'http://google.com/'
]

urls = []
driver = webdriver.Firefox()  

for page in set(pages):
	page = page.replace('//www', '//dev')
	driver.get(page)  
    
	for request in driver.requests:  
    	if request.response:  
        	print(request.url, request.response.status_code, request.response.headers['Content-Type'])
        	urls.append([page, request.url, request.response.status_code, request.response.headers['Content-Type']])
df = pd.DataFrame(urls)

The code above process the following links:

Firefox driver is used in headful mode to load each page. Then all subrequests are processed one by one.

Finally we collect all links and their status in Pandas DataFrame.

Final result contain all requests from those 3 URL-s:

url requests status content-type
http://httpbin.org/ http://detectportal.firefox.com/canonical.html 200 text/html
http://google.com/ http://detectportal.firefox.com/canonical.html 200 text/html
http://wikipedia.org/ https://tracking-protection.cdn.mozilla.net/ads-track-digest256/1695941350 200 application/octet-stream
http://google.com/ https://tracking-protection.cdn.mozilla.net/content-track-digest256/1695941350 200 application/octet-stream
http://wikipedia.org/ https://wikipedia.org/ 301 text/html; charset=iso-8859-1

Total number of requests is 117.

You can compare the results with the Firefox network tab and Python results by:

  • Open Firefox
  • Right click Inspect
  • Network Tab
  • Reload

get-the-network-tab-with-python-requests.webp