Finding all images in HTML files over a certain size with Python BeautifulSoup

This example shows how to use the Beautiful Soup library to find all images referenced in a bunch of html files, then filter to a particular size range – this works well to take out header images, logos, tracking pictures, etc. This assumes a system where you mirrored a website’s directory structure with wget. Unlike …