Website watcher script
1. Scrape a website and store its text contents into a .txt or PDF file on AWS or Google Cloud. Note: this should only contain the readable text of the file and not the HTML etc.
2. Do step #1 periodically (e.g. every 24 hrs) to check whether the site contents have changed.
3. If the site contents have changed, generate a diff for each version.
14 freelancers are bidding on average $198 for this job
Hi, I will be able to do this for you. But I have a couple of questions. Is this for just one website? Is there a specific section in the website where you expect changes to happen over 24hrs? Lets discuss. Thanks!
I can scrape the website using python beatufiulSoup and run the same on AWS cloud using Lambda and Cloudwatch Event to run it once a day daily. Please contact me if interested.