Have a lot of experience of web scraping with python, selenium, casperjs, etc, for financial info, job info etc.
Not sure why C++ is needed. Howeevr, I have many years of c/C++ experience.
It is unclear to me what will be crawled and scale. Maybe you need a distributed real big crawling farm?
Milestones:
1. simple scrapying script to scrape the page and parse it, selization to some db, mongo, or flat file
2. scrapper and job scheduling.
3. further performance improvement etc.