This is my website: [url removed, login to view] which was build by a [url removed, login to view] member who has since been BANNED for being a scammer.
[url removed, login to view] is a SEARCH ENGINE which scrapes data from a list of 150-200 websites selling watches.
I need a data scraping expert to complete the search engine - which scrapes data from over 150-200 websites. These things are incomplete:
a. DATA is not fetched from all 150-200 websites
b. existing websites scraped is not consistently fetched data
c. email alert function is not working
d. email alert function admin panel to approve signups
e. social media interface incomplete
These are the comments from a data-scraping expert, which u need to fix:
" Current setup is not flexible, we have to add a scrapping script for each of the URL we need to scrap for data, even if they are similar and just a pagination of the same website. So 2 scrips for
[url removed, login to view] and [url removed, login to view]
This needs to improve so it can autoscrap the other pages of the same URL, even if they do not exist now but will be added later
There are about 35 files suppose to be executed to scrap all of the URL you have in the list. Have not confirmed if they have covered all of the URL as need to match them in each of the file. It is difficult to execute all of these 35 files manually and/or as cron jobs. It should be just one process which should scrap all of the pages.
Database details are being defined in each of these files, and need to be updated in each of them in case later on you decide to change the password or want to setup on different host. Tiring :)
Current setup is not suitable to run as cron jobs, as they are scrapping over 20 web pages within the single script, failure of one of them, will result failure of all.
Current setup of scrapping is very slow, as it is done sequentially. So the next webpage can only be scrapped once the previous one completes successfully.
Not all of the scrapping scripts are complete, as I can see for some Web URL they have incomplete codes inside. This means you are currently not getting the data from these web pages. For ex. [url removed, login to view]
Jobs are being executed manually on browser, with a fixed page refresh limit. This is causing the pages to reload after the specified time and start from the start again. Due to that, the pages schedule to be scrapped down the page might not be executed in case the one above take longer then the reload time."
If you wish to be hired, you must
a. PROVE THAT you can make this website 100% functional
b. SPEAK GOOD ENGLISH and can LISTEN well
c ANTICIPATE client needs well. DO NOT only WAIT TO BE TOLD.
d. be PROFESSIONAL and do what is needed to get the search engine up and running
e. advise what hosting requirements are needed. this is the current plan: VPS 1180 [url removed, login to view]
f. you must be the TECHNICAL PERSON - I DO NOT wish to speak to the BUSINESS DEVELOPMENT PERSON.
g. Advise how FUTURE WEBSITES can be uploaded and data fetched from them.
Dateline: 20 days
If you completion rate is less than 90%, please do not bid for this project.
19 freelancers are bidding on average $736 for this job
Hi, I have done a similar assignment before and would anticipate this would take me around 8 hours to complete. Im an EXPERT in scraping. Please send me a message if interested. Thank you.
I've expertise in web-scraping as my regular expression skills is quite high especially when working with PHP, i can do all the task with preg_match or preg_match_all to fetch all necessary data.