Find Jobs
Hire Freelancers

Build web crawler & screenshot comparison app in PHP/SQL

$30-5000 USD

Cancelled
Posted over 11 years ago

$30-5000 USD

Paid on delivery
**Summary:** I'd like for someone to develop a prototype of a web application. The purpose of the web app is to allow you to input a URL and then automatically record AB tests taking place on that URL, and then tell you (by inference) which versions ultimately won and lost the AB test. **Bidding instructions:** 1. Please state how long you estimate it would take you to complete this project. 2. Please ask any questions you may have - questions that evidence you've thoroughly read the requirements are a very good sign. 3. Please include in your bid a short summary of how you believe this app will work. **Technical challenges to overcome in this app:** * Build a web crawler that can crawl a URL every day X times/day. * Have the web crawler take screenshots from Internet Explorer 10's perspective. * Automatically clear Internet Explorer's cookies and cache before each time it crawls a URL. * Image comparison: compare screenshots against those already taken and determine whether they're the same or different. Please find additional details below... Thanks for your bids! ## Deliverables **Here's an explanation of how the app works.** 1) You input a URL. 2) You tell the app how many times per day to crawl the URL (ex: 50 times/day). 3) The app visits the URL 50 times every day, clearing cookies and cache each time, so that it will see all test versions (most AB test platforms rely on cookies so that a user sees the same version upon multiple visits). 4) The app takes a screenshot each time it visits the URL and stores it in a database. 5) When the app takes a screenshot, it compares the new image with existing images in the database, to determine if it has found a new test version or if it is seeing an image again that is already being stored in the app's database. 6) Within the course of 1 day, the app will determine how many different unqiue images appear, and that group of unique images makes up the "test group". So the 1st day that the app crawls a new URL, the set of unique images it finds will compose the 1st test group. 7) The app will then seek to determine the winners and losers of that AB test group by recrawling the URL every day. 8) In order to decide winners and losers, the app must first find that the test group has changed, and when it has, images that remain from the old test group will be inferred to be winners, and images from the old test group that do not remain in the new test group will be inferred to be losers. **INPUT** 1) URL to be crawled 2) How many times per day to crawl that URL (default is 50) **OUTPUT** The output is a chronologically organized list of thumnbnail images, with the oldest on top. If you click on a thumbnail, a full sized screenshot image pops up in a lightbox. Each test group occupies one row. Below each image thumbnail are two pieces of data: 1) % distribution of how often that images is being served within the test group. For exampl, if there are 4 unique images and they're all being shown equally, it would say 25% under each image. This percentage is subject to change if the URL owner decides to change it mid test, and as the URL is recrawled every day it can readjust the percentages. The numbers are finalized when the test group is closed and a new test group begins. 2) After the app determines winners and losers, below each image it marks "winner" or "loser" along with the date that it was determined. **NAVIGATION** 1) Homepage: password protected - after login, there's a list of URLs that have been setup to crawl along with one a button to "Enter a new URL," which is how you start monitoring a URL. 2) URL setup page: After selecting "Enter new URL" there is a field for URL, daily crawl frequency and an "enter" button to get started. 3) URL detail page: On the homepage, after you click on a URL, you are brought to the output page, as described above on the OUTPUT section. There is also a "recrawl" button, which overrides the normal daily crawl settings and immediately recrawls the URL. There should be an progress display that shows you x/50 crawls have been completed in real time. **TECHNICAL DETAILS** * I'd like this coded in PHP/SQL. * I don't care about pretty design - I just want a functional prototype, so design just needs to be good enough to operate the app. * Daily crawl: The daily crawls should always take place at 6am PST. Each crawl should take place between 10 and 30 seconds apart, with a random number chosen between 10 and 30. * Screenshots: The crawler should access the URL on the latest version of Internet Explorer 10, and the screenshot should be of the entire window including parts that you would need to scroll to see, not just above the fold.
Project ID: 2786413

About the project

Remote project
Active 12 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs

About the client

Flag of UNITED STATES
Seattle, United States
5.0
235
Payment method verified
Member since Mar 2, 2007

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.