I need you to make a script that will run either [login to view URL] or [login to view URL] and populate a mysql database (the reason that this needs to be a script is there are probably 700,000+ entries in total). I prefer to the script to be in PHP, but if you have other, faster methods, then you can use those as well.
The scraped data should go into 2 tables in mysql.
1st table: artists
fields:
a_name (name of the artist, i.e. "Britney Spears"),
a_id (incremental artist ID 1 by 1 starting from 1)
a_alias_plain (url field - it'll be structure "artist-name" the multiple words are separated by dashes. All words are lower case. All non-numeric/non-alphabet characters must be parsed out. Make sure there is only 1 dash separating each word)
a_alias_lyrics (url field - it'll be the structure "artist-name-lyrics", mutliple words are separated by dashes and "-lyrics" is appended at the end. All words are lower case. All non-numeric/non-alphabet characters must be parsed out. Make sure there is only 1 dash separating each word)
2nd table: songs
fields:
s_id (id of the song, incremental 1 by 1 starting with 1)
s_name (the name of the song, i.e. "Feel The Way")
s_text (the actual text of the song, I only want the text and not any other stuff on the page)
s_artist (this is going to be the Artist's ID from a_id - this is so that I can associate which song is for which artist)
s_alias_plain (this is an url field - structure is "song-name", each word is separated by dashes. All words are lower case. All non-numeric/non-alphabet characters must be parsed out. Make sure there is only 1 dash separating each word)
s_alias_lyrics (this is the 2nd url field just in case, each word is separated by dashes with "-lyrics" appended at the end. All words are lower case. All non-numeric/non-alphabet characters must be parsed out. Make sure there is only 1 dash separating each word)
Database should have proper collation so that all special characters are displayed.
The whole database should probably have 700,000+ entries. I don't want to wait more than 5 days, so if you can complete it within that time frame, feel free to bid. I am not paying more than $100 so please don't bid higher. I need to start as soon as possible, so if you give me a good bid, you could even start working today.
Please only bid if you have read the requirements fully.
We can do this for you.
The task is interesting. Please let us know to which site you wanna give priority to fetch data from the given two.
The task is not critical at all. but it should get completed with proper care as we have to deal with html formates to fetch data.
Pelase check your PM,
We are ready to start with :) and just waiting for you to select us.
We will definetly deliver you the expected output with no compromise.
Regards