Find Jobs
Hire Freelancers

automated ec2 python daily web crawl & scrape. starts & stops ec2 programatically. aurora db.

$250-750 USD

Cancelled
Posted over 4 years ago

$250-750 USD

Paid on delivery
please read this in detail. answer the 11 questions below fully and clearly. do not say anything else. we've been getting a lot of spam and need your response to be as concise as possible. responses that violate this request will be ignored. we need a service written in python to crawl through urls on two web domains, extracting data found in json objects on the default page source. this needs to crawl thousands of instances of around 8 unique web pages. data will be saved in a db with about 5 tables with about 10 columns each. this needs to be completed each day. a scheduler should start an ec2 instance and the code should begin executing. when the crawl is finished for the day, the ec2 instance should be terminated. also, if the IP address ever gets blocked by the website being crawled, then that ec2 instance should be shut down, and a new one started (with a unique IP address) all required data is held in the page source accessible with a simple curl or GET of a url. no clicking is necessary for this web scraping project. QUESTIONS - YOU MUST ANSWER ALL. please number your answers for clarity 1. we need to use the aws serverless sql-based db. what is it called? 2. how would you start the ec2 instances automatically each day? 3. how would you terminate the ec2 instances when the crawl was completed? 4. visit [login to view URL] -- the name, location, date, price, and age limit for this event can all be found in a single json object in the html returned by this url. what is this json value? copy and paste this entire json object in your response. 5. how would you programmatically extract this json from the url? 6. how would you programmatically extract this json from the url if there were multiple similar json objects on the page? 7. when would you use dynamodb instead of aurora? 8. what is clean code? 9. what is dry code? 10. how long would this project take you? 11. how much $ would you require for this?
Project ID: 20853546

About the project

10 proposals
Remote project
Active 5 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
10 freelancers are bidding on average $607 USD for this job
User Avatar
Hi there Me and my team can deliver your tasks with great quality We are focused on Web Development and created many beautiful sites, mostly in Python. We like to use Laravel as REST api and Vuejs as SPA for new app projects, for Cms we use WordPress and Elementor To be able to give clients what exactly they want, we are working on hourly rate which is US$40 per hour Contact me for an enjoyable and reliable development experience. Thank you
$500 USD in 5 days
5.0 (95 reviews)
8.6
8.6
User Avatar
Hi, How are you? I am very interested in your project and I have read your descriptions carefully. I can answer to you. As you can see from my profile, I have enough experience on linux, scrap, crawl and etc. but I want to discuss more via chat. Thanks/
$500 USD in 7 days
4.9 (73 reviews)
7.5
7.5
User Avatar
Hi There, a. We can develop the python program you want us to code for you. b. Please check our reply for the questions you have asked. 1. aurora db 2. using lambda 3. lambda 4. ,5, 6-using Python with django 7. dyanmodb is not rdbms....while aurora is relational DBMS. It's about just crawling the data We have used mongodb for web crawling using nutch and solr 8. since we are having professional have more than 15+ years of exp in various programing like Java, Python, Oracle etc . So understand the advantages of clean codes 9. Dry stands for dont repeat yourself We do reusable and modular programming...so once a code written can reused Rather re written 10 & 11 We need to check your 8 different jSON Objects from where where we need to read the data.
$750 USD in 7 days
5.0 (9 reviews)
6.4
6.4
User Avatar
1. we need to use the aws serverless sql-based db. what is it called? You want to use AWS lambda & RDS service with Nodejs/python, we can use server less framework for this and great experience 2. how would you start the ec2 instances automatically each day? Many ways - we can use AWS Batch Opswork Chef scripts, ECS/cloud init Autoscaling features or using AWS sdk EC2 apis 3. how would you terminate the ec2 instances when the crawl was completed? I would recommend using AWS Batch for this use case as it will automatically kill the service after each run and you can even save money using spot instances instead 5. how would you programmatically extract this json from the url? We can use Apis such as node-pupetteer, cheerio etc., to parse and return as json 6. how would you programmatically extract this json from the url if there were multiple similar json objects on the page? Using cheerio we can parse every single HTML nodes and loop them to retrieve by id and class e7. when would you use dynamodb instead of aurora? If the data is not too much relational in nature 8. what is clean code? less code/Efficient/performant/comments/maintenance friendly code 9. what is dry code? aimed at code reusability 10. how long would this project take you? 1 week 11. how much $ would you require for this? $750
$700 USD in 7 days
4.9 (8 reviews)
6.1
6.1
User Avatar
Nice to meet you I am an Amazon Cloud Architect for the web infrastructure serving 90 million page impressions and 12 TB Internet traffic per month. The AWS services I use are EC2, ELB, MySQL RDS, VPC, CloudFront, ElastiCache, CloudWatch, CloudFormation, OpsWorks, ElasticBeanstalk, CodeDeploy, S3, SES, SQS and SNS. I have 20 years of Linux SysAdmin experience. I currently use Apache, Nginx, Ldirectord, MySQL, Perl, PHP, Memcached, Sphinx, Bind, Typo3, WordPress, Send-mail, Postfix, NFS, Samba, Snort, Vsftpd, aide, Nagios, Cacti, Puppet and a bunch of other traditional Linux software. I am good at amazon-web-services,linux,python,software-architecture If you’re looking for a developer that’s truly an expert, driven by passion, not afraid to take on a challenge, and will be there with you every step of the way then look no further as I’m your guy.
$637 USD in 9 days
5.0 (2 reviews)
2.4
2.4
User Avatar
Hi There, I am writing in response to your post for "automated ec2 python daily web crawl & scrape. starts & stops ec2 programatically. aurora db.."After carefully reviewing the description I feel that I am a suitable match for the job. I lead a Team of Professionals having 2 - 10 years of experience in the IT industry. We have young & energetic experts working in almost all technologies & spheres of IT. Our developers and coders are highly proficient with Object-Oriented Programming and ensure high coding standards, documentation & easily maintainability. We follow strict testing & Quality Assurance procedures to ensure we deliver the highest quality work. We are a team of IT services providers focused on providing highly scalable business solutions to the services & manufacturing sector with innovative approaches and advanced methodologies. We provide strategic development for the global business community with our wide array of solutions and services customized for a range of key verticals & horizontals. Our functional knowledge covers applications in the areas of banking, consumer durables, manufacturing, real estate, retail & logistics, POS & billing, sales & distribution, inventory control, HR, etc.etc. We even provide Online Identity & Branding to companies to establish themselves with a great presence & serve the online user base. Looking forward to hearing from you soon. Thanks & Regards Maan Singh
$500 USD in 35 days
0.0 (0 reviews)
0.0
0.0

About the client

Flag of UNITED STATES
Durham, United States
0.0
0
Payment method verified
Member since Aug 18, 2019

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.