Spark Scala JSON data parsing

₹600-2000 INR

Closed

Posted

over 4 years ago

₹600-2000 INR

Paid on delivery

Parsing a JSON file and having duplicate rows due to multiple array items. Need to flatten the data and make it a single record. Using Spark or Spark SQL.

Spark

Scala

Apache Hadoop

Project ID: 21834901

About the project

12 proposals

Remote project

Active 4 yrs ago

Looking to make some money?

Email address

Benefits of bidding on Freelancer

Set your budget and timeframe

Get paid for your work

Outline your proposal

It's free to sign up and bid on jobs

12 freelancers are bidding on average ₹5,777 INR for this job

@dineshrajputit

Hi, I have 8 years of experience and working on hadoop, spark, nosql, java, BI tools(tableau, powerbi), cloud(Amazon, Google, Microsoft Azure)... Done end to end data warehouse management projects on aws cloud with hadoop, hive, spark and presodb. Worked on multiple etl project like springboot, angular, node, PHP, Kafka, nifi, flume, mapreduce, spark with XML/JSON., Cassandra, mongodb, hbase, redis, oracle, sap hana, ASE.... Many more. Let's discuss the required things in detail. I am committed to work done and strong in issue resolving as well. Thanks

₹2,250 INR in 1 day

5.0

(3 reviews)

3.5

@rsh57eb89da3a5de

Hi , i have a good experience in scala . I am working in the same since 2.6 years . Developed rest apis and spark project for streaming in scala . Have been used kafka , akka , spray json for the scala language . We cam discuss more on detail about the requirement .

₹1,750 INR in 1 day

5.0

(3 reviews)

3.1

@rnaushad

Hi, I have about 15 years of experience in java stack and 2 years in spark . I have recently implemented a solution to flatten a hierarchical json structure. Kindly share your json file and I will share the solution with you in about 2-3 hours, however in java. Please let me know if you absolutely need this in scala. Regards, Rabiya

₹1,250 INR in 1 day

5.0

(4 reviews)

3.0

@ManiSpark

don't worry about money. I will do it for you. please be ready with the test data and the expected output. I'm waiting!!!

₹750 INR in 1 day

0.0

(0 reviews)

0.0

@abhishkk

I have been working on data cleansing for past 6 years using spark and storing it. This requires the expertise to understand how to setup the cluster of computers and also the cost of same. I have been worked on deduplication framework which removes the duplicates by a file or, if a stream of data, in a particular time frame. Even though I will be getting only Rs. 500 at this bid, I am doing it to build a reputation among clients and gain trust as I am just starting my freelance work (even though I have more than 6.5 years of expertise). I would be fixing the issues reported, if any and will provide all sort of support needed.

₹750 INR in 5 days

0.0

(0 reviews)

0.0

@silpamaddipatla

I have an experience in spark scala almost 3 years. I have handled json data in my two previous projects . I can able to do this as i have did flattening of json and read the json file through spark scala,and i have did array explode as well. in spark scala. I am confident that i can able to handle this.

₹1,700 INR in 7 days

0.0

(0 reviews)

0.0

@sivasai2006

I can do it. Please share me the input and output, I can code it as per the requirement. Let me know other details in chat.

₹750 INR in 2 days

0.0

(0 reviews)

0.0

@brahmanandasahoo

send me a sample JSON and the development language scala/python will do it in few hours.8+ years of bigdata experience.

₹1,750 INR in 1 day

0.0

(0 reviews)

0.0

@shubhver

i have worked on the same scenario in my compny projects. so i think i can do this in less time. i can do using spark core or spark sql as per your requirments.

₹650 INR in 2 days

0.0

(0 reviews)

0.0

@sumanthsharma21

hi i can deliver your requirement in 2 days. also need tk understand if its for real time or a simple flat file.

₹2,250 INR in 2 days

0.0

(0 reviews)

0.0

@asjkim

I am currently doing very similar role for other client based on hourly rate and available immediately. Below is my summary. I have extensive Big Data experinece, and designed and delivered a metadata-driven data ingestionI have extensive Big data experience and develped a number of PySpark/Scala framework which ingests data from various Westpac data sources to Westpac Data Hub (HDFS) then integrates, transforms, and publishes to target sources including Kafka, RDBMS (Teradata, Oracle, and SQL Server) and SFTP, etc. Technologies used for the project including Python, Spark, Spark SQL, Hadoop, HDFS, Hive, Cassandra, Hbase, Kafka, NIFI, and Atlas. I created Scala/PySpark based framework which ingests data from customer rating bureau including Equifax, Illion, and Experian. Designed the entire JSON/XML explosion pattern which involves multi-level JSON/XML explosion and normalized table creation in HDFS platform using Scala/PySpark, Hive, SparkSQL, and Hbase. Created entire downstream conceptual, logical, and physical data models for downstream users including credit risk analysts and data scientists.

₹53,221 INR in 3 days