Hello,
I need a proof of concept done for ETL to mongodb for a time series data. I attaching some sample csv files.
It contains two files which are generated every 15 minutes. These two files contains information for the same object, so as such the etl process should load it for the same element, i mean 1 record for each C_ID (this is how the each item is unqiuely identified).
Also we have some reference data, which tells where this C_ID belongs, we need aggregation done at each level for hourly
Attached is the accessdb which tells about the queries which i am interested in performing. (Incremental aggregation is the key thing)
Scenarios to check
a) We should have two threads reading the two files , the idea is to check upsert of an entry.
b) See we load file 5 files for 1 hour (expected is 8) and then load other 3 files later , see how incrmental aggregation takes care of it.
Hi,
I have hands-on experience of working on similar project using Hadoop, MongoDB and Apache Mahout.
I have read and understood your requirements, however I need some more information about your requirements. If you find my bid correct then please do contact me.
Thanks and Regards,
Chandrashekhar