I need a complete guidance on restructuring a data, using a streaming lining data technique .
#BigData #datascinece #machinelearing #spark #hadoop #HDFS #tech
4 answers
Vijayakumar’s Answer
Udaya’s Answer
Anuj’s Answer
Jane’s Answer
It depends on the scale of the data and your SLA requirements. Let's say you want to reconstruct a small dataset in memory. If you are a Java or Scala developer, You can leverage Java 8 Lamda or Scala to manipulate data using functions like map, flatmap, reduce, and etc.
But if your dataset will be in TB and low latency data processing is required, then you need to consider to build a large scale and distributed streaming pipelines. The following tech stack may help you to initiate your evaluation:
Kafka (for pub/sub)
Kinesis (for pub/sub)
Storm
Spark Streaming (for compute)
Hope it helps!