If you have landed on this page then you are most likely aspiring to learn Hadoop ecosystem of technologies and tools. Why not make your goal measurable and timeboxed? Why not have a plan in place and an approach to get certified? Let me help you with these two questions in this series of blog posts.Please enrol for this course for free
Getting a certification is not about obtaining a piece of paper, it is more about the learning you go through in order to get your skills to a certain industry standard benchmark. It is about leveraging that learning in solving real time or real time like production problems.
This series of posts is intended to help aspiring big data professionals in increasing their chances of getting certified. I am going to create a bunch of scenarios that focus on bare minimum technologies that any big data professional should be hands on with (especially the ones on the technical side like developers and architects). I am also going to post video walkthrough of the solution to each of those problems. Going through the video walkthrough helps you learn the concepts in a more real-time like set up. Hence, I am not inclined towards creating individual tutorial videos.
All of the videos and the posts are going to be certification focussed true to the spirit of this series of posts. Solving these problem scenarios not only help you in your preparation for the certification but also tremendously help you in validating the skills your have acquired in dealing with the hadoop ecosystem.
I strongly recommend that you go through the certification preparation plan\strategy video before going through the rest of the videos and links on this blog. Click the link below for the youtube playlist of videos related to this topic.
- Environment – we need one of the below to practice
- Cloudera quickstart VM (5.8 is used for demo)
- Hortonworks Sandbox
- Any cloud based labs such as labs.itversity.com
- Programming using Scala or Python
- Apache Spark
- SQL and Big Data based SQL frameworks such as Hive, Spark SQL
- Flume, Kafka and Spark Streaming
- HDFS Command line