Let’s take a step back and try to understand what exactly happened in Hadoop Analysis.
In Hadoop Analysis, we have two important files.
1 - Log File
2 - Hadoop Log Analysis Script
So, the objective here is to use Linux Commands to analyze data inside the Hadoop logs file.
Hadoop Log Analysis Script is a shell script that has all the necessary commands based on which the Analysis of Hadoop Log files is happening. If you open that file, you will see a sequence of Linux commands to count lines, characters, words, etc.
So, in this milestone, you have to analyze the OpenSSH/Apache logs similarly.
How can you do that?
1 - You can write a similar shell script like hadoop_log_analysis.sh
2 - You can write individual Linux commands and execute them on the terminal and see the results.
For reference, you can always open the Hadoop analysis shell script to see the commands.