After comparing different guides on the internet, I ended up my own version base on the Hadoop official guide with manual download. If you prefer Homebrew, this one would be your best choice. Actually there is no difference in the configuration of these two methods except the file directories. Here I extend the official guide by more details in case you need it.
Aug 31, 2020 Download Shuttle is one of our top download managers for Mac, as it is a simple solution for managing multiple downloads and accelerating transfer speeds. This application can split your downloads in multiple segments and download them all at once, handle links with authentication, and includes basic, but effective, link management options. Bitnami Hadoop Stack Installers Bitnami native installers automate the setup of a Bitnami application stack on Windows, Mac OS and Linux. Each installer includes all of the software necessary to run out of the box (the stack). The process is simple; just download, click next-next-next and you are done! Hadoop is released as source code tarballs with corresponding binary tarballs for convenience. The downloads are distributed via mirror sites and should be checked for tampering using GPG or SHA-512.
Also, this guide is part of my Hadoop tutorial 1. It aims to setting up the pseudo-distributed mode in single node cluster. And I will explain the HDFS configurations and command lines in Hadoop tutorial 2.
1. Required software
1) Java
Run the following command in a terminal:
If Java is already installed, you can see a similar result like:
If not, the terminal will prompt you for installation or you can download Java JDK here. Tomb raider mac download.
2) SSH
First enable Remote Login in System Preference -> Sharing.
Now check that you can ssh to the localhost without a passphrase:
If you cannot ssh to localhost without a passphrase, execute the following commands:
Download Hadoop
2. Get a Hadoop distribution
You can download it from Apache Download Mirror.
3. Prepare to start the Hadoop cluster
1) Unpack the downloaded Hadoop distribution.
2) Run the following command to figure out where is your Java home directory:
You can see a result like:
3) In the distribution, edit the file etc/hadoop/hadoop-env.sh to define some parameters as follows:
4) Try the following command:
This will display the usage documentation for the hadoop script.
Now you are ready to start your Hadoop cluster in one of the three supported modes:
- Standalone mode
- Pseudo-distributed mode
- fully-distributed mode
We will go through pseudo-distributed mode and run a MapReduce job on YARN here. In this mode, Hadoop runs on a single node and each Hadoop daemon runs in a separate Java process.
4. Configuration
Edit following config files in your Hadoop directory
1) etc/hadoop/core-site.xml:
2) etc/hadoop/hdfs-site.xml:
3) etc/hadoop/mapred-site.xml:
4) etc/hadoop/yarn-site.xml:
5. Execution
1) Format and start HDFS and YARN
Format the filesystem:
Start NameNode daemon and DataNode daemon:
Now you can browse the web interface for the NameNode at - http://localhost:50070/
Make the HDFS directories required to execute MapReduce jobs:
Start ResourceManager daemon and NodeManager daemon:
Browse the web interface for the ResourceManager at - http://localhost:8088/
2) Test examples code that came with the hadoop version
Copy the input files into the distributed filesystem:
Run some of the examples provided:
This example counts the words starting with 'dfs' in the input.
Examine the output files:
Copy the output files from the distributed filesystem to the local filesystem and examine them:
or View the output files on the distributed filesystem:
You can see the result like:
3) Stop YARN and HDFS
Cloudera Hadoop Download For Mac
When you're done, stop the daemons with: