一、准备
准备四个机器,分别是 node0001、node0002、node0003、node0004
1.下载 jdk-8u341-linux-x64.tar.gz
2.下载 hadoop-2.6.5.tar.gz
3.修改别名
配置 /etc/hostname 文件,四个机器分别配置 node0001、node0002、node0003、node0004
4.配置/etc/hosts
为了方便,配置映射信息
172.17.0.2 node0001
172.17.0.3 node0002
172.17.0.4 node0003
172.17.0.5 node0004
二、安装hadoop
1.解压
把 jdk-8u341-linux-x64.tar.gz 和 hadoop-2.6.5.tar.gz 解压到 /opt/ 目录下
tar -xf jdk-8u341-linux-x64.tar.gz -C /opt/
tar -xf hadoop-2.6.5.tar.gz -C /opt/
2.修改配置文件的JAVA_HOME
在 /opt/hadoop-2.6.5/etc/hadoop/ 目录下,有 hadoop-env.sh、mapred-env.sh、yarn-env.sh 文件,修改其 JAVA_HOME 的路径
export JAVA_HOME=/opt/jdk1.8.0_341/
3.配置主节点信息,dir目录信息
打开 /opt/hadoop-2.6.5/etc/hadoop/core-site.xml 文件
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://node0001:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/var/hadoop/full</value>
</property>
</configuration>
4.配置从节点信息
打开 /opt/hadoop-2.6.5/etc/hadoop/slaves 文件
node0002
node0003
node0004
5.配置副本数量、及secondary
打开 /opt/hadoop-2.6.5/etc/hadoop/hdfs-site.xml 文件
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>node0002:50090</value>
</property>
</configuration>
三、配置 ssh
node0001 安装 openssh-client,生成公钥
ssh-keygen -t rsa
node0001、node0002、node0003、node0004 开启 sshserver 服务
把 node0001 的 ssh 公钥追加到 node0001、node0002、node0003、node0004 .ssh 的 authorized_keys 文件中
(node0001 可以免密登陆 node0001、node0002、node0003、node0004)
四、使用
1.格式化
在 node0001 上格式化 namenode
hdfs namenode -format
格式化成功
root@node0001:/opt/hadoop-2.6.5/bin# ./hdfs namenode -format
23/10/11 07:26:07 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = node0001/172.17.0.2
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 2.6.5
23/10/11 07:26:07 INFO util.GSet: Computing capacity for map NameNodeRetryCache
23/10/11 07:26:07 INFO util.GSet: VM type = 64-bit
23/10/11 07:26:07 INFO util.GSet: 0.029999999329447746% max memory 889 MB = 273.1 KB
23/10/11 07:26:07 INFO util.GSet: capacity = 2^15 = 32768 entries
23/10/11 07:26:07 INFO namenode.NNConf: ACLs enabled? false
23/10/11 07:26:07 INFO namenode.NNConf: XAttrs enabled? true
23/10/11 07:26:07 INFO namenode.NNConf: Maximum size of an xattr: 16384
23/10/11 07:26:07 INFO namenode.FSImage: Allocated new BlockPoolId: BP-950352295-172.17.0.2-1697009167802
23/10/11 07:26:07 INFO common.Storage: Storage directory /var/hadoop/full/dfs/name has been successfully formatted.
23/10/11 07:26:07 INFO namenode.FSImageFormatProtobuf: Saving image file /var/hadoop/full/dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
23/10/11 07:26:07 INFO namenode.FSImageFormatProtobuf: Image file /var/hadoop/full/dfs/name/current/fsimage.ckpt_0000000000000000000 of size 321 bytes saved in 0 seconds.
23/10/11 07:26:07 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
23/10/11 07:26:07 INFO util.ExitUtil: Exiting with status 0
23/10/11 07:26:07 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at node0001/172.17.0.2
************************************************************/
2.启动
在 node0001 上使用 start-dfs.sh 启动,可以看到 node0001、node0002、node0003、node0004 都启动了
start-dfs.sh
root@node0001:/opt/hadoop-2.6.5/sbin# ./start-dfs.sh
Starting namenodes on [node0001]
node0001: starting namenode, logging to /opt/hadoop-2.6.5/logs/hadoop-root-namenode-node0001.out
node0002: datanode running as process 4561. Stop it first.
node0003: datanode running as process 4508. Stop it first.
node0004: datanode running as process 4535. Stop it first.
Starting secondary namenodes [node0002]
node0002: secondarynamenode running as process 4646. Stop it first.
jps 查看 java 进程
node0001
root@node0001:/opt/jdk1.8.0_341/bin# ./jps
5463 Jps
5224 NameNode
node0002
root@node0002:/opt/jdk1.8.0_341/bin# ./jps
4561 DataNode
4803 Jps
4646 SecondaryNameNode
node0003
root@node0003:/opt/jdk1.8.0_341/bin# ./jps
4629 Jps
4508 DataNode
node0004
root@node0004:/opt/jdk1.8.0_341/bin# ./jps
4656 Jps
4535 DataNode
在浏览器输入 node0001:50070
可以看到界面
3.使用 hdfs 建立目录
建立目录 /user/roots
hdfs dfs -mkdir -p /user/root
root@node0001:/opt/hadoop-2.6.5/bin# ./hdfs dfs -mkdir -p /user/root
4.上传文件
把文件上传到 /user/root
hdfs dfs -put /root/test.tar.gz /user/root
hdfs dfs -put /root/test1.tar.gz /user/root
原文件大小 141.3 MB
root@node0001:/opt/hadoop-2.6.5/bin# ./hdfs dfs -put /root/test.tar.gz /user/root
root@node0001:/opt/hadoop-2.6.5/bin# ./hdfs dfs -put /root/test1.tar.gz /user/root
5.数据位置
原文件大小 141.3 MB
node0001 上 namenode 信息在:
root@node0001:/var/hadoop/full/dfs/name/current# ls -lh
total 2.1M
-rw-r--r-- 1 root root 201 Oct 11 15:26 VERSION
-rw-r--r-- 1 root root 42 Oct 11 15:33 edits_0000000000000000001-0000000000000000002
-rw-r--r-- 1 root root 1.0M Oct 11 15:33 edits_0000000000000000003-0000000000000000003
-rw-r--r-- 1 root root 42 Oct 11 15:41 edits_0000000000000000004-0000000000000000005
-rw-r--r-- 1 root root 1.0M Oct 11 15:50 edits_inprogress_0000000000000000006
-rw-r--r-- 1 root root 321 Oct 11 15:33 fsimage_0000000000000000002
-rw-r--r-- 1 root root 62 Oct 11 15:33 fsimage_0000000000000000002.md5
-rw-r--r-- 1 root root 321 Oct 11 15:41 fsimage_0000000000000000005
-rw-r--r-- 1 root root 62 Oct 11 15:41 fsimage_0000000000000000005.md5
-rw-r--r-- 1 root root 2 Oct 11 15:41 seen_txid
node0002 上 datanode 信息在:
root@node0002:/var/hadoop/full/dfs/data/current/BP-950352295-172.17.0.2-1697009167802/current/finalized/subdir0/subdir0# ls -lh
total 259M
-rw-r--r-- 1 root root 128M Oct 11 15:42 blk_1073741825
-rw-r--r-- 1 root root 1.1M Oct 11 15:42 blk_1073741825_1001.meta
-rw-r--r-- 1 root root 128M Oct 11 15:50 blk_1073741827
-rw-r--r-- 1 root root 1.1M Oct 11 15:50 blk_1073741827_1003.meta
node0003 上 datanode 信息在:
root@node0003:/var/hadoop/full/dfs/data/current/BP-950352295-172.17.0.2-1697009167802/current/finalized/subdir0/subdir0# ls -lh
total 27M
-rw-r--r-- 1 root root 14M Oct 11 15:42 blk_1073741826
-rw-r--r-- 1 root root 107K Oct 11 15:42 blk_1073741826_1002.meta
-rw-r--r-- 1 root root 14M Oct 11 15:50 blk_1073741828
-rw-r--r-- 1 root root 107K Oct 11 15:50 blk_1073741828_1004.meta
node0004 上 datanode 信息在:
root@node0004:/var/hadoop/full/dfs/data/current/BP-950352295-172.17.0.2-1697009167802/current/finalized/subdir0/subdir0# ls -lh
total 285M
-rw-r--r-- 1 root root 128M Oct 11 15:42 blk_1073741825
-rw-r--r-- 1 root root 1.1M Oct 11 15:42 blk_1073741825_1001.meta
-rw-r--r-- 1 root root 14M Oct 11 15:42 blk_1073741826
-rw-r--r-- 1 root root 107K Oct 11 15:42 blk_1073741826_1002.meta
-rw-r--r-- 1 root root 128M Oct 11 15:50 blk_1073741827
-rw-r--r-- 1 root root 1.1M Oct 11 15:50 blk_1073741827_1003.meta
-rw-r--r-- 1 root root 14M Oct 11 15:50 blk_1073741828
-rw-r--r-- 1 root root 107K Oct 11 15:50 blk_1073741828_1004.meta
6.关闭
在 node0001 上使用 stop-dfs.sh 关闭 hadoop,可以看到关闭信息
stop-dfs.sh
关闭成功
root@node0001:/opt/hadoop-2.6.5/sbin# ./stop-dfs.sh
Stopping namenodes on [node0001]
node0001: stopping namenode
node0002: stopping datanode
node0004: stopping datanode
node0003: stopping datanode
Stopping secondary namenodes [node0002]
node0002: stopping secondarynamenode