Hadoop实现了一个分布式文件系统(Hadoop Distributed File System),简称HDFS。HDFS有高容错性的特点,并且设计用来部署在低廉的(low-cost)硬件上;而且它提供高吞吐量(high throughput)来访问应用程序的数据,适合那些有着超大数据集(large data set)的应用程序。HDFS放宽了(relax)POSIX的要求,可以以流的形式访问(streaming access)文件系统中的数据。
1、新建Hadoop用户 配置免密码登陆
|
[root@ipython ~]# groupadd hadoop
[root@ipython ~]# useradd hadoop -g hadoop
[root@ipython ~]# passwd hadoop
[root@ipython ~]# mkdir /tools
[root@ipython ~]# chown hadoop:hadoop /tools/
##Ssh##
[root@ipython ~]# su - hadoop
[hadoop@ipython ~]$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
+--[ DSA 1024]----+
|BE* |
|.*.= |
|+ o o . . |
|. o . o + |
| . . S o . |
| = o . |
| o o |
| . |
| |
+-----------------+
[hadoop@ipython ~]$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
[hadoop@ipython ~]$ chmod 0600 ~/.ssh/authorized_keys
|
2、JAVA.JDK 已安装(请参考:【CentOS JDK-1.8安装】)
|
[hadoop@ipython ~]$ java -version
java version "1.8.0_25"
Java(TM) SE Runtime Environment (build 1.8.0_25-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.25-b02, mixed mode)
|
3、下载并解包Hadoop
|
[hadoop@ipython ~]$ cd /tools/
[hadoop@ipython source]$ wget https://archive.apache.org/dist/hadoop/core/hadoop-2.2.0/hadoop-2.2.0.tar.gz
[hadoop@ipython source]$ tar zxf hadoop-2.2.0.tar.gz
[hadoop@ipython source]$ ln -s /tools/source/hadoop-2.2.0 /tools/hadoop
|
4、添加Hadoop环境变量
|
[hadoop@ipython source]$ cat >> ~/.bashrc <
export HADOOP_PREFIX="/tools/hadoop"
export PATH=\$PATH:\$HADOOP_PREFIX/bin
export PATH=\$PATH:\$HADOOP_PREFIX/sbin
export HADOOP_MAPRED_HOME=\${HADOOP_PREFIX}
export HADOOP_COMMON_HOME=\${HADOOP_PREFIX}
export HADOOP_HDFS_HOME=\${HADOOP_PREFIX}
export YARN_HOME=\${HADOOP_PREFIX}
####hadoop-env####
export JAVA_HOME="/tools/java"
export HADOOP_COMMON_LIB_NATIVE_DIR=\${HADOOP_PREFIX}/lib/native
export HADOOP_OPTS="-Djava.library.path=\$HADOOP_PREFIX/lib"
####yarn-env####
export HADOOP_COMMON_LIB_NATIVE_DIR=\${HADOOP_PREFIX}/lib/native
export HADOOP_OPTS="-Djava.library.path=\$HADOOP_PREFIX/lib"
EOF
[hadoop@ipython source]$ source ~/.bashrc
|
5、变更配置文件
|
[hadoop@ipython source]$ cd $HADOOP_PREFIX/etc/hadoop
[hadoop@ipython hadoop]$ vi core-site.xml
#-------------------------------------------------------#
#-------------------------------------------------------#
[hadoop@ipython hadoop]$ vi hdfs-site.xml
#-------------------------------------------------------#
#-------------------------------------------------------#
[hadoop@ipython hadoop]$ cp mapred-site.xml.template mapred-site.xml
[hadoop@ipython hadoop]$ vi mapred-site.xml
#-------------------------------------------------------#
#-------------------------------------------------------#
[hadoop@ipython hadoop]$ vi yarn-site.xml
#-------------------------------------------------------#
#-------------------------------------------------------#
|
6、启动HDFS
|
[hadoop@ipython hadoop]$ hdfs namenode -format
15/01/23 23:55:40 INFO namenode.FSImage: Saving image file /tools/hadoop/dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
15/01/23 23:55:40 INFO namenode.FSImage: Image file /tools/hadoop/dfs/name/current/fsimage.ckpt_0000000000000000000 of size 198 bytes saved in 0 seconds.
15/01/23 23:55:40 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
15/01/23 23:55:40 INFO util.ExitUtil: Exiting with status 0
15/01/23 23:55:40 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ipython.me/10.211.55.40
************************************************************/
##Start All(namenode,datanode,yarn)###
[hadoop@ipython hadoop]$ cd $HADOOP_PREIFX/sbin
[hadoop@ipython sbin]$ start-all.sh
##Jps##
[hadoop@ipython sbin]$ jps
2656 Jps
2000 DataNode
2275 ResourceManager
1892 NameNode
2374 NodeManager
2141 SecondaryNameNode
|
访问HDFS NameNode Web
hadoop-hadoop-cluster
访问NameNode web UI
hadoop-namenode-info
访问 Resource Manager 接口
hadoop-node-manager
测试Hadoop
|
[hadoop@ipython hadoop]$ hdfs dfs -mkdir /user
[hadoop@ipython hadoop]$ hdfs dfs -put /tmp /test/logs
|
hadoop_test