我已经设置了AWS EMR.我SSH到主节点.我想将文件复制到hdfs系统中.我的程序中的一小段代码是:
os.system('/home/hadoop/bin/hdfs dfs -put %s PATH_to_HADOOP' % tmp_output)
我想输入我的hdfs文件系统的路径.
我做
[ec2-user@ip-172-31-0-185 input]$ /home/hadoop/bin/hdfs dfs -ls / Found 2 items drwxr-xr-x - hadoop supergroup 0 2014-04-14 22:21 /hbase drwxrwx--- - hadoop supergroup 0 2014-04-14 22:19 /tmp
我试试
[ec2-user@ip-172-31-0-185 input]$ /home/hadoop/bin/hdfs dfs -mkdir /tmp/stockmarkets mkdir: Permission denied: user=ec2-user, access=EXECUTE, inode="/tmp":hadoop:supergroup:drwxrwx---
所以,要添加ec2-user使用hadoop,我按照以下说明操作:
http://cloudcelebrity.wordpress.com/2013/06/05/handling-permission-denied-error-on-hdfs/
但是在我写完之后(将ubuntu替换为ec2-user)
sudo adduser ec2-user hadoop
而不是获取添加消息,我得到:
Usage: useradd [options] LOGIN Options: -b, --base-dir BASE_DIR base directory for the home directory of the new account -c, --comment COMMENT GECOS field of the new account -d, --home-dir HOME_DIR home directory of the new account -D, --defaults print or change default useradd configuration -e, --expiredate EXPIRE_DATE expiration date of the new account -f, --inactive INACTIVE password inactivity period of the new account -g, --gid GROUP name or ID of the primary group of the new account -G, --groups GROUPS list of supplementary groups of the new account -h, --help display this help message and exit -k, --skel SKEL_DIR use this alternative skeleton directory -K, --key KEY=VALUE override /etc/login.defs defaults -l, --no-log-init do not add the user to the lastlog and faillog databases -m, --create-home create the user's home directory -M, --no-create-home do not create the user's home directory -N, --no-user-group do not create a group with the same name as the user -o, --non-unique allow to create users with duplicate (non-unique) UID -p, --password PASSWORD encrypted password of the new account -r, --system create a system account -s, --shell SHELL login shell of the new account -u, --uid UID user ID of the new account -U, --user-group create a group with the same name as the user -Z, --selinux-user SEUSER use a specific SEUSER for the SELinux user mapping
所以我都困惑和搞砸..请帮助> ....
用于Amazon EMR的hadoop @(publicIP)SSH.
从那里你可以用HDFS做任何你喜欢的事情,而不必"su".我刚做了一个mkdir并运行了distcp和一个流媒体工作.根据EMR说明,我按照hadoop @做所有事情.