问答文章1 问答文章501 问答文章1001 问答文章1501 问答文章2001 问答文章2501 问答文章3001 问答文章3501 问答文章4001 问答文章4501 问答文章5001 问答文章5501 问答文章6001 问答文章6501 问答文章7001 问答文章7501 问答文章8001 问答文章8501 问答文章9001 问答文章9501

如何在linux下安装hadoop

发布网友 发布时间:2022-04-22 10:00

我来回答

1个回答

热心网友 时间:2022-05-06 07:42

一、前期准备:
下载hadoop: http://hadoop.apache.org/core/releases.html
http://hadoop.apache.org/common/releases.html
http://www.apache.org/dyn/closer.cgi/hadoop/core/
http://labs.xiaonei.com/apache-mirror/hadoop/core/hadoop-0.20.1/hadoop-0.20.1.tar.gz
http://labs.xiaonei.com/apache-mirror/hadoop/
二、硬件环境
共有3台机器,均使用的CentOS,Java使用的是jdk1.6.0。
三、安装JAVA6
sudo apt-get install sun-java6-jdk
/etc/environment
打开之后加入:#中间是以英文的冒号隔开,记得windows中是以英文的分号做为分隔的
CLASSPATH=.:/usr/local/java/lib
JAVA_HOME=/usr/local/java
三、配置host表
[root@hadoop ~]# vi /etc/hosts
127.0.0.1 localhost
192.168.13.100 namenode
192.168.13.108 datanode1
192.168.13.110 datanode2
[root@test ~]# vi /etc/hosts
127.0.0.1 localhost
192.168.13.100 namenode
192.168.13.108 datanode1
[root@test2 ~]# vi /etc/host
127.0.0.1 localhost
192.168.13.100 namenode
192.168.13.110 datanode2
添加用户和用户组
addgroup hadoop
adser hadoop
usermod -a -G hadoop hadoop
passwd hadoop
配置ssh:
服务端:
su hadoop
ssh-keygen -t rsa
cp id_rsa.pub authorized_keys
客户端
chmod 700 /home/hadoop
chmod 755 /home/hadoop/.ssh
su hadoop
cd /home
mkdir .ssh
服务端:
chmod 644 /home/hadoop/.ssh/authorized_keys
scp authorized_keys datanode1:/home/hadoop/.ssh/
scp authorized_keys datanode2:/home/hadoop/.ssh/
ssh datanode1
ssh datanode2
 如果ssh配置好了就会出现以下提示信息
The authenticity of host [dbrg-2] can't be established.
Key fingerpr is 1024 5f:a0:0b:65:d3:82:df:ab:44:62:6d:98:9c:fe:e9:52.
Are you sure you want to continue connecting (yes/no)?
  OpenSSH告诉你它不知道这台主机但是你不用担心这个问题你是第次登录这台主机键入“yes”这将把
这台主机“识别标记”加到“~/.ssh/know_hosts”文件中第 2次访问这台主机时候就不会再显示这条提示信
不过别忘了测试本机ssh dbrg-1
 
mkdir /home/hadoop/HadoopInstall
tar -zxvf hadoop-0.20.1.tar.gz -C /home/hadoop/HadoopInstall/
cd /home/hadoop/HadoopInstall/
ln -s hadoop-0.20.1 hadoop
export JAVA_HOME=/usr/local/java
export CLASSPATH=.:/usr/local/java/lib
export HADOOP_HOME=/home/hadoop/HadoopInstall/hadoop
export HADOOP_CONF_DIR=/home/hadoop/hadoop-conf
export PATH=$HADOOP_HOME/bin:$PATH
cd $HADOOP_HOME/conf/
mkdir /home/hadoop/hadoop-conf
cp hadoop-env.sh core-site.xml hdfs-site.xml mapred-site.xml masters slaves /home/hadoop/hadoop-conf
vi $HADOOP_HOME/hadoop-conf/hadoop-env.sh
# The java implementation to use. Required. --修改成你自己jdk安装的目录
export JAVA_HOME=/usr/local/java

export HADOOP_CLASSPATH=.:/usr/local/java/lib
# The maximum amount of heap to use, in MB. Default is 1000.--根据你的内存大小调整
export HADOOP_HEAPSIZE=200
vi /home/hadoop/.bashrc
export JAVA_HOME=/usr/local/java
export CLASSPATH=.:/usr/local/java/lib
export HADOOP_HOME=/home/hadoop/HadoopInstall/hadoop
export HADOOP_CONF_DIR=/home/hadoop/hadoop-conf
export PATH=$HADOOP_HOME/bin:$PATH
配置
namenode
#vi $HADOOP_CONF_DIR/slaves
192.168.13.108
192.168.13.110
#vi $HADOOP_CONF_DIR/core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://192.168.13.100:9000</value>
</property>
</configuration>
#vi $HADOOP_CONF_DIR/hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
</configuration>
#vi $HADOOP_CONF_DIR/mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
<name>mapred.job.tracker</name>
<value>192.168.13.100:11000</value>
</property>
</configuration>
~
在slave上的配置文件如下(hdfs-site.xml不需要配置):
[root@test12 conf]# cat core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://namenode:9000</value>
</property>
</configuration>
[root@test12 conf]# cat mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>namenode:11000</value>
</property>
</configuration>
启动
export PATH=$HADOOP_HOME/bin:$PATH
hadoop namenode -format
start-all.sh
停止stop-all.sh
在hdfs上创建danchentest文件夹,上传文件到此目录下
$HADOOP_HOME/bin/hadoop fs -mkdir danchentest
$HADOOP_HOME/bin/hadoop fs -put $HADOOP_HOME/README.txt danchentest
cd $HADOOP_HOME
hadoop jar hadoop-0.20.1-examples.jar wordcount /user/hadoop/danchentest/README.txt output1
09/12/21 18:31:44 INFO input.FileInputFormat: Total input paths to process : 1
09/12/21 18:31:45 INFO mapred.JobClient: Running job: job_200912211824_0002
09/12/21 18:31:46 INFO mapred.JobClient: map 0% rece 0%
09/12/21 18:31:53 INFO mapred.JobClient: map 100% rece 0%
09/12/21 18:32:05 INFO mapred.JobClient: map 100% rece 100%
09/12/21 18:32:07 INFO mapred.JobClient: Job complete: job_200912211824_0002
09/12/21 18:32:07 INFO mapred.JobClient: Counters: 17
09/12/21 18:32:07 INFO mapred.JobClient: Job Counters
09/12/21 18:32:07 INFO mapred.JobClient: Launched rece tasks=1
查看输出结果文件,这个文件在hdfs上
[root@test11 hadoop]# hadoop fs -ls output1
Found 2 items
drwxr-xr-x - root supergroup 0 2009-09-30 16:01 /user/root/output1/_logs
-rw-r--r-- 3 root supergroup 1306 2009-09-30 16:01 /user/root/output1/part-r-00000
[root@test11 hadoop]# hadoop fs -cat output1/part-r-00000
(BIS), 1
(ECCN) 1
查看hdfs运行状态,可以通过web界面来访问http://192.168.13.100:50070/dfshealth.jsp;查看map-rece信息,
可以通过web界面来访问http://192.168.13.100:50030/jobtracker.jsp;下面是直接命令行看到的结果。
出现08/01/25 16:31:40 INFO ipc.Client: Retrying connect to server: foo.bar.com/1.1.1.1:53567. Already tried 1 time(s).
的原因是没有格式化:hadoop namenode -format
声明声明:本网页内容为用户发布,旨在传播知识,不代表本网认同其观点,若有侵权等问题请及时与本网联系,我们将在第一时间删除处理。E-MAIL:11247931@qq.com
电脑wifi已禁用怎么打开电脑无线网络禁用了怎么恢复 ...禁用网络在哪重开win7笔记本无线网络被禁用了怎么办 win7网络禁用怎么恢复 windows7网络被禁用怎么恢复 Win7系统本地连接禁用了怎么恢复Win7系统启动本地连接的两种方法图文... 梦见家人去世什么预兆 ...经缝针现在基本痊愈,一个月过去了现在就是小腿还不能贴大腿,最近感... 小腿缝针拆线三个月了表皮长好了里面的肉怎么有点带黑红色还有点白色... 小腿迎面骨掉快深宽都1厘米左右的肉。当时没缝针。已经20天了。天天... 运费和快递费各走 什么科目? 快递费用放什么科目 吃桃子的季节是哪个? 桃子哪个季节成熟? linux怎么安装hadoop 桃子在什么时候开花结果 linux搭建hadoop步骤linux搭建hadoop 桃子是在哪个季节结果的? 梨和桃在什么季节结果实? 404 Not Found QTreeView和QTreewidget的区别? 怎么设置QTreeWidget的宽度 如何QTreeWidget的水平滚动条自动出现 qtreewidget重命名判断重名 QTreeWidget打开节点 如何使QTreeWidget不显示虚线边框 璇烽棶闱掓槬涓囧瞾杩欐湰涔〉湪闾i噷鍙&#xFFFD;互涔板埌锛岀帇钂% 关于qt中的QTreeWidget的拖放问题 QTreeWidget 的弹出菜单怎么设置快捷键 如何不显示Qtreewidget左边的减号和加号 404 Not Found 404 Not Found 桃子是什么季节的? 怎么在linux上安装hadoop 桃树什么时候开花结果 几月份桃树开花 Linux命令 将该hadoop文件夹的属主用户设为hadoop, $ sudo chown -R hadoop:hadoop hadoop 404 Not Found 在Linux虚拟机上配置Hadoop,在初始化时显示权限不够 linux系统里面为什么安装完一个服务,要建一个相应的nologin用户呢? 为什么用瑞星杀毒时,CUP100%暴满 404 Not Found linux下怎么知道hadoop安装成功 linux中Hadoop 安装和配置问题 如何辨别电热油汀取暖器是否漏油? 怎样判断油汀漏油 linux 在执行初始化hadoopt时,提示没有权限 油丁漏油怎么回事 404 Not Found 电热油汀漏油。。。 油汀漏油是怎么回事。拜托各位了 3Q 上泉村冰雕在哪里 油汀电暖器漏油有毒吗