《大数据平台搭建详细教程CHD.docx》由会员分享,可在线阅读,更多相关《大数据平台搭建详细教程CHD.docx(25页珍藏版)》请在优知文库上搜索。
1、大数据平台搭建详细教程目录1 .引言41.1 编写目的42 .详细搭建步骤42.1前期准备42.1.1添力口hostname42.1.2添加子用户52.1.3设置免密登陆52.1.4关闭selinux52.1.5关闭防火墙52.1.6安装JDK62.2安装hadoop集群62.2.1Zookeeper62.2.1.1配置Zookeeper62.2.1.2Zookeeper的使用72.3.2Hadoop72.3.2.1配置HadOoP82.3.2.2第一次启动hadoop92. 3.3Spark103. 3.1安装SCale(全部节点)102.3.3.2安装spark112.3.4Hive112
2、.3.4.1部署MySQL主从集群112.3.4.2配置HiVe142.3.5Sqoop172.3.5.1配置SqoOP172.3.5.2使用sqoop182.4安装HbaSe集群182.4.1Hbase182.4.1.2部署分布式hbase集群182.4.1.3操作hbase212.4.2Kafka222.4.2.1分布式部署kafka222.4.2.2使用Kafka222.4.3Kafka-MONITOR232.4.3.1配置Kafka-MONITOR232.5环境变量242.5.1在hadoop节点上添加的环境变量242.5.2在hbase集群节点配置环境变量251.引言1.1编写目的本
3、教程基于CentOS7.3编写,主要用于大数据平台搭建,其中组件有Zookeeper.HDFS.YARN、MaPredUCeS2、HBaSe、Spark、HiVe和SQe)OP。本系统一共2套,一套hadoop集群,一套HbaSe集群功能部若组件IHadoo集群管理方点(2台)NameNode(hadoop)、DFszKFalloverControIIer(hadoop)、ResourceManager(hadoop)HIVE(MYSQL),SQOOPMYSQLHadoO隙群数据节点(3台)hadpO1JournaINode(hadoop),DataNode(hadoop),QuorumPee
4、rMain(Zookeeper),SPARK(master、worker),NodeManager(hadoop)hadoop01hadoop02HbaSe集群管理三点(2台)hbaseManagerO1NameNode(hadoop)xDFszKFaliovefControIIer(hadoop)、ResourceManager(hadoop),Hmaster(hbase)KafkaOffsetMonitorhbaseManagerC2HbaSe集群数据E点(3台)hbaseO1JournaINode(hadp),DataNode(hadoop).Zookeeper1HRegionServe
5、r(hbase),KAFKA,NodeManager(hadoop)hbaseO2hbaseO3图1.1组件2.详细搭建步骤2.1 前期准备在全部节点配置2.1.1 添加hostname修改主机名,并且在每个节点上etchosts文件中添加hostnamegIP,如果有域名服务器可以不127.0.0.1localhostlocalhost.localdomainlocalhost4localhost4.localdomain4:1localhostlocalhost.Iocaldomain1。CalhOSt6localhost6.localdomain692.168.19.31hadoop01
6、192.168.19.32hadoop02192.168.19.33hadoop03192.168.19.34hadoop04192.168.19.35hadoop05用添加.2-1-1添加生机名2.1.2 添加子用户在全部主机上添加子用户,hadoop集群子用户名为hadoop,HbaSe集群子用户名为hbaseadduserHadoopadduserhbase2.1.3 设置免密登陆生成sshkey,设置主机之间子用户免密登陆,将所有主机子用户的rsa.pub复制到authorized_keys中,然后将authorized.keys复制到所有节点,并将authorized.keys权限改
7、为644.chown-RHadooprhadoophomeHadoopchmod700homeHadoopchmod700homehadoop.sshchmod644homehadoop.sshauthorized-keyschmod600homehadoop.sshid-rsa配置完成后,验证配置是否成功,相互免密登陆就算配置成功.2.1.4 关闭selinu修改所有节点的etcselinuxconfig中值为disabled,并重启SELINX=disabled用usrsbinsestatus检查2.1.5 关闭防火墙使用如下命令关闭所有节点防火墙Systemctlstopfirewall
8、d.servicesystemctldisablefirewalld.servicesystemctlstatusfirewalld.service2.1.6 安装JDK因为had。P所有组件都需要使用JDK,所以要提前安装JDK。本教程默认使用的jdk-8ul62-linux-x64.rpm版本.在官网下载好安装包后,拷贝到节点中,使用如下命令安装:yuminstall-yjdk-8ul62-linux-x64.rpmroothadoop01#java-versionjavaversion1.8.0_162Java(TM)SERuntimeEnvironment(build1.8.0_162
9、-bl2)JavaHotSpot(TM)64-BitServerVM(build25.162-bl2rmixedmode)rootQhadoopOl#2-1-5安装JDK2.2 安装hadoop集群环境安装顺序如下:Zookeeper-hadoop-spark-hive-sqoop2.2.1 Zookeeper在节点hadoop01,hadoop02和hadoop03上配置安装Zookeeper,用户为子用户hadoop2.2.1.1 配置ZOokeePer1.创建先关文件夹mkdir-phomehadoopoptdatazookeepermkdir-phomehadoopoptdatazoo
10、keeperzookeeperjog2 .上传ZK安装包至JhomehadoopZOOkeePer-3.4.5-Cdh5.10.0.tar.gz,然后解压tar-zxvfzookeeper-3.4.5-cdh5.10.0.tar.gz3 .创建homehadoopzoOkeePer-3.4.5-Cdh5.10.0confZOo.cfgroot()hadoop01conf#catzoo.cfgtickTime=2000initLimit=5syncLimit=2dataDir=homehadoopoptdatazookeeperdataLogDir=homehadC)OPoptdataZoOke
11、ePer/zookeeperOgclientPort=2181server.33=hadoop01:2888:3888server.34=hadoop02:2888:3888server.35=hadoop03:2888:38884.在每个节点上的homehadoopoptdataZOokeePer中创建文件myid,并且写入对应的值hadoop01,llmyid写入33hadoop02l,myid写入34hadoop03rlmyid写入352.2.1.2Zookeeper的使用1 .启动ZK在每个节点上用如下命令启动Zookeepehomehadoopzookeeper-3.4.5-cdh5
12、.10.0binzkServer.shstart2 .测试连接ZKhomehadoopzookeeper-3.4.5-cdh5.10.0binzkCli.sh-serverhadoop01:21803 .杳看状态homehadoopzookeeper-3.4.5-cdh5.10.0binzkServer.shstatus4 .3.2Hadoop在全部节点上配置hadoop,用户为子用户hadoop2.3.2.1配置Hadoop1 .解压hadoop-2.6.0-cdh5.10.0.tar.gz至Jhomehadooptar-zxvfhadoop-2.6.0-cdh5.10.0.tar.gz2
13、.创建文件夹mkdir-phomehadoopoptdatahadooptmpmkdir-phomehadoopoptdatahadoophadoop-namemkdir-phomehadoopoptdatahadoophadoop-datamkdir-phomehadoopoptdatahadoopeditsdirdfsjournalnodemkdir-phomehadoopoptdatahadoopnm-local-dirmkdir-phomehadoopoptdatahadoophadoopjogmkdir-phomehadoopoptdatahadoopuserlogs3. ighom
14、ehadoophadoop-2.6.0-cdh5.10.0etchadoophadoop-env.sh#Thejavaimplementationtouse.exportJAVA_HOME=/usr/java/jdkl.8.0_1624,配置hdfsha配置文件如下,详细配置在文件夹had。P中core-site.xmlhdfs-site.xml5 .配置yarnHA配置文件如下,详细配置在文件夹hadoop中yam-site.xml(单独到管理节点配置yarn.resourcemanager.ha.id指定为当前管理节点)mapred-site.ml6 .YarndatamanagerDatamanager节点将文件SPark-2.3.0-yam-shuffle.jar放入homehadoophadoop-2.6.0-cdh5.10.0sharehadoopyarnspark-2.3.0-yarn-shuffle.jar2.3.2.2第一次启动hadoop1 .在namenodel上执行,创建命名空间homehadoophadoop-2.6.0-cdh5.10.0binhdfszkfc-formatZK检查:Ka-Act