环境准备
Hadoop并没有提供官方的 Windows 10 下的安装包,所以需要自己手动来编译,官方文档中 BUILDING.txt
文件中说明了编译源码需要的软件环境:
- Hadoop源码
- Windows 系统
- JDK 1.6+
- Maven 3.0 or later
- Findbugs 1.3.9 (可省略)
- ProtocolBuffer 2.5.0
- CMake 2.6 or newer
- Windows SDK or Visual Studio 2010 Professional
- Cygwin: 为了使用 sh, mkdir, rm, cp, tar, gzip 等命令
- zlib
- 电脑需要联网
Hadoop源码
使用git或者直接下载压缩包,地址:https://github.com/apache/hadoop/releases/tag/rel%2Frelease-2.6.5
, 下载之后解压到C盘根目录并重命名为 dfs265
,防止目录名称过长或者路径中包含空格。
JDK
JAVA_HOME=JDK目录
CLASSPATH=.;%JAVA_HOME%\lib;%JAVA_HOME%\lib\jre\lib;
PATH=%JAVA_HOME%\bin;%JAVA_HOME%\jre\bin;
注意:Hadoop2.6.5最好使用JDK1.7进行编译,使用1.8的时候,会有问题,我使用的版本如下:
$ java -version
java version "1.7.0_51"
Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)
Maven
下载地址:http://maven.apache.org/download.cgi
,配置如下:
M2_HOME=maven目录
PATH=%M2_HOME%\bin;
我使用的版本是3.3.9:
$ mvn -version
Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-11T00:41:47+08:00)
Maven home: C:\java\maven\bin\..
Java version: 1.7.0_51, vendor: Oracle Corporation
Java home: C:\java\JDK\jdk170\jre
Default locale: zh_CN, platform encoding: GBK
OS name: "windows 8", version: "6.2", arch: "amd64", family: "windows"
可以看出,Maven将Windows10识别成了 windows 8,这里应该会对编译时候系统的识别造成影响,visual studio那里再处理。
ProtocolBuffer
下载地址:https://github.com/google/protobuf/releases/tag/v2.5.0
,选择 protoc-2.5.0-win32.zip
版本, 解压到安装目录,并添加到PAHT,如下:
PATH=C:\java\protoc-2.5.0-win32;
测试:
$ protoc --version
libprotoc 2.5.0
CMake
要求 CMake 2.6+ 的版本,下载地址:https://cmake.org/download/
, 下载并解压到安装目录,并添加bin目录到PATH,由于cygwin中也带了个cmake, 所以在Path中,cmake 的bin目录得加在cygwin的bin 目录之前:
CMAKE_HOME=cmake目录
PATH=%CMAKE_HOME%\bin;
测试:
$ cmake --version
cmake version 3.7.0-rc2
CMake suite maintained and supported by Kitware (kitware.com/cmake).
Visual Studio
VS的版本选择有两种,第一使用推荐的VS2010版,使用这个版本比较简单方便,或者使用最新的VS2015版,需要升级一些文件。 我选择的是 Visual Studio Community 2015 with Update 3 – Free 的 IOS 版本, 地址:https://www.visualstudio.com/downloads/
。 上面maven安装的时候,maven将系统识别为了windows 8,所以安装 visual studio 2015的时候,需要一并安装 SDK 8.1。 安装完毕之后,如果使用的是VS2015版本需要升级相关的文件:
- C:\dfs265\hadoop-common-project\hadoop-common\src\main\native\native.sln
- C:\dfs265\hadoop-common-project\hadoop-common\src\main\winutils\winutils.sln
使用VS2015分别打开上述文件,提示升级的时候进行确认即可。升级成功的提示信息如下:
正在升级项目“native”...
配置“Release|x64”: 将平台工具集更改为“v140”(之前为“v100”)。
重定目标结束: 1 个已完成,0 个未通过,0 个已跳过
正在升级项目“winutils”...
配置“Debug|x64”: 将平台工具集更改为“v140”(之前为“v100”)。
配置“Release|x64”: 将平台工具集更改为“v140”(之前为“v100”)。
正在升级项目“libwinutils”...
配置“Debug|x64”: 将平台工具集更改为“v140”(之前为“v100”)。
配置“Release|x64”: 将平台工具集更改为“v140”(之前为“v100”)。
重定目标结束: 2 个已完成,0 个未通过,0 个已跳过
另外需要修改下面的文件:
- C:\dfs265\hadoop-hdfs-project\hadoop-hdfs\pom.xml
将 “Visual Studio 10” 改成 “Visual Studio 14”,保存后退出。
Cygwin
下载地址:https://cygwin.com/install.html
,安装并添加bin目录到PAHT:
CYGWIN_HOME=cygwin安装目录
PATH=%CYGWIN_HOME%\bin;
Zlib
好像不安装Zlib也是可以编译成功,但还是安装下吧。可以网上找编译好的版本,也可以自己从源码编译。 下载地址忘记了,自己搜下别人编译好的windows可用即可:
ZLIB_HOME=zlib目录
PATH=%ZLIB_HOME%\bin;
其它设置
上面的设置好之后,还需要将如下配置增加到环境变量:
Platform=x64
VCTargetsPath=C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V140
MSBUILD_HOME=C:\Program Files (x86)\MSBuild\14.0\Bin
PATH=%MSBUILD_HOME%;
编译
一般指令如下所示:
mvn clean package -Pdist,native-win -DskipTests -Dtar
指令中加入 -Dmaven.javadoc.skip=true
参数,可以剔除指令文档的编译,如果使用的是JDK1.8,那么必须加入此参数才能编译通过:
mvn clean package -Pdist,native-win -DskipTests -Dtar -Dmaven.javadoc.skip=true
编译成功大概需要半小时,机器好的话可能更快,如果省略说明文档的编译也会提高速度, 编译成功之后的目录为 C:\dfs265\hadoop-dist\target
,成功信息显示如下:
[INFO] Executing tasks
main:
[exec] $ tar cf hadoop-2.6.5.tar hadoop-2.6.5
[exec] $ gzip -f hadoop-2.6.5.tar
[exec]
[exec] Hadoop dist tar available at: C:\dfs265\hadoop-dist\target/hadoop-2.6.5.tar.gz
[exec]
[INFO] Executed tasks
[INFO]
[INFO] --- maven-javadoc-plugin:2.8.1:jar (module-javadocs) @ hadoop-dist ---
[INFO] Building jar: C:\dfs265\hadoop-dist\target\hadoop-dist-2.6.5-javadoc.jar
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Apache Hadoop Main ................................. SUCCESS [ 5.302 s]
[INFO] Apache Hadoop Build Tools .......................... SUCCESS [ 3.245 s]
[INFO] Apache Hadoop Project POM .......................... SUCCESS [ 3.383 s]
[INFO] Apache Hadoop Annotations .......................... SUCCESS [ 5.051 s]
[INFO] Apache Hadoop Assemblies ........................... SUCCESS [ 0.782 s]
[INFO] Apache Hadoop Project Dist POM ..................... SUCCESS [ 2.886 s]
[INFO] Apache Hadoop Maven Plugins ........................ SUCCESS [ 8.258 s]
[INFO] Apache Hadoop MiniKDC .............................. SUCCESS [ 16.756 s]
[INFO] Apache Hadoop Auth ................................. SUCCESS [ 13.092 s]
[INFO] Apache Hadoop Auth Examples ........................ SUCCESS [ 5.844 s]
[INFO] Apache Hadoop Common ............................... SUCCESS [03:30 min]
[INFO] Apache Hadoop NFS .................................. SUCCESS [ 13.650 s]
[INFO] Apache Hadoop KMS .................................. SUCCESS [ 28.320 s]
[INFO] Apache Hadoop Common Project ....................... SUCCESS [ 0.095 s]
[INFO] Apache Hadoop HDFS ................................. SUCCESS [05:42 min]
[INFO] Apache Hadoop HttpFS ............................... SUCCESS [ 55.960 s]
[INFO] Apache Hadoop HDFS BookKeeper Journal .............. SUCCESS [ 13.488 s]
[INFO] Apache Hadoop HDFS-NFS ............................. SUCCESS [ 12.245 s]
[INFO] Apache Hadoop HDFS Project ......................... SUCCESS [ 0.096 s]
[INFO] hadoop-yarn ........................................ SUCCESS [ 0.089 s]
[INFO] hadoop-yarn-api .................................... SUCCESS [03:22 min]
[INFO] hadoop-yarn-common ................................. SUCCESS [01:44 min]
[INFO] hadoop-yarn-server ................................. SUCCESS [ 0.105 s]
[INFO] hadoop-yarn-server-common .......................... SUCCESS [ 29.556 s]
[INFO] hadoop-yarn-server-nodemanager ..................... SUCCESS [ 43.315 s]
[INFO] hadoop-yarn-server-web-proxy ....................... SUCCESS [ 9.429 s]
[INFO] hadoop-yarn-server-applicationhistoryservice ....... SUCCESS [ 16.351 s]
[INFO] hadoop-yarn-server-resourcemanager ................. SUCCESS [ 58.206 s]
[INFO] hadoop-yarn-server-tests ........................... SUCCESS [ 17.863 s]
[INFO] hadoop-yarn-client ................................. SUCCESS [ 16.329 s]
[INFO] hadoop-yarn-applications ........................... SUCCESS [ 0.094 s]
[INFO] hadoop-yarn-applications-distributedshell .......... SUCCESS [ 4.064 s]
[INFO] hadoop-yarn-applications-unmanaged-am-launcher ..... SUCCESS [ 6.835 s]
[INFO] hadoop-yarn-site ................................... SUCCESS [ 0.110 s]
[INFO] hadoop-yarn-registry ............................... SUCCESS [ 14.566 s]
[INFO] hadoop-yarn-project ................................ SUCCESS [ 4.299 s]
[INFO] hadoop-mapreduce-client ............................ SUCCESS [ 0.355 s]
[INFO] hadoop-mapreduce-client-core ....................... SUCCESS [01:27 min]
[INFO] hadoop-mapreduce-client-common ..................... SUCCESS [ 38.155 s]
[INFO] hadoop-mapreduce-client-shuffle .................... SUCCESS [ 12.238 s]
[INFO] hadoop-mapreduce-client-app ........................ SUCCESS [ 26.710 s]
[INFO] hadoop-mapreduce-client-hs ......................... SUCCESS [ 23.336 s]
[INFO] hadoop-mapreduce-client-jobclient .................. SUCCESS [ 20.261 s]
[INFO] hadoop-mapreduce-client-hs-plugins ................. SUCCESS [ 3.353 s]
[INFO] Apache Hadoop MapReduce Examples ................... SUCCESS [ 17.791 s]
[INFO] hadoop-mapreduce ................................... SUCCESS [ 2.927 s]
[INFO] Apache Hadoop MapReduce Streaming .................. SUCCESS [ 11.117 s]
[INFO] Apache Hadoop Distributed Copy ..................... SUCCESS [ 22.965 s]
[INFO] Apache Hadoop Archives ............................. SUCCESS [ 5.038 s]
[INFO] Apache Hadoop Rumen ................................ SUCCESS [ 13.747 s]
[INFO] Apache Hadoop Gridmix .............................. SUCCESS [ 10.653 s]
[INFO] Apache Hadoop Data Join ............................ SUCCESS [ 6.081 s]
[INFO] Apache Hadoop Ant Tasks ............................ SUCCESS [ 3.085 s]
[INFO] Apache Hadoop Extras ............................... SUCCESS [ 4.128 s]
[INFO] Apache Hadoop Pipes ................................ SUCCESS [ 0.074 s]
[INFO] Apache Hadoop OpenStack support .................... SUCCESS [ 14.496 s]
[INFO] Apache Hadoop Amazon Web Services support .......... SUCCESS [ 10.876 s]
[INFO] Apache Hadoop Client ............................... SUCCESS [ 10.715 s]
[INFO] Apache Hadoop Mini-Cluster ......................... SUCCESS [ 1.497 s]
[INFO] Apache Hadoop Scheduler Load Simulator ............. SUCCESS [ 15.287 s]
[INFO] Apache Hadoop Tools Dist ........................... SUCCESS [ 10.746 s]
[INFO] Apache Hadoop Tools ................................ SUCCESS [ 0.073 s]
[INFO] Apache Hadoop Distribution ......................... SUCCESS [ 55.172 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 28:23 min
[INFO] Finished at: 2016-11-01T11:03:50+08:00
[INFO] Final Memory: 120M/1007M
[INFO] ------------------------------------------------------------------------
安装配置
参考另外一篇博文: 《Hadoop,Spark,HBase 开发环境配置》 ,需要注意的是需要将sh的文件配置更改为cmd的文件配置。
参考资料
文档信息
- 本文作者:Bob.Zhu
- 本文链接:https://adolphor.github.io/2016/11/01/build-hadoop-on-windows-10/
- 版权声明:自由转载-非商用-非衍生-保持署名(创意共享3.0许可证)