基于 aarch64 编译 Hadoop 3.3.6 Native Libraries 完整套件

后知后觉 暂无评论

官方的预构建包对很多 Native Libraries 功能扩展支持不是很完善,需要重新进行构建,本文演示在 ARMv8 环境编译。

过程

基础环境 Debian 11 (aarch64) 最小化安装实例,CPU 是 Neoverse-N1,建议编译前完整升级一遍系统并重启后继续操作。

$ cat /etc/os-release 
PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
NAME="Debian GNU/Linux"
VERSION_ID="11"
VERSION="11 (bullseye)"
VERSION_CODENAME=bullseye
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
小贴士:Debian 12 也可以使用,但是因为已经放弃支持 openssl v1.1.1,所以会导致 Native Library 中的 openssl 会报错。

环境

Oracle JDK

因为 Hadoop 及相关套件都是基于 Java 编写的,先安装基础环境。理论上 OpenJDK 亦可使用,不过谨慎起见,在 Oracle JDK 官网下载 JDK 安装包。

## 若下载版本为 '8u381',解压到系统
sudo tar xf jdk-8u381-linux-aarch64.tar.gz -C /opt/

配置环境变量

cat <<'EOF' | sudo tee /etc/profile.d/maven.sh
export JAVA_HOME="/opt/jdk1.8.0_381"
export PATH=$JAVA_HOME/bin:$PATH
export M2_HOME="/opt/apache-maven-3.9.6"
export MAVEN_HOME="/opt/apache-maven-3.9.6"
export PATH=$M2_HOME/bin:$PATH
EOF

重新登陆终端,然后检查版本

$ java -version
java version "1.8.0_381"
Java(TM) SE Runtime Environment (build 1.8.0_381-b09)
Java HotSpot(TM) 64-Bit Server VM (build 25.381-b09, mixed mode)

Maven

然后部署 Maven ,提供 Java 构建环境。

wget https://dlcdn.apache.org/maven/maven-3/3.9.6/binaries/apache-maven-3.9.6-bin.tar.gz
sudo tar xf apache-maven-3.9.6-bin.tar.gz -C /opt/

System Depends

然后准备 Native Libraries 编译环境,官方演示编译环境为 Ubuntu,因此依赖需要更换为 RedHat 系的包名。

sudo apt install -y build-essential

然后安装 Native Depends

sudo apt install -y zlib1g-dev libzstd-dev libbz2-dev libssl-dev

Build Tools

接下来安装编译工具,因为 Debian 官方仓库中的 cmake 版本较低,不满足要求,因此手动编译安装:

wget https://cmake.org/files/v3.21/cmake-3.21.7.tar.gz
tar xf cmake-3.21.7.tar.gz
cd cmake-3.21.7/
./bootstrap
make -j$(nproc)
sudo make install

然后安装 Native 编译套件

sudo apt install -y protobuf-compiler doxygen libsasl2-dev libfuse-dev libprotoc-dev

安装完成后检查系统组件版本

$ cmake --version
cmake version 3.21.7

CMake suite maintained and supported by Kitware (kitware.com/cmake).
$ protoc --version
libprotoc 3.12.4

获取源码

下载 Hadoop 3.3.6 版本源码包并解压

wget https://dlcdn.apache.org/hadoop/common/hadoop-3.3.6/hadoop-3.3.6-src.tar.gz
tar xf hadoop-3.3.6-src.tar.gz
cd hadoop-3.3.6-src/

因 Node 12 已经废弃,无法正常完成编译,需要修改 hadoop-project/pom.xml #216 处:

...
    <nodejs.version>v16.20.2</nodejs.version>
    <yarnpkg.version>v1.22.21</yarnpkg.version>
...

可选项

ISA-L Support

ISA-L (Intelligent Storage Acceleration Library) 是 Intel 开发的智能存储加速库,可以为 HDFS 提高性能,可以在 ARMv8(aarch64) 和 AMD64(x86_64) 架构上编译。

此组件在编译时会自动检测并添加支持,因此请在编译前按下述步骤安装 ISA-L 库,然后进行编译即可原生支持 ISA-L。

## 克隆源码
git clone https://github.com/intel/isa-l
## 补充依赖
sudo apt install -y autoconf libtool pkgconf
## 编译安装
./autogen.sh
./configure --prefix=/usr --libdir=/usr/lib
make
sudo make install

PMDK Support

PMDK(Persistent Memory Development Kit) 利用 PMDK 用户态编程库进行数据读写,减小用户态、内核态切换与文件系统开销,提高集群的读写性能。

PMDK 扩展与其他扩展不同,即便系统内检测到相关依赖库也不会默认编译支持,需要在编译时增加参数重新进行编译,增加支持,安装所需依赖。

  1. 可以使用发行版的包管理器进行安装:

    ## 只安装运行依赖 Runtime (部署机器上安装)
    sudo apt -y install libpmem1 librpmem1 libpmemblk1 libpmemlog1 libpmemobj1 libpmempool1
    ## 只安装开发套件 Development (编译机器上安装)
    sudo apt -y install libpmem-dev librpmem-dev libpmemblk-dev libpmemlog-dev libpmemobj-dev libpmempool-dev libpmempool-dev
  2. 如果需要指定位置可以手动编译安装:

    ## 安装依赖
    sudo apt install -y pkg-config pandoc
    ## 克隆源码
    git clone https://github.com/pmem/pmdk
    ## 编译安装
    make
    sudo make install prefix=/usr

然后编译时需要添加参数 -Drequire.pmdk

mvn clean package -Pdist,native, -DskipTests -Dtar -Dmaven.javadoc-skip=true -Drequire.pmdk -X

开始编译

使用命令进行编译:

mvn clean package -Pdist,native, -DskipTests -Dtar -Dmaven.javadoc-skip=true -X
小贴士:如果需要可选组件,使用上面的命令进行编译。

执行编译看到以下提示即为构建成功。

[INFO] CycloneDX: Creating BOM version 1.4 with 0 component(s)
[INFO] CycloneDX: Writing and validating BOM (XML): /home/kane/hadoop-3.3.6-src/hadoop-cloud-storage-project/target/bom.xml
[INFO]            attaching as hadoop-cloud-storage-project-3.3.6-cyclonedx.xml
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Apache Hadoop Main 3.3.6:
[INFO] 
[INFO] Apache Hadoop Main ................................. SUCCESS [  2.368 s]
[INFO] Apache Hadoop Build Tools .......................... SUCCESS [  1.499 s]
[INFO] Apache Hadoop Project POM .......................... SUCCESS [  2.061 s]
[INFO] Apache Hadoop Annotations .......................... SUCCESS [  2.728 s]
[INFO] Apache Hadoop Assemblies ........................... SUCCESS [  0.237 s]
[INFO] Apache Hadoop Project Dist POM ..................... SUCCESS [  1.601 s]
[INFO] Apache Hadoop Maven Plugins ........................ SUCCESS [  8.984 s]
[INFO] Apache Hadoop MiniKDC .............................. SUCCESS [  8.531 s]
[INFO] Apache Hadoop Auth ................................. SUCCESS [ 25.536 s]
[INFO] Apache Hadoop Auth Examples ........................ SUCCESS [ 11.744 s]
[INFO] Apache Hadoop Common ............................... SUCCESS [02:39 min]
[INFO] Apache Hadoop NFS .................................. SUCCESS [  6.789 s]
[INFO] Apache Hadoop KMS .................................. SUCCESS [ 24.841 s]
[INFO] Apache Hadoop Registry ............................. SUCCESS [ 15.685 s]
[INFO] Apache Hadoop Common Project ....................... SUCCESS [  4.352 s]
[INFO] Apache Hadoop HDFS Client .......................... SUCCESS [01:00 min]
[INFO] Apache Hadoop HDFS ................................. SUCCESS [02:18 min]
[INFO] Apache Hadoop HDFS Native Client ................... SUCCESS [03:49 min]
[INFO] Apache Hadoop HttpFS ............................... SUCCESS [ 10.966 s]
[INFO] Apache Hadoop HDFS-NFS ............................. SUCCESS [ 23.265 s]
[INFO] Apache Hadoop HDFS-RBF ............................. SUCCESS [01:07 min]
[INFO] Apache Hadoop HDFS Project ......................... SUCCESS [  0.093 s]
[INFO] Apache Hadoop YARN ................................. SUCCESS [  3.623 s]
[INFO] Apache Hadoop YARN API ............................. SUCCESS [ 41.717 s]
[INFO] Apache Hadoop YARN Common .......................... SUCCESS [02:04 min]
[INFO] Apache Hadoop YARN Server .......................... SUCCESS [  5.830 s]
[INFO] Apache Hadoop YARN Server Common ................... SUCCESS [ 42.703 s]
[INFO] Apache Hadoop YARN NodeManager ..................... SUCCESS [01:05 min]
[INFO] Apache Hadoop YARN Web Proxy ....................... SUCCESS [ 11.800 s]
[INFO] Apache Hadoop YARN ApplicationHistoryService ....... SUCCESS [ 28.103 s]
[INFO] Apache Hadoop YARN Timeline Service ................ SUCCESS [ 19.408 s]
[INFO] Apache Hadoop YARN ResourceManager ................. SUCCESS [01:26 min]
[INFO] Apache Hadoop YARN Server Tests .................... SUCCESS [ 26.091 s]
[INFO] Apache Hadoop YARN Client .......................... SUCCESS [ 19.255 s]
[INFO] Apache Hadoop YARN SharedCacheManager .............. SUCCESS [ 20.411 s]
[INFO] Apache Hadoop YARN Timeline Plugin Storage ......... SUCCESS [ 25.229 s]
[INFO] Apache Hadoop YARN TimelineService HBase Backend ... SUCCESS [  2.417 s]
[INFO] Apache Hadoop YARN TimelineService HBase Common .... SUCCESS [ 25.709 s]
[INFO] Apache Hadoop YARN TimelineService HBase Client .... SUCCESS [ 35.170 s]
[INFO] Apache Hadoop YARN TimelineService HBase Servers ... SUCCESS [  4.684 s]
[INFO] Apache Hadoop YARN TimelineService HBase Server 1.2  SUCCESS [ 28.158 s]
[INFO] Apache Hadoop YARN TimelineService HBase tests ..... SUCCESS [ 26.595 s]
[INFO] Apache Hadoop YARN Router .......................... SUCCESS [ 17.968 s]
[INFO] Apache Hadoop YARN TimelineService DocumentStore ... SUCCESS [ 34.780 s]
[INFO] Apache Hadoop YARN Applications .................... SUCCESS [  4.704 s]
[INFO] Apache Hadoop YARN DistributedShell ................ SUCCESS [ 20.645 s]
[INFO] Apache Hadoop YARN Unmanaged Am Launcher ........... SUCCESS [ 14.177 s]
[INFO] Apache Hadoop MapReduce Client ..................... SUCCESS [ 11.323 s]
[INFO] Apache Hadoop MapReduce Core ....................... SUCCESS [ 30.879 s]
[INFO] Apache Hadoop MapReduce Common ..................... SUCCESS [ 32.335 s]
[INFO] Apache Hadoop MapReduce Shuffle .................... SUCCESS [ 16.921 s]
[INFO] Apache Hadoop MapReduce App ........................ SUCCESS [ 32.149 s]
[INFO] Apache Hadoop MapReduce HistoryServer .............. SUCCESS [ 38.130 s]
[INFO] Apache Hadoop MapReduce JobClient .................. SUCCESS [ 24.188 s]
[INFO] Apache Hadoop Mini-Cluster ......................... SUCCESS [ 15.063 s]
[INFO] Apache Hadoop YARN Services ........................ SUCCESS [  5.188 s]
[INFO] Apache Hadoop YARN Services Core ................... SUCCESS [ 41.947 s]
[INFO] Apache Hadoop YARN Services API .................... SUCCESS [ 21.226 s]
[INFO] Apache Hadoop YARN Application Catalog ............. SUCCESS [  4.114 s]
[INFO] Apache Hadoop YARN Application Catalog Webapp ...... SUCCESS [02:24 min]
[INFO] Apache Hadoop YARN Application Catalog Docker Image  SUCCESS [  0.920 s]
[INFO] Apache Hadoop YARN Application MaWo ................ SUCCESS [  1.814 s]
[INFO] Apache Hadoop YARN Application MaWo Core ........... SUCCESS [ 15.226 s]
[INFO] Apache Hadoop YARN Site ............................ SUCCESS [  2.783 s]
[INFO] Apache Hadoop YARN Registry ........................ SUCCESS [  9.964 s]
[INFO] Apache Hadoop YARN UI .............................. SUCCESS [  2.873 s]
[INFO] Apache Hadoop YARN CSI ............................. SUCCESS [ 27.413 s]
[INFO] Apache Hadoop YARN Project ......................... SUCCESS [ 31.951 s]
[INFO] Apache Hadoop MapReduce HistoryServer Plugins ...... SUCCESS [ 15.285 s]
[INFO] Apache Hadoop MapReduce NativeTask ................. SUCCESS [ 56.629 s]
[INFO] Apache Hadoop MapReduce Uploader ................... SUCCESS [  3.940 s]
[INFO] Apache Hadoop MapReduce Examples ................... SUCCESS [ 18.966 s]
[INFO] Apache Hadoop MapReduce ............................ SUCCESS [ 16.108 s]
[INFO] Apache Hadoop MapReduce Streaming .................. SUCCESS [ 26.449 s]
[INFO] Apache Hadoop Distributed Copy ..................... SUCCESS [ 28.258 s]
[INFO] Apache Hadoop Client Aggregator .................... SUCCESS [ 18.785 s]
[INFO] Apache Hadoop Dynamometer Workload Simulator ....... SUCCESS [ 25.240 s]
[INFO] Apache Hadoop Dynamometer Cluster Simulator ........ SUCCESS [ 24.243 s]
[INFO] Apache Hadoop Dynamometer Block Listing Generator .. SUCCESS [ 23.564 s]
[INFO] Apache Hadoop Dynamometer Dist ..................... SUCCESS [ 24.972 s]
[INFO] Apache Hadoop Dynamometer .......................... SUCCESS [  4.352 s]
[INFO] Apache Hadoop Archives ............................. SUCCESS [ 33.729 s]
[INFO] Apache Hadoop Archive Logs ......................... SUCCESS [ 25.850 s]
[INFO] Apache Hadoop Rumen ................................ SUCCESS [ 25.954 s]
[INFO] Apache Hadoop Gridmix .............................. SUCCESS [ 22.775 s]
[INFO] Apache Hadoop Data Join ............................ SUCCESS [ 24.023 s]
[INFO] Apache Hadoop Extras ............................... SUCCESS [ 23.918 s]
[INFO] Apache Hadoop Pipes ................................ SUCCESS [  5.174 s]
[INFO] Apache Hadoop Amazon Web Services support .......... SUCCESS [01:19 min]
[INFO] Apache Hadoop Kafka Library support ................ SUCCESS [ 15.371 s]
[INFO] Apache Hadoop Azure support ........................ SUCCESS [ 29.753 s]
[INFO] Apache Hadoop Aliyun OSS support ................... SUCCESS [ 24.773 s]
[INFO] Apache Hadoop Scheduler Load Simulator ............. SUCCESS [ 28.102 s]
[INFO] Apache Hadoop Resource Estimator Service ........... SUCCESS [ 17.202 s]
[INFO] Apache Hadoop Azure Data Lake support .............. SUCCESS [ 11.797 s]
[INFO] Apache Hadoop Image Generation Tool ................ SUCCESS [ 18.804 s]
[INFO] Apache Hadoop Tools Dist ........................... SUCCESS [ 34.335 s]
[INFO] Apache Hadoop OpenStack support .................... SUCCESS [  0.096 s]
[INFO] Apache Hadoop Common Benchmark ..................... SUCCESS [01:04 min]
[INFO] Apache Hadoop Tools ................................ SUCCESS [  2.841 s]
[INFO] Apache Hadoop Client API ........................... SUCCESS [02:10 min]
[INFO] Apache Hadoop Client Runtime ....................... SUCCESS [01:55 min]
[INFO] Apache Hadoop Client Packaging Invariants .......... SUCCESS [  0.985 s]
[INFO] Apache Hadoop Client Test Minicluster .............. SUCCESS [03:14 min]
[INFO] Apache Hadoop Client Packaging Invariants for Test . SUCCESS [  1.325 s]
[INFO] Apache Hadoop Client Packaging Integration Tests ... SUCCESS [  3.386 s]
[INFO] Apache Hadoop Distribution ......................... SUCCESS [ 54.833 s]
[INFO] Apache Hadoop Client Modules ....................... SUCCESS [  0.077 s]
[INFO] Apache Hadoop Tencent COS Support .................. SUCCESS [  9.521 s]
[INFO] Apache Hadoop Cloud Storage ........................ SUCCESS [  5.921 s]
[INFO] Apache Hadoop Cloud Storage Project ................ SUCCESS [  0.087 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  55:46 min
[INFO] Finished at: 2023-12-24T03:01:54-05:00
[INFO] ------------------------------------------------------------------------

生成的安装包在 hadoop-dist/target/ 目录下 hadoop-3.3.6.tar.gz 为最终编译的产品包。

编译后使用新安装包部署后重新执行检查

$ hadoop checknative
2023-12-24 04:52:38,220 INFO bzip2.Bzip2Factory: Successfully loaded & initialized native-bzip2 library system-native
2023-12-24 04:52:38,224 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
2023-12-24 04:52:38,230 WARN erasurecode.ErasureCodeNative: Loading ISA-L failed: Loading functions from ISA-L failed: Failed to load symbolec_init_tables
2023-12-24 04:52:38,230 WARN erasurecode.ErasureCodeNative: ISA-L support is not available in your platform... using builtin-java codec where applicable
2023-12-24 04:52:38,264 INFO nativeio.NativeIO: The native code was built with PMDK support, and PMDK libs were loaded successfully.
Native library checking:
hadoop:  true /opt/hadoop-3.3.6/lib/native/libhadoop.so.1.0.0
zlib:    true /lib/aarch64-linux-gnu/libz.so.1
zstd  :  true /lib/aarch64-linux-gnu/libzstd.so.1
bzip2:   true /lib/aarch64-linux-gnu/libbz2.so.1
openssl: true /lib/aarch64-linux-gnu/libcrypto.so.1.1
ISA-L:   false Loading ISA-L failed: Loading functions from ISA-L failed: Failed to load symbolec_init_tables
PMDK:    true /usr/lib/aarch64-linux-gnu/libpmem.so.1.0.0

常见问题

ISA-L 报错 Failed to load symbolec_init_tables

需要手动合入 ARMv8 补丁:点击跳转

然后重新编译安装 ISA-L,然后检查即可恢复正常:

Native library checking:
hadoop:  true /opt/hadoop-3.3.6/lib/native/libhadoop.so.1.0.0
zlib:    true /lib/aarch64-linux-gnu/libz.so.1
zstd  :  true /lib/aarch64-linux-gnu/libzstd.so.1
bzip2:   true /lib/aarch64-linux-gnu/libbz2.so.1
openssl: true /lib/aarch64-linux-gnu/libcrypto.so
ISA-L:   true /lib/libisal.so.2
PMDK:    true /usr/lib/aarch64-linux-gnu/libpmem.so.1.0.0

附录

参考链接

如果遇到问题或者对文章内容存疑,请在下方留言,博主看到后将及时回复,谢谢!
回复 / 查看「历史评论
回答44+23=