Compile Paddle Deep Learning Framework On Jetson Xavier NX

First and foremost the Jetson Xavier NX configuration, I compiled on, is given below. For different jetpack versions, it might be possible to follow the similar steps and get the expected results, but you may end up facing issues.
NVIDIA Jetson Xavier NX (Developer Kit Version)- Jetpack : 4.5 [L4T 32.5.0]
- Board :
* Type : Xavier NX (Developer Kit Version)
* SOC Family : tegra194 ID: 25 - Libraries:
* CUDA : 10.2.89
* OpenCV : 4.1.1 compiled CUDA: NO
* TensorRT : 7.1.3.0
* VPI : 1.0.12 ii libnvvpi1 arm64
* VisionWorks: 1.6.0.501
* Vulkan : 1.2.70
* cuDNN : 8.0.0.180
Compiling Paddle on Jetson for inference is a bit tedious. Given the existing documentation on paddle docs, it may take hours to compile and even more if you stuck in between CMake issues.
Leaving build process overnight is the best idea.
Why compile ?
aarch64 wheel for python3 is not available and even if it does, it remains outdated. Also to address the wheel installation issue on jetpack 4.5 , I decided to compile it.
The steps mentioned on the paddle docs would not work as you’d expect them to. Docs are little outdated.
Let’s begin the build process:
1. Installing apt dependencies.
sudo apt-get upgrade
sudo apt-get install \
python3-pip \
gcc \
g++ \
make \
cmake \
git \
vim \
unrar \
python3-dev \
python3-pip \
swig \
wget \
patchelf \
libopencv-dev
2. Installing python dependencies:
sudo -H pip3 install numpy==1.19.4 protobuf==3.14.0 wheel==0.30.0 setuptools==39.0.1
Takeaway:
Make sure the numpy version is 1.19.4 as I stumbled on Illegal Instruction (core dumped)
error. You may end up compiling again keeping thought you screwed the build process up somewhere, I did :’( .
3. Include CUDA on PATH
echo "export PATH=/usr/local/cuda/bin:\$PATH" >> ~/.bashrc
echo "export LD_LIBRARY_PATH=/usr/local/cuda:\$LD_LIBRARY_PATH" >> ~/.bashrc# Increase the number of files open limit to 102400. It is required to cope up with 'Too many files open' issue.
ulimit -n 102400
4. Prepare directories for compilation:
mkdir ~/paddle-compilation
cd ~/paddle-compilation
5. Swap space is really helpful on Xavier NX
sudo fallocate -l 5G /var/swapfile
sudo chmod 600 /var/swapfile
sudo mkswap /var/swapfile
sudo swapon /var/swapfile
sudo bash -c 'echo "/var/swapfile swap swap defaults 0 0" >> /etc/fstab'
Takeaway:
a) Without any swap space and the build procs = 6, Your build process is going to get out of memory near 27 % completion.
6. Start compiling:
sudo nvpmodel -m 0 && sudo jetson_clocks #(max power mode)
git clone https://github.com/PaddlePaddle/Paddle.git
git checkout release/2.0
cd Paddle
mkdir build
cd build
cmake .. \
-DWITH_CONTRIB=OFF \
-DWITH_MKL=OFF \
-DWITH_AVX=OFF \
-DWITH_MKLDNN=OFF \
-DWITH_TESTING=OFF \
-DCMAKE_BUILD_TYPE=Release \
-DON_INFER=ON \
-DWITH_PYTHON=ON \
-DWITH_XBYAK=OFF \
-DWITH_NV_JETSON=ON \
-DWITH_TENSORRT=ON \
-DWITH_NCCL=OFF \
-DPY_VERSION=3make -j4
Takeaways:
a) Make sure you turn this -DWITH_NCCL
flag off else paddle would not compile and throw NCCL not found C++ compilation error. NCCL
is usually needed when hardware is equipped with more than one GPU.
b) -DPY_VERSION=3
is a must if you’d like to compile for python3.
c) -DWITH_TENSORRT=ON
is also needed to speed up inference leveraging tensorrt inference engine capabilities.
d) Since the process go long for hours. Utilizing all cores throttles the CPU (on Nano) or memory (In Xavier). Huh, figured out after successive hit and trials.
e) Use make -j2
on Nano and make -j4
on Xavier for a peaceful build.
It takes around 5 hrs to complete the build process on Xavier (8–9 hours on Nano).
7. Finally install paddlepaddle-gpu on your Xavier… :)
pip3 install ./python/dist/paddlepaddle_gpu-0.0.0-cp36-cp36m-linux_aarch64.whl
If you reached at the end. And didn’t manage to compile it.
Here’s my final giveaway:
Wheel directly from G-Drive.