software solutions for migration guide from aarch32 to aarch64 · bit architecture. if the...

Software Solutions for Migration Guide

from Aarch32 to Aarch64

1. Introduction

This document applied to i.MX 8 series chips. i.MX 8

series is using Armv8-A architecture, which supports

Aarch64 and Aarch32 Execution state.This document

intends to describe some software changes when

migrate from Aarch32 to Aarch64.

Armv8-A architecture is a 64-bit instruction

set(aarch64), and which is compatible with armv7 32-

bit architecture. If the processor is running on a 64-bit

operating system, the processor is also able to run

unmodified Armv7 32-bit binaries. For Android/Yocto

this means that once the kernel has been ported to 64-

bits (Supported by linux community) then the rest of

the OS, from core libraries to apps and games, can be

either 32-bit or 64-bit.

NXP Semiconductors Document Number: AN12212

Application Note Rev. 0 , 07/2018

Contents

1. Introduction ............................................................ 1 2. The difference between AARCH64 and AARCH32

............................................................................... 2 3. EL Model Implementation ..................................... 3 4. What’s ATF ........................................................... 4 5. What’s SCFW ........................................................ 4 6. Image structure....................................................... 5 7. How to build Aarch64 toolchain ............................ 6 8. How to build uboot/kernel/Application with

Aarch64 Toolchain ................................................. 7 9. How to build the rootfs to support 32-bit

application .............................................................. 8 10. How to build the ATF ............................................ 9 11. How to package scfw/atf/uboot .............................. 9 12. Utilizing the Codebase ......................................... 10 13. Linux Kernel support of AARCH64 .................... 10 14. Reference document ............................................. 17

Software Solutions for Migration Guide from Aarch32 to Aarch64, Application Note, Rev. 0, 07/2018

2 NXP Semiconductors

Figure 1. ARMv8-A Architecture

2. The difference between AARCH64 and AARCH32 Difference between AARCH64 and AARCH32

AARCH64 State AARCH32 State

Provides 31 64-bit general-purpose registers, of

which X30 is used as the procedure link register.

Provides 13 32-bit general-purpose registers, and

a 32-bit PC, SP, and link register (LR). The LR is

used as both an ELR and a procedure link

register. Some of these registers have multiple

banked instances for use in different PE modes.

Provides a 64-bit program counter (PC), stack

pointers (SPs), and exception link registers

(ELRs).

Provides a single ELR, for exception returns from

Hyp mode.

Provides 32 128-bit registers for SIMD vector

and scalar floating-point support.

Provides 32 64-bit registers for Advanced SIMD

vector and scalar floating-point support.

Provides a single instruction set, A64. Provides two instruction sets, A32 and T32.

Defines the Armv8 Exception model, with up to

four Exception levels, EL0 - EL3, that provide an

execution privilege hierarchy,

Supports the Armv7-A exception model, based

on PE modes, and maps this onto the Armv8

Exception model, that is based on the exception

levels.

Provides support for 64-bit virtual addressing. Provides support for 32-bit virtual addressing.

Defines a number of Process state (PSTATE)

elements that hold PE state. The A64 instruction

set includes instructions that operate directly on

various PSTATE elements.

Defines a number of Process state (PSTATE)

elements that hold PE state. The A32 and T32

instruction sets include instructions that operate

directly on various PSTATE elements, and

instructions that access PSTATE by using the

NXP Semiconductors 3

Application Program Status Register (APSR) or

the Current Program Status Register (CPSR).

Names each System register using a suffix that

indicates the lowest exception level at which the

register can be accessed.

Transferring control between the AArch64 and AArch32 Execution states is known as interprocessing.

The PE can move between Execution states only on a change of Exception level, and subject to the rules

given in Interprocessing on page D1-1962 of document DDI0487C_a_armv8_arm. This means different

software layers, such as an application, an operating system kernel, and a hypervisor, executing at

different exception levels, can execute in different execution states.

3. EL Model Implementation

The exception levels in AArch64 are the same as those in the Cortex®-A15 through the addition of the

HYP mode for hypervisor support inserted between the OS kernel mode and the TrustZone® monitor

mode. The secure world still only supports a single OS instance for reasons of simplicity associated with

approvals for secure software.

The exception hierarchy also defines a strict set of rules for what transitions are valid between 32-bit and

64-bit operation. As execution elevates or takes exception at an increased level of privilege, operation

can only be either at the same width, or an increased width of ISA. It is not possible, for example, for a

32-bit hypervisor to support a 64-bit operating system while executing in the HYP mode.

Figure 2. Armv8-a Exception Levels Model

To help support this new exception model, A64 also introduces a dedicated exception link register

(ELR) that is written on all exception entry. When moving from a 32-bit mode to enter a 64-bit

exception it will also be automatically zero extended. Interrupt masks are also set automatically on

exception entry. Each exception level has its own vector base address registers and each vector is

distinguished by type; synchronous, IRQ, FIQ or Error. The origin of the exception is also available

from the vector and additional details about the exception are supplied in a syndrome register. This is a

specifically useful feature to enable the virtualization of IO devices where any virtual machine’s access

to the device is trapped to an exception in the hypervisor, and as such the hypervisor can simply read

this information to evaluate which operation is to be virtualized.

For the imx8’s implementation of the EL model:

1. Implementation EL3 into Arm® trust firmware(ATF)

2. Implementation EL0/EL1/EL2 into Linux Kernel

4. What’s ATF

Arm trusted firmware provides a reference implementation of secure world software for Armv8-A,

including a “Secure Monitor” executing at Exception Level 3 (EL3). It implements various Arm

interface standards, such as:

- The Power State Coordination Interface (PSCI)

- Trusted Board Boot Requirements (TBBR, ARM DEN0006C-1)

- SMC Calling Convention

- System Control and Management Interface

As far as possible the code is designed for reuse or porting to other Armv8-A model and hardware

platforms. Arm will continue development in collaboration with interested parties to provide a full

reference implementation of Secure Monitor code and Arm standards to the benefit of all developers

working with Armv8-A TrustZone technology.

For NXP i.mx8 platform implementation:

You can find the imx-atf source code at /tmp/work/imx8qxpmek-poky-linux/imx-atf/ in the yocto

project, or by the following git project:

https://source.codeaurora.org/external/imx/imx-atf.git

5. What’s SCFW

iMX8 introduced one System Controller (SC) that provides an abstraction to many of the underlying

features of the hardware. This function runs on a Cortex-M processor which executes SC firmware

(FW). This overview describes the features of the SCFW and the APIs exposed to other software

components.

The features of the SC include:

• System Initialization and Boot

• Initial power and clock configuration

• DRAM controller configuration

• Security configuration

• Power Management

• Power, Clock, and Reset Control

• Resource Management

• SoC peripheral, memory, I/O management and partitioning

• Access / permission control

• System Counter

• Pin multiplexing and Pad Configuration

• Temperature Monitoring

• Watchdog

• Misc Control

• Misc chip GPR control

You can find the SC-firmware binary at tmp/work/imx8qxpmek-poky-linux/imx-sc-firmware/0.2-

r0/imx-sc-firmware-0.2/. The source code is not open, and you use the binary directly. You need this

binary, when you package the images using mkimage tool.

6. Image structure

For yocto project:

All of the binaries to run with imx8qxp include: scfw firmware, ATF binary, and uboot, kernel Image,

and rootfs. The binaries’ location in the sd card:

SCFW(scfw_tcm.bin) + ATF(bl31.bin) + uboot ---------------> flash.bin

dd if=flash.bin of=/dev/mmcblk1 bs=1k seek=33 conv=fsync

$ ls tmp/work/imx8qxpmek-poky-linux/imx-sc-firmware/0.2-r0/imx-sc-firmware-0.2/ COPYING mx8qm-scfw-tcm.bin mx8qx-scfw-tcm.bin SCR-imx-sc-firmware.txt

or dd if=flash.bin of=/dev/mmcblk1 bs=1k seek=32 conv=fsync

For the flash.bin location, different SoC chip reversion, may have some

small difference, please check the specific user guide in detail

uImage + dtb --------------------------> mmcblk1p1(Fat file system)

rootfs -----------------------------------> mmcblk1p2 (Ext4 file system)

The binary is used by the Processor:

Cortex-M4(SCU) -------------------------------------------> SCFW(scfw_tcm.bin)

CA72/53/35 (EL3) ----------------------------------------------------> ATF(bl31.bin)

CA72/53/35(EL0-EL2) ----------------------------------------------> uboot/kernel/rootfs

7. How to build Aarch64 toolchain

# repo init -u https://source.codeaurora.org/external/imx/imx-manifest -b imx-linux-morty -m

imx-4.9.51-8mq_beta.xml

# repo sync

# DISTRO=fsl-imx-wayland MACHINE=imx8qxpmek source fsl-setup-release.sh -b build-wayland

# bitbake meta-toolchain

# ./tmp/deploy/sdk/fsl-imx-wayland-glibc-x86_64-meta-toolchain-aarch64-toolchain-4.9.51-mx8-

beta.sh

If success, then it will indicate you input the toolchain install directory:

#./tmp/deploy/sdk/fsl-imx-wayland-glibc-x86_64-meta-toolchain-aarch64-toolchain-4.9.51-mx8-beta.sh

NXP i.MX Release Distro SDK installer version 4.9.51-mx8-beta

Enter target directory for SDK (default: /opt/fsl-imx-wayland/4.9.51-mx8-beta):

For more details, please download imx yocto documents,

https://cache.nxp.com/secured/assets/documents/en/supporting-information/fsl-yocto-L4.9.51_imx8mq-

beta.tar.gz?__gda__=1516942644_3c882b72a61fbe1553c59e615a116d1e&fileExt=.gz

8. How to build uboot/kernel/Application with Aarch64 Toolchain

1. Setup the aarch64 build environment:

a. source /opt/fsl-imx-wayland/4.9.51-mx8-beta/environment-setup-aarch64-poky-linux

b. printenv, you can see the env have been set:

ARCH=arm64

RANLIB=aarch64-poky-linux-ranlib

CROSS_COMPILE=aarch64-poky-linux-

CC=aarch64-poky-linux-gcc --sysroot=/opt/fsl-imx-wayland/4.9.51-mx8-

beta/sysroots/aarch64-poky-linux

XDG_RUNTIME_DIR=/run/user/1001

OBJDUMP=aarch64-poky-linux-objdump

LESSCLOSE=/usr/bin/lesspipe %s %s

SDKTARGETSYSROOT=/opt/fsl-imx-wayland/4.9.51-mx8-beta/sysroots/aarch64-

poky-linux

2. Build the kernel:

a. # unset CFLAGS CPPFLAGS CXXFLAGS LDFLAGS MACHINE

b. # make defconfig

c. # make -j4

d. Then the dtb/Image files were created at arch/arm64/boot/.

3. Build the u-boot:

a. # unset CFLAGS CPPFLAGS CXXFLAGS LDFLAGS MACHINE

b. # make imx8qxp_mek_defconfig (Take imx8qxp mek board as an example here)

c. # make

d. Then you can find the binary u-boot.imx at top directory of u-boot.

4. Build Linux application:

Test@mpusesz:~$ cat hello.c

#include <stdio.h>

int main()

printf("hello world\n");

return 0;

Test@mpusesz:~$ source /opt/fsl-imx-xwayland/4.9.51-mx8-beta/environment-setup-aarch64-poky-

Test@mpusesz:~$ aarch64-poky-linux-gcc --sysroot=/opt/fsl-imx-xwayland/4.9.51-mx8-

beta/sysroots/aarch64-poky-linux -L/opt/fsl-imx-xwayland/4.9.51-mx8-beta/sysroots/aarch64-poky-

linux/usr/lib -lm -lc hello.c -o hello.o

Run at the imx8qxpmek board:

root@imx8qxpmek:~# ./hello

hello world

Or you can choose aarch64-linux-gnu-gcc to build the application

Test@mpusesz:~$ source /opt/fsl-imx-xwayland/4.9.51-mx8-beta/environment-setup-aarch64-poky-

Test@mpusesz:~$ aarch64-linux-gnu-gcc hello.c -o hello

9. How to build the rootfs to support 32-bit application

Add the following lines into local.conf file –

require conf/multilib.conf

MULTILIBS = "multilib:lib32"

DEFAULTTUNE_virtclass-multilib-lib32 = "armv7athf-neon"

IMAGE_INSTALL_append = "lib32-glibc lib32-libgcc lib32-libstdc++"

Then rebuild the yocto project, using bitbake command.

Then you can see there two lib directory lib and lib64 there:

And two library interpreter, ld-linux-aarch64.so.1 and ld-linux-armhf.so.3, which are used to interpret

64-bit binary and 32bit binary respectively.

Test with imx8qxp mek board with simple hello application:

root@imx8qxpmek:~# file hello

hello: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter

/lib/ld-linux-armhf.so.3, for GNU/Linux 2.6.32,

BuildID[sha1]=1175974bb5c2b31bc10c4483f018bbd3bb858ec5, not stripped

root@imx8qxpmek:~# ./hello

###hello

If need modify "armv7athf-neon" into your own toolchain you are using in your 32-bits application. Or

you can build your 32-bit application with –static link flag to compile.

For how to support the transition by armv8, from 32-bit to 64-bit, or from 64-bit to 32-bit, please check

the document ARMv8_white_paper_v5.pdf(you can find this document from arm website) for the

details.

10. How to build the ATF

Download the ATF from:

# git clone https://source.codeaurora.org/external/imx/imx-atf.git

# cd imx-atf

# git checkout imx_4.9.51_imx8_beta2

# make PLAT=imx8qxp bl31

Then the atf binary was created at:

build/imx8qxp/release/bl31.bin

11. How to package scfw/atf/uboot

Download imx-mkimage from:

# git clone https://source.codeaurora.org/external/imx/imx-mkimage.git

# cd imx-mkimage/

# git checkout imx_4.9.51_imx8_beta2

# cp bl31.bin imx-mkimage/ iMX8QX/ (Take imx8qxp example)

# cp u-boot.bin imx-mkimage/ iMX8QX/

# cp mx8qm-scfw-tcm.bin imx-mkimage/ iMX8QX/

# make SOC=iMX8QX flash_dcd

Then you can get the binary,

imx-mkimage/iMX8QX$ ls flash.bin

flash.bin.

Now you flash the SCFW+ATF+uboot into the flash:

# dd if=flash.bin of=/dev/mmcblk1 bs=1k seek=33 conv=fsync

12. Utilizing the Codebase

As the Armv8 provided more registers, it can provide better effect. So suggest you can rebuild the codes

with aarch64 toolchain if possible, and fix all of warning introduced by the new toolchain.

Armv8 compilers support standard high-level code such as C and C++ natively; this code will compile

and run after an appropriate Armv8 Board Support Package (BSP) is in place. Assembly code, by

contrast, requires that careful attention be paid to how the code is used. While many assembly

instructions from Armv7 still exist in Armv8, their syntax or behavior can differ in very subtle ways.

Some coding constructs that do not compile or behave as expected relative to Armv7 include hard-coded

memory locations (true of any software porting project), access to the ARMv7 coprocessors (such as

CP15) and register names, and data alignment. The Arm® Cortex™-A Programmer's Guide for Armv8-

A (DEN0024A), published by Arm, presents a detailed analysis of porting concerns.

13. Linux Kernel support of AARCH64

Armv8-a exception support:

Linux can be used as guest os or Hypervisor. In the kernel codebase, EL0/EL1/EL2 were implemented

by default. EL3 was implemented into ATF.

The following code splice in the kernel code base is to setup the EL0/EL1 level exception:

Figure 3. kernel code base setup the EL0/EL1 level exception

UEFI protocol interface support:

For Aarch64, linux kernel enabled CONFIG_EFI by default, it can be loaded by UEFI firmware, beside

loaded by u-boot.

Figure 4. UEFI protocol interface support

Boot up linux:

Essentially, the boot loader should provide (as a minimum) the following: 1. Setup and initialize the RAM

2. Setup the device tree

3. Decompress the kernel imag

4. Call the kernel image

The AArch64 kernel does not currently provide a decompressor and therefore requires decompression

(gzip etc.) to be performed by the bootloader if a compressed Image target (e.g. Image.gz) is used. For

bootloaders that do not implement this requirement, the uncompressed Image target is available instead.

physical address of device tree blob (dtb) in system RAM, will be put into general register x0 by

bootloader, and when the kernel bootup, will get this address from x0 register.

NXP i.MX 8 bsp release have support 64-bit by default, you need want to port new community to imx

64-bits processors, you can refer to other linux arm64 documents from kernel code base

Documentation/arm64/.

Memory management:

The memory management unit defined by AArch64 is fundamentally the same as that used in the

Cortex-A15 except for the ability to now support both 48-bits of virtual and physical address. Bounding

support to 48-bit has the advantage that we could again simplify the hardware such that they are required

to only support up to four levels of page table when required to decode an address, while also more

importantly limiting the scope of complexity for validation. In fact, AArch64 also now natively supports

a 64 KB minimum page size in addition to the more familiar 4 KB page and as such can reduce the

required walk from four to two levels where a 42-bit address is sufficient.

Figure 5. Memory management unit defined by AArch64

Any 32-bit code will of course be limited to operating in the first 4 GB of address space, and as such the

hardware will automatically zero-extend the virtual address into any elevated 64-bit call. To provide a

basic memory map, the architecture also provides two base addresses from which the virtual addresses

used for access to the OS services and the application data can grow. Virtual address spaces from 232 to

248 bytes in size are supported from the top and bottom of the 64-bit address space.

As with the Cortex-A15, the Armv8 memory management unit provides two stage translations from an

application virtual address (VA), to the intermediate physical address (IPA) used by any hypervisor, and

then through to the actual physical address (PA) placed on the memory bus. The IPA and PA are

implementation defined to support between 32 and 48-bits of address space.

You can choose to use the virtual address space size number through kernel kconfig option:

Figure 6. Choose virtual address space size number through kernel kconfig option

And can choose page size as 64 KB or other size.

Figure 7. Choose page size

Here is the virtual kernel memory layout of the 4 KB page size case in imx8qxp:

Figure 8. Virtual kernel memory layout of the 4 KB page size

Here is the virtual kernel memory layout of 64 KB page size case in imx8qxp:

Figure 9. virtual kernel memory layout of 64 KB page size

Compared to arm32 memory layout, aarch64 have enough virtual address, and no need

lowmem/highmem anymore.All of vmalloc address space can be used like legacy lowmem. The

following picture is imx6sx-sdb linux memory layout.

Figure 10. imx6sx-sdb linux memory layout

The memory layout definition in the kernel code:

Figure 11. memory layout definition in the kernel code

The text start address started from KIMAGE_VADDR + TEXT_OFFSET.

Figure 12. Text start address

For the physical memory, configuration is the same with before, can set it in the dts file:

Figure 13. DDR Memory Setting in the dts file

You can configure the DDR memory size, start address and reserve the memory from the system

memory in the dts file.

14. Reference document

• DDI0487C_a_armv8_arm.pdf

• ARMv8_white_paper_v5.pdf

• Documentation/arm64/memory.txt

• Documentation/arm64/booting.txt

Document Number: AN12212 Rev. 0

07/2018

How to Reach Us:

Home Page:

nxp.com

Web Support:

nxp.com/support

Information in this document is provided solely to enable system and software implementers to use NXP

products. There are no express or implied copyright licenses granted hereunder to design or fabricate any

integrated circuits based on the information in this document. NXP reserves the right to make changes

without further notice to any products herein.

NXP makes no warranty, representation, or guarantee regarding the suitability of its products for any

particular purpose, nor does NXP assume any liability arising out of the application or use of any product

or circuit, and specifically disclaims any and all liability, including without limitation consequential or

incidental damages. “Typical” parameters that may be provided in NXP data sheets and/or specifications

can and do vary in different applications, and actual performance may vary over time. All operating

parameters, including “typicals,” must be validated for each customer application by customer's technical

experts. NXP does not convey any license under its patent rights nor the rights of others. NXP sells

products pursuant to standard terms and conditions of sale, which can be found at the following address:

nxp.com/SalesTermsandConditions.

While NXP has implemented advanced security features, all products may be subject to unidentified

vulnerabilities. Customers are responsible for the design and operation of their applications and products

to reduce the effect of these vulnerabilities on customer’s applications and products, and NXP accepts no

liability for any vulnerability that is discovered. Customers should implement appropriate design and

operating safeguards to minimize the risks associated with their applications and products.

NXP, the NXP logo, NXP SECURE CONNECTIONS FOR A SMARTER WORLD, COOLFLUX,

EMBRACE, GREENCHIP, HITAG, I2C BUS, ICODE, JCOP, LIFE VIBES, MIFARE, MIFARE

CLASSIC, MIFARE DESFire, MIFARE PLUS, MIFARE FLEX, MANTIS, MIFARE ULTRALIGHT,

MIFARE4MOBILE, MIGLO, NTAG, ROADLINK, SMARTLX, SMARTMX, STARPLUG, TOPFET,

TRENCHMOS, UCODE, Freescale, the Freescale logo, AltiVec, C‑5, CodeTEST, CodeWarrior,

ColdFire, ColdFire+, C‑Ware, the Energy Efficient Solutions logo, Kinetis, Layerscape, MagniV,

mobileGT, PEG, PowerQUICC, Processor Expert, QorIQ, QorIQ Qonverge, Ready Play, SafeAssure, the

SafeAssure logo, StarCore, Symphony, VortiQa, Vybrid, Airfast, BeeKit, BeeStack, CoreNet, Flexis,

MXC, Platform in a Package, QUICC Engine, SMARTMOS, Tower, TurboLink, and UMEMS are

trademarks of NXP B.V. All other product or service names are the property of their respective owners.

Arm, AMBA, Artisan, Cortex, Jazelle, Keil, SecurCore, Thumb, TrustZone, and μVision are registered

trademarks of Arm Limited (or its subsidiaries) in the EU and/or elsewhere. Arm7, Arm9, Arm11,

big.LITTLE, CoreLink, CoreSight, DesignStart, Mali, Mbed, NEON, POP, Sensinode, Socrates, ULINK

and Versatile are trademarks of Arm Limited (or its subsidiaries) in the EU and/or elsewhere. All rights

reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. The Power Architecture

and Power.org word marks and the Power and Power.org logos and related marks are trademarks and

service marks licensed by Power.org.

software solutions for migration guide from aarch32 to aarch64 · bit architecture. if the...

Documents

a bit-serial viterbi processor

ne24.24m 24 bit digital multiple channel matrix processor

designkilla: the 32-bit pipelined processor

implementing a 32-bit processor-based design in...

why use a small 8-bit processor when there are cheap ... ·...

sigmadsp multichannel 28-bit audio processor ad1940/ad1941

manual sound processor audison bit one

dsp56156 16-bit digital signal processor

8 bit risc processor presentation

effective implementation of a 32-bit risc processor

ece5917 soc architecture:...

64 bit multi processor paper presentation

team 6 16-bit risc processor

12-bit ccd signal processor with precision timing

32-bit 5-stage risc pipeline processor with 2-bit dynamic...

fpga implementation of an 8-bit simple processor

32-bit pipelined risc processor

tsk3000a 32-bit risc processor - altium tsk3000a 32 bit...

core-a: a 32-bit synthesizable processor core

microblazeâ„¢ 32-bit risc processor - altium