2. About us.
● Kynetics is full software stack engineering firm
○ Embedded Unit
○ Application Unit
● We support NXP embedded application processors
● Custom Android OS for different industries.
○ Kernel development
○ API Device Integration (HAL)
○ Custom system services (native, Java)
● Continuous building and delivery of artifacts
3. Outline (1/2)
1. Introduction to AMP
○ SMP vs AMP
○ i.MX7 overview
○ OpenAMP framework
2. RPMsg in Android kernel
○ RPMsg character driver overview
○ Implementation in the Linux kernel
3. Android porting on Colibri i.MX7
○ Kynetics’ Cohesys BSP for Colibri i.MX 7
4. SMP vs AMP
SMP on homogeneous architectures:
● Single OS controlling two or
more identical cores sharing
system resources
● Dynamic scheduling and load
balancing
. . .
App App
OS
Coren
Core1
...
Kernel SMP
6. SMP vs AMP
AMP on heterogeneous
architectures:
● Different OS on each core -->
full-featured OS alongside a
real-time kernel
● Inter processor communication
protocol
● Efficient when the application
can be statically partitioned
across cores - high performance
is achieved locally
. . .
App App Task Task
OS OS/RTOS
Coren
Core1
...
MCAPI
7. Why Heterogeneous Systems?
A growing number of embedded systems require concurrent execution in
segregated environments:
● Real time performances to access certain devices/peripherals
● Power consumption (MCU + MPU systems were used in the past)
○ Data aggregation from sensors
● System integrity: segregation (Rich OS + critical subsystem)
○ Multi-chip approach
○ Virtualization
○ HMP (Heterogeneous multiprocessing) ←
8. ● Cortex-A7 core + Cortex-M4 core
● Master - Slave architecture
○ A7 is the master
○ M4 is the slave
● Inter processor communication
○ MU - Messaging Unit
○ RPMsg component (OpenAMP framework)
● Safe sharing of I/O resources
○ RDC - Resource Domain Controller
NXP i.MX7 overview
i.MX7 Reference Manual: https://www.nxp.com/docs/en/reference-manual/IMX7DRM.pdf
9. Why Embedded Android
● Very application oriented: abstraction between low level hardware and
application layers
● Rich UI SDK
○ Native (NDK)
○ Java (SDK)
● Great debugging tools
● Productive development environment
○ Android Studio
○ Gradle based build system
● Almost any java developer can be an “embedded application developer”
10. OpenAMP framework: Inter-processor communication
RPMsg
VirtIO/Virtqueue
Shared memory
Inter-core interrupts
RPMsg Lite,
OpenAMP Rpmsg,
...
VirtIO, Virtqueue, Vring
Shmem, MU, Mailbox
Transport Layer
MAC Layer
Physical Layer
11. OpenAMP framework: MAC (VirtIO)
virtqueue
struct vring
short used_idx
short avail_idx
Int (*add_buff)(..)
void*(*get_buff)(..)
void(*kick)(..)
vring_desc
vring_desc
vring_desc
vring_desc
vring_desc
vring_avail
vring_used
...
VRING Buffer list
Buffer
Buffer
Buffer
Shared Memory
12. VirtIO Communication
Master (A7) transmit to Remote (M4)
● Master get_buff() from virtqueue1
○ get idx from USED ring
● Master fills the buffer
● Master add_buff() to the virtqueue1
○ write buffer idx in AVAIL ring and increment idx
● Remote get_buff() from AVAIL ring
○ Remote add_buff() to USED ring (freed)
● Master writes buffer idx to USED ring and increment idx
Master (A7) receives from Remote (M4)
● Master get_buff() from virtqueue2
○ get idx from USED ring tail
● Master add_buff() to the virtqueue2
○ write buffer idx AVAIL in ring and
increment
● Remote get_buff() from AVAIL ring
and fills the buffer
○ Remote add_ buff() to USED
ring and increment
● Master get_buff() from USED ring
14. RPMsg character driver
The Linux RPMsg char driver exposes RPMsg endpoints to user-space processes.
● Supports the creation of multiple endpoints for each RPMsg device
● Each created endpoint device shows up as a single character device in /dev
● Provides multiple interfaces:
○ Control interface: allows creation/destruction of endpoint interfaces
○ Endpoint interface (one for each exposed endpoint): allows creation, destruction
and interaction with endpoints
The driver was first introduced in the Linux 4.11 version (sources can be found in the drivers
folder of mainline kernel). More info are available in our technical note.
16. Cohesys BSP
Board Support Package for Toradex Colibri-iMX7 SoM:
● Android 7.1.2
● U-Boot 2017.03 (from NXP) + support for .ELF files
● Linux Kernel 4.9 + RPMsg character driver backported from Kernel 4.11
This build is compatible with:
● Colibri i.MX7 eMMC SOM 1GB RAM
● Toradex Iris carrier board
● 7” capacitive parallel display from Toradex
17. Hybrid Android/FreeRTOS Demo - Goal
● FreeRTOS binary running on Cortex-M4
○ Sample IMU sensor
○ Send data upon configuration:
➢ VECTOR mode - raw acc, mag, gyro data
➢ NORM mode - norm of acc, mag, gyro vectors
● Android executable running on Cortex-A7 [i.e. “headless” mode]
○ Check inter-core communication and log received data on a text file
● Android app running on Cortex-A7 [i.e. “headful” mode]
○ Sensor data plotting
25. JNI - native to Java code
Java: impossibility of interacting directly with the hardware
JNI: Glue layer between Java and the lower layers of the OS
● Provides support for interacting with native code like C/C++
● Map native methods which interact directly with the hardware
● Java code declares static native methods in whatever class in the code
● Main Activity loads the native libraries (.so or .dll) where native methods are
implemented (in C) and bind them to the class where they have been
declared (native).
27. Native IPC library (JNI)
Motivation: The Android app needs to
interact with the control interface exposed
by the RPMsg char driver:
● Endpoint creation requires ioctl
operation on the control interface
● Ioctl operations cannot be done
from Java code
Activity
JNI wrapper
Native library
Android kernel
Rpmsg char driver
UI app
Linux process
● Low level operation on RPMsg devices
(e.g. creating/destroying endpoints) are
handled by native C methods.
29. GUI
Plotting libraries used: https://github.com/PhilJay/MPAndroidChart
● VECTOR mode: plots raw values of
the three components (i.e. x, y, z) of
respectively the acc, mag and gyro
vectors.
● NORM mode : plots the norm values
of the acc, mag and gyro vectors.
There is one plot for each sensor.
● NORM mode is selected by default
during application startup.
30. I/O Data Rate
Remote core:
● Sample IMUs every 10ms - 100 Hz
● Buffer of 300 elements = 3Kb (TCM Memory is only 32 Kb - bigger buffer is possible if
application is moved to DDR)
○ In NORM mode each element is 12 byte (3 float * 4 bytes each float)
○ In VECTOR mode each element is 36 byte
● Items are dequeued and sent to master 10 at a time every 100 ms
○ In NORM mode sending speed is 1.32KB/s (with RPMSG header)
○ In VECTOR mode sending speed is 3.67KB/s (with RPMSG header)
Master core:
● At the driver layer
○ In NORM mode receiving speed is ~0.93KB/s (without RPMSG header)
○ In VECTOR mode receiving speed is ~3.51KB/s (without RPMSG header)
33. References
● Kynetics Technical Notes: http://kynetics.com/docs
○ Android Asymmetric Multiprocessing on Toradex Colibri i.MX7D
○ RPMsg device and driver on Linux and Android
○ Android Asymmetric Multiprocessing on i.MX7: Remote Core Sensors Data
Streaming in Java
● Kynetics GitHub: https://github.com/kynetics
● OpenAMP project page
● An Introduction to Asymmetric Multiprocessing: When this Architecture can be a Game
Changer (ELC 2018)