RenderScript is a high-performance computation API that allows apps to run operations with automatic parallelization across CPU and GPU cores. It provides access to these features without needing to write code for different architectures or processor types. RenderScript code is compiled at runtime, so apps do not need to be recompiled for different devices. The document provides an overview of RenderScript and its system, basic usage, entry points, variables, pointers, memory management, and comparisons with OpenGL ES. It also includes examples of using RenderScript for single-image sampling, multi-image sampling, multi-pass effects, and generating a histogram.
2. About RenderScript
• RenderScript computation
– high performance computation API at the native level that you
write in C (C99 standard)
– your apps the ability to run operations with automatic
parallelization across all available processor cores
• different types of processors such as the CPU, GPU or DSP
– useful for apps that do image processing, mathematical
modeling, or any operations that require lots of mathematical
computation
– access to all of these features without having to write code to
support different architectures or a different amount of
processing cores
– do not need to recompile your application for different
processor types, because Renderscript code is compiled on the
device at runtime.
4. RenderScript on Android 4.1
• Deprecation Notice
– Earlier versions of Renderscript included an
experimental graphics engine component
• most of the APIs in rs_graphics.rsh and the
corresponding APIs in android.renderscript
– If you have apps that render graphics with
Renderscript, we highly recommend you convert
your code to another Android graphics rendering
option.
• http://docs.huihoo.com/android/4.2/guide/topics/rend
erscript/compute.html#overview
5. RenderScript basic 1/2
1. Write a .rs 2. Run ‘clean’ and ‘ScriptC_scriptname.java’ is generated
3. ScriptC_mono.java
7. RenderScript entry points
Function(in .rs) Comment
void root(const uchar4* v_in, uchar4* v_out) [Default]
forEach_root(in, out)
void root(const uchar4 *v_in, uchar4 *v_out, const uchar4* data, uint32_t x, uint32_t y) forEach_root(in, out)
Data can be renamed
x, y are not changeable order and
name
void root(const uchar4 *v_in, uchar4 *v_out, uint32_t x, uint32_t y) forEach_root(in, out)
void root(const uchar4 *v_in, uchar4 *v_out, const uchar4* data) forEach_root(in, out)
void root(uchar4* v_out) forEach_root(out)
uchar4 __attribute__((kernel)) functionname(uchar4 in, uint32_t x, uint32_t y) forEach_functionname(in, out)
int root() For graphics(deprecated)
Called by RenderScriptGL
void functionname() invoke_functionname()
8. FilterScript
• FilterScript
– Introduced in Android 4.2 (API Level 17), Filterscript defines a subset of Renderscript
that focuses on image processing operations, such as those that you would typically
write with an OpenGL ES fragment shader.
– Present Android Developer’s RenderScript contents are explaining basically using
FilterScript.
• http://developer.android.com/guide/topics/renderscript/compute.html
• Usage
– Inputs and return values of root functions cannot contain pointers. The default root
function signature contains pointers, so you must use the __attribute__((kernel))
attribute to declare a custom root function when using FilterScript.
– Built-in types cannot exceed 32-bits.
– FilterScript must always use relaxed floating point precision by using the rs_fp_relaxed
pragma.
• Most applications can use rs_fp_relaxed without any side effects. This may be very beneficial on some
architectures due to additional optimizations only available with relaxed precision (such as SIMD CPU
instructions).
– FilterScript files must end with an .fs extension, instead of an .rs extension
• Android-18 allows using .rs instead of .fs
10. Script
Nested Classes
class Script.Builder Only intended for use by generated reflected code.
class Script.FieldBase Only intended for use by generated reflected code.
class Script.FieldID FieldID is an identifier for a Script + exported field pair.
class Script.KernelID KernelID is an identifier for a Script + root function pair.
class Script.LaunchOptions Class used to specify clipping for a kernel launch.
Protected Methods
Script.FieldID createFieldID(int slot, Element e) Only to be used by generated reflected classes.
Script.KernelID createKernelID(int slot, int sig, Element ein, Element eout) Only to be used by generated reflected classes.
void forEach(int slot, Allocation ain, Allocation aout, FieldPacker v, Script.LaunchOptions sc) Only intended for use by generated reflected code.
void forEach(int slot, Allocation ain, Allocation aout, FieldPacker v) Only intended for use by generated reflected code.
void invoke(int slot) Only intended for use by generated reflected code.
void invoke(int slot, FieldPacker v) Only intended for use by generated reflected code.
Public Methods
Int getXEnd() Returns the current X end
Int getXStart() Returns the current X start
Int getYEnd() Returns the current Y end
Int getYStart() Returns the current Y start
Int getZEnd() Returns the current Z end
Int getZStart() Returns the current Z start
Script.LaunchOptions setX(int xstartArg, int xendArg) Set the X range.
Script.LaunchOptions setY(int ystartArg, int yendArg) Set the Y range.
Script.LaunchOptions setZ(int zstartArg, int zendArg) Set the Z range.
11. Variables
//uchar4 *gPixels;
private final static int mExportVarIdx_gPixels = 5;
private Allocation mExportVar_gPixels;
public void bind_gPixels(Allocation v) {
mExportVar_gPixels = v;
if (v == null) bindAllocation(null, mExportVarIdx_gPixels);
else bindAllocation(v, mExportVarIdx_gPixels);
}
public Allocation get_gPixels() {
return mExportVar_gPixels;
}
//rs_allocation gIn;
private final static int mExportVarIdx_gIn = 0;
private Allocation mExportVar_gIn;
public synchronized void set_gIn(Allocation v) {
setVar(mExportVarIdx_gIn, v);
mExportVar_gIn = v;
}
public Allocation get_gIn() {
return mExportVar_gIn;
}
public Script.FieldID getFieldID_gIn() {
return createFieldID(mExportVarIdx_gIn, null);
}
//float gFactor = 6;
private final static int mExportVarIdx_gFactor = 6;
private float mExportVar_gFactor;
public synchronized void set_gFactor(float v) {
setVar(mExportVarIdx_gFactor, v);
mExportVar_gFactor = v;
}
public float get_gFactor() {
return mExportVar_gFactor;
}
public Script.FieldID getFieldID_gFactor() {
return createFieldID(mExportVarIdx_gFactor, null);
}
//void root(…)
private final static int mExportForEachIdx_root = 0;
public Script.KernelID getKernelID_root() {
return createKernelID(mExportForEachIdx_root, 3, null, null);
}
public void forEach_root(Allocation ain, Allocation aout) {
// check ain
if (!ain.getType().getElement().isCompatible(__U8_4)) {
throw new RSRuntimeException("Type mismatch with U8_4!");
}
// check aout
if (!aout.getType().getElement().isCompatible(__U8_4)) {
throw new RSRuntimeException("Type mismatch with U8_4!");
}
// Verify dimensions
Type tIn = ain.getType();
Type tOut = aout.getType();
if ((tIn.getCount() != tOut.getCount()) ||
(tIn.getX() != tOut.getX()) ||
(tIn.getY() != tOut.getY()) ||
(tIn.getZ() != tOut.getZ()) ||
(tIn.hasFaces() != tOut.hasFaces()) ||
(tIn.hasMipmaps() != tOut.hasMipmaps())) {
throw new RSRuntimeException("Dimension mismatch between input and output
parameters!");
}
forEach(mExportForEachIdx_root, ain, aout, null);
12. Structs
/* typedef struct __attribute__((packed, aligned(4)))
Point {
float2 delta;
float2 position;
//uchar4 color;
} Point_t;
Point_t *point;
*/
public class ScriptField_Point extends
android.renderscript.Script.FieldBase {
static public class Item {
public static final int sizeof = 16;
Float2 delta;
Float2 position;
Item() {
delta = new Float2();
position = new Float2();
}
}
private Item mItemArray[];
private FieldPacker mIOBuffer;
private static java.lang.ref.WeakReference<Element> mElementCache =
new java.lang.ref.WeakReference<Element>(null);
public static Element createElement(RenderScript rs) {
Element.Builder eb = new Element.Builder(rs);
eb.add(Element.F32_2(rs), "delta");
eb.add(Element.F32_2(rs), "position");
return eb.create();
}
public synchronized void set(Item i, int index, boolean copyNow) {
if (mItemArray == null) mItemArray = new Item[getType().getX() /* count */];
mItemArray[index] = i;
if (copyNow) {
copyToArray(i, index);
FieldPacker fp = new FieldPacker(Item.sizeof);
copyToArrayLocal(i, fp);
mAllocation.setFromFieldPacker(index, fp);
}
}
public synchronized Item get(int index) {
if (mItemArray == null) return null;
return mItemArray[index];
}
public synchronized void set_delta(int index, Float2 v, boolean copyNow) {
if (mIOBuffer == null) mIOBuffer = new FieldPacker(Item.sizeof * getType().getX()/* count */);
if (mItemArray == null) mItemArray = new Item[getType().getX() /* count */];
if (mItemArray[index] == null) mItemArray[index] = new Item();
mItemArray[index].delta = v;
if (copyNow) {
mIOBuffer.reset(index * Item.sizeof);
mIOBuffer.addF32(v);
FieldPacker fp = new FieldPacker(8);
fp.addF32(v);
mAllocation.setFromFieldPacker(index, 0, fp);
}
}
public synchronized void set_position(int index, Float2 v, boolean copyNow) {
if (mIOBuffer == null) mIOBuffer = new FieldPacker(Item.sizeof * getType().getX()/* count */);
if (mItemArray == null) mItemArray = new Item[getType().getX() /* count */];
if (mItemArray[index] == null) mItemArray[index] = new Item();
mItemArray[index].position = v;
if (copyNow) {
mIOBuffer.reset(index * Item.sizeof + 8);
mIOBuffer.addF32(v);
FieldPacker fp = new FieldPacker(8);
fp.addF32(v);
mAllocation.setFromFieldPacker(index, 1, fp);
}
}
13. Pointers
/* typedef struct Point {
float2 position;
float size;
} Point_t;
Point_t *touchPoints;
int32_t *intPointer;
*/
private ScriptField_Point mExportVar_touchPoints;
public void bind_touchPoints(ScriptField_Point v) {
mExportVar_touchPoints = v;
if (v == null) bindAllocation(null, mExportVarIdx_touchPoints);
else bindAllocation(v.getAllocation(), mExportVarIdx_touchPoints);
}
public ScriptField_Point get_touchPoints() {
return mExportVar_touchPoints;
}
private Allocation mExportVar_intPointer;
public void bind_intPointer(Allocation v) {
mExportVar_intPointer = v;
if (v == null) bindAllocation(null, mExportVarIdx_intPointer);
else bindAllocation(v, mExportVarIdx_intPointer);
}
public Allocation get_intPointer() {
return mExportVar_intPointer;
}
14. Memory
Android
Object Type
Description
Element An element describes one cell of a memory allocation
and can have two forms: basic or complex.
Basic:
• Single float value
• 4 element float vector
• single RGB-565 color
• single unsigned int 16
Complex:
• Structs
Type A type is a memory allocation template and consists of
an element and one or more dimensions. It describes the
layout of the memory (basically an array of Elements)
but does not allocate the memory for the data that it
describes.
Allocation An allocation provides the memory for applications
based on a description of the memory that is
represented by a Type. Allocated memory can exist in
many memory spaces concurrently. If memory is
modified in one space, you must explicitly synchronize
the memory, so that it is updated in all the other spaces
in which it exists.
15. Allocation
• Lifecycle
– Immutable
– Once created
– The replacement is to create a new allocation and copy contents
• Memory usages
– USAGE_SCRIPT: Allocates in the script memory space. This is the default memory space if you do not
specify a memory space.
– USAGE_GRAPHICS_TEXTURE: Allocates in the texture memory space of the GPU.
• This was deprecated in API level 16.
– USAGE_GRAPHICS_VERTEX: Allocates in the vertex memory space of the GPU.
• This was deprecated in API level 16.
– USAGE_GRAPHICS_CONSTANTS: Allocates in the constants memory space of the GPU that is used by
the various program objects.
• This was deprecated in API level 16.
– USAGE_SHARED
• Memory management
– synchronized void destory()
• Frees any native resources associated with this object. The primary use is to force immediate cleanup of resources when it is believed
the GC will not respond quickly enough.
– synchronized void resize(int dimX)
• This method was deprecated in API level 18. RenderScript objects should be immutable once created. The replacement is to create a
new allocation and copy the contents.
16. RenderScript vs OpenGL ES
• Cons
– Not need to swizzling: ARGB -> BGRA (Low performance reason)
– Not need to Y flipping on bitmap: RenderScript is according to Android Bitmap order. (Low
performance reason)
– Using bitmap’s real pixel array index instead of texture coordinates
– GLSL is based texture and texture coordinates(float), accurate pixel value sampling(getting
pixel from texture) is hard.
– Android standard
– Not need Lifecycle management
– Interoperating rendering with normal views is available (RenderScript has no drawing routine)
– Interoperating parallel processing between CPU and GPU is available (If GPU doesn't support
parallel processing, use CPU)
• Pros
– No Vertex shader
• Vertex shader work -> per pixel work -> overhead
• RSSurfaceView can use vertex shader (But now deprecated API)
– RenderScript developer insufficient
• Documents and materials are insufficient
– Higher level C like language than GLSL
18. RenderScript sample
single image sampling
• Single-image sampling(default)
– *v_in, *v_out are current image’s in/out pointers
– Using forEach_root()
– rsUnpackColor8888 unpacks uchar4 -> float4
– rsPackColorTo8888 packs float4 -> uchar4
19. RenderScript sample
multi image sampling
• Multi-image sampling
– Not only one given image pixel pointer *v_in, *v_out sampling, but multiple image pixel sampling by
additionally binded uchar4*
– Basically use when filtering after referencing near pixel
– Use invoke_filter() instead of forEach_root()
20. RenderScript sample
multi pass effect
• Multi-pass
– An effect after run RenderScript’s root function n times, run getBitmap() from an allocation.
– Between n-1 and n, data is transferred by allocation. Not bitmap copying.