SlideShare uma empresa Scribd logo
1 de 85
Baixar para ler offline
Tiramisu を
ちょこっと、味見してみました。
Halide勉強会@フィックスターズ
2018/07/28(土)
@Vengineer
ロゴ:https://github.com/Tiramisu-Compiler/tiramisu/
ブログ (2007年~) : Vengineerの戯言
 http://blogs.yahoo.co.jp/verification_engineer
SlideShare :
 https://www.slideshare.net/ssuser479fa3
Twitter (2009年~) :
@Vengineer
ソースコード解析職人
Tiramisu: A Code Optimization Framework for
High Performance Systems
https://www.csail.mit.edu/research/tiramisu-framework-code-optimizat
ion-and-code-generation
MIT CSAIL
Tiramisu Compiler
・(コード最適化 & コード生成) フレームワーク
・コード最適化 (ループ最適化)
  => 独自コンパイラに組み込み可能
  => loop tiling, loop fusion/distribution, loop spliting
    loop interchange, loop shifting, loop unrolling
    loop parallelization, loop vectorization
    storage reordering, modulo storage
・コード生成
  => マルチCPU (LLVM)、GPU (CUDA)、
    分散システム (MPI)、FPGA (Xilinx Vivado HLS)
https://github.com/Tiramisu-Colib/tiramisu#tiramisu-compiler
Tiramisuは、Halide & ISLを使っている
・Halide
https://github.com/halide/Halide
・ISL (Integer Set Library)
http://isl.gforge.inria.fr/
Facebook Research : Tensor Comprehensions
https://github.com/facebookresearch/TensorComprehensions
Tensor Comprehensions (TC) is a fully-functional C++ library to automatically
synthesize high-performance machine learning kernels
using Halide, ISL and NVRTC or LLVM.
Halide と Tiramisu の違い
Halideは、矩形領域のみサポート
Tiramisuは、矩形領域でなくても、OK!
何故なら?
 polyhedral representation (多面体表現) を使う!
4つのチャレンジ
1). MPI+OpenMP+CUDA+HLS
2). メモリ依存
3). 最適化と効率の良いコード生成
4). representation
4). representation
The challenge of representation is
addressed by using a unified framework based on
polyhedral sets to represent the four layers.
「polyhedral sets」
よくわからないので、
誰か?教えてください
論文
を詳しく見てみよう
https://arxiv.org/pdf/1804.10694.pdf
コードをベタで書く
 ・Layer I : Abstract Algorithm
 ・Layer II : Computation Management
 ・Layer III : Data Management
 ・Layer IV : Communication Managenent
 ・Code generation: Abstract Syntax Tree
https://arxiv.org/pdf/1804.10694.pdf
The first layer defines abstract computations,
which are not yet scheduled or mapped to memory.
Each computation represents an expression to compute.
https://arxiv.org/pdf/1804.10694.pdf
Layer I : Abstract Algorithm
{b1(i, j, c) : 0 ≤ i < N ∧ 0 ≤ j < M ∧ 0 ≤ c < 3}
The iteration domain is the set of tuples b1(i, j, c) such that
0 ≤ i < N ∧ 0 ≤ j < M ∧ 0 ≤ c < 3
https://arxiv.org/pdf/1804.10694.pdf
Iteration domain
Tiramisu でのコード生成では、次の2つのステップで行われる
1)、time-space mapping
   This mapping is done by applying an affine relation
2)、adding new statements.
Layer II, III, IV にて、コマンドを追加する
https://arxiv.org/pdf/1804.10694.pdf
コード生成へのステップ
Affine transformations including loop tiling, skewing, loop fusion, distribution,
splitting, reordering, and many others can be expressed as an affine map that maps
computations from Layer I into the time-space domain in Layer II.
We call this map a time-space map.
Layer I の iteration domain を time-space domain に変換
https://arxiv.org/pdf/1804.10694.pdf
Time-space Maps
Layer I:iteration domain
{C(i, j) : 0 ≤ i < N ∧ 0 ≤ j < N } : A(i, j) + B(i, j)
 Time-space mapping として、(16 x 16 tiles) を!
{C(i, j) →C(i1, j1, i2, j2) : i1 = f loor (i/16) ∧ i2 = i%16∧
j1 = f loor (j/16) ∧ j2 = j%16 ∧ 0 ≤ i < N ∧ 0 ≤ j < N }
Layer II:time-space domain
{C(i1, j1, i2, j2) : i1 = f loor (i/16) ∧ i2 = i%16 ∧ j1 = f loor (j/16)∧j2 = j%16 ∧ 0 ≤
i < N ∧ 0 ≤ j < N } :
A(i1 ∗ 16 + i2, j1 ∗ 16 + j2) + B(i1 ∗ 16 + i2, j1 ∗ 16 + j2)
https://arxiv.org/pdf/1804.10694.pdf
サンプル:Time-space Maps
Time dimensions : => When
実行の順番(他の computation に対して) を指定する
Space dimensions : => Where
各 computation を実行するプロセッサ を指定する
Time-space domain (Time-space Maps)
https://arxiv.org/pdf/1804.10694.pdf
Layer II: Computation Management
Space dimensions
各 computation を実行するプロセッサを指定する.
cpu 共有メモリシステム内のCPUで実行する
node 分散システム内のノードで実行する
gpu_thread_X GPUスレッドのX次元で実行する
gpu_block_X GPUブロックのX次元で実行する
vec(s) ベクタ化する ( s は、ベクタ幅 )
unroll アンロール
pipeline パイプライン化 ( FPGA のみ )
https://arxiv.org/pdf/1804.10694.pdf
Layer II: Computation Management
Data Management では、計算結果を蓄えておくメモリの場所を指定する
allocation/deallocation statements
a set of access relations, which map a computation from Layer
II to array elements read or written by that computation.
https://arxiv.org/pdf/1804.10694.pdf
Layer III: Data Management
通信コマンド(同期通信を含む)を追加およびスケジューリングされる
Layer IIIで追加した allocation or deallocation operation は、
Layer IVでスケジュールされる
https://arxiv.org/pdf/1804.10694.pdf
Layer IV: Communication Management
テストコードを追っかけてみよう!
+
xxx.otest_XX wrapper_test_XX.o
wrapper_test_XX
このファイルに、
tiramisu のコードを書き、
オブジェクトファイル(xxx.o)を生成する
Tiramisuは、Halide と同様にオブジェクトファイルを生成し、その
オブジェクトファイルをリンクして使用します。
function
入力
Tiramisuは、ざっくり、こんな感じ
computation
computation
computation
computation
computation
bufferbuffer
出力
int main(int, char **)
{
Halide::Buffer<uint8_t> reference_buf(NN, MM);
init_buffer(reference_buf, (uint8_t)7);
Halide::Buffer<uint8_t> output_buf(NN, MM);
init_buffer(output_buf, (uint8_t)13);
assign_7_to_10x10_2D_array_with_tiling_parallelism(
output_buf.raw_buffer());
compare_buffers("assign_7_to_10x10_2D_array_with_tiling_parallelism",
output_buf, reference_buf);
return 0;
}
テストコード (tests/wrapper_test_01.cpp)
https://github.com/rbaghdadi/tiramisu/blob/master/tests/wrapper_test_01.cpp
この例は、resultのみ
オブジェクトファイルを生成
int main(int argc, char **argv)
{
generate_function_1(
"assign_7_to_10x10_2D_array_with_tiling_parallelism",
10, 3, 4);
return 0;
}
テストコード (tests/developers/test_01.cpp)
https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L41
void generate_function_1(std::string name, int size, int val0, int val1 )
{
tiramisu::global::set_default_tiramisu_options();
tiramisu::function function0(name);
tiramisu::constant N("N", tiramisu::expr((int32_t) size), p_int32, true,
NULL, 0, &function0 );
テストコード (tests/test_01.cpp)
https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L16
static void set_default_tiramisu_options()
{
global::loop_iterator_type = p_int32;
set_auto_data_mapping(true);
// GPU : NVIDIA NVCCのパス
auto location = std::getenv(NVCC_BIN_DIR_ENV_VAR);
if (location)
nvcc_bin_dir = location;
}
global::set_default_tiramisu_optionsメソッド
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/expr.h#L93
void generate_function_1(std::string name, int size, int val0, int val1 )
{
tiramisu::global::set_default_tiramisu_options();
tiramisu::function function0(name);
tiramisu::constant N("N", tiramisu::expr((int32_t) size), p_int32, true,
NULL, 0, &function0 );
テストコード (tests/test_01.cpp)
https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L16
A class to represent functions in Tiramisu.
A function in Tiramisu is composed of a set of computations (tiramisu::computation).
例:
std::string name(“sample”);
tiramisu::function function0(name);
function クラス
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L97
computionsの集合!
void generate_function_1(std::string name, int size, int val0, int val1 )
{
tiramisu::global::set_default_tiramisu_options();
tiramisu::function function0(name);
tiramisu::constant N("N", tiramisu::expr((int32_t) size), p_int32, true,
NULL, 0, &function0 );
テストコード (tests/test_01.cpp)
https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L16
A class that represents loop invariants.
An object of the invariant class can be an expression,
a symbolic constant
or a variable that is invariant to all the loops of the function.
例:
tiramisu::constant N("N", tiramisu::expr((int32_t) size),
p_int32, true, NULL, 0, &function0);
constant クラス
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L3667
tiramisu::var i("i"), j("j"), i0("i0"), j0("j0"), i1("i1"), j1("j1");
tiramisu::expr e1 = tiramisu::expr(tiramisu::o_add,
tiramisu::expr((uint8_t) val0),
tiramisu::expr((uint8_t) val1) );
tiramisu::computation S0("[N]->{S0[i,j]: 0<=i<N and 0<=j<N}",
e1, true, p_uint8, &function0 );
tiramisu::buffer buf0("buf0", {size, size}, tiramisu::p_uint8,
a_output, &function0 );
テストコード (tests/test_01.cpp)
https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L16
A class that represents constant variable references
例:
tiramisu::var i("i"), j("j"), i0("i0"), j0("j0"), i1("i1"), j1("j1")
var クラス
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/expr.h#L1641
tiramisu::var i("i"), j("j"), i0("i0"), j0("j0"), i1("i1"), j1("j1");
tiramisu::expr e1 = tiramisu::expr(tiramisu::o_add,
tiramisu::expr((uint8_t) val0),
tiramisu::expr((uint8_t) val1) );
tiramisu::computation S0("[N]->{S0[i,j]: 0<=i<N and 0<=j<N}",
e1, true, p_uint8, &function0 );
tiramisu::buffer buf0("buf0", {size, size}, tiramisu::p_uint8,
a_output, &function0 );
テストコード (tests/test_01.cpp)
https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L16
A class to represent tiramisu expressions.
例:
tiramisu::expr e1 = tiramisu::expr(tiramisu::o_add,
tiramisu::expr((uint8_t) val0),
tiramisu::expr((uint8_t) val1));
expr クラス
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/expr.h#L128
tiramisu::var i("i"), j("j"), i0("i0"), j0("j0"), i1("i1"), j1("j1");
tiramisu::expr e1 = tiramisu::expr(tiramisu::o_add,
tiramisu::expr((uint8_t) val0),
tiramisu::expr((uint8_t) val1) );
tiramisu::computation S0("[N]->{S0[i,j]: 0<=i<N and 0<=j<N}",
e1, true, p_uint8, &function0 );
tiramisu::buffer buf0("buf0", {size, size}, tiramisu::p_uint8,
a_output, &function0 );
テストコード (tests/test_01.cpp)
https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L16
A class that represents computations.
A computation is an expression associated with an iteration domain.
A computation indicates what needs to be computed
(the expression that should be computed).
A computation has three representations:
Level I
Level II
Level III
(最新の論文では、Layer I/II/III/IV と表現している。
Layer IVは、Communication Managenent)
computation クラス
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L1225
例、
tiramisu::var i = tiramisu::var("i");
tiramisu::computation input("[N]->{input[i]}",
tiramisu::expr(), false,
p_uint8, &function0);
tiramisu::computation result("[N]->{result[0]}",
tiramisu::expr(input(0)), true,
p_uint8, &function0);
result.add_definitions("[N]->{result[i]: 1<=i<N}",
(result(i - 1) + input(i)), true,
p_uint8, &function0);
computation クラス
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L1225
tiramisu::var i("i"), j("j"), i0("i0"), j0("j0"), i1("i1"), j1("j1");
tiramisu::expr e1 = tiramisu::expr(tiramisu::o_add,
tiramisu::expr((uint8_t) val0),
tiramisu::expr((uint8_t) val1) );
tiramisu::computation S0("[N]->{S0[i,j]: 0<=i<N and 0<=j<N}",
e1, true, p_uint8, &function0 );
tiramisu::buffer buf0("buf0", {size, size}, tiramisu::p_uint8,
a_output, &function0 );
テストコード (tests/test_01.cpp)
https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L16
A class that represents buffers.
Buffers have two use cases:
- used to store the results of computations, and
- used to represent input arguments to functions.
例: 入力バッファ
tiramisu::buffer input_buffer("input_buffer", {size},
tiramisu::p_uint8, a_input, &function0);
          結果用のバッファ
tiramisu::buffer result_scalar("result_scalar", {1},
tiramisu::p_uint8, a_output, &function0);
buffer クラス
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L957
S0.set_access("{S0[i,j]->buf0[i,j]}");
S0.tile(i, j, 2, 2, i0, j0, i1, j1);
S0.tag_parallel_level(i0);
テストコード (tests/test_01.cpp)
https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L16
void set_access(std::string access_str);
void set_access(isl_map *access);
Set the access relation of the computation.
The access relation is a relation from computations to buffer
locations. access_str is a string that represents the relation.
It is encoded in the ISL format,
(http://isl.gforge.inria.fr/user.html#Sets-and-Relations)
例、
S0.set_access("{S0[i,j]->buf0[i,j]}");
computation::set_access メソッド
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L3130
S0.set_access("{S0[i,j]->buf0[i,j]}");
S0.tile(i, j, 2, 2, i0, j0, i1, j1);
S0.tag_parallel_level(i0);
テストコード (tests/test_01.cpp)
https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L16
void tile(tiramisu::var L0, tiramisu::var L1, int sizeX, int sizeY,
tiramisu::var L0_outer, tiramisu::var L1_outer,
tiramisu::var L0_inner, tiramisu::var L1_inner );
Tile the two loop levels L0 and L1 with rectangular tiling.
sizeX and sizeY represent the tile size.
L0 and L1 should be two consecutive loop levels.
L0_outer, L1_outer, L0_inner, L1_inner are the names
of the new dimensions created after tiling.
例、
S0.tile(i, j, 2, 2, i0, j0, i1, j1);
computation::tile メソッド
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L3424
S0.set_access("{S0[i,j]->buf0[i,j]}");
S0.tile(i, j, 2, 2, i0, j0, i1, j1);
S0.tag_parallel_level(i0);
テストコード (tests/test_01.cpp)
https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L16
void tag_parallel_level(tiramisu::var L);
void tag_parallel_level(int L);
Tag the loop level L to be parallelized.
例、
S0.tag_parallel_level(i0);
computation::tag_parallel_level メソッド
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L3424
// 引数 (buf0) を設定
function0.set_arguments({&buf0});
// interation domain => time-space domain
function0.gen_time_space_domain();
// ISLのAbstract Syntax Treeを生成
function0.gen_isl_ast();
// Halide Statement を生成
function0.gen_halide_stmt();
// オブジェクトファイルの生成
function0.gen_halide_obj("build/generated_fct_test_01.o");
}
テストコード (tests/test_01.cpp)
https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L16
void set_arguments(const std::vector<tiramisu::buffer *> &buffer_vec );
Set the arguments of the function.
The arguments of the function are provided as a vector of
pointers to buffers. Each buffer represents an argument
to the function.
During code generation, the arguments in the vector will
become the arguments of the generated function
(with the order of their appearance in the vector).
function::set_arguments メソッド
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L918
// 引数 (buf0) を設定
function0.set_arguments({&buf0});
// interation domain => time-space domain
function0.gen_time_space_domain();
// ISLのAbstract Syntax Treeを生成
function0.gen_isl_ast();
// Halide Statement を生成
function0.gen_halide_stmt();
// オブジェクトファイルの生成
function0.gen_halide_obj("build/generated_fct_test_01.o");
}
テストコード (tests/test_01.cpp)
https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L16
void gen_time_space_domain();
Generate the time-space domain of the function.
In this representation, the logical time of execution
and the processor where the computation
will be executed are both specified.
function::gen_time_space_domain メソッド
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L910
// 引数 (buf0) を設定
function0.set_arguments({&buf0});
// interation domain => time-space domain
function0.gen_time_space_domain();
// ISLのAbstract Syntax Treeを生成
function0.gen_isl_ast();
// Halide Statement を生成
function0.gen_halide_stmt();
// オブジェクトファイルの生成
function0.gen_halide_obj("build/generated_fct_test_01.o");
}
テストコード (tests/test_01.cpp)
https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L16
void gen_isl_ast();
Generate an isl AST that represents the function.
function::gen_isl_ast メソッド
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L905
// 引数 (buf0) を設定
function0.set_arguments({&buf0});
// interation domain => time-space domain
function0.gen_time_space_domain();
// ISLのAbstract Syntax Treeを生成
function0.gen_isl_ast();
// Halide Statement を生成
function0.gen_halide_stmt();
// オブジェクトファイルの生成
function0.gen_halide_obj("build/generated_fct_test_01.o");
}
テストコード (tests/test_01.cpp)
https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L16
void gen_halide_stmt();
Generate a Halide stmt that represents the function.
function::gen_halide_stmt メソッド
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L897
// 引数 (buf0) を設定
function0.set_arguments({&buf0});
// interation domain => time-space domain
function0.gen_time_space_domain();
// ISLのAbstract Syntax Treeを生成
function0.gen_isl_ast();
// Halide Statement を生成
function0.gen_halide_stmt();
// オブジェクトファイルの生成
function0.gen_halide_obj("build/generated_fct_test_01.o");
}
テストコード (tests/test_01.cpp)
https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L16
void gen_halide_obj(const std::string &obj_file_name,
Halide::Target::OS os,
Halide::Target::Arch arch, int bits ) const;
Generate an object file that contains the compiled function.
This function relies on Halide to generate the object file.
obj_file_name : the name of the generated file.
os : the target operating system (Halide::Target::OS).
arch : the architecture of the target (the instruction set).
bits : the bit-width of the target machine.
(must be 0 for unknown, or 32 or 64 )
function::gen_halide_obj メソッド
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L897
void tiramisu::function::codegen(
const std::vector<tiramisu::buffer *> &buffer_vec,
const std::string obj_filename) {
this->set_arguments(buffer_vec);
this->lift_dist_comps(); // <= MPI/CUDAの時のみ有効
this->gen_time_space_domain();
this->gen_isl_ast();
this->gen_halide_stmt();
this->gen_halide_obj(obj_filename);
}
全部まとめでコード生成 function::codegen
https://github.com/Tiramisu-Colib/tiramisu/blob/master/src/tiramisu_core.cpp#L8508
実は、ここまでは、
Low Level Tiramisu API
Tiramisu expressions
って、何?
// C++ code with a Tiramisu expression.
#include "tiramisu.h"
void foo(int N, int array_a[N], int array_b[N], int array_c[N])
{
tiramisu::init();
// Declare an iterator and inputs
tiramisu::iter i, j;
tiramisu::in A(i,j), B(i,j);
Tiramisu expressions (README.md)
https://github.com/Tiramisu-Compiler/tiramisu/blob/master/README.md#example
// Declare the Tiramisu expression (algorithm)
tiramisu::comp C(i,j) = A(i,j) + B(i,j);
// Specify optimizations
C.parallelize(i).vectorize(j, 4);
// Realize, compile and run the expression
C.realize(tiramisu::int32_t, {N});
C.compile({(A, array_a), (B, array_b), (C, array_c)});
C.run();
}
Tiramisu expressions (README.md)
https://github.com/Tiramisu-Compiler/tiramisu/blob/master/README.md#example
あー、なんか、Halide っぽいね。
でも、実装コードは、
まだ、ありません
https://arxiv.org/pdf/1804.10694.pdf
コードをベタで書く DSL Compiler
が無いと
え、もしかして、
Tiramisu expressions
のこと?
ブログ (2007年~) : Vengineerの戯言
 http://blogs.yahoo.co.jp/verification_engineer
SlideShare :
 https://www.slideshare.net/ssuser479fa3
ありがとうございました
Twitter (2009年~) :
@Vengineer
ソースコード解析職人
おまけ
Layer I
virtual void add_definitions(std::string iteration_domain_str, tiramisu::expr e,
bool schedule_this_computation, tiramisu::primitive_t t,
tiramisu::function *fct );
Add definitions of computations that have the same name as this
computation.
The arguments of this function are identical to the arguments of
the computation constructor.
In general, this function is used to express reductions
and to express computation updates.
function::add_definitions メソッド
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L2541
例、
// [N]->{C[0,i]: 0<=i<N} : 10
tiramisu::computation C("[N]->{C[0,i]: 0<=i<N}",
tiramisu::expr((uint8_t) 10), true,
p_uint8, &function0);
// [N]->{C[1,i]: 0<=i<N} : C(0, i) + 10
C.add_definitions("[N]->{C[1,i]: 0<=i<N}",
C(0, i) + tiramisu::expr((uint8_t) 10), true,
p_uint8, &function0);
function::add_definitions メソッド
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L2541
tiramisu::computation& get_update(int index);
Returns the index update that has been added to this computation such that:
- If index == 0, then this computation is returned.
- If > 0, then it returns the pth computation added
through add_definitions.
function::get_update メソッド
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L3065
例、
tiramisu::computation result("[N]->{result[0]}",
tiramisu::expr(input(0)), true, p_uint8, &function0 );
result.add_definitions("[N]->{result[i]: 1<=i<N}",
(result(i - 1) + input(i)), true, p_uint8, &function0 );
// result.get_update(1)は、result[1]になる
// result[0] を先に実行してから、result[i]を実行する
result.get_update(1).after(result, computation::root);
function::get_update メソッド
https://github.com/rbaghdadi/tiramisu/blob/master/tutorials/tutorial_06.cpp
void tiramisu::computation::set_expression(const tiramisu::expr &e );
Set the expression of the computation.
例、
computation c_C("[N]->{c_C[i,j,0]: 0<=i<N and 0<=j<N}",
expr((uint8_t) 0), true, p_uint8, &matmul );
c_C.add_definitions("[N]->{c_C[i,j,k]: 0<=i<N and 0<=j<N and 0<=k<N}",
expr(), true, p_uint8, &matmul );
expr e1 = c_C(i, j, k - 1) + c_A(i, k) * c_B(k, j);
c_C.get_update(1).set_expression(e1);
computation::set_expression メソッド
https://github.com/rbaghdadi/tiramisu/blob/master/src/tiramisu_core.cpp#L7470
Layer II
void after(computation &comp, tiramisu::var iterator);
Schedule this computation to run after the computation comp.
This computation is placed after comp in the loop level level.
level is a loop level in this computation.
computation::after メソッド
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L2598
例、
{S0[i,j]: 0<=i<N and 0<=j<N} and {S1[i,j]: 0<=i<N and 0<=j<N}
S1.after(S0, i)
for (i=0; i<N; i++)
{
for (j=0; j<N; j++)
S0;
for (j=0; j<N; j++)
S1;
}
computation::after メソッド
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L2598
例、
{S0[i,j]: 0<=i<N and 0<=j<N} and {S1[i,j]: 0<=i<N and 0<=j<N}
S1.after(S0, j)
for (i=0; i<N; i++)
for (j=0; j<N; j++)
{
S0;
S1;
}
computation::after メソッド
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L2598
例、
{S0[i,j]: 0<=i<N and 0<=j<N} and {S1[i,j]: 0<=i<N and 0<=j<N}
S1.after(S0, computation::root)
for (i=0; i<N; i++)
for (j=0; j<N; j++)
S0;
for (i=0; i<N; i++)
for (j=0; j<N; j++)
S1;
computation::after メソッド
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L2598
例、
{S0[i,j]: 0<=i<N and 0<=j<N}, {S1[i,j]: 0<=i<N and 0<=j<N}
and {S2[i,j]: 0<=i<N and 0<=j<N}.
for (i=0; i<N; i++)
for (j=0; j<N; j++)
S0;
for (i=0; i<N; i++)
for (j=0; j<N; j++)
S1;
for (i=0; i<N; i++)
for (j=0; j<N; j++)
S2;
computation::fuse_after メソッド
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L2939
例、
S2.fuse_after(j, S1);
S1.fuse_after(j, S0);
for (i=0; i<N; i++)
for (j=0; j<N; j++)
{
S0;
S1;
S2;
}
computation::fuse_after メソッド
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L2939
例、
S2.fuse_after(i, S1);
S1.fuse_after(i, S0);
for (i=0; i<N; i++)
{
for (j=0; j<N; j++)
S0;
for (j=0; j<N; j++)
S1;
for (j=0; j<N; j++)
S2;
}
computation::fuse_after メソッド
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L2939
void before(computation &consumer, tiramisu::var L);
Schedule this computation to run
before the computation consumer at the loop level L
computation::before メソッド
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L2598
void between(computation &before_comp, tiramisu::var before_l,
computation &after_comp, tiramisu::var after_l );
Schedule this computation to run
after before_comp at the loop level before_l,
and before after_comp at loop level after_l.
The outermost loop level is 0.
computation::between メソッド
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L2598
void bind_to(buffer *buff);
Bind this computation to a buffer. i.e., create a one-to-one data
mapping between the computation and the buffer.
In Tiramisu, a tiramisu computation cannot directly consume
values from buffers.
Buffers should first be wrapped in computations.
computation::bind_to メソッド
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L2840
例、
tiramisu::buffer N_input_b("N_input_b", {1},
tiramisu::p_int32, a_input, &function0 );
N_input.bind_to(&N_input_b);
tiramisu::buffer S0_b("S0_b", {N_input(0), N_input(0)},
tiramisu::p_uint8, a_temporary, &function0 );
S0.bind_to(&S0_b);
tiramisu::buffer S1_b("S1_b", {tiramisu::var("N"), tiramisu::var("N")},
tiramisu::p_uint8, a_output, &function0 );
S1.bind_to(&S1_b);
computation::bind_to メソッド
https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_87.cpp
void compute_at(computation &consumer, tiramisu::var L );
void compute_at(computation &consumer, int L );
void interchange(tiramisu::var L0, tiramisu::var L1 );
void set_inline(bool is_inline = true );
computation クラス のいろいろなメソッド
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h
void shift(tiramisu::var L0, int n );
void split(tiramisu::var L0, int sizeX );
void split(tiramisu::var L0, int sizeX, tiramisu::var L0_outer, tiramisu::var L0_inner );
void tile(int L0, int L1, int sizeX, int sizeY );
void tile(int L0, int L1, int L2, int sizeX, int sizeY, int sizeZ );
void unroll(tiramisu::var L, int fac );
void unroll(tiramisu::var L, int fac, tiramisu::var L_outer, tiramisu::var L_inner );
void vectorize(tiramisu::var L, int v );
void vectorize(tiramisu::var L, int v, tiramisu::var L_outer, tiramisu::var L_inner );
computation クラス のいろいろなメソッド
https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h

Mais conteúdo relacionado

Mais procurados

개발 과정 최적화 하기 내부툴로 더욱 강력한 개발하기 Stephen kennedy _(11시40분_103호)
개발 과정 최적화 하기 내부툴로 더욱 강력한 개발하기 Stephen kennedy _(11시40분_103호)개발 과정 최적화 하기 내부툴로 더욱 강력한 개발하기 Stephen kennedy _(11시40분_103호)
개발 과정 최적화 하기 내부툴로 더욱 강력한 개발하기 Stephen kennedy _(11시40분_103호)changehee lee
 
GPU Programming on CPU - Using C++AMP
GPU Programming on CPU - Using C++AMPGPU Programming on CPU - Using C++AMP
GPU Programming on CPU - Using C++AMPMiller Lee
 
Egor Bogatov - .NET Core intrinsics and other micro-optimizations
Egor Bogatov - .NET Core intrinsics and other micro-optimizationsEgor Bogatov - .NET Core intrinsics and other micro-optimizations
Egor Bogatov - .NET Core intrinsics and other micro-optimizationsEgor Bogatov
 
How to add an optimization for C# to RyuJIT
How to add an optimization for C# to RyuJITHow to add an optimization for C# to RyuJIT
How to add an optimization for C# to RyuJITEgor Bogatov
 
Google Edge TPUで TensorFlow Liteを使った時に 何をやっているのかを妄想してみる 2 「エッジAIモダン計測制御の世界」オ...
Google Edge TPUで TensorFlow Liteを使った時に 何をやっているのかを妄想してみる 2  「エッジAIモダン計測制御の世界」オ...Google Edge TPUで TensorFlow Liteを使った時に 何をやっているのかを妄想してみる 2  「エッジAIモダン計測制御の世界」オ...
Google Edge TPUで TensorFlow Liteを使った時に 何をやっているのかを妄想してみる 2 「エッジAIモダン計測制御の世界」オ...Mr. Vengineer
 
Kirk Shoop, Reactive programming in C++
Kirk Shoop, Reactive programming in C++Kirk Shoop, Reactive programming in C++
Kirk Shoop, Reactive programming in C++Sergey Platonov
 
Евгений Крутько, Многопоточные вычисления, современный подход.
Евгений Крутько, Многопоточные вычисления, современный подход.Евгений Крутько, Многопоточные вычисления, современный подход.
Евгений Крутько, Многопоточные вычисления, современный подход.Platonov Sergey
 
深入淺出C語言
深入淺出C語言深入淺出C語言
深入淺出C語言Simen Li
 
Boost.Python - domesticating the snake
Boost.Python - domesticating the snakeBoost.Python - domesticating the snake
Boost.Python - domesticating the snakeSławomir Zborowski
 
CodiLime Tech Talk - Grzegorz Rozdzialik: What the java script
CodiLime Tech Talk - Grzegorz Rozdzialik: What the java scriptCodiLime Tech Talk - Grzegorz Rozdzialik: What the java script
CodiLime Tech Talk - Grzegorz Rozdzialik: What the java scriptCodiLime
 
C++ idioms by example (Nov 2008)
C++ idioms by example (Nov 2008)C++ idioms by example (Nov 2008)
C++ idioms by example (Nov 2008)Olve Maudal
 
PVS-Studio team experience: checking various open source projects, or mistake...
PVS-Studio team experience: checking various open source projects, or mistake...PVS-Studio team experience: checking various open source projects, or mistake...
PVS-Studio team experience: checking various open source projects, or mistake...Andrey Karpov
 
Best Bugs from Games: Fellow Programmers' Mistakes
Best Bugs from Games: Fellow Programmers' MistakesBest Bugs from Games: Fellow Programmers' Mistakes
Best Bugs from Games: Fellow Programmers' MistakesAndrey Karpov
 
.NET 2015: Будущее рядом
.NET 2015: Будущее рядом.NET 2015: Будущее рядом
.NET 2015: Будущее рядомAndrey Akinshin
 
Vc4c development of opencl compiler for videocore4
Vc4c  development of opencl compiler for videocore4Vc4c  development of opencl compiler for videocore4
Vc4c development of opencl compiler for videocore4nomaddo
 
Anomalies in X-Ray Engine
Anomalies in X-Ray EngineAnomalies in X-Ray Engine
Anomalies in X-Ray EnginePVS-Studio
 

Mais procurados (20)

개발 과정 최적화 하기 내부툴로 더욱 강력한 개발하기 Stephen kennedy _(11시40분_103호)
개발 과정 최적화 하기 내부툴로 더욱 강력한 개발하기 Stephen kennedy _(11시40분_103호)개발 과정 최적화 하기 내부툴로 더욱 강력한 개발하기 Stephen kennedy _(11시40분_103호)
개발 과정 최적화 하기 내부툴로 더욱 강력한 개발하기 Stephen kennedy _(11시40분_103호)
 
GPU Programming on CPU - Using C++AMP
GPU Programming on CPU - Using C++AMPGPU Programming on CPU - Using C++AMP
GPU Programming on CPU - Using C++AMP
 
Egor Bogatov - .NET Core intrinsics and other micro-optimizations
Egor Bogatov - .NET Core intrinsics and other micro-optimizationsEgor Bogatov - .NET Core intrinsics and other micro-optimizations
Egor Bogatov - .NET Core intrinsics and other micro-optimizations
 
How to add an optimization for C# to RyuJIT
How to add an optimization for C# to RyuJITHow to add an optimization for C# to RyuJIT
How to add an optimization for C# to RyuJIT
 
Google Edge TPUで TensorFlow Liteを使った時に 何をやっているのかを妄想してみる 2 「エッジAIモダン計測制御の世界」オ...
Google Edge TPUで TensorFlow Liteを使った時に 何をやっているのかを妄想してみる 2  「エッジAIモダン計測制御の世界」オ...Google Edge TPUで TensorFlow Liteを使った時に 何をやっているのかを妄想してみる 2  「エッジAIモダン計測制御の世界」オ...
Google Edge TPUで TensorFlow Liteを使った時に 何をやっているのかを妄想してみる 2 「エッジAIモダン計測制御の世界」オ...
 
Kirk Shoop, Reactive programming in C++
Kirk Shoop, Reactive programming in C++Kirk Shoop, Reactive programming in C++
Kirk Shoop, Reactive programming in C++
 
Евгений Крутько, Многопоточные вычисления, современный подход.
Евгений Крутько, Многопоточные вычисления, современный подход.Евгений Крутько, Многопоточные вычисления, современный подход.
Евгений Крутько, Многопоточные вычисления, современный подход.
 
Joel Falcou, Boost.SIMD
Joel Falcou, Boost.SIMDJoel Falcou, Boost.SIMD
Joel Falcou, Boost.SIMD
 
深入淺出C語言
深入淺出C語言深入淺出C語言
深入淺出C語言
 
C++11 & C++14
C++11 & C++14C++11 & C++14
C++11 & C++14
 
Boost.Python - domesticating the snake
Boost.Python - domesticating the snakeBoost.Python - domesticating the snake
Boost.Python - domesticating the snake
 
CodiLime Tech Talk - Grzegorz Rozdzialik: What the java script
CodiLime Tech Talk - Grzegorz Rozdzialik: What the java scriptCodiLime Tech Talk - Grzegorz Rozdzialik: What the java script
CodiLime Tech Talk - Grzegorz Rozdzialik: What the java script
 
C++ idioms by example (Nov 2008)
C++ idioms by example (Nov 2008)C++ idioms by example (Nov 2008)
C++ idioms by example (Nov 2008)
 
PVS-Studio team experience: checking various open source projects, or mistake...
PVS-Studio team experience: checking various open source projects, or mistake...PVS-Studio team experience: checking various open source projects, or mistake...
PVS-Studio team experience: checking various open source projects, or mistake...
 
Best Bugs from Games: Fellow Programmers' Mistakes
Best Bugs from Games: Fellow Programmers' MistakesBest Bugs from Games: Fellow Programmers' Mistakes
Best Bugs from Games: Fellow Programmers' Mistakes
 
.NET 2015: Будущее рядом
.NET 2015: Будущее рядом.NET 2015: Будущее рядом
.NET 2015: Будущее рядом
 
Vc4c development of opencl compiler for videocore4
Vc4c  development of opencl compiler for videocore4Vc4c  development of opencl compiler for videocore4
Vc4c development of opencl compiler for videocore4
 
Clang tidy
Clang tidyClang tidy
Clang tidy
 
C++11
C++11C++11
C++11
 
Anomalies in X-Ray Engine
Anomalies in X-Ray EngineAnomalies in X-Ray Engine
Anomalies in X-Ray Engine
 

Semelhante a Tiramisu をちょっと、味見してみました。

Static analysis of C++ source code
Static analysis of C++ source codeStatic analysis of C++ source code
Static analysis of C++ source codePVS-Studio
 
Static analysis of C++ source code
Static analysis of C++ source codeStatic analysis of C++ source code
Static analysis of C++ source codeAndrey Karpov
 
PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...
PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...
PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...Andrey Karpov
 
Runtime Code Generation and Data Management for Heterogeneous Computing in Java
Runtime Code Generation and Data Management for Heterogeneous Computing in JavaRuntime Code Generation and Data Management for Heterogeneous Computing in Java
Runtime Code Generation and Data Management for Heterogeneous Computing in JavaJuan Fumero
 
The Effect of Hierarchical Memory on the Design of Parallel Algorithms and th...
The Effect of Hierarchical Memory on the Design of Parallel Algorithms and th...The Effect of Hierarchical Memory on the Design of Parallel Algorithms and th...
The Effect of Hierarchical Memory on the Design of Parallel Algorithms and th...David Walker
 
20145-5SumII_CSC407_assign1.htmlCSC 407 Computer Systems II.docx
20145-5SumII_CSC407_assign1.htmlCSC 407 Computer Systems II.docx20145-5SumII_CSC407_assign1.htmlCSC 407 Computer Systems II.docx
20145-5SumII_CSC407_assign1.htmlCSC 407 Computer Systems II.docxeugeniadean34240
 
PVS-Studio, a solution for resource intensive applications development
PVS-Studio, a solution for resource intensive applications developmentPVS-Studio, a solution for resource intensive applications development
PVS-Studio, a solution for resource intensive applications developmentOOO "Program Verification Systems"
 
Halide tutorial 2019
Halide tutorial 2019Halide tutorial 2019
Halide tutorial 2019Champ Yen
 
Skiron - Experiments in CPU Design in D
Skiron - Experiments in CPU Design in DSkiron - Experiments in CPU Design in D
Skiron - Experiments in CPU Design in DMithun Hunsur
 
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the CompilerPragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the CompilerMarina Kolpakova
 
PyHEP 2018: Tools to bind to Python
PyHEP 2018:  Tools to bind to PythonPyHEP 2018:  Tools to bind to Python
PyHEP 2018: Tools to bind to PythonHenry Schreiner
 
Beyond Breakpoints: A Tour of Dynamic Analysis
Beyond Breakpoints: A Tour of Dynamic AnalysisBeyond Breakpoints: A Tour of Dynamic Analysis
Beyond Breakpoints: A Tour of Dynamic AnalysisFastly
 
Introduction to MPI
Introduction to MPIIntroduction to MPI
Introduction to MPIyaman dua
 
Top 10 bugs in C++ open source projects, checked in 2016
Top 10 bugs in C++ open source projects, checked in 2016Top 10 bugs in C++ open source projects, checked in 2016
Top 10 bugs in C++ open source projects, checked in 2016PVS-Studio
 
100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects 100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects Andrey Karpov
 
Picking Mushrooms after Cppcheck
Picking Mushrooms after CppcheckPicking Mushrooms after Cppcheck
Picking Mushrooms after CppcheckAndrey Karpov
 
Machine Learning and Go. Go!
Machine Learning and Go. Go!Machine Learning and Go. Go!
Machine Learning and Go. Go!Diana Ortega
 
Facebook Glow Compiler のソースコードをグダグダ語る会
Facebook Glow Compiler のソースコードをグダグダ語る会Facebook Glow Compiler のソースコードをグダグダ語る会
Facebook Glow Compiler のソースコードをグダグダ語る会Mr. Vengineer
 

Semelhante a Tiramisu をちょっと、味見してみました。 (20)

Static analysis of C++ source code
Static analysis of C++ source codeStatic analysis of C++ source code
Static analysis of C++ source code
 
Static analysis of C++ source code
Static analysis of C++ source codeStatic analysis of C++ source code
Static analysis of C++ source code
 
PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...
PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...
PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...
 
Runtime Code Generation and Data Management for Heterogeneous Computing in Java
Runtime Code Generation and Data Management for Heterogeneous Computing in JavaRuntime Code Generation and Data Management for Heterogeneous Computing in Java
Runtime Code Generation and Data Management for Heterogeneous Computing in Java
 
The Effect of Hierarchical Memory on the Design of Parallel Algorithms and th...
The Effect of Hierarchical Memory on the Design of Parallel Algorithms and th...The Effect of Hierarchical Memory on the Design of Parallel Algorithms and th...
The Effect of Hierarchical Memory on the Design of Parallel Algorithms and th...
 
20145-5SumII_CSC407_assign1.htmlCSC 407 Computer Systems II.docx
20145-5SumII_CSC407_assign1.htmlCSC 407 Computer Systems II.docx20145-5SumII_CSC407_assign1.htmlCSC 407 Computer Systems II.docx
20145-5SumII_CSC407_assign1.htmlCSC 407 Computer Systems II.docx
 
PVS-Studio, a solution for resource intensive applications development
PVS-Studio, a solution for resource intensive applications developmentPVS-Studio, a solution for resource intensive applications development
PVS-Studio, a solution for resource intensive applications development
 
Halide tutorial 2019
Halide tutorial 2019Halide tutorial 2019
Halide tutorial 2019
 
Skiron - Experiments in CPU Design in D
Skiron - Experiments in CPU Design in DSkiron - Experiments in CPU Design in D
Skiron - Experiments in CPU Design in D
 
C Programming Homework Help
C Programming Homework HelpC Programming Homework Help
C Programming Homework Help
 
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the CompilerPragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
 
PyHEP 2018: Tools to bind to Python
PyHEP 2018:  Tools to bind to PythonPyHEP 2018:  Tools to bind to Python
PyHEP 2018: Tools to bind to Python
 
Gpus graal
Gpus graalGpus graal
Gpus graal
 
Beyond Breakpoints: A Tour of Dynamic Analysis
Beyond Breakpoints: A Tour of Dynamic AnalysisBeyond Breakpoints: A Tour of Dynamic Analysis
Beyond Breakpoints: A Tour of Dynamic Analysis
 
Introduction to MPI
Introduction to MPIIntroduction to MPI
Introduction to MPI
 
Top 10 bugs in C++ open source projects, checked in 2016
Top 10 bugs in C++ open source projects, checked in 2016Top 10 bugs in C++ open source projects, checked in 2016
Top 10 bugs in C++ open source projects, checked in 2016
 
100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects 100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects
 
Picking Mushrooms after Cppcheck
Picking Mushrooms after CppcheckPicking Mushrooms after Cppcheck
Picking Mushrooms after Cppcheck
 
Machine Learning and Go. Go!
Machine Learning and Go. Go!Machine Learning and Go. Go!
Machine Learning and Go. Go!
 
Facebook Glow Compiler のソースコードをグダグダ語る会
Facebook Glow Compiler のソースコードをグダグダ語る会Facebook Glow Compiler のソースコードをグダグダ語る会
Facebook Glow Compiler のソースコードをグダグダ語る会
 

Mais de Mr. Vengineer

XilinxのxsimでSoftware Driven Verification.pdf
XilinxのxsimでSoftware  Driven Verification.pdfXilinxのxsimでSoftware  Driven Verification.pdf
XilinxのxsimでSoftware Driven Verification.pdfMr. Vengineer
 
VerilatorとSystemCでSoftware Driven Verification
VerilatorとSystemCでSoftware Driven VerificationVerilatorとSystemCでSoftware Driven Verification
VerilatorとSystemCでSoftware Driven VerificationMr. Vengineer
 
Cloud TPU Driver API ソースコード解析
Cloud TPU Driver API ソースコード解析Cloud TPU Driver API ソースコード解析
Cloud TPU Driver API ソースコード解析Mr. Vengineer
 
Cloud Deep Learning Chips Training & Inference
Cloud Deep Learning Chips Training & InferenceCloud Deep Learning Chips Training & Inference
Cloud Deep Learning Chips Training & InferenceMr. Vengineer
 
TensorFlow Lite Delegateとは?
TensorFlow Lite Delegateとは?TensorFlow Lite Delegateとは?
TensorFlow Lite Delegateとは?Mr. Vengineer
 
Pixel Visual Core device driver source code analysis
Pixel Visual Core device driver source code analysisPixel Visual Core device driver source code analysis
Pixel Visual Core device driver source code analysisMr. Vengineer
 
TensorFlow XLA 「XLAとは、から、最近の利用事例について」
TensorFlow XLA 「XLAとは、から、最近の利用事例について」TensorFlow XLA 「XLAとは、から、最近の利用事例について」
TensorFlow XLA 「XLAとは、から、最近の利用事例について」Mr. Vengineer
 
Ultra96(UltraZed)実践勉強会
Ultra96(UltraZed)実践勉強会Ultra96(UltraZed)実践勉強会
Ultra96(UltraZed)実践勉強会Mr. Vengineer
 
Bridge TensorFlow to run on Intel nGraph backends (v0.4)
Bridge TensorFlow to run on Intel nGraph backends (v0.4)Bridge TensorFlow to run on Intel nGraph backends (v0.4)
Bridge TensorFlow to run on Intel nGraph backends (v0.4)Mr. Vengineer
 
Bridge TensorFlow to run on Intel nGraph backends (v0.5)
Bridge TensorFlow to run on Intel nGraph backends (v0.5)Bridge TensorFlow to run on Intel nGraph backends (v0.5)
Bridge TensorFlow to run on Intel nGraph backends (v0.5)Mr. Vengineer
 
TensorFlow local Python XLA client
TensorFlow local Python XLA clientTensorFlow local Python XLA client
TensorFlow local Python XLA clientMr. Vengineer
 
LeFlowを調べてみました
LeFlowを調べてみましたLeFlowを調べてみました
LeFlowを調べてみましたMr. Vengineer
 
Tensorflow dynamically loadable XLA plugin ソースコード解析
Tensorflow  dynamically loadable XLA plugin ソースコード解析Tensorflow  dynamically loadable XLA plugin ソースコード解析
Tensorflow dynamically loadable XLA plugin ソースコード解析Mr. Vengineer
 
TensorFlow Lite (r1.5) & Android 8.1 Neural Network API
TensorFlow Lite (r1.5) & Android 8.1 Neural Network APITensorFlow Lite (r1.5) & Android 8.1 Neural Network API
TensorFlow Lite (r1.5) & Android 8.1 Neural Network APIMr. Vengineer
 
「ディープラーニングでは、エコシステムが大切よ!」
 「ディープラーニングでは、エコシステムが大切よ!」 「ディープラーニングでは、エコシステムが大切よ!」
「ディープラーニングでは、エコシステムが大切よ!」Mr. Vengineer
 
TensorFlow XLA とハードウェア
TensorFlow XLA とハードウェアTensorFlow XLA とハードウェア
TensorFlow XLA とハードウェアMr. Vengineer
 
2017年のFPGA Community活動について
2017年のFPGA Community活動について2017年のFPGA Community活動について
2017年のFPGA Community活動についてMr. Vengineer
 
Zynq VIPを利用したテストベンチ
Zynq VIPを利用したテストベンチZynq VIPを利用したテストベンチ
Zynq VIPを利用したテストベンチMr. Vengineer
 

Mais de Mr. Vengineer (20)

XilinxのxsimでSoftware Driven Verification.pdf
XilinxのxsimでSoftware  Driven Verification.pdfXilinxのxsimでSoftware  Driven Verification.pdf
XilinxのxsimでSoftware Driven Verification.pdf
 
VerilatorとSystemCでSoftware Driven Verification
VerilatorとSystemCでSoftware Driven VerificationVerilatorとSystemCでSoftware Driven Verification
VerilatorとSystemCでSoftware Driven Verification
 
VerilatorとSystemC
VerilatorとSystemCVerilatorとSystemC
VerilatorとSystemC
 
Cloud TPU Driver API ソースコード解析
Cloud TPU Driver API ソースコード解析Cloud TPU Driver API ソースコード解析
Cloud TPU Driver API ソースコード解析
 
Cloud Deep Learning Chips Training & Inference
Cloud Deep Learning Chips Training & InferenceCloud Deep Learning Chips Training & Inference
Cloud Deep Learning Chips Training & Inference
 
TensorFlow Lite Delegateとは?
TensorFlow Lite Delegateとは?TensorFlow Lite Delegateとは?
TensorFlow Lite Delegateとは?
 
Pixel Visual Core device driver source code analysis
Pixel Visual Core device driver source code analysisPixel Visual Core device driver source code analysis
Pixel Visual Core device driver source code analysis
 
TensorFlow XLA 「XLAとは、から、最近の利用事例について」
TensorFlow XLA 「XLAとは、から、最近の利用事例について」TensorFlow XLA 「XLAとは、から、最近の利用事例について」
TensorFlow XLA 「XLAとは、から、最近の利用事例について」
 
Ultra96(UltraZed)実践勉強会
Ultra96(UltraZed)実践勉強会Ultra96(UltraZed)実践勉強会
Ultra96(UltraZed)実践勉強会
 
Bridge TensorFlow to run on Intel nGraph backends (v0.4)
Bridge TensorFlow to run on Intel nGraph backends (v0.4)Bridge TensorFlow to run on Intel nGraph backends (v0.4)
Bridge TensorFlow to run on Intel nGraph backends (v0.4)
 
Bridge TensorFlow to run on Intel nGraph backends (v0.5)
Bridge TensorFlow to run on Intel nGraph backends (v0.5)Bridge TensorFlow to run on Intel nGraph backends (v0.5)
Bridge TensorFlow to run on Intel nGraph backends (v0.5)
 
TensorFlow XLA RPC
TensorFlow XLA RPCTensorFlow XLA RPC
TensorFlow XLA RPC
 
TensorFlow local Python XLA client
TensorFlow local Python XLA clientTensorFlow local Python XLA client
TensorFlow local Python XLA client
 
LeFlowを調べてみました
LeFlowを調べてみましたLeFlowを調べてみました
LeFlowを調べてみました
 
Tensorflow dynamically loadable XLA plugin ソースコード解析
Tensorflow  dynamically loadable XLA plugin ソースコード解析Tensorflow  dynamically loadable XLA plugin ソースコード解析
Tensorflow dynamically loadable XLA plugin ソースコード解析
 
TensorFlow Lite (r1.5) & Android 8.1 Neural Network API
TensorFlow Lite (r1.5) & Android 8.1 Neural Network APITensorFlow Lite (r1.5) & Android 8.1 Neural Network API
TensorFlow Lite (r1.5) & Android 8.1 Neural Network API
 
「ディープラーニングでは、エコシステムが大切よ!」
 「ディープラーニングでは、エコシステムが大切よ!」 「ディープラーニングでは、エコシステムが大切よ!」
「ディープラーニングでは、エコシステムが大切よ!」
 
TensorFlow XLA とハードウェア
TensorFlow XLA とハードウェアTensorFlow XLA とハードウェア
TensorFlow XLA とハードウェア
 
2017年のFPGA Community活動について
2017年のFPGA Community活動について2017年のFPGA Community活動について
2017年のFPGA Community活動について
 
Zynq VIPを利用したテストベンチ
Zynq VIPを利用したテストベンチZynq VIPを利用したテストベンチ
Zynq VIPを利用したテストベンチ
 

Último

Call Girls in Vashi Escorts Services - 7738631006
Call Girls in Vashi Escorts Services - 7738631006Call Girls in Vashi Escorts Services - 7738631006
Call Girls in Vashi Escorts Services - 7738631006Pooja Nehwal
 
VVIP Pune Call Girls Kalyani Nagar (7001035870) Pune Escorts Nearby with Comp...
VVIP Pune Call Girls Kalyani Nagar (7001035870) Pune Escorts Nearby with Comp...VVIP Pune Call Girls Kalyani Nagar (7001035870) Pune Escorts Nearby with Comp...
VVIP Pune Call Girls Kalyani Nagar (7001035870) Pune Escorts Nearby with Comp...Call Girls in Nagpur High Profile
 
Top Rated Pune Call Girls Katraj ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
Top Rated  Pune Call Girls Katraj ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...Top Rated  Pune Call Girls Katraj ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
Top Rated Pune Call Girls Katraj ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...Call Girls in Nagpur High Profile
 
Call Girls Chikhali Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Chikhali Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Chikhali Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Chikhali Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
Call Girls Dubai Slut Wife O525547819 Call Girls Dubai Gaped
Call Girls Dubai Slut Wife O525547819 Call Girls Dubai GapedCall Girls Dubai Slut Wife O525547819 Call Girls Dubai Gaped
Call Girls Dubai Slut Wife O525547819 Call Girls Dubai Gapedkojalkojal131
 
Shikrapur Call Girls Most Awaited Fun 6297143586 High Profiles young Beautie...
Shikrapur Call Girls Most Awaited Fun  6297143586 High Profiles young Beautie...Shikrapur Call Girls Most Awaited Fun  6297143586 High Profiles young Beautie...
Shikrapur Call Girls Most Awaited Fun 6297143586 High Profiles young Beautie...tanu pandey
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In Yusuf Sarai ≼🔝 Delhi door step delevry≼🔝
Call Now ≽ 9953056974 ≼🔝 Call Girls In Yusuf Sarai ≼🔝 Delhi door step delevry≼🔝Call Now ≽ 9953056974 ≼🔝 Call Girls In Yusuf Sarai ≼🔝 Delhi door step delevry≼🔝
Call Now ≽ 9953056974 ≼🔝 Call Girls In Yusuf Sarai ≼🔝 Delhi door step delevry≼🔝9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Develop Keyboard Skill.pptx er power point
Develop Keyboard Skill.pptx er power pointDevelop Keyboard Skill.pptx er power point
Develop Keyboard Skill.pptx er power pointGetawu
 
Makarba ( Call Girls ) Ahmedabad ✔ 6297143586 ✔ Hot Model With Sexy Bhabi Rea...
Makarba ( Call Girls ) Ahmedabad ✔ 6297143586 ✔ Hot Model With Sexy Bhabi Rea...Makarba ( Call Girls ) Ahmedabad ✔ 6297143586 ✔ Hot Model With Sexy Bhabi Rea...
Makarba ( Call Girls ) Ahmedabad ✔ 6297143586 ✔ Hot Model With Sexy Bhabi Rea...Naicy mandal
 
VVIP Pune Call Girls Balaji Nagar (7001035870) Pune Escorts Nearby with Compl...
VVIP Pune Call Girls Balaji Nagar (7001035870) Pune Escorts Nearby with Compl...VVIP Pune Call Girls Balaji Nagar (7001035870) Pune Escorts Nearby with Compl...
VVIP Pune Call Girls Balaji Nagar (7001035870) Pune Escorts Nearby with Compl...Call Girls in Nagpur High Profile
 
9892124323 Pooja Nehwal Call Girls Services Call Girls service in Santacruz A...
9892124323 Pooja Nehwal Call Girls Services Call Girls service in Santacruz A...9892124323 Pooja Nehwal Call Girls Services Call Girls service in Santacruz A...
9892124323 Pooja Nehwal Call Girls Services Call Girls service in Santacruz A...Pooja Nehwal
 
Book Paid Lohegaon Call Girls Pune 8250192130Low Budget Full Independent High...
Book Paid Lohegaon Call Girls Pune 8250192130Low Budget Full Independent High...Book Paid Lohegaon Call Girls Pune 8250192130Low Budget Full Independent High...
Book Paid Lohegaon Call Girls Pune 8250192130Low Budget Full Independent High...ranjana rawat
 
9892124323, Call Girl in Juhu Call Girls Services (Rate ₹8.5K) 24×7 with Hote...
9892124323, Call Girl in Juhu Call Girls Services (Rate ₹8.5K) 24×7 with Hote...9892124323, Call Girl in Juhu Call Girls Services (Rate ₹8.5K) 24×7 with Hote...
9892124323, Call Girl in Juhu Call Girls Services (Rate ₹8.5K) 24×7 with Hote...Pooja Nehwal
 
Top Rated Pune Call Girls Shirwal ⟟ 6297143586 ⟟ Call Me For Genuine Sex Ser...
Top Rated  Pune Call Girls Shirwal ⟟ 6297143586 ⟟ Call Me For Genuine Sex Ser...Top Rated  Pune Call Girls Shirwal ⟟ 6297143586 ⟟ Call Me For Genuine Sex Ser...
Top Rated Pune Call Girls Shirwal ⟟ 6297143586 ⟟ Call Me For Genuine Sex Ser...Call Girls in Nagpur High Profile
 
Deira Dubai Escorts +0561951007 Escort Service in Dubai by Dubai Escort Girls
Deira Dubai Escorts +0561951007 Escort Service in Dubai by Dubai Escort GirlsDeira Dubai Escorts +0561951007 Escort Service in Dubai by Dubai Escort Girls
Deira Dubai Escorts +0561951007 Escort Service in Dubai by Dubai Escort GirlsEscorts Call Girls
 
Kothanur Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...
Kothanur Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...Kothanur Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...
Kothanur Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...amitlee9823
 
Lucknow 💋 Call Girls Adil Nagar | ₹,9500 Pay Cash 8923113531 Free Home Delive...
Lucknow 💋 Call Girls Adil Nagar | ₹,9500 Pay Cash 8923113531 Free Home Delive...Lucknow 💋 Call Girls Adil Nagar | ₹,9500 Pay Cash 8923113531 Free Home Delive...
Lucknow 💋 Call Girls Adil Nagar | ₹,9500 Pay Cash 8923113531 Free Home Delive...anilsa9823
 

Último (20)

Call Girls in Vashi Escorts Services - 7738631006
Call Girls in Vashi Escorts Services - 7738631006Call Girls in Vashi Escorts Services - 7738631006
Call Girls in Vashi Escorts Services - 7738631006
 
VVIP Pune Call Girls Kalyani Nagar (7001035870) Pune Escorts Nearby with Comp...
VVIP Pune Call Girls Kalyani Nagar (7001035870) Pune Escorts Nearby with Comp...VVIP Pune Call Girls Kalyani Nagar (7001035870) Pune Escorts Nearby with Comp...
VVIP Pune Call Girls Kalyani Nagar (7001035870) Pune Escorts Nearby with Comp...
 
Top Rated Pune Call Girls Katraj ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
Top Rated  Pune Call Girls Katraj ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...Top Rated  Pune Call Girls Katraj ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
Top Rated Pune Call Girls Katraj ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
 
Call Girls Chikhali Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Chikhali Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Chikhali Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Chikhali Call Me 7737669865 Budget Friendly No Advance Booking
 
Call Girls Dubai Slut Wife O525547819 Call Girls Dubai Gaped
Call Girls Dubai Slut Wife O525547819 Call Girls Dubai GapedCall Girls Dubai Slut Wife O525547819 Call Girls Dubai Gaped
Call Girls Dubai Slut Wife O525547819 Call Girls Dubai Gaped
 
🔝 9953056974🔝 Delhi Call Girls in Ajmeri Gate
🔝 9953056974🔝 Delhi Call Girls in Ajmeri Gate🔝 9953056974🔝 Delhi Call Girls in Ajmeri Gate
🔝 9953056974🔝 Delhi Call Girls in Ajmeri Gate
 
Shikrapur Call Girls Most Awaited Fun 6297143586 High Profiles young Beautie...
Shikrapur Call Girls Most Awaited Fun  6297143586 High Profiles young Beautie...Shikrapur Call Girls Most Awaited Fun  6297143586 High Profiles young Beautie...
Shikrapur Call Girls Most Awaited Fun 6297143586 High Profiles young Beautie...
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In Yusuf Sarai ≼🔝 Delhi door step delevry≼🔝
Call Now ≽ 9953056974 ≼🔝 Call Girls In Yusuf Sarai ≼🔝 Delhi door step delevry≼🔝Call Now ≽ 9953056974 ≼🔝 Call Girls In Yusuf Sarai ≼🔝 Delhi door step delevry≼🔝
Call Now ≽ 9953056974 ≼🔝 Call Girls In Yusuf Sarai ≼🔝 Delhi door step delevry≼🔝
 
Develop Keyboard Skill.pptx er power point
Develop Keyboard Skill.pptx er power pointDevelop Keyboard Skill.pptx er power point
Develop Keyboard Skill.pptx er power point
 
Makarba ( Call Girls ) Ahmedabad ✔ 6297143586 ✔ Hot Model With Sexy Bhabi Rea...
Makarba ( Call Girls ) Ahmedabad ✔ 6297143586 ✔ Hot Model With Sexy Bhabi Rea...Makarba ( Call Girls ) Ahmedabad ✔ 6297143586 ✔ Hot Model With Sexy Bhabi Rea...
Makarba ( Call Girls ) Ahmedabad ✔ 6297143586 ✔ Hot Model With Sexy Bhabi Rea...
 
CHEAP Call Girls in Hauz Quazi (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Hauz Quazi  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Hauz Quazi  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Hauz Quazi (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
VVIP Pune Call Girls Balaji Nagar (7001035870) Pune Escorts Nearby with Compl...
VVIP Pune Call Girls Balaji Nagar (7001035870) Pune Escorts Nearby with Compl...VVIP Pune Call Girls Balaji Nagar (7001035870) Pune Escorts Nearby with Compl...
VVIP Pune Call Girls Balaji Nagar (7001035870) Pune Escorts Nearby with Compl...
 
9892124323 Pooja Nehwal Call Girls Services Call Girls service in Santacruz A...
9892124323 Pooja Nehwal Call Girls Services Call Girls service in Santacruz A...9892124323 Pooja Nehwal Call Girls Services Call Girls service in Santacruz A...
9892124323 Pooja Nehwal Call Girls Services Call Girls service in Santacruz A...
 
Book Paid Lohegaon Call Girls Pune 8250192130Low Budget Full Independent High...
Book Paid Lohegaon Call Girls Pune 8250192130Low Budget Full Independent High...Book Paid Lohegaon Call Girls Pune 8250192130Low Budget Full Independent High...
Book Paid Lohegaon Call Girls Pune 8250192130Low Budget Full Independent High...
 
9892124323, Call Girl in Juhu Call Girls Services (Rate ₹8.5K) 24×7 with Hote...
9892124323, Call Girl in Juhu Call Girls Services (Rate ₹8.5K) 24×7 with Hote...9892124323, Call Girl in Juhu Call Girls Services (Rate ₹8.5K) 24×7 with Hote...
9892124323, Call Girl in Juhu Call Girls Services (Rate ₹8.5K) 24×7 with Hote...
 
Top Rated Pune Call Girls Shirwal ⟟ 6297143586 ⟟ Call Me For Genuine Sex Ser...
Top Rated  Pune Call Girls Shirwal ⟟ 6297143586 ⟟ Call Me For Genuine Sex Ser...Top Rated  Pune Call Girls Shirwal ⟟ 6297143586 ⟟ Call Me For Genuine Sex Ser...
Top Rated Pune Call Girls Shirwal ⟟ 6297143586 ⟟ Call Me For Genuine Sex Ser...
 
Deira Dubai Escorts +0561951007 Escort Service in Dubai by Dubai Escort Girls
Deira Dubai Escorts +0561951007 Escort Service in Dubai by Dubai Escort GirlsDeira Dubai Escorts +0561951007 Escort Service in Dubai by Dubai Escort Girls
Deira Dubai Escorts +0561951007 Escort Service in Dubai by Dubai Escort Girls
 
Kothanur Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...
Kothanur Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...Kothanur Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...
Kothanur Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...
 
(ISHITA) Call Girls Service Aurangabad Call Now 8617697112 Aurangabad Escorts...
(ISHITA) Call Girls Service Aurangabad Call Now 8617697112 Aurangabad Escorts...(ISHITA) Call Girls Service Aurangabad Call Now 8617697112 Aurangabad Escorts...
(ISHITA) Call Girls Service Aurangabad Call Now 8617697112 Aurangabad Escorts...
 
Lucknow 💋 Call Girls Adil Nagar | ₹,9500 Pay Cash 8923113531 Free Home Delive...
Lucknow 💋 Call Girls Adil Nagar | ₹,9500 Pay Cash 8923113531 Free Home Delive...Lucknow 💋 Call Girls Adil Nagar | ₹,9500 Pay Cash 8923113531 Free Home Delive...
Lucknow 💋 Call Girls Adil Nagar | ₹,9500 Pay Cash 8923113531 Free Home Delive...
 

Tiramisu をちょっと、味見してみました。

  • 2. ブログ (2007年~) : Vengineerの戯言  http://blogs.yahoo.co.jp/verification_engineer SlideShare :  https://www.slideshare.net/ssuser479fa3 Twitter (2009年~) : @Vengineer ソースコード解析職人
  • 3. Tiramisu: A Code Optimization Framework for High Performance Systems https://www.csail.mit.edu/research/tiramisu-framework-code-optimizat ion-and-code-generation MIT CSAIL
  • 4. Tiramisu Compiler ・(コード最適化 & コード生成) フレームワーク ・コード最適化 (ループ最適化)   => 独自コンパイラに組み込み可能   => loop tiling, loop fusion/distribution, loop spliting     loop interchange, loop shifting, loop unrolling     loop parallelization, loop vectorization     storage reordering, modulo storage ・コード生成   => マルチCPU (LLVM)、GPU (CUDA)、     分散システム (MPI)、FPGA (Xilinx Vivado HLS) https://github.com/Tiramisu-Colib/tiramisu#tiramisu-compiler
  • 5. Tiramisuは、Halide & ISLを使っている ・Halide https://github.com/halide/Halide ・ISL (Integer Set Library) http://isl.gforge.inria.fr/ Facebook Research : Tensor Comprehensions https://github.com/facebookresearch/TensorComprehensions Tensor Comprehensions (TC) is a fully-functional C++ library to automatically synthesize high-performance machine learning kernels using Halide, ISL and NVRTC or LLVM.
  • 6. Halide と Tiramisu の違い Halideは、矩形領域のみサポート Tiramisuは、矩形領域でなくても、OK! 何故なら?  polyhedral representation (多面体表現) を使う!
  • 7. 4つのチャレンジ 1). MPI+OpenMP+CUDA+HLS 2). メモリ依存 3). 最適化と効率の良いコード生成 4). representation
  • 8. 4). representation The challenge of representation is addressed by using a unified framework based on polyhedral sets to represent the four layers. 「polyhedral sets」 よくわからないので、 誰か?教えてください
  • 11.  ・Layer I : Abstract Algorithm  ・Layer II : Computation Management  ・Layer III : Data Management  ・Layer IV : Communication Managenent  ・Code generation: Abstract Syntax Tree https://arxiv.org/pdf/1804.10694.pdf
  • 12. The first layer defines abstract computations, which are not yet scheduled or mapped to memory. Each computation represents an expression to compute. https://arxiv.org/pdf/1804.10694.pdf Layer I : Abstract Algorithm
  • 13. {b1(i, j, c) : 0 ≤ i < N ∧ 0 ≤ j < M ∧ 0 ≤ c < 3} The iteration domain is the set of tuples b1(i, j, c) such that 0 ≤ i < N ∧ 0 ≤ j < M ∧ 0 ≤ c < 3 https://arxiv.org/pdf/1804.10694.pdf Iteration domain
  • 14. Tiramisu でのコード生成では、次の2つのステップで行われる 1)、time-space mapping    This mapping is done by applying an affine relation 2)、adding new statements. Layer II, III, IV にて、コマンドを追加する https://arxiv.org/pdf/1804.10694.pdf コード生成へのステップ
  • 15. Affine transformations including loop tiling, skewing, loop fusion, distribution, splitting, reordering, and many others can be expressed as an affine map that maps computations from Layer I into the time-space domain in Layer II. We call this map a time-space map. Layer I の iteration domain を time-space domain に変換 https://arxiv.org/pdf/1804.10694.pdf Time-space Maps
  • 16. Layer I:iteration domain {C(i, j) : 0 ≤ i < N ∧ 0 ≤ j < N } : A(i, j) + B(i, j)  Time-space mapping として、(16 x 16 tiles) を! {C(i, j) →C(i1, j1, i2, j2) : i1 = f loor (i/16) ∧ i2 = i%16∧ j1 = f loor (j/16) ∧ j2 = j%16 ∧ 0 ≤ i < N ∧ 0 ≤ j < N } Layer II:time-space domain {C(i1, j1, i2, j2) : i1 = f loor (i/16) ∧ i2 = i%16 ∧ j1 = f loor (j/16)∧j2 = j%16 ∧ 0 ≤ i < N ∧ 0 ≤ j < N } : A(i1 ∗ 16 + i2, j1 ∗ 16 + j2) + B(i1 ∗ 16 + i2, j1 ∗ 16 + j2) https://arxiv.org/pdf/1804.10694.pdf サンプル:Time-space Maps
  • 17. Time dimensions : => When 実行の順番(他の computation に対して) を指定する Space dimensions : => Where 各 computation を実行するプロセッサ を指定する Time-space domain (Time-space Maps) https://arxiv.org/pdf/1804.10694.pdf Layer II: Computation Management
  • 18. Space dimensions 各 computation を実行するプロセッサを指定する. cpu 共有メモリシステム内のCPUで実行する node 分散システム内のノードで実行する gpu_thread_X GPUスレッドのX次元で実行する gpu_block_X GPUブロックのX次元で実行する vec(s) ベクタ化する ( s は、ベクタ幅 ) unroll アンロール pipeline パイプライン化 ( FPGA のみ ) https://arxiv.org/pdf/1804.10694.pdf Layer II: Computation Management
  • 19. Data Management では、計算結果を蓄えておくメモリの場所を指定する allocation/deallocation statements a set of access relations, which map a computation from Layer II to array elements read or written by that computation. https://arxiv.org/pdf/1804.10694.pdf Layer III: Data Management
  • 20. 通信コマンド(同期通信を含む)を追加およびスケジューリングされる Layer IIIで追加した allocation or deallocation operation は、 Layer IVでスケジュールされる https://arxiv.org/pdf/1804.10694.pdf Layer IV: Communication Management
  • 22. + xxx.otest_XX wrapper_test_XX.o wrapper_test_XX このファイルに、 tiramisu のコードを書き、 オブジェクトファイル(xxx.o)を生成する Tiramisuは、Halide と同様にオブジェクトファイルを生成し、その オブジェクトファイルをリンクして使用します。
  • 24. int main(int, char **) { Halide::Buffer<uint8_t> reference_buf(NN, MM); init_buffer(reference_buf, (uint8_t)7); Halide::Buffer<uint8_t> output_buf(NN, MM); init_buffer(output_buf, (uint8_t)13); assign_7_to_10x10_2D_array_with_tiling_parallelism( output_buf.raw_buffer()); compare_buffers("assign_7_to_10x10_2D_array_with_tiling_parallelism", output_buf, reference_buf); return 0; } テストコード (tests/wrapper_test_01.cpp) https://github.com/rbaghdadi/tiramisu/blob/master/tests/wrapper_test_01.cpp この例は、resultのみ
  • 25. オブジェクトファイルを生成 int main(int argc, char **argv) { generate_function_1( "assign_7_to_10x10_2D_array_with_tiling_parallelism", 10, 3, 4); return 0; } テストコード (tests/developers/test_01.cpp) https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L41
  • 26. void generate_function_1(std::string name, int size, int val0, int val1 ) { tiramisu::global::set_default_tiramisu_options(); tiramisu::function function0(name); tiramisu::constant N("N", tiramisu::expr((int32_t) size), p_int32, true, NULL, 0, &function0 ); テストコード (tests/test_01.cpp) https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L16
  • 27. static void set_default_tiramisu_options() { global::loop_iterator_type = p_int32; set_auto_data_mapping(true); // GPU : NVIDIA NVCCのパス auto location = std::getenv(NVCC_BIN_DIR_ENV_VAR); if (location) nvcc_bin_dir = location; } global::set_default_tiramisu_optionsメソッド https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/expr.h#L93
  • 28. void generate_function_1(std::string name, int size, int val0, int val1 ) { tiramisu::global::set_default_tiramisu_options(); tiramisu::function function0(name); tiramisu::constant N("N", tiramisu::expr((int32_t) size), p_int32, true, NULL, 0, &function0 ); テストコード (tests/test_01.cpp) https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L16
  • 29. A class to represent functions in Tiramisu. A function in Tiramisu is composed of a set of computations (tiramisu::computation). 例: std::string name(“sample”); tiramisu::function function0(name); function クラス https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L97 computionsの集合!
  • 30. void generate_function_1(std::string name, int size, int val0, int val1 ) { tiramisu::global::set_default_tiramisu_options(); tiramisu::function function0(name); tiramisu::constant N("N", tiramisu::expr((int32_t) size), p_int32, true, NULL, 0, &function0 ); テストコード (tests/test_01.cpp) https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L16
  • 31. A class that represents loop invariants. An object of the invariant class can be an expression, a symbolic constant or a variable that is invariant to all the loops of the function. 例: tiramisu::constant N("N", tiramisu::expr((int32_t) size), p_int32, true, NULL, 0, &function0); constant クラス https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L3667
  • 32. tiramisu::var i("i"), j("j"), i0("i0"), j0("j0"), i1("i1"), j1("j1"); tiramisu::expr e1 = tiramisu::expr(tiramisu::o_add, tiramisu::expr((uint8_t) val0), tiramisu::expr((uint8_t) val1) ); tiramisu::computation S0("[N]->{S0[i,j]: 0<=i<N and 0<=j<N}", e1, true, p_uint8, &function0 ); tiramisu::buffer buf0("buf0", {size, size}, tiramisu::p_uint8, a_output, &function0 ); テストコード (tests/test_01.cpp) https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L16
  • 33. A class that represents constant variable references 例: tiramisu::var i("i"), j("j"), i0("i0"), j0("j0"), i1("i1"), j1("j1") var クラス https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/expr.h#L1641
  • 34. tiramisu::var i("i"), j("j"), i0("i0"), j0("j0"), i1("i1"), j1("j1"); tiramisu::expr e1 = tiramisu::expr(tiramisu::o_add, tiramisu::expr((uint8_t) val0), tiramisu::expr((uint8_t) val1) ); tiramisu::computation S0("[N]->{S0[i,j]: 0<=i<N and 0<=j<N}", e1, true, p_uint8, &function0 ); tiramisu::buffer buf0("buf0", {size, size}, tiramisu::p_uint8, a_output, &function0 ); テストコード (tests/test_01.cpp) https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L16
  • 35. A class to represent tiramisu expressions. 例: tiramisu::expr e1 = tiramisu::expr(tiramisu::o_add, tiramisu::expr((uint8_t) val0), tiramisu::expr((uint8_t) val1)); expr クラス https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/expr.h#L128
  • 36. tiramisu::var i("i"), j("j"), i0("i0"), j0("j0"), i1("i1"), j1("j1"); tiramisu::expr e1 = tiramisu::expr(tiramisu::o_add, tiramisu::expr((uint8_t) val0), tiramisu::expr((uint8_t) val1) ); tiramisu::computation S0("[N]->{S0[i,j]: 0<=i<N and 0<=j<N}", e1, true, p_uint8, &function0 ); tiramisu::buffer buf0("buf0", {size, size}, tiramisu::p_uint8, a_output, &function0 ); テストコード (tests/test_01.cpp) https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L16
  • 37. A class that represents computations. A computation is an expression associated with an iteration domain. A computation indicates what needs to be computed (the expression that should be computed). A computation has three representations: Level I Level II Level III (最新の論文では、Layer I/II/III/IV と表現している。 Layer IVは、Communication Managenent) computation クラス https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L1225
  • 38. 例、 tiramisu::var i = tiramisu::var("i"); tiramisu::computation input("[N]->{input[i]}", tiramisu::expr(), false, p_uint8, &function0); tiramisu::computation result("[N]->{result[0]}", tiramisu::expr(input(0)), true, p_uint8, &function0); result.add_definitions("[N]->{result[i]: 1<=i<N}", (result(i - 1) + input(i)), true, p_uint8, &function0); computation クラス https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L1225
  • 39. tiramisu::var i("i"), j("j"), i0("i0"), j0("j0"), i1("i1"), j1("j1"); tiramisu::expr e1 = tiramisu::expr(tiramisu::o_add, tiramisu::expr((uint8_t) val0), tiramisu::expr((uint8_t) val1) ); tiramisu::computation S0("[N]->{S0[i,j]: 0<=i<N and 0<=j<N}", e1, true, p_uint8, &function0 ); tiramisu::buffer buf0("buf0", {size, size}, tiramisu::p_uint8, a_output, &function0 ); テストコード (tests/test_01.cpp) https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L16
  • 40. A class that represents buffers. Buffers have two use cases: - used to store the results of computations, and - used to represent input arguments to functions. 例: 入力バッファ tiramisu::buffer input_buffer("input_buffer", {size}, tiramisu::p_uint8, a_input, &function0);           結果用のバッファ tiramisu::buffer result_scalar("result_scalar", {1}, tiramisu::p_uint8, a_output, &function0); buffer クラス https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L957
  • 41. S0.set_access("{S0[i,j]->buf0[i,j]}"); S0.tile(i, j, 2, 2, i0, j0, i1, j1); S0.tag_parallel_level(i0); テストコード (tests/test_01.cpp) https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L16
  • 42. void set_access(std::string access_str); void set_access(isl_map *access); Set the access relation of the computation. The access relation is a relation from computations to buffer locations. access_str is a string that represents the relation. It is encoded in the ISL format, (http://isl.gforge.inria.fr/user.html#Sets-and-Relations) 例、 S0.set_access("{S0[i,j]->buf0[i,j]}"); computation::set_access メソッド https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L3130
  • 43. S0.set_access("{S0[i,j]->buf0[i,j]}"); S0.tile(i, j, 2, 2, i0, j0, i1, j1); S0.tag_parallel_level(i0); テストコード (tests/test_01.cpp) https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L16
  • 44. void tile(tiramisu::var L0, tiramisu::var L1, int sizeX, int sizeY, tiramisu::var L0_outer, tiramisu::var L1_outer, tiramisu::var L0_inner, tiramisu::var L1_inner ); Tile the two loop levels L0 and L1 with rectangular tiling. sizeX and sizeY represent the tile size. L0 and L1 should be two consecutive loop levels. L0_outer, L1_outer, L0_inner, L1_inner are the names of the new dimensions created after tiling. 例、 S0.tile(i, j, 2, 2, i0, j0, i1, j1); computation::tile メソッド https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L3424
  • 45. S0.set_access("{S0[i,j]->buf0[i,j]}"); S0.tile(i, j, 2, 2, i0, j0, i1, j1); S0.tag_parallel_level(i0); テストコード (tests/test_01.cpp) https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L16
  • 46. void tag_parallel_level(tiramisu::var L); void tag_parallel_level(int L); Tag the loop level L to be parallelized. 例、 S0.tag_parallel_level(i0); computation::tag_parallel_level メソッド https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L3424
  • 47. // 引数 (buf0) を設定 function0.set_arguments({&buf0}); // interation domain => time-space domain function0.gen_time_space_domain(); // ISLのAbstract Syntax Treeを生成 function0.gen_isl_ast(); // Halide Statement を生成 function0.gen_halide_stmt(); // オブジェクトファイルの生成 function0.gen_halide_obj("build/generated_fct_test_01.o"); } テストコード (tests/test_01.cpp) https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L16
  • 48. void set_arguments(const std::vector<tiramisu::buffer *> &buffer_vec ); Set the arguments of the function. The arguments of the function are provided as a vector of pointers to buffers. Each buffer represents an argument to the function. During code generation, the arguments in the vector will become the arguments of the generated function (with the order of their appearance in the vector). function::set_arguments メソッド https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L918
  • 49. // 引数 (buf0) を設定 function0.set_arguments({&buf0}); // interation domain => time-space domain function0.gen_time_space_domain(); // ISLのAbstract Syntax Treeを生成 function0.gen_isl_ast(); // Halide Statement を生成 function0.gen_halide_stmt(); // オブジェクトファイルの生成 function0.gen_halide_obj("build/generated_fct_test_01.o"); } テストコード (tests/test_01.cpp) https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L16
  • 50. void gen_time_space_domain(); Generate the time-space domain of the function. In this representation, the logical time of execution and the processor where the computation will be executed are both specified. function::gen_time_space_domain メソッド https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L910
  • 51. // 引数 (buf0) を設定 function0.set_arguments({&buf0}); // interation domain => time-space domain function0.gen_time_space_domain(); // ISLのAbstract Syntax Treeを生成 function0.gen_isl_ast(); // Halide Statement を生成 function0.gen_halide_stmt(); // オブジェクトファイルの生成 function0.gen_halide_obj("build/generated_fct_test_01.o"); } テストコード (tests/test_01.cpp) https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L16
  • 52. void gen_isl_ast(); Generate an isl AST that represents the function. function::gen_isl_ast メソッド https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L905
  • 53. // 引数 (buf0) を設定 function0.set_arguments({&buf0}); // interation domain => time-space domain function0.gen_time_space_domain(); // ISLのAbstract Syntax Treeを生成 function0.gen_isl_ast(); // Halide Statement を生成 function0.gen_halide_stmt(); // オブジェクトファイルの生成 function0.gen_halide_obj("build/generated_fct_test_01.o"); } テストコード (tests/test_01.cpp) https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L16
  • 54. void gen_halide_stmt(); Generate a Halide stmt that represents the function. function::gen_halide_stmt メソッド https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L897
  • 55. // 引数 (buf0) を設定 function0.set_arguments({&buf0}); // interation domain => time-space domain function0.gen_time_space_domain(); // ISLのAbstract Syntax Treeを生成 function0.gen_isl_ast(); // Halide Statement を生成 function0.gen_halide_stmt(); // オブジェクトファイルの生成 function0.gen_halide_obj("build/generated_fct_test_01.o"); } テストコード (tests/test_01.cpp) https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_01.cpp#L16
  • 56. void gen_halide_obj(const std::string &obj_file_name, Halide::Target::OS os, Halide::Target::Arch arch, int bits ) const; Generate an object file that contains the compiled function. This function relies on Halide to generate the object file. obj_file_name : the name of the generated file. os : the target operating system (Halide::Target::OS). arch : the architecture of the target (the instruction set). bits : the bit-width of the target machine. (must be 0 for unknown, or 32 or 64 ) function::gen_halide_obj メソッド https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L897
  • 57. void tiramisu::function::codegen( const std::vector<tiramisu::buffer *> &buffer_vec, const std::string obj_filename) { this->set_arguments(buffer_vec); this->lift_dist_comps(); // <= MPI/CUDAの時のみ有効 this->gen_time_space_domain(); this->gen_isl_ast(); this->gen_halide_stmt(); this->gen_halide_obj(obj_filename); } 全部まとめでコード生成 function::codegen https://github.com/Tiramisu-Colib/tiramisu/blob/master/src/tiramisu_core.cpp#L8508
  • 60. // C++ code with a Tiramisu expression. #include "tiramisu.h" void foo(int N, int array_a[N], int array_b[N], int array_c[N]) { tiramisu::init(); // Declare an iterator and inputs tiramisu::iter i, j; tiramisu::in A(i,j), B(i,j); Tiramisu expressions (README.md) https://github.com/Tiramisu-Compiler/tiramisu/blob/master/README.md#example
  • 61. // Declare the Tiramisu expression (algorithm) tiramisu::comp C(i,j) = A(i,j) + B(i,j); // Specify optimizations C.parallelize(i).vectorize(j, 4); // Realize, compile and run the expression C.realize(tiramisu::int32_t, {N}); C.compile({(A, array_a), (B, array_b), (C, array_c)}); C.run(); } Tiramisu expressions (README.md) https://github.com/Tiramisu-Compiler/tiramisu/blob/master/README.md#example
  • 64. ブログ (2007年~) : Vengineerの戯言  http://blogs.yahoo.co.jp/verification_engineer SlideShare :  https://www.slideshare.net/ssuser479fa3 ありがとうございました Twitter (2009年~) : @Vengineer ソースコード解析職人
  • 67. virtual void add_definitions(std::string iteration_domain_str, tiramisu::expr e, bool schedule_this_computation, tiramisu::primitive_t t, tiramisu::function *fct ); Add definitions of computations that have the same name as this computation. The arguments of this function are identical to the arguments of the computation constructor. In general, this function is used to express reductions and to express computation updates. function::add_definitions メソッド https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L2541
  • 68. 例、 // [N]->{C[0,i]: 0<=i<N} : 10 tiramisu::computation C("[N]->{C[0,i]: 0<=i<N}", tiramisu::expr((uint8_t) 10), true, p_uint8, &function0); // [N]->{C[1,i]: 0<=i<N} : C(0, i) + 10 C.add_definitions("[N]->{C[1,i]: 0<=i<N}", C(0, i) + tiramisu::expr((uint8_t) 10), true, p_uint8, &function0); function::add_definitions メソッド https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L2541
  • 69. tiramisu::computation& get_update(int index); Returns the index update that has been added to this computation such that: - If index == 0, then this computation is returned. - If > 0, then it returns the pth computation added through add_definitions. function::get_update メソッド https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L3065
  • 70. 例、 tiramisu::computation result("[N]->{result[0]}", tiramisu::expr(input(0)), true, p_uint8, &function0 ); result.add_definitions("[N]->{result[i]: 1<=i<N}", (result(i - 1) + input(i)), true, p_uint8, &function0 ); // result.get_update(1)は、result[1]になる // result[0] を先に実行してから、result[i]を実行する result.get_update(1).after(result, computation::root); function::get_update メソッド https://github.com/rbaghdadi/tiramisu/blob/master/tutorials/tutorial_06.cpp
  • 71. void tiramisu::computation::set_expression(const tiramisu::expr &e ); Set the expression of the computation. 例、 computation c_C("[N]->{c_C[i,j,0]: 0<=i<N and 0<=j<N}", expr((uint8_t) 0), true, p_uint8, &matmul ); c_C.add_definitions("[N]->{c_C[i,j,k]: 0<=i<N and 0<=j<N and 0<=k<N}", expr(), true, p_uint8, &matmul ); expr e1 = c_C(i, j, k - 1) + c_A(i, k) * c_B(k, j); c_C.get_update(1).set_expression(e1); computation::set_expression メソッド https://github.com/rbaghdadi/tiramisu/blob/master/src/tiramisu_core.cpp#L7470
  • 73. void after(computation &comp, tiramisu::var iterator); Schedule this computation to run after the computation comp. This computation is placed after comp in the loop level level. level is a loop level in this computation. computation::after メソッド https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L2598
  • 74. 例、 {S0[i,j]: 0<=i<N and 0<=j<N} and {S1[i,j]: 0<=i<N and 0<=j<N} S1.after(S0, i) for (i=0; i<N; i++) { for (j=0; j<N; j++) S0; for (j=0; j<N; j++) S1; } computation::after メソッド https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L2598
  • 75. 例、 {S0[i,j]: 0<=i<N and 0<=j<N} and {S1[i,j]: 0<=i<N and 0<=j<N} S1.after(S0, j) for (i=0; i<N; i++) for (j=0; j<N; j++) { S0; S1; } computation::after メソッド https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L2598
  • 76. 例、 {S0[i,j]: 0<=i<N and 0<=j<N} and {S1[i,j]: 0<=i<N and 0<=j<N} S1.after(S0, computation::root) for (i=0; i<N; i++) for (j=0; j<N; j++) S0; for (i=0; i<N; i++) for (j=0; j<N; j++) S1; computation::after メソッド https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L2598
  • 77. 例、 {S0[i,j]: 0<=i<N and 0<=j<N}, {S1[i,j]: 0<=i<N and 0<=j<N} and {S2[i,j]: 0<=i<N and 0<=j<N}. for (i=0; i<N; i++) for (j=0; j<N; j++) S0; for (i=0; i<N; i++) for (j=0; j<N; j++) S1; for (i=0; i<N; i++) for (j=0; j<N; j++) S2; computation::fuse_after メソッド https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L2939
  • 78. 例、 S2.fuse_after(j, S1); S1.fuse_after(j, S0); for (i=0; i<N; i++) for (j=0; j<N; j++) { S0; S1; S2; } computation::fuse_after メソッド https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L2939
  • 79. 例、 S2.fuse_after(i, S1); S1.fuse_after(i, S0); for (i=0; i<N; i++) { for (j=0; j<N; j++) S0; for (j=0; j<N; j++) S1; for (j=0; j<N; j++) S2; } computation::fuse_after メソッド https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L2939
  • 80. void before(computation &consumer, tiramisu::var L); Schedule this computation to run before the computation consumer at the loop level L computation::before メソッド https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L2598
  • 81. void between(computation &before_comp, tiramisu::var before_l, computation &after_comp, tiramisu::var after_l ); Schedule this computation to run after before_comp at the loop level before_l, and before after_comp at loop level after_l. The outermost loop level is 0. computation::between メソッド https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L2598
  • 82. void bind_to(buffer *buff); Bind this computation to a buffer. i.e., create a one-to-one data mapping between the computation and the buffer. In Tiramisu, a tiramisu computation cannot directly consume values from buffers. Buffers should first be wrapped in computations. computation::bind_to メソッド https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h#L2840
  • 83. 例、 tiramisu::buffer N_input_b("N_input_b", {1}, tiramisu::p_int32, a_input, &function0 ); N_input.bind_to(&N_input_b); tiramisu::buffer S0_b("S0_b", {N_input(0), N_input(0)}, tiramisu::p_uint8, a_temporary, &function0 ); S0.bind_to(&S0_b); tiramisu::buffer S1_b("S1_b", {tiramisu::var("N"), tiramisu::var("N")}, tiramisu::p_uint8, a_output, &function0 ); S1.bind_to(&S1_b); computation::bind_to メソッド https://github.com/rbaghdadi/tiramisu/blob/master/tests/test_87.cpp
  • 84. void compute_at(computation &consumer, tiramisu::var L ); void compute_at(computation &consumer, int L ); void interchange(tiramisu::var L0, tiramisu::var L1 ); void set_inline(bool is_inline = true ); computation クラス のいろいろなメソッド https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h
  • 85. void shift(tiramisu::var L0, int n ); void split(tiramisu::var L0, int sizeX ); void split(tiramisu::var L0, int sizeX, tiramisu::var L0_outer, tiramisu::var L0_inner ); void tile(int L0, int L1, int sizeX, int sizeY ); void tile(int L0, int L1, int L2, int sizeX, int sizeY, int sizeZ ); void unroll(tiramisu::var L, int fac ); void unroll(tiramisu::var L, int fac, tiramisu::var L_outer, tiramisu::var L_inner ); void vectorize(tiramisu::var L, int v ); void vectorize(tiramisu::var L, int v, tiramisu::var L_outer, tiramisu::var L_inner ); computation クラス のいろいろなメソッド https://github.com/rbaghdadi/tiramisu/blob/master/include/tiramisu/core.h