SlideShare uma empresa Scribd logo
1 de 46
Baixar para ler offline
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
DreamWorks Animation*:
Slashing the cost of 3d Matrix
Math using X-Form
(Transform) Building Blocks
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
DreamWorks Animation*:
Slashing the cost of 3d Matrix
Math using X-Form
(Transform) Building Blocks
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
DreamWorks Animation:
Slashing the cost of 3d Matrix
Math using X-Form
(Transform) Building Blocks
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Alex Wells (presenter)
& Martin Watt (DWA)
August 12 & 13, 2015
DreamWorks Animation:
Slashing the cost of 3d Matrix
Math using X-Form
(Transform) Building Blocks
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS
DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO
SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER
INTELLECTUAL PROPERTY RIGHT.
A "Mission Critical Application" is any application in which failure of the Intel Product could result, directly or indirectly, in personal injury or death. SHOULD YOU PURCHASE OR USE INTEL'S PRODUCTS FOR ANY SUCH MISSION
CRITICAL APPLICATION, YOU SHALL INDEMNIFY AND HOLD INTEL AND ITS SUBSIDIARIES, SUBCONTRACTORS AND AFFILIATES, AND THE DIRECTORS, OFFICERS, AND EMPLOYEES OF EACH, HARMLESS AGAINST ALL CLAIMS
COSTS, DAMAGES, AND EXPENSES AND REASONABLE ATTORNEYS' FEES ARISING OUT OF, DIRECTLY OR INDIRECTLY, ANY CLAIM OF PRODUCT LIABILITY, PERSONAL INJURY, OR DEATH ARISING IN ANY WAY OUT OF SUCH
MISSION CRITICAL APPLICATION, WHETHER OR NOT INTEL OR ITS SUBCONTRACTOR WAS NEGLIGENT IN THE DESIGN, MANUFACTURE, OR WARNING OF THE INTEL PRODUCT OR ANY OF ITS PARTS.
Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined". Intel reserves
these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this
information.
The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.
Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order.
Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or go to: http://www.intel.com/design/literature.htm
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products.
Intel does not control or audit the design or implementation of third party benchmarks or Web sites referenced in this document. Intel encourages all of its customers to visit the referenced Web sites or others where similar
performance benchmarks are reported and confirm whether the referenced benchmarks are accurate and reflect performance of systems available for purchase.
Relative performance is calculated by assigning a baseline value of 1.0 to one benchmark result, and then dividing the actual benchmark result for the baseline platform into each of the specific benchmark results of each of the other
platforms, and assigning them a relative performance number that correlates with the performance improvements reported.
SPEC, SPECint, SPECfp, SPECrate. SPECpower, SPECjAppServer, SPECjbb, SPECjvm, SPECWeb, SPECompM, SPECompL, SPEC MPI, SPECjEnterprise* are trademarks of the Standard Performance Evaluation Corporation. See
http://www.spec.org for more information. TPC-C, TPC-H, TPC-E are trademarks of the Transaction Processing Council. See http://www.tpc.org for more information.
Hyper-Threading Technology requires a computer system with a processor supporting HT Technology and an HT Technology-enabled chipset, BIOS and operating system. Performance will vary depending on the specific hardware and
software you use. For more information including details on which processors support HT Technology, see here
Intel® Turbo Boost Technology requires a Platform with a processor with Intel Turbo Boost Technology capability. Intel Turbo Boost Technology performance varies depending on hardware, software and overall system configuration.
Check with your platform manufacturer on whether your system delivers Intel Turbo Boost Technology. For more information, see http://www.intel.com/technology/turboboost
No computer system can provide absolute security. Requires an enabled Intel® processor and software optimized for use of the technology. Consult your system manufacturer and/or software vendor for more information.
Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families: Go to:
Learn About Intel® Processor Numbers
Intel product plans in this presentation do not constitute Intel plan of record product roadmaps. Please contact your Intel representative to obtain Intel’s current plan of record product roadmaps.
Copyright © 2014 Intel Corporation. All rights reserved. Intel, the Intel logo, Xeon and Intel Core are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. All dates and
products specified are for planning purposes only and are subject to change without notice
*Other names and brands may be claimed as the property of others.
Legal Disclaimers
5
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
The above statements and any others in this document that refer to plans and expectations for the third quarter, the year and the future are forward-looking statements that
involve a number of risks and uncertainties. Words such as “anticipates,” “expects,” “intends,” “plans,” “believes,” “seeks,” “estimates,” “may,” “will,” “should” and their variations
identify forward-looking statements. Statements that refer to or are based on projections, uncertain events or assumptions also identify forward-looking statements. Many
factors could affect Intel’s actual results, and variances from Intel’s current expectations regarding such factors could cause actual results to differ materially from those
expressed in these forward-looking statements. Intel presently considers the following to be the important factors that could cause actual results to differ materially from the
company’s expectations. Demand could be different from Intel's expectations due to factors including changes in business and economic conditions; customer acceptance of
Intel’s and competitors’ products; supply constraints and other disruptions affecting customers; changes in customer order patterns including order cancellations; and changes
in the level of inventory at customers. Uncertainty in global economic and financial conditions poses a risk that consumers and businesses may defer purchases in response to
negative financial events, which could negatively affect product demand and other related matters. Intel operates in intensely competitive industries that are characterized by
a high percentage of costs that are fixed or difficult to reduce in the short term and product demand that is highly variable and difficult to forecast. Revenue and the gross
margin percentage are affected by the timing of Intel product introductions and the demand for and market acceptance of Intel's products; actions taken by Intel's competitors,
including product offerings and introductions, marketing programs and pricing pressures and Intel’s response to such actions; and Intel’s ability to respond quickly to
technological developments and to incorporate new features into its products. The gross margin percentage could vary significantly from expectations based on capacity
utilization; variations in inventory valuation, including variations related to the timing of qualifying products for sale; changes in revenue levels; segment product mix; the
timing and execution of the manufacturing ramp and associated costs; start-up costs; excess or obsolete inventory; changes in unit costs; defects or disruptions in the supply of
materials or resources; product manufacturing quality/yields; and impairments of long-lived assets, including manufacturing, assembly/test and intangible assets. Intel's
results could be affected by adverse economic, social, political and physical/infrastructure conditions in countries where Intel, its customers or its suppliers operate, including
military conflict and other security risks, natural disasters, infrastructure disruptions, health concerns and fluctuations in currency exchange rates. Expenses, particularly certain
marketing and compensation expenses, as well as restructuring and asset impairment charges, vary depending on the level of demand for Intel's products and the level of
revenue and profits. Intel’s results could be affected by the timing of closing of acquisitions and divestitures. Intel's results could be affected by adverse effects associated with
product defects and errata (deviations from published specifications), and by litigation or regulatory matters involving intellectual property, stockholder, consumer, antitrust,
disclosure and other issues, such as the litigation and regulatory matters described in Intel's SEC reports. An unfavorable ruling could include monetary damages or an
injunction prohibiting Intel from manufacturing or selling one or more products, precluding particular business practices, impacting Intel’s ability to design its products, or
requiring other remedies such as compulsory licensing of intellectual property. A detailed discussion of these and other factors that could affect Intel’s results is included in
Intel’s SEC filings, including the company’s most recent reports on Form 10-Q, Form 10-K and earnings release.
Risk Factors
6
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
7
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Before
 After
Overall Speedup 1.2x
8
DWA* Character Animation
Speedup After XBB
Motion System
Speedup 1.6x
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Motion System in DWA Character Animation
 Observed performance bottlenecks in Motion System
 3d Matrix transforms
 How would an ideal transform behave
 XBB representation
 XBB deferred evaluation
 Results
Agenda
9
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 To represent bones of a skeleton in 3d space an
animation tool builds a Hierarchy of Joints and how
they are connected.
– Typically a Directed Acyclic Graph of Joints
How is a skeleton represented for
animation?
10
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Relative to a parent Joint (in Local Space), each Joint
needs to model:
– Rotational Euler Angles(around X, Y, and Z axis) & Order
– Scale (of X, Y, and Z axis)
– Shear (along X, Y, and Z axis)
– Translation (X, Y, and Z components)
 Animation curves change values over time
– drive the Joint’s attributes (rotation, translation, etc.)
How is a each Joint represented?
11
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Deformers which compute the final 3d vertices of a
character’s skin need an “Frame” of reference to apply
offsets from.
 The “World Space” Position and Orientation of the Joints
from the Hierarchy (skeleton) provide that “Frame” of
reference.
How does the skeleton influence the
skin?
12
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Representing a “Frame” of reference
struct Matrix4x4
{
double m[4][4];
};
 A 4x4 Matrix can represent the Position and Orientation of a
Joint in World Space.
 When used in this manner, the 4x4 Matrix is commonly
referred to as a 3d transform (x-form).
 4x4 Matrix is typically implemented literally as a 4x4 array of
floating point values.
13
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Rotation, Scale, Shear, and Translation can all be
represented as 4x4 Matrices.
 Multiple 4x4 Matrices can be concatenated (multiplied)
together to a single 4x4 matrix.
 3d points and 3d vectors (offsets) can be multiplied through
a 4x4 Matrix to be transformed to the position and
orientation in “World Space” it represents.
 For each Joint
– matrices representing Scale, Shear, Rotation, and Translation are
combined together into a single “Local Space” 4x4 matrix.
Why a 4x4 Matrix?
14
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 By recursively combining the “Local Space” transforms of a Joint
with its parent Joint’s “Local Space” until the root of the hierarchy
is reached, a 4x4 matrix can be accumulated that represents the
World Space of that Joint.
 As there are many joints, its pays off to cache a “World Space” 4x4
Matrix at each joint, so that a recursive walk up the hierarchy can
stop early if a clean “World Space” has been cached.
How To Calculate The World Space
Transform Of A Joint?
15
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Each time step, 1000’s of Joint attributes change,
invalidating a Hierarchy’s cached World Space and
Local Space transforms.
 1000’s of operations on Hierarchy objects build up a
complex skeleton.
Hierarchy is the core of
DWA’s Motion System
 Imagine how many bones are used to
represent a 4 legged creature with a
tail & wings.
 Due to the recursion, there is little
opportunity for data vectorization or
threading.
16
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Despite heavy parallelization of the Deformation System (green & yellow), it
can’t start until the Motion System (red) finishes assembling a Hierarchy.
Motion System Is On The Critical Path
17
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Motion System dwarfs the
other systems.
 Amdahl’s law limits our
threading & vectorization
improvements in the
deformation system from
having a larger overall
impact.
Wall Time Spent in Each Category
18
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 “hier_apply_fk_around_pivot”
as the hottest operator
– Operates on a Hierarchy
– Verified in Intel® VTune™
Amplifier XE
 Several other “hier” related
operations taking up other
top hot spots.
Time Spent inside each type of Operator
19
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Typical implementation
– Loop over rows
– Loop over colums
– Compute result element by
multiplying one row of first matrix
across one column of the other
 Simple enough, but how much
work did we really just do?
struct Matrix4x4
{
double m[4][4];
};
20
Matrix4x4 operator * (const Matrix4x4 &iOther)
{
Matrix4x4 result;
for (int r=0;r < 4; ++r)
{
for (int c=0;c < 4; ++c)
{
double sum = 0.0;
for(int k=0; k < 4; ++k)
{
sum += m[r][k]*iOther.m[k][c];
}
result.m[r][c] = sum;
}
}
return result;
}
Matrix Concatenation (Multiplication)
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 64 Multiplies (double precision)
 48 Additions (double precision)
Expensive Matrix Concatenation
Matrix4x4 operator * (const Matrix4x4 &iOther)
{
Matrix4x4 result;
result.m[0][0] =
m[0][0]*iOther.m[0][0] +
m[0][1]*iOther.m[1][0] +
m[0][2]*iOther.m[2][0] +
m[0][3]*iOther.m[3][0];
result.m[0][1] =
m[0][0]*iOther.m[0][1] +
m[0][1]*iOther.m[1][1] +
m[0][2]*iOther.m[2][1] +
m[0][3]*iOther.m[3][1];
result.m[0][2] =
m[0][0]*iOther.m[0][2] +
m[0][1]*iOther.m[1][2] +
m[0][2]*iOther.m[2][2] +
m[0][3]*iOther.m[3][2];
result.m[0][3] =
m[0][0]*iOther.m[0][3] +
m[0][1]*iOther.m[1][3] +
m[0][2]*iOther.m[2][3] +
m[0][3]*iOther.m[3][3];
result.m[1][0] =
m[1][0]*iOther.m[0][0] +
m[1][1]*iOther.m[1][0] +
m[1][2]*iOther.m[2][0] +
m[1][3]*iOther.m[3][0];
result.m[1][1] =
m[1][0]*iOther.m[0][1] +
m[1][1]*iOther.m[1][1] +
m[1][2]*iOther.m[2][1] +
m[1][3]*iOther.m[3][1];
result.m[1][2] =
m[1][0]*iOther.m[0][2] +
m[1][1]*iOther.m[1][2] +
m[1][2]*iOther.m[2][2] +
m[1][3]*iOther.m[3][2];
result.m[1][3] =
m[1][0]*iOther.m[0][3] +
m[1][1]*iOther.m[1][3] +
m[1][2]*iOther.m[2][3] +
m[1][3]*iOther.m[3][3];
result.m[2][0] =
m[2][0]*iOther.m[0][0] +
m[2][1]*iOther.m[1][0] +
m[2][2]*iOther.m[2][0] +
m[2][3]*iOther.m[3][0];
result.m[2][1] =
m[2][0]*iOther.m[0][1] +
m[2][1]*iOther.m[1][1] +
m[2][2]*iOther.m[2][1] +
m[2][3]*iOther.m[3][1];
result.m[2][2] =
m[2][0]*iOther.m[0][2] +
m[2][1]*iOther.m[1][2] +
m[2][2]*iOther.m[2][2] +
m[2][3]*iOther.m[3][2];
result.m[2][3] =
m[2][0]*iOther.m[0][3] +
m[2][1]*iOther.m[1][3] +
m[2][2]*iOther.m[2][3] +
m[2][3]*iOther.m[3][3];
result.m[3][0] =
m[3][0]*iOther.m[0][0] +
m[3][1]*iOther.m[1][0] +
m[3][2]*iOther.m[2][0] +
m[3][3]*iOther.m[3][0];
result.m[3][1] =
m[3][0]*iOther.m[0][1] +
m[3][1]*iOther.m[1][1] +
m[3][2]*iOther.m[2][1] +
m[3][3]*iOther.m[3][1];
result.m[3][2] =
m[3][0]*iOther.m[0][2] +
m[3][1]*iOther.m[1][2] +
m[3][2]*iOther.m[2][2] +
m[3][3]*iOther.m[3][2];
result.m[3][3] =
m[3][0]*iOther.m[0][3] +
m[3][1]*iOther.m[1][3] +
m[3][2]*iOther.m[2][3] +
m[3][3]*iOther.m[3][3];
return result;
}
21
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Good news! YES!
 If you knew the exact transform a 4x4 matrix was
representing, you would know quite a few 0 and 1
values at compile time.
Are Any of Those 16 Matrix Values Known
At Compile Time?
Identity
[1][0][0][0]
[0][1][0][0]
[0][0][1][0]
[0][0][0][1]
Translation(x,y,z)
[1][0][0][0]
[0][1][0][0]
[0][0][1][0]
[x][y][z][1]
Shear(x,y,z)
[1][0][0][0]
[x][1][0][0]
[y][z][1][0]
[0][0][0][1]
Scale(x,y,z)
[x][0][0][0]
[0][y][0][0]
[0][0][z][0]
[0][0][0][1]
22
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Building rotation matrices is more expensive because of the need
to call sine and cosine on the angle
 Rotations also have 0 and 1 values
What About Rotations?
Rotate X axis(angle)
[1][0][0][0]
[0][c][s][0]
[0][-s][c][0]
[0][0][0][1]
Rotate Y axis(angle)
[c][0][-s][0]
[0][1][0][0]
[s][0][c][0]
[0][0][0][1]
Rotate Z axis(angle)
[c][s][0][0]
[-s][c][0][0]
[0][0][1][0]
[0][0][0][1]
23
let s = sine(angle)
let c = cosine(angle)
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Unfortunately, the matrix multiply method doesn’t
know that the 4x4 Matrix it was passed has any 0 or 1
values
– So it can not avoid performing math operations.
 Even if we had separate classes to represent the
different transformations and multiple versions of the
matrix multiply method for each
– The result becomes a general 4x4 matrix.
– Chains of multiplication would only benefit on the 1st multiply
operation
Huge Optimization Potential!
24
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Pseudo algorithm to compute a Joint’s World Space
– 10 4x4 matrix multiplications
– 1 matrix inversion (very expensive) in the middle
 YES… But you won’t even want to try
 Good luck getting the expanded math right
Can we expand the math by hand?
JointWorldSpace = Scale*Shear*
ParentScale*ParentShear*
RotZ*RotY*RotX*
((ParentScale*ParentShear).inverse())*
Translate*
ParentWorldSpace;
25
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Must keep high level representation of algorithm
 Perform the absolute minimum required number of
math operations
– It must track known values
– Continue tracking values through matrix multiplications
 Utilize known information to provide a cheaper
alternative to full matrix inversions
 Interface/Adapt to existing 4x4 Matrix data types
Ideal Transform Behavior
26
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
C++ library to enable composition of 3d transforms
Instead of a general purpose 4x4 matrix, it provides
specific types for different transforms.
Track known values through multiplication chains
Deferred Evaluation
Localized source code changes required to take
advantage of
Introducing Xform Building Blocks (XBB)
27
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
XBB Scale, Shear3, & Translation
ref::Matrix4x4 S;
S.makeScale(scaleX, scaleY, scaleZ);
ref::Matrix4x4 SH;
SH.makeShear3(shearX, shearY, shearZ);
ref::Matrix4x4 T;
T.makeTranslation(transX, transY, transZ);
128 Bytes of Stack
Used Per 4x4 Matrix
Overhead to initialize to Identity(),
then overwrite elements
28
xbb::Scale S(scaleX, scaleY, scaleZ);
xbb::Shear3 SH(shearX, shearY, shearZ);
xbb::Translation T(transX, transY, transZ);
 Before  After XBB
24 Bytes of Stack
No overhead to initialize
4x4 elements that are
known to be 0 or 1
for each type of transform
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
XBB Transform Representation
struct Translation
{
double x;
double y;
double z;
…
};
29
 Stores only non-constant data
needed to represent a 4x4 matrix of
the transform type
 Provides methods for element level
access to a 4x4 matrix
– Return known constant values
double e10() const { return 0.0; }
double e11() const { return 1.0; }
double e12() const { return 0.0; }
double e13() const { return 0.0; }
double e20() const { return 0.0; }
double e21() const { return 0.0; }
double e22() const { return 1.0; }
double e23() const { return 0.0; }
double e30() const { return x; }
double e31() const { return y; }
double e32() const { return z; }
double e33() const { return 1.0; }
double e00() const { return 1.0; }
double e01() const { return 0.0; }
double e02() const { return 0.0; }
double e03() const { return 0.0; }
Translation(x,y,z)
[1][0][0][0]
[0][1][0][0]
[0][0][1][0]
[x][y][z][1]
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
XBB Transform Constancy
enum Constancy
{
ConstantZero,
ConstantOne,
NotConstant
};
30
 Each transform identifies if each 4x4
matrix element is a constant 0, 1, or
Not Constant
 Constancy is suitable as template
parameter
– Matrix Multiply will make use of
static const Constancy c10 = ConstantZero;
static const Constancy c11 = ConstantOne;
static const Constancy c12 = ConstantZero;
static const Constancy c13 = ConstantZero;
static const Constancy c20 = ConstantZero;
static const Constancy c21 = ConstantZero;
static const Constancy c22 = ConstantOne;
static const Constancy c23 = ConstantZero;
static const Constancy c30 = NotConstant;
static const Constancy c31 = NotConstant;
static const Constancy c32 = NotConstant;
static const Constancy c33 = ConstantOne;
static const Constancy c00 = ConstantOne;
static const Constancy c01 = ConstantZero;
static const Constancy c02 = ConstantZero;
static const Constancy c03 = ConstantZero;
Translation(x,y,z)
[1][0][0][0]
[0][1][0][0]
[0][0][1][0]
[x][y][z][1]
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
XBB Rotations
ref::Matrix4x4 Rx;
Rx.makeRotationX(rotX);
ref::Matrix4x4 Ry;
Ry.makeRotationY(rotY);
ref::Matrix4x4 Rz;
Rz.makeRotationZ(rotZ);
128 Bytes of Stack
Used Per 4x4 Matrix
Overhead to initialize to Identity(),
then overwrite elements
31
xbb::RotationX Rx(rotX);
xbb::RotationY Ry(rotY);
xbb::RotationZ Rz(rotZ);
 Before  After XBB
16 Bytes of Stack
No overhead to initialize
4x4 elements that are
known to be 0 or 1
for each type of transform
sin(angle)
cosine(angle)
sine(angle)
cosine(angle)
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
XBB Rotation Representation
struct RotationX
{
double cosineOfAngle;
double sineOfAngle;
…
};
32
 Stores the sine and cosine of the
angle, not the angle itself.
 Provides methods for element
level access to a 4x4 matrix
– Return known constant values
double e10() const { return 0.0; }
double e11() const { return cosineOfAngle; }
double e12() const { return sineOfAngle; }
double e13() const { return 0.0; }
double e20() const { return 0.0; }
double e21() const { return -sineOfAngle; }
double e22() const { return cosineOfAngle; }
double e23() const { return 0.0; }
double e30() const { return 0.0; }
double e31() const { return 0.0; }
double e32() const { return 0.0; }
double e33() const { return 1.0; }
double e00() const { return 1.0; }
double e01() const { return 0.0; }
double e02() const { return 0.0; }
double e03() const { return 0.0; }
Rotate X axis(angle)
[1][0][0][0]
[0][c][s][0]
[0][-s][c][0]
[0][0][0][1]
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
XBB Multiply
ref::Matrix4x4 SxSH;
SxSH = S*SH;
33
auto SxSH = S*SH;
xbb::Matrix4x3 SxSH_Matrix;
SxSH.to(SxSH_Matrix);
 Before
 After XBB
No Math is performed.
Instead, a new type
Multiply<Scale, Shear3>
is returned
Math is deferred until you explicitly
export to a general purpose matrix.
XBB’s Multiply uses the Constancy
of its template parameters to
define its own Constancy values
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Multiplication Chains
ref::Matrix4x4 jointLocalSpace;
jointLocalSpace = S*SH*Rz*Ry*Rx*T;
34
xbb::Matrix4x3 jointLocalSpace;
(S*SH*Rz*Ry*Rx*T).to(jointLocalSpace);
 Before
 After XBB
Confirmed assembly has
minimum math operations
5 matrix multiplications:
320 multiplications
240 adds
Speedup 2.45x
Multiply<Multiply<Multiply<Multiply<Multiply<Scale, Shear3>,
RotationZ>,
RotationY>,
RotationX>,
Translation>
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Deferred Evaluation (reduce)
35
typedef ReducedMatrix
<
c00, c01, c02, c03,
c10, c11, c12, c13,
c20, c21, c22, c23,
c30, c31, c32, c33
> ReducedType;
 ReducedMatrix based on a transform’s
Constancy.
– Only has data members for NotConstant matrix
elements
 Multiply’s reduce recursively expands its left
and right operands
– Expands out entire multiplication chain
 4x4 elements setByMatrixMultiply
– Actually multiplies a column by row
– Knows Constancy of the elements from reduced
left and right transforms
 Using template specialization based on the
Constancy
– Only exact terms necessary are accessed
– Emits only necessary multiplications & additions
ReducedType Multiply::reduce() const
{
const auto tl = left.reduce();
const auto tr = right.reduce();
ReducedType r;
r.setByMatrixMultiply<0,0>(tl,tr);
r.setByMatrixMultiply<0,1>(tl,tr);
...
r.setByMatrixMultiply<3,3>(tl,tr);
return r;
}
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Many Hierarchy operations change only Translation of a Joint.
– If we could cache the Rotation transforms, then many expensive
sin/cos calls could be avoided.
– Matrix4x4 is too big (128 bytes) to cache one for each Rotation X, Y,
and Z.
 XBB rotations are only 16 bytes each
– Small enough to cache inside the Joint object
XBB: Cached Rotations
(S*SH*cached.Rz*cached.Ry*cached.Rx*T).to(jointLocalSpace);
Use Cached Sin/Cos of Angles
Speedup 12.71x
36
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Identity is free in any multiplication chain
– Optimized out entirely
– Only 1 byte of stack space (empty struct)
 Transpose is free in any multiplication chain
– Deferred evaluation pulls results out in different order
– No additional math or data movement
XBB Identity & Transpose
Identity id;
(S*SH*id*R*T).to(result);
37
(S*SH*R*T).transpose().(result);
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Inverse is very expensive
– Determinant
– Cofactor
– Transpose
– Division
– scalar matrix multiply
Before: Inverse of (Scale*Shear)
inverseOfSxSH = (S*SH).inverse();
38
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
(S*SH).inverse().to(inverseOfSxSH);
 MAGIC happens
– Inverse becomes part of deferred evaluation!
 Because we have a representation of the multiplication chain
– we can move the inverse inside the multiplication chain and reverse its order
 Inverse of most transform primitives is free
– except Scale which costs 3 divisions
 During deferred evaluation
– the logical 4x4 matrix values are reordered and flip signs where needed to
represent its inverse
(SH.inverse()*S.inverse()).to(inverseOfSxSH);
Speedup 6.43x
39
After XBB: Inverse of (Scale*Shear)
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Provide template specializations for adapters to map between DWA
math classes and XBB’s.
– Allows XBB deferred evaluation directly into DWA matrix types
 In many scenarios, the transforms could have been Identity based on
logic inside the Joint.
– To take full advantage of XBB, we needed to know the exact type of transforms
of involved.
 Templatized Hierarchy algorithm making conditional logic controlled
by template parameters. e.g.
– Order of Rotations
– Scale Propagation Mode
 Specialized templates based on parameters to
– Use the correct type of XBB transform
 Identity whenever possible
– Multiply the Rotations in the correct order
XBB Integration to DWA Motion System
40
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Built a jump table with instances of the algorithm for all the
different combinations of options and rotation orders.
– Used enums as indexes into multi-dimensional array of function
pointers to the corresponding algorithm instance to execute.
 Used XBB for decomposing World Space Matrix4x4 into individual
Joint attributes.
 Rewrote expensive “hier_apply_fk_around_pivot” with XBB directly
vs. going through Hierarchy object
– Avoid high overhead of building Hierarchy on on the fly
 Performed non XBB related optimizations
– Reduced dynamic memory allocation by replacing local std::vector<T>
with stack based array when possible
XBB Integration to DWA Motion System
(continued…)
41
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Before
 After
XBB DWA Motion System Results
Overall Speedup 1.2x
42
hier_apply_fk_around_pivot
Speedup 2.8x
Motion System
Speedup 1.6x
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Reducing the Critical Path helped Thread Scaling.
43
XBB DWA Motion System Scaling
Reached goal of 30 fps
on single Avoton cartridge
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 Good way to improve the impact of vectorization or
threading is to reduce the amount of work being done
outside those data parallel regions.
– Ideally do less work in the first place.
 Complex optimization problems can be represented in C++
and presented back to the compiler in a form it can excel at
optimizing.
– Expanding math by hand is untenable.
 You can do much more with C++11/14 to encapsulate
problems while retaining the original high level algorithm
– Look for optimization problems that might be representable at a
higher level.
Call to Action
44
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
 XBB has exactly the features required to support the DWA
Motion System.
 For general purpose use
– more transformations and math operations might be required. e.g.
 Inverse of general 4x4 matrix
 Single precision version or template based data type
 XBB can be licensed or potentially open sourced upon
request.
– Could be of use to CAD, Animation Tools, and Gaming.
 Contact Alex Wells (alex.m.wells@intel.com)
Future Work
45
C o p y r i g h t © 2 0 1 5 , I n t e l C o r p o r a t i o n . A l l r i g h t s r e s e r v e d . *O t h e r n a me s a n d b r a n d s ma y b e c l a i me d a s t h e p r o p e r t y o f o t h e r s .

Mais conteúdo relacionado

Mais procurados

Transforming Products into Platforms
Transforming Products into PlatformsTransforming Products into Platforms
Transforming Products into PlatformsDelyn Simons
 
Ceph Day Shanghai - VSM (Virtual Storage Manager) - Simplify Ceph Management ...
Ceph Day Shanghai - VSM (Virtual Storage Manager) - Simplify Ceph Management ...Ceph Day Shanghai - VSM (Virtual Storage Manager) - Simplify Ceph Management ...
Ceph Day Shanghai - VSM (Virtual Storage Manager) - Simplify Ceph Management ...Ceph Community
 
TwilioCon 2013 API Panel with Capital One, ESPN, Accenture, Mashery
TwilioCon 2013 API Panel with Capital One, ESPN, Accenture, MasheryTwilioCon 2013 API Panel with Capital One, ESPN, Accenture, Mashery
TwilioCon 2013 API Panel with Capital One, ESPN, Accenture, MasheryDelyn Simons
 
Intel: мобильность и трансформация рабочего места
Intel: мобильность и трансформация рабочего местаIntel: мобильность и трансформация рабочего места
Intel: мобильность и трансформация рабочего местаExpolink
 
OIT to Volumetric Shadow Mapping, 101 Uses for Raster-Ordered Views using Dir...
OIT to Volumetric Shadow Mapping, 101 Uses for Raster-Ordered Views using Dir...OIT to Volumetric Shadow Mapping, 101 Uses for Raster-Ordered Views using Dir...
OIT to Volumetric Shadow Mapping, 101 Uses for Raster-Ordered Views using Dir...Gael Hofemeier
 
LF_DPDK17_The Path to Data Plane Microservices
LF_DPDK17_The Path to Data Plane MicroservicesLF_DPDK17_The Path to Data Plane Microservices
LF_DPDK17_The Path to Data Plane MicroservicesLF_DPDK
 
LF_DPDK17_Reducing Barriers to Adoption - Making DPDK Easier to Integrate int...
LF_DPDK17_Reducing Barriers to Adoption - Making DPDK Easier to Integrate int...LF_DPDK17_Reducing Barriers to Adoption - Making DPDK Easier to Integrate int...
LF_DPDK17_Reducing Barriers to Adoption - Making DPDK Easier to Integrate int...LF_DPDK
 
【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYeh
【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYeh【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYeh
【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYehMAKERPRO.cc
 
TDC2019 Intel Software Day - Inferencia de IA em edge devices
TDC2019 Intel Software Day - Inferencia de IA em edge devicesTDC2019 Intel Software Day - Inferencia de IA em edge devices
TDC2019 Intel Software Day - Inferencia de IA em edge devicestdc-globalcode
 
D101 ggc techprodspec
D101 ggc techprodspecD101 ggc techprodspec
D101 ggc techprodspecIMI CALULU
 
E5 Intel Xeon Processor E5 Family Making the Business Case
E5 Intel Xeon Processor E5 Family Making the Business Case E5 Intel Xeon Processor E5 Family Making the Business Case
E5 Intel Xeon Processor E5 Family Making the Business Case Intel IT Center
 
Explore, design and implement threading parallelism with Intel® Advisor XE
Explore, design and implement threading parallelism with Intel® Advisor XEExplore, design and implement threading parallelism with Intel® Advisor XE
Explore, design and implement threading parallelism with Intel® Advisor XEIntel IT Center
 
EARS: The Easy Approach to Requirements Syntax
EARS: The Easy Approach to Requirements SyntaxEARS: The Easy Approach to Requirements Syntax
EARS: The Easy Approach to Requirements SyntaxTechWell
 
LF_DPDK17_DPDK's best kept secret – Micro-benchmark performance tests
LF_DPDK17_DPDK's best kept secret – Micro-benchmark performance testsLF_DPDK17_DPDK's best kept secret – Micro-benchmark performance tests
LF_DPDK17_DPDK's best kept secret – Micro-benchmark performance testsLF_DPDK
 
Launch X-431 Diagun V product introduction
Launch X-431 Diagun V product introductionLaunch X-431 Diagun V product introduction
Launch X-431 Diagun V product introductionLeslieTsai2
 
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...tdc-globalcode
 
Unveiling the Early Universe with Intel Xeon Processors and Intel Xeon Phi at...
Unveiling the Early Universe with Intel Xeon Processors and Intel Xeon Phi at...Unveiling the Early Universe with Intel Xeon Processors and Intel Xeon Phi at...
Unveiling the Early Universe with Intel Xeon Processors and Intel Xeon Phi at...Intel IT Center
 
Embedded Platforms Launch Press Presentation
Embedded Platforms Launch Press PresentationEmbedded Platforms Launch Press Presentation
Embedded Platforms Launch Press PresentationAMD
 

Mais procurados (19)

Transforming Products into Platforms
Transforming Products into PlatformsTransforming Products into Platforms
Transforming Products into Platforms
 
Ceph Day Shanghai - VSM (Virtual Storage Manager) - Simplify Ceph Management ...
Ceph Day Shanghai - VSM (Virtual Storage Manager) - Simplify Ceph Management ...Ceph Day Shanghai - VSM (Virtual Storage Manager) - Simplify Ceph Management ...
Ceph Day Shanghai - VSM (Virtual Storage Manager) - Simplify Ceph Management ...
 
TwilioCon 2013 API Panel with Capital One, ESPN, Accenture, Mashery
TwilioCon 2013 API Panel with Capital One, ESPN, Accenture, MasheryTwilioCon 2013 API Panel with Capital One, ESPN, Accenture, Mashery
TwilioCon 2013 API Panel with Capital One, ESPN, Accenture, Mashery
 
Intel: мобильность и трансформация рабочего места
Intel: мобильность и трансформация рабочего местаIntel: мобильность и трансформация рабочего места
Intel: мобильность и трансформация рабочего места
 
OIT to Volumetric Shadow Mapping, 101 Uses for Raster-Ordered Views using Dir...
OIT to Volumetric Shadow Mapping, 101 Uses for Raster-Ordered Views using Dir...OIT to Volumetric Shadow Mapping, 101 Uses for Raster-Ordered Views using Dir...
OIT to Volumetric Shadow Mapping, 101 Uses for Raster-Ordered Views using Dir...
 
LF_DPDK17_The Path to Data Plane Microservices
LF_DPDK17_The Path to Data Plane MicroservicesLF_DPDK17_The Path to Data Plane Microservices
LF_DPDK17_The Path to Data Plane Microservices
 
LF_DPDK17_Reducing Barriers to Adoption - Making DPDK Easier to Integrate int...
LF_DPDK17_Reducing Barriers to Adoption - Making DPDK Easier to Integrate int...LF_DPDK17_Reducing Barriers to Adoption - Making DPDK Easier to Integrate int...
LF_DPDK17_Reducing Barriers to Adoption - Making DPDK Easier to Integrate int...
 
【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYeh
【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYeh【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYeh
【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYeh
 
TDC2019 Intel Software Day - Inferencia de IA em edge devices
TDC2019 Intel Software Day - Inferencia de IA em edge devicesTDC2019 Intel Software Day - Inferencia de IA em edge devices
TDC2019 Intel Software Day - Inferencia de IA em edge devices
 
Intel® desktop board
Intel® desktop boardIntel® desktop board
Intel® desktop board
 
D101 ggc techprodspec
D101 ggc techprodspecD101 ggc techprodspec
D101 ggc techprodspec
 
E5 Intel Xeon Processor E5 Family Making the Business Case
E5 Intel Xeon Processor E5 Family Making the Business Case E5 Intel Xeon Processor E5 Family Making the Business Case
E5 Intel Xeon Processor E5 Family Making the Business Case
 
Explore, design and implement threading parallelism with Intel® Advisor XE
Explore, design and implement threading parallelism with Intel® Advisor XEExplore, design and implement threading parallelism with Intel® Advisor XE
Explore, design and implement threading parallelism with Intel® Advisor XE
 
EARS: The Easy Approach to Requirements Syntax
EARS: The Easy Approach to Requirements SyntaxEARS: The Easy Approach to Requirements Syntax
EARS: The Easy Approach to Requirements Syntax
 
LF_DPDK17_DPDK's best kept secret – Micro-benchmark performance tests
LF_DPDK17_DPDK's best kept secret – Micro-benchmark performance testsLF_DPDK17_DPDK's best kept secret – Micro-benchmark performance tests
LF_DPDK17_DPDK's best kept secret – Micro-benchmark performance tests
 
Launch X-431 Diagun V product introduction
Launch X-431 Diagun V product introductionLaunch X-431 Diagun V product introduction
Launch X-431 Diagun V product introduction
 
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
 
Unveiling the Early Universe with Intel Xeon Processors and Intel Xeon Phi at...
Unveiling the Early Universe with Intel Xeon Processors and Intel Xeon Phi at...Unveiling the Early Universe with Intel Xeon Processors and Intel Xeon Phi at...
Unveiling the Early Universe with Intel Xeon Processors and Intel Xeon Phi at...
 
Embedded Platforms Launch Press Presentation
Embedded Platforms Launch Press PresentationEmbedded Platforms Launch Press Presentation
Embedded Platforms Launch Press Presentation
 

Destaque

Real-Time Game Optimization with Intel® GPA
Real-Time Game Optimization with Intel® GPAReal-Time Game Optimization with Intel® GPA
Real-Time Game Optimization with Intel® GPAIntel® Software
 
Real-Time Game Optimization with Intel® GPA
Real-Time Game Optimization with Intel® GPAReal-Time Game Optimization with Intel® GPA
Real-Time Game Optimization with Intel® GPAIntel® Software
 
Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution
Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-ResolutionUltra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution
Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-ResolutionIntel® Software
 
Dreamworks Presentation
Dreamworks PresentationDreamworks Presentation
Dreamworks PresentationDreamworksng
 
DreamWorks Animation
DreamWorks AnimationDreamWorks Animation
DreamWorks AnimationAshley Coro
 
Looking at Machine Learning in Games
Looking at Machine Learning in GamesLooking at Machine Learning in Games
Looking at Machine Learning in GamesIntel® Software
 
Masked Software Occlusion Culling
Masked Software Occlusion CullingMasked Software Occlusion Culling
Masked Software Occlusion CullingIntel® Software
 
DreamWorks Pictures
DreamWorks PicturesDreamWorks Pictures
DreamWorks PicturesSarah Byard
 
Dreamworks Studios Skg
Dreamworks Studios SkgDreamworks Studios Skg
Dreamworks Studios Skgrishabhbhatia
 
Unity Optimization Tips, Tricks and Tools
Unity Optimization Tips, Tricks and ToolsUnity Optimization Tips, Tricks and Tools
Unity Optimization Tips, Tricks and ToolsIntel® Software
 
Optimization Deep Dive: Unreal Engine 4 on Intel
Optimization Deep Dive: Unreal Engine 4 on IntelOptimization Deep Dive: Unreal Engine 4 on Intel
Optimization Deep Dive: Unreal Engine 4 on IntelIntel® Software
 

Destaque (16)

Real-Time Game Optimization with Intel® GPA
Real-Time Game Optimization with Intel® GPAReal-Time Game Optimization with Intel® GPA
Real-Time Game Optimization with Intel® GPA
 
Real-Time Game Optimization with Intel® GPA
Real-Time Game Optimization with Intel® GPAReal-Time Game Optimization with Intel® GPA
Real-Time Game Optimization with Intel® GPA
 
Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution
Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-ResolutionUltra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution
Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution
 
Dreamworks Presentation
Dreamworks PresentationDreamworks Presentation
Dreamworks Presentation
 
DreamWorks Animation
DreamWorks AnimationDreamWorks Animation
DreamWorks Animation
 
VFX Operations
VFX OperationsVFX Operations
VFX Operations
 
Looking at Machine Learning in Games
Looking at Machine Learning in GamesLooking at Machine Learning in Games
Looking at Machine Learning in Games
 
Cigdc powerpoint
Cigdc powerpointCigdc powerpoint
Cigdc powerpoint
 
Masked Software Occlusion Culling
Masked Software Occlusion CullingMasked Software Occlusion Culling
Masked Software Occlusion Culling
 
DreamWorks Pictures
DreamWorks PicturesDreamWorks Pictures
DreamWorks Pictures
 
Math
MathMath
Math
 
D math graph
D math graphD math graph
D math graph
 
Presentation Dreamworks
Presentation DreamworksPresentation Dreamworks
Presentation Dreamworks
 
Dreamworks Studios Skg
Dreamworks Studios SkgDreamworks Studios Skg
Dreamworks Studios Skg
 
Unity Optimization Tips, Tricks and Tools
Unity Optimization Tips, Tricks and ToolsUnity Optimization Tips, Tricks and Tools
Unity Optimization Tips, Tricks and Tools
 
Optimization Deep Dive: Unreal Engine 4 on Intel
Optimization Deep Dive: Unreal Engine 4 on IntelOptimization Deep Dive: Unreal Engine 4 on Intel
Optimization Deep Dive: Unreal Engine 4 on Intel
 

Semelhante a DreamWorks Animation

Advancing Science in Alternative Energy and Bioengineering with Many-Core Pro...
Advancing Science in Alternative Energy and Bioengineering with Many-Core Pro...Advancing Science in Alternative Energy and Bioengineering with Many-Core Pro...
Advancing Science in Alternative Energy and Bioengineering with Many-Core Pro...inside-BigData.com
 
Intel HPC Update
Intel HPC UpdateIntel HPC Update
Intel HPC UpdateIBM Danmark
 
Arquitetura do coprocessador Intel® Xeon Phi™ - Intel Software Conference 2013
Arquitetura do coprocessador Intel® Xeon Phi™ - Intel Software Conference 2013Arquitetura do coprocessador Intel® Xeon Phi™ - Intel Software Conference 2013
Arquitetura do coprocessador Intel® Xeon Phi™ - Intel Software Conference 2013Intel Software Brasil
 
Using Xeon + FPGA for Accelerating HPC Workloads
Using Xeon + FPGA for Accelerating HPC WorkloadsUsing Xeon + FPGA for Accelerating HPC Workloads
Using Xeon + FPGA for Accelerating HPC Workloadsinside-BigData.com
 
Austin Cherian: Big data and HPC technologies - intel
Austin Cherian: Big data and HPC technologies - intelAustin Cherian: Big data and HPC technologies - intel
Austin Cherian: Big data and HPC technologies - intelVu Hung Nguyen
 
Como criar um mundo autônomo e conectado - Jomar Silva
Como criar um mundo autônomo e conectado - Jomar SilvaComo criar um mundo autônomo e conectado - Jomar Silva
Como criar um mundo autônomo e conectado - Jomar SilvaiMasters
 
8 intel network builders overview
8 intel network builders overview8 intel network builders overview
8 intel network builders overviewvideos
 
AI & Computer Vision (OpenVINO) - CPBR12
AI & Computer Vision (OpenVINO) - CPBR12AI & Computer Vision (OpenVINO) - CPBR12
AI & Computer Vision (OpenVINO) - CPBR12Jomar Silva
 
4 dpdk roadmap(1)
4 dpdk roadmap(1)4 dpdk roadmap(1)
4 dpdk roadmap(1)videos
 
O uso de tecnologias Intel na implantação de sistemas de alto desempenho
O uso de tecnologias Intel na implantação de sistemas de alto desempenhoO uso de tecnologias Intel na implantação de sistemas de alto desempenho
O uso de tecnologias Intel na implantação de sistemas de alto desempenhoIntel Software Brasil
 
Yocto Project Open Source Build System and Collaboration Initiative
Yocto Project Open Source Build System and Collaboration InitiativeYocto Project Open Source Build System and Collaboration Initiative
Yocto Project Open Source Build System and Collaboration InitiativeMarcelo Sanz
 
Internet of Things: Lightning Round, Sargent
Internet of Things: Lightning Round, SargentInternet of Things: Lightning Round, Sargent
Internet of Things: Lightning Round, SargentGovLoop
 
Lynn Comp - Intel Big Data & Cloud Summit 2013 (2)
Lynn Comp - Intel Big Data & Cloud Summit 2013 (2)Lynn Comp - Intel Big Data & Cloud Summit 2013 (2)
Lynn Comp - Intel Big Data & Cloud Summit 2013 (2)IntelAPAC
 
Transforming Business with Advanced Analytics
Transforming Business with Advanced AnalyticsTransforming Business with Advanced Analytics
Transforming Business with Advanced AnalyticsIntel IT Center
 
Intel® AI: Reinforcement Learning Coach
Intel® AI:  Reinforcement Learning Coach Intel® AI:  Reinforcement Learning Coach
Intel® AI: Reinforcement Learning Coach Intel® Software
 
Intel Mobile Launch Information
Intel Mobile Launch InformationIntel Mobile Launch Information
Intel Mobile Launch InformationAnna Yovka
 
50 Billion Connected Things are Coming
50 Billion Connected Things are Coming50 Billion Connected Things are Coming
50 Billion Connected Things are ComingIntel® Software
 
E20190227[EDLS]インテル®︎FPGAによるエッジAI
E20190227[EDLS]インテル®︎FPGAによるエッジAIE20190227[EDLS]インテル®︎FPGAによるエッジAI
E20190227[EDLS]インテル®︎FPGAによるエッジAILeapMind Inc
 
Achieve Unconstrained Collaboration in a Digital World
Achieve Unconstrained Collaboration in a Digital WorldAchieve Unconstrained Collaboration in a Digital World
Achieve Unconstrained Collaboration in a Digital WorldIntel IT Center
 

Semelhante a DreamWorks Animation (20)

Advancing Science in Alternative Energy and Bioengineering with Many-Core Pro...
Advancing Science in Alternative Energy and Bioengineering with Many-Core Pro...Advancing Science in Alternative Energy and Bioengineering with Many-Core Pro...
Advancing Science in Alternative Energy and Bioengineering with Many-Core Pro...
 
Intel HPC Update
Intel HPC UpdateIntel HPC Update
Intel HPC Update
 
Arquitetura do coprocessador Intel® Xeon Phi™ - Intel Software Conference 2013
Arquitetura do coprocessador Intel® Xeon Phi™ - Intel Software Conference 2013Arquitetura do coprocessador Intel® Xeon Phi™ - Intel Software Conference 2013
Arquitetura do coprocessador Intel® Xeon Phi™ - Intel Software Conference 2013
 
VIOPS08: マイクロサーバー アーキテクチャトレンド
VIOPS08: マイクロサーバー アーキテクチャトレンドVIOPS08: マイクロサーバー アーキテクチャトレンド
VIOPS08: マイクロサーバー アーキテクチャトレンド
 
Using Xeon + FPGA for Accelerating HPC Workloads
Using Xeon + FPGA for Accelerating HPC WorkloadsUsing Xeon + FPGA for Accelerating HPC Workloads
Using Xeon + FPGA for Accelerating HPC Workloads
 
Austin Cherian: Big data and HPC technologies - intel
Austin Cherian: Big data and HPC technologies - intelAustin Cherian: Big data and HPC technologies - intel
Austin Cherian: Big data and HPC technologies - intel
 
Como criar um mundo autônomo e conectado - Jomar Silva
Como criar um mundo autônomo e conectado - Jomar SilvaComo criar um mundo autônomo e conectado - Jomar Silva
Como criar um mundo autônomo e conectado - Jomar Silva
 
8 intel network builders overview
8 intel network builders overview8 intel network builders overview
8 intel network builders overview
 
AI & Computer Vision (OpenVINO) - CPBR12
AI & Computer Vision (OpenVINO) - CPBR12AI & Computer Vision (OpenVINO) - CPBR12
AI & Computer Vision (OpenVINO) - CPBR12
 
4 dpdk roadmap(1)
4 dpdk roadmap(1)4 dpdk roadmap(1)
4 dpdk roadmap(1)
 
O uso de tecnologias Intel na implantação de sistemas de alto desempenho
O uso de tecnologias Intel na implantação de sistemas de alto desempenhoO uso de tecnologias Intel na implantação de sistemas de alto desempenho
O uso de tecnologias Intel na implantação de sistemas de alto desempenho
 
Yocto Project Open Source Build System and Collaboration Initiative
Yocto Project Open Source Build System and Collaboration InitiativeYocto Project Open Source Build System and Collaboration Initiative
Yocto Project Open Source Build System and Collaboration Initiative
 
Internet of Things: Lightning Round, Sargent
Internet of Things: Lightning Round, SargentInternet of Things: Lightning Round, Sargent
Internet of Things: Lightning Round, Sargent
 
Lynn Comp - Intel Big Data & Cloud Summit 2013 (2)
Lynn Comp - Intel Big Data & Cloud Summit 2013 (2)Lynn Comp - Intel Big Data & Cloud Summit 2013 (2)
Lynn Comp - Intel Big Data & Cloud Summit 2013 (2)
 
Transforming Business with Advanced Analytics
Transforming Business with Advanced AnalyticsTransforming Business with Advanced Analytics
Transforming Business with Advanced Analytics
 
Intel® AI: Reinforcement Learning Coach
Intel® AI:  Reinforcement Learning Coach Intel® AI:  Reinforcement Learning Coach
Intel® AI: Reinforcement Learning Coach
 
Intel Mobile Launch Information
Intel Mobile Launch InformationIntel Mobile Launch Information
Intel Mobile Launch Information
 
50 Billion Connected Things are Coming
50 Billion Connected Things are Coming50 Billion Connected Things are Coming
50 Billion Connected Things are Coming
 
E20190227[EDLS]インテル®︎FPGAによるエッジAI
E20190227[EDLS]インテル®︎FPGAによるエッジAIE20190227[EDLS]インテル®︎FPGAによるエッジAI
E20190227[EDLS]インテル®︎FPGAによるエッジAI
 
Achieve Unconstrained Collaboration in a Digital World
Achieve Unconstrained Collaboration in a Digital WorldAchieve Unconstrained Collaboration in a Digital World
Achieve Unconstrained Collaboration in a Digital World
 

Mais de Intel® Software

AI for All: Biology is eating the world & AI is eating Biology
AI for All: Biology is eating the world & AI is eating Biology AI for All: Biology is eating the world & AI is eating Biology
AI for All: Biology is eating the world & AI is eating Biology Intel® Software
 
Python Data Science and Machine Learning at Scale with Intel and Anaconda
Python Data Science and Machine Learning at Scale with Intel and AnacondaPython Data Science and Machine Learning at Scale with Intel and Anaconda
Python Data Science and Machine Learning at Scale with Intel and AnacondaIntel® Software
 
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSciStreamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSciIntel® Software
 
AI for good: Scaling AI in science, healthcare, and more.
AI for good: Scaling AI in science, healthcare, and more.AI for good: Scaling AI in science, healthcare, and more.
AI for good: Scaling AI in science, healthcare, and more.Intel® Software
 
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...Intel® Software
 
Advanced Techniques to Accelerate Model Tuning | Software for AI Optimization...
Advanced Techniques to Accelerate Model Tuning | Software for AI Optimization...Advanced Techniques to Accelerate Model Tuning | Software for AI Optimization...
Advanced Techniques to Accelerate Model Tuning | Software for AI Optimization...Intel® Software
 
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...Intel® Software
 
AWS & Intel Webinar Series - Accelerating AI Research
AWS & Intel Webinar Series - Accelerating AI ResearchAWS & Intel Webinar Series - Accelerating AI Research
AWS & Intel Webinar Series - Accelerating AI ResearchIntel® Software
 
Intel AIDC Houston Summit - Overview Slides
Intel AIDC Houston Summit - Overview SlidesIntel AIDC Houston Summit - Overview Slides
Intel AIDC Houston Summit - Overview SlidesIntel® Software
 
AIDC NY: BODO AI Presentation - 09.19.2019
AIDC NY: BODO AI Presentation - 09.19.2019AIDC NY: BODO AI Presentation - 09.19.2019
AIDC NY: BODO AI Presentation - 09.19.2019Intel® Software
 
AIDC NY: Applications of Intel AI by QuEST Global - 09.19.2019
AIDC NY: Applications of Intel AI by QuEST Global - 09.19.2019AIDC NY: Applications of Intel AI by QuEST Global - 09.19.2019
AIDC NY: Applications of Intel AI by QuEST Global - 09.19.2019Intel® Software
 
Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...
Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...
Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...Intel® Software
 
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...Intel® Software
 
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...Intel® Software
 
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...Intel® Software
 
AIDC India - Intel Movidius / Open Vino Slides
AIDC India - Intel Movidius / Open Vino SlidesAIDC India - Intel Movidius / Open Vino Slides
AIDC India - Intel Movidius / Open Vino SlidesIntel® Software
 
AIDC India - AI Vision Slides
AIDC India - AI Vision SlidesAIDC India - AI Vision Slides
AIDC India - AI Vision SlidesIntel® Software
 
Enhance and Accelerate Your AI and Machine Learning Solution | SIGGRAPH 2019 ...
Enhance and Accelerate Your AI and Machine Learning Solution | SIGGRAPH 2019 ...Enhance and Accelerate Your AI and Machine Learning Solution | SIGGRAPH 2019 ...
Enhance and Accelerate Your AI and Machine Learning Solution | SIGGRAPH 2019 ...Intel® Software
 

Mais de Intel® Software (20)

AI for All: Biology is eating the world & AI is eating Biology
AI for All: Biology is eating the world & AI is eating Biology AI for All: Biology is eating the world & AI is eating Biology
AI for All: Biology is eating the world & AI is eating Biology
 
Python Data Science and Machine Learning at Scale with Intel and Anaconda
Python Data Science and Machine Learning at Scale with Intel and AnacondaPython Data Science and Machine Learning at Scale with Intel and Anaconda
Python Data Science and Machine Learning at Scale with Intel and Anaconda
 
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSciStreamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSci
 
AI for good: Scaling AI in science, healthcare, and more.
AI for good: Scaling AI in science, healthcare, and more.AI for good: Scaling AI in science, healthcare, and more.
AI for good: Scaling AI in science, healthcare, and more.
 
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
 
Advanced Techniques to Accelerate Model Tuning | Software for AI Optimization...
Advanced Techniques to Accelerate Model Tuning | Software for AI Optimization...Advanced Techniques to Accelerate Model Tuning | Software for AI Optimization...
Advanced Techniques to Accelerate Model Tuning | Software for AI Optimization...
 
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...
Reducing Deep Learning Integration Costs and Maximizing Compute Efficiency| S...
 
AWS & Intel Webinar Series - Accelerating AI Research
AWS & Intel Webinar Series - Accelerating AI ResearchAWS & Intel Webinar Series - Accelerating AI Research
AWS & Intel Webinar Series - Accelerating AI Research
 
Intel Developer Program
Intel Developer ProgramIntel Developer Program
Intel Developer Program
 
Intel AIDC Houston Summit - Overview Slides
Intel AIDC Houston Summit - Overview SlidesIntel AIDC Houston Summit - Overview Slides
Intel AIDC Houston Summit - Overview Slides
 
AIDC NY: BODO AI Presentation - 09.19.2019
AIDC NY: BODO AI Presentation - 09.19.2019AIDC NY: BODO AI Presentation - 09.19.2019
AIDC NY: BODO AI Presentation - 09.19.2019
 
AIDC NY: Applications of Intel AI by QuEST Global - 09.19.2019
AIDC NY: Applications of Intel AI by QuEST Global - 09.19.2019AIDC NY: Applications of Intel AI by QuEST Global - 09.19.2019
AIDC NY: Applications of Intel AI by QuEST Global - 09.19.2019
 
Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...
Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...
Advanced Single Instruction Multiple Data (SIMD) Programming with Intel® Impl...
 
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...
Build a Deep Learning Video Analytics Framework | SIGGRAPH 2019 Technical Ses...
 
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
 
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...
RenderMan*: The Role of Open Shading Language (OSL) with Intel® Advanced Vect...
 
AIDC India - AI on IA
AIDC India  - AI on IAAIDC India  - AI on IA
AIDC India - AI on IA
 
AIDC India - Intel Movidius / Open Vino Slides
AIDC India - Intel Movidius / Open Vino SlidesAIDC India - Intel Movidius / Open Vino Slides
AIDC India - Intel Movidius / Open Vino Slides
 
AIDC India - AI Vision Slides
AIDC India - AI Vision SlidesAIDC India - AI Vision Slides
AIDC India - AI Vision Slides
 
Enhance and Accelerate Your AI and Machine Learning Solution | SIGGRAPH 2019 ...
Enhance and Accelerate Your AI and Machine Learning Solution | SIGGRAPH 2019 ...Enhance and Accelerate Your AI and Machine Learning Solution | SIGGRAPH 2019 ...
Enhance and Accelerate Your AI and Machine Learning Solution | SIGGRAPH 2019 ...
 

Último

Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceanilsa9823
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 

Último (20)

Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 

DreamWorks Animation

  • 1. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. DreamWorks Animation*: Slashing the cost of 3d Matrix Math using X-Form (Transform) Building Blocks
  • 2. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. DreamWorks Animation*: Slashing the cost of 3d Matrix Math using X-Form (Transform) Building Blocks
  • 3. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. DreamWorks Animation: Slashing the cost of 3d Matrix Math using X-Form (Transform) Building Blocks
  • 4. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Alex Wells (presenter) & Martin Watt (DWA) August 12 & 13, 2015 DreamWorks Animation: Slashing the cost of 3d Matrix Math using X-Form (Transform) Building Blocks
  • 5. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. A "Mission Critical Application" is any application in which failure of the Intel Product could result, directly or indirectly, in personal injury or death. SHOULD YOU PURCHASE OR USE INTEL'S PRODUCTS FOR ANY SUCH MISSION CRITICAL APPLICATION, YOU SHALL INDEMNIFY AND HOLD INTEL AND ITS SUBSIDIARIES, SUBCONTRACTORS AND AFFILIATES, AND THE DIRECTORS, OFFICERS, AND EMPLOYEES OF EACH, HARMLESS AGAINST ALL CLAIMS COSTS, DAMAGES, AND EXPENSES AND REASONABLE ATTORNEYS' FEES ARISING OUT OF, DIRECTLY OR INDIRECTLY, ANY CLAIM OF PRODUCT LIABILITY, PERSONAL INJURY, OR DEATH ARISING IN ANY WAY OUT OF SUCH MISSION CRITICAL APPLICATION, WHETHER OR NOT INTEL OR ITS SUBCONTRACTOR WAS NEGLIGENT IN THE DESIGN, MANUFACTURE, OR WARNING OF THE INTEL PRODUCT OR ANY OF ITS PARTS. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined". Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information. The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or go to: http://www.intel.com/design/literature.htm Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Intel does not control or audit the design or implementation of third party benchmarks or Web sites referenced in this document. Intel encourages all of its customers to visit the referenced Web sites or others where similar performance benchmarks are reported and confirm whether the referenced benchmarks are accurate and reflect performance of systems available for purchase. Relative performance is calculated by assigning a baseline value of 1.0 to one benchmark result, and then dividing the actual benchmark result for the baseline platform into each of the specific benchmark results of each of the other platforms, and assigning them a relative performance number that correlates with the performance improvements reported. SPEC, SPECint, SPECfp, SPECrate. SPECpower, SPECjAppServer, SPECjbb, SPECjvm, SPECWeb, SPECompM, SPECompL, SPEC MPI, SPECjEnterprise* are trademarks of the Standard Performance Evaluation Corporation. See http://www.spec.org for more information. TPC-C, TPC-H, TPC-E are trademarks of the Transaction Processing Council. See http://www.tpc.org for more information. Hyper-Threading Technology requires a computer system with a processor supporting HT Technology and an HT Technology-enabled chipset, BIOS and operating system. Performance will vary depending on the specific hardware and software you use. For more information including details on which processors support HT Technology, see here Intel® Turbo Boost Technology requires a Platform with a processor with Intel Turbo Boost Technology capability. Intel Turbo Boost Technology performance varies depending on hardware, software and overall system configuration. Check with your platform manufacturer on whether your system delivers Intel Turbo Boost Technology. For more information, see http://www.intel.com/technology/turboboost No computer system can provide absolute security. Requires an enabled Intel® processor and software optimized for use of the technology. Consult your system manufacturer and/or software vendor for more information. Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families: Go to: Learn About Intel® Processor Numbers Intel product plans in this presentation do not constitute Intel plan of record product roadmaps. Please contact your Intel representative to obtain Intel’s current plan of record product roadmaps. Copyright © 2014 Intel Corporation. All rights reserved. Intel, the Intel logo, Xeon and Intel Core are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. All dates and products specified are for planning purposes only and are subject to change without notice *Other names and brands may be claimed as the property of others. Legal Disclaimers 5
  • 6. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. The above statements and any others in this document that refer to plans and expectations for the third quarter, the year and the future are forward-looking statements that involve a number of risks and uncertainties. Words such as “anticipates,” “expects,” “intends,” “plans,” “believes,” “seeks,” “estimates,” “may,” “will,” “should” and their variations identify forward-looking statements. Statements that refer to or are based on projections, uncertain events or assumptions also identify forward-looking statements. Many factors could affect Intel’s actual results, and variances from Intel’s current expectations regarding such factors could cause actual results to differ materially from those expressed in these forward-looking statements. Intel presently considers the following to be the important factors that could cause actual results to differ materially from the company’s expectations. Demand could be different from Intel's expectations due to factors including changes in business and economic conditions; customer acceptance of Intel’s and competitors’ products; supply constraints and other disruptions affecting customers; changes in customer order patterns including order cancellations; and changes in the level of inventory at customers. Uncertainty in global economic and financial conditions poses a risk that consumers and businesses may defer purchases in response to negative financial events, which could negatively affect product demand and other related matters. Intel operates in intensely competitive industries that are characterized by a high percentage of costs that are fixed or difficult to reduce in the short term and product demand that is highly variable and difficult to forecast. Revenue and the gross margin percentage are affected by the timing of Intel product introductions and the demand for and market acceptance of Intel's products; actions taken by Intel's competitors, including product offerings and introductions, marketing programs and pricing pressures and Intel’s response to such actions; and Intel’s ability to respond quickly to technological developments and to incorporate new features into its products. The gross margin percentage could vary significantly from expectations based on capacity utilization; variations in inventory valuation, including variations related to the timing of qualifying products for sale; changes in revenue levels; segment product mix; the timing and execution of the manufacturing ramp and associated costs; start-up costs; excess or obsolete inventory; changes in unit costs; defects or disruptions in the supply of materials or resources; product manufacturing quality/yields; and impairments of long-lived assets, including manufacturing, assembly/test and intangible assets. Intel's results could be affected by adverse economic, social, political and physical/infrastructure conditions in countries where Intel, its customers or its suppliers operate, including military conflict and other security risks, natural disasters, infrastructure disruptions, health concerns and fluctuations in currency exchange rates. Expenses, particularly certain marketing and compensation expenses, as well as restructuring and asset impairment charges, vary depending on the level of demand for Intel's products and the level of revenue and profits. Intel’s results could be affected by the timing of closing of acquisitions and divestitures. Intel's results could be affected by adverse effects associated with product defects and errata (deviations from published specifications), and by litigation or regulatory matters involving intellectual property, stockholder, consumer, antitrust, disclosure and other issues, such as the litigation and regulatory matters described in Intel's SEC reports. An unfavorable ruling could include monetary damages or an injunction prohibiting Intel from manufacturing or selling one or more products, precluding particular business practices, impacting Intel’s ability to design its products, or requiring other remedies such as compulsory licensing of intellectual property. A detailed discussion of these and other factors that could affect Intel’s results is included in Intel’s SEC filings, including the company’s most recent reports on Form 10-Q, Form 10-K and earnings release. Risk Factors 6
  • 7. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. 7
  • 8. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Before  After Overall Speedup 1.2x 8 DWA* Character Animation Speedup After XBB Motion System Speedup 1.6x
  • 9. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Motion System in DWA Character Animation  Observed performance bottlenecks in Motion System  3d Matrix transforms  How would an ideal transform behave  XBB representation  XBB deferred evaluation  Results Agenda 9
  • 10. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  To represent bones of a skeleton in 3d space an animation tool builds a Hierarchy of Joints and how they are connected. – Typically a Directed Acyclic Graph of Joints How is a skeleton represented for animation? 10
  • 11. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Relative to a parent Joint (in Local Space), each Joint needs to model: – Rotational Euler Angles(around X, Y, and Z axis) & Order – Scale (of X, Y, and Z axis) – Shear (along X, Y, and Z axis) – Translation (X, Y, and Z components)  Animation curves change values over time – drive the Joint’s attributes (rotation, translation, etc.) How is a each Joint represented? 11
  • 12. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Deformers which compute the final 3d vertices of a character’s skin need an “Frame” of reference to apply offsets from.  The “World Space” Position and Orientation of the Joints from the Hierarchy (skeleton) provide that “Frame” of reference. How does the skeleton influence the skin? 12
  • 13. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Representing a “Frame” of reference struct Matrix4x4 { double m[4][4]; };  A 4x4 Matrix can represent the Position and Orientation of a Joint in World Space.  When used in this manner, the 4x4 Matrix is commonly referred to as a 3d transform (x-form).  4x4 Matrix is typically implemented literally as a 4x4 array of floating point values. 13
  • 14. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Rotation, Scale, Shear, and Translation can all be represented as 4x4 Matrices.  Multiple 4x4 Matrices can be concatenated (multiplied) together to a single 4x4 matrix.  3d points and 3d vectors (offsets) can be multiplied through a 4x4 Matrix to be transformed to the position and orientation in “World Space” it represents.  For each Joint – matrices representing Scale, Shear, Rotation, and Translation are combined together into a single “Local Space” 4x4 matrix. Why a 4x4 Matrix? 14
  • 15. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  By recursively combining the “Local Space” transforms of a Joint with its parent Joint’s “Local Space” until the root of the hierarchy is reached, a 4x4 matrix can be accumulated that represents the World Space of that Joint.  As there are many joints, its pays off to cache a “World Space” 4x4 Matrix at each joint, so that a recursive walk up the hierarchy can stop early if a clean “World Space” has been cached. How To Calculate The World Space Transform Of A Joint? 15
  • 16. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Each time step, 1000’s of Joint attributes change, invalidating a Hierarchy’s cached World Space and Local Space transforms.  1000’s of operations on Hierarchy objects build up a complex skeleton. Hierarchy is the core of DWA’s Motion System  Imagine how many bones are used to represent a 4 legged creature with a tail & wings.  Due to the recursion, there is little opportunity for data vectorization or threading. 16
  • 17. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Despite heavy parallelization of the Deformation System (green & yellow), it can’t start until the Motion System (red) finishes assembling a Hierarchy. Motion System Is On The Critical Path 17
  • 18. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Motion System dwarfs the other systems.  Amdahl’s law limits our threading & vectorization improvements in the deformation system from having a larger overall impact. Wall Time Spent in Each Category 18
  • 19. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  “hier_apply_fk_around_pivot” as the hottest operator – Operates on a Hierarchy – Verified in Intel® VTune™ Amplifier XE  Several other “hier” related operations taking up other top hot spots. Time Spent inside each type of Operator 19
  • 20. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Typical implementation – Loop over rows – Loop over colums – Compute result element by multiplying one row of first matrix across one column of the other  Simple enough, but how much work did we really just do? struct Matrix4x4 { double m[4][4]; }; 20 Matrix4x4 operator * (const Matrix4x4 &iOther) { Matrix4x4 result; for (int r=0;r < 4; ++r) { for (int c=0;c < 4; ++c) { double sum = 0.0; for(int k=0; k < 4; ++k) { sum += m[r][k]*iOther.m[k][c]; } result.m[r][c] = sum; } } return result; } Matrix Concatenation (Multiplication)
  • 21. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  64 Multiplies (double precision)  48 Additions (double precision) Expensive Matrix Concatenation Matrix4x4 operator * (const Matrix4x4 &iOther) { Matrix4x4 result; result.m[0][0] = m[0][0]*iOther.m[0][0] + m[0][1]*iOther.m[1][0] + m[0][2]*iOther.m[2][0] + m[0][3]*iOther.m[3][0]; result.m[0][1] = m[0][0]*iOther.m[0][1] + m[0][1]*iOther.m[1][1] + m[0][2]*iOther.m[2][1] + m[0][3]*iOther.m[3][1]; result.m[0][2] = m[0][0]*iOther.m[0][2] + m[0][1]*iOther.m[1][2] + m[0][2]*iOther.m[2][2] + m[0][3]*iOther.m[3][2]; result.m[0][3] = m[0][0]*iOther.m[0][3] + m[0][1]*iOther.m[1][3] + m[0][2]*iOther.m[2][3] + m[0][3]*iOther.m[3][3]; result.m[1][0] = m[1][0]*iOther.m[0][0] + m[1][1]*iOther.m[1][0] + m[1][2]*iOther.m[2][0] + m[1][3]*iOther.m[3][0]; result.m[1][1] = m[1][0]*iOther.m[0][1] + m[1][1]*iOther.m[1][1] + m[1][2]*iOther.m[2][1] + m[1][3]*iOther.m[3][1]; result.m[1][2] = m[1][0]*iOther.m[0][2] + m[1][1]*iOther.m[1][2] + m[1][2]*iOther.m[2][2] + m[1][3]*iOther.m[3][2]; result.m[1][3] = m[1][0]*iOther.m[0][3] + m[1][1]*iOther.m[1][3] + m[1][2]*iOther.m[2][3] + m[1][3]*iOther.m[3][3]; result.m[2][0] = m[2][0]*iOther.m[0][0] + m[2][1]*iOther.m[1][0] + m[2][2]*iOther.m[2][0] + m[2][3]*iOther.m[3][0]; result.m[2][1] = m[2][0]*iOther.m[0][1] + m[2][1]*iOther.m[1][1] + m[2][2]*iOther.m[2][1] + m[2][3]*iOther.m[3][1]; result.m[2][2] = m[2][0]*iOther.m[0][2] + m[2][1]*iOther.m[1][2] + m[2][2]*iOther.m[2][2] + m[2][3]*iOther.m[3][2]; result.m[2][3] = m[2][0]*iOther.m[0][3] + m[2][1]*iOther.m[1][3] + m[2][2]*iOther.m[2][3] + m[2][3]*iOther.m[3][3]; result.m[3][0] = m[3][0]*iOther.m[0][0] + m[3][1]*iOther.m[1][0] + m[3][2]*iOther.m[2][0] + m[3][3]*iOther.m[3][0]; result.m[3][1] = m[3][0]*iOther.m[0][1] + m[3][1]*iOther.m[1][1] + m[3][2]*iOther.m[2][1] + m[3][3]*iOther.m[3][1]; result.m[3][2] = m[3][0]*iOther.m[0][2] + m[3][1]*iOther.m[1][2] + m[3][2]*iOther.m[2][2] + m[3][3]*iOther.m[3][2]; result.m[3][3] = m[3][0]*iOther.m[0][3] + m[3][1]*iOther.m[1][3] + m[3][2]*iOther.m[2][3] + m[3][3]*iOther.m[3][3]; return result; } 21
  • 22. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Good news! YES!  If you knew the exact transform a 4x4 matrix was representing, you would know quite a few 0 and 1 values at compile time. Are Any of Those 16 Matrix Values Known At Compile Time? Identity [1][0][0][0] [0][1][0][0] [0][0][1][0] [0][0][0][1] Translation(x,y,z) [1][0][0][0] [0][1][0][0] [0][0][1][0] [x][y][z][1] Shear(x,y,z) [1][0][0][0] [x][1][0][0] [y][z][1][0] [0][0][0][1] Scale(x,y,z) [x][0][0][0] [0][y][0][0] [0][0][z][0] [0][0][0][1] 22
  • 23. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Building rotation matrices is more expensive because of the need to call sine and cosine on the angle  Rotations also have 0 and 1 values What About Rotations? Rotate X axis(angle) [1][0][0][0] [0][c][s][0] [0][-s][c][0] [0][0][0][1] Rotate Y axis(angle) [c][0][-s][0] [0][1][0][0] [s][0][c][0] [0][0][0][1] Rotate Z axis(angle) [c][s][0][0] [-s][c][0][0] [0][0][1][0] [0][0][0][1] 23 let s = sine(angle) let c = cosine(angle)
  • 24. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Unfortunately, the matrix multiply method doesn’t know that the 4x4 Matrix it was passed has any 0 or 1 values – So it can not avoid performing math operations.  Even if we had separate classes to represent the different transformations and multiple versions of the matrix multiply method for each – The result becomes a general 4x4 matrix. – Chains of multiplication would only benefit on the 1st multiply operation Huge Optimization Potential! 24
  • 25. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Pseudo algorithm to compute a Joint’s World Space – 10 4x4 matrix multiplications – 1 matrix inversion (very expensive) in the middle  YES… But you won’t even want to try  Good luck getting the expanded math right Can we expand the math by hand? JointWorldSpace = Scale*Shear* ParentScale*ParentShear* RotZ*RotY*RotX* ((ParentScale*ParentShear).inverse())* Translate* ParentWorldSpace; 25
  • 26. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Must keep high level representation of algorithm  Perform the absolute minimum required number of math operations – It must track known values – Continue tracking values through matrix multiplications  Utilize known information to provide a cheaper alternative to full matrix inversions  Interface/Adapt to existing 4x4 Matrix data types Ideal Transform Behavior 26
  • 27. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. C++ library to enable composition of 3d transforms Instead of a general purpose 4x4 matrix, it provides specific types for different transforms. Track known values through multiplication chains Deferred Evaluation Localized source code changes required to take advantage of Introducing Xform Building Blocks (XBB) 27
  • 28. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. XBB Scale, Shear3, & Translation ref::Matrix4x4 S; S.makeScale(scaleX, scaleY, scaleZ); ref::Matrix4x4 SH; SH.makeShear3(shearX, shearY, shearZ); ref::Matrix4x4 T; T.makeTranslation(transX, transY, transZ); 128 Bytes of Stack Used Per 4x4 Matrix Overhead to initialize to Identity(), then overwrite elements 28 xbb::Scale S(scaleX, scaleY, scaleZ); xbb::Shear3 SH(shearX, shearY, shearZ); xbb::Translation T(transX, transY, transZ);  Before  After XBB 24 Bytes of Stack No overhead to initialize 4x4 elements that are known to be 0 or 1 for each type of transform
  • 29. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. XBB Transform Representation struct Translation { double x; double y; double z; … }; 29  Stores only non-constant data needed to represent a 4x4 matrix of the transform type  Provides methods for element level access to a 4x4 matrix – Return known constant values double e10() const { return 0.0; } double e11() const { return 1.0; } double e12() const { return 0.0; } double e13() const { return 0.0; } double e20() const { return 0.0; } double e21() const { return 0.0; } double e22() const { return 1.0; } double e23() const { return 0.0; } double e30() const { return x; } double e31() const { return y; } double e32() const { return z; } double e33() const { return 1.0; } double e00() const { return 1.0; } double e01() const { return 0.0; } double e02() const { return 0.0; } double e03() const { return 0.0; } Translation(x,y,z) [1][0][0][0] [0][1][0][0] [0][0][1][0] [x][y][z][1]
  • 30. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. XBB Transform Constancy enum Constancy { ConstantZero, ConstantOne, NotConstant }; 30  Each transform identifies if each 4x4 matrix element is a constant 0, 1, or Not Constant  Constancy is suitable as template parameter – Matrix Multiply will make use of static const Constancy c10 = ConstantZero; static const Constancy c11 = ConstantOne; static const Constancy c12 = ConstantZero; static const Constancy c13 = ConstantZero; static const Constancy c20 = ConstantZero; static const Constancy c21 = ConstantZero; static const Constancy c22 = ConstantOne; static const Constancy c23 = ConstantZero; static const Constancy c30 = NotConstant; static const Constancy c31 = NotConstant; static const Constancy c32 = NotConstant; static const Constancy c33 = ConstantOne; static const Constancy c00 = ConstantOne; static const Constancy c01 = ConstantZero; static const Constancy c02 = ConstantZero; static const Constancy c03 = ConstantZero; Translation(x,y,z) [1][0][0][0] [0][1][0][0] [0][0][1][0] [x][y][z][1]
  • 31. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. XBB Rotations ref::Matrix4x4 Rx; Rx.makeRotationX(rotX); ref::Matrix4x4 Ry; Ry.makeRotationY(rotY); ref::Matrix4x4 Rz; Rz.makeRotationZ(rotZ); 128 Bytes of Stack Used Per 4x4 Matrix Overhead to initialize to Identity(), then overwrite elements 31 xbb::RotationX Rx(rotX); xbb::RotationY Ry(rotY); xbb::RotationZ Rz(rotZ);  Before  After XBB 16 Bytes of Stack No overhead to initialize 4x4 elements that are known to be 0 or 1 for each type of transform sin(angle) cosine(angle) sine(angle) cosine(angle)
  • 32. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. XBB Rotation Representation struct RotationX { double cosineOfAngle; double sineOfAngle; … }; 32  Stores the sine and cosine of the angle, not the angle itself.  Provides methods for element level access to a 4x4 matrix – Return known constant values double e10() const { return 0.0; } double e11() const { return cosineOfAngle; } double e12() const { return sineOfAngle; } double e13() const { return 0.0; } double e20() const { return 0.0; } double e21() const { return -sineOfAngle; } double e22() const { return cosineOfAngle; } double e23() const { return 0.0; } double e30() const { return 0.0; } double e31() const { return 0.0; } double e32() const { return 0.0; } double e33() const { return 1.0; } double e00() const { return 1.0; } double e01() const { return 0.0; } double e02() const { return 0.0; } double e03() const { return 0.0; } Rotate X axis(angle) [1][0][0][0] [0][c][s][0] [0][-s][c][0] [0][0][0][1]
  • 33. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. XBB Multiply ref::Matrix4x4 SxSH; SxSH = S*SH; 33 auto SxSH = S*SH; xbb::Matrix4x3 SxSH_Matrix; SxSH.to(SxSH_Matrix);  Before  After XBB No Math is performed. Instead, a new type Multiply<Scale, Shear3> is returned Math is deferred until you explicitly export to a general purpose matrix. XBB’s Multiply uses the Constancy of its template parameters to define its own Constancy values
  • 34. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Multiplication Chains ref::Matrix4x4 jointLocalSpace; jointLocalSpace = S*SH*Rz*Ry*Rx*T; 34 xbb::Matrix4x3 jointLocalSpace; (S*SH*Rz*Ry*Rx*T).to(jointLocalSpace);  Before  After XBB Confirmed assembly has minimum math operations 5 matrix multiplications: 320 multiplications 240 adds Speedup 2.45x Multiply<Multiply<Multiply<Multiply<Multiply<Scale, Shear3>, RotationZ>, RotationY>, RotationX>, Translation>
  • 35. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Deferred Evaluation (reduce) 35 typedef ReducedMatrix < c00, c01, c02, c03, c10, c11, c12, c13, c20, c21, c22, c23, c30, c31, c32, c33 > ReducedType;  ReducedMatrix based on a transform’s Constancy. – Only has data members for NotConstant matrix elements  Multiply’s reduce recursively expands its left and right operands – Expands out entire multiplication chain  4x4 elements setByMatrixMultiply – Actually multiplies a column by row – Knows Constancy of the elements from reduced left and right transforms  Using template specialization based on the Constancy – Only exact terms necessary are accessed – Emits only necessary multiplications & additions ReducedType Multiply::reduce() const { const auto tl = left.reduce(); const auto tr = right.reduce(); ReducedType r; r.setByMatrixMultiply<0,0>(tl,tr); r.setByMatrixMultiply<0,1>(tl,tr); ... r.setByMatrixMultiply<3,3>(tl,tr); return r; }
  • 36. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Many Hierarchy operations change only Translation of a Joint. – If we could cache the Rotation transforms, then many expensive sin/cos calls could be avoided. – Matrix4x4 is too big (128 bytes) to cache one for each Rotation X, Y, and Z.  XBB rotations are only 16 bytes each – Small enough to cache inside the Joint object XBB: Cached Rotations (S*SH*cached.Rz*cached.Ry*cached.Rx*T).to(jointLocalSpace); Use Cached Sin/Cos of Angles Speedup 12.71x 36
  • 37. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Identity is free in any multiplication chain – Optimized out entirely – Only 1 byte of stack space (empty struct)  Transpose is free in any multiplication chain – Deferred evaluation pulls results out in different order – No additional math or data movement XBB Identity & Transpose Identity id; (S*SH*id*R*T).to(result); 37 (S*SH*R*T).transpose().(result);
  • 38. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Inverse is very expensive – Determinant – Cofactor – Transpose – Division – scalar matrix multiply Before: Inverse of (Scale*Shear) inverseOfSxSH = (S*SH).inverse(); 38
  • 39. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. (S*SH).inverse().to(inverseOfSxSH);  MAGIC happens – Inverse becomes part of deferred evaluation!  Because we have a representation of the multiplication chain – we can move the inverse inside the multiplication chain and reverse its order  Inverse of most transform primitives is free – except Scale which costs 3 divisions  During deferred evaluation – the logical 4x4 matrix values are reordered and flip signs where needed to represent its inverse (SH.inverse()*S.inverse()).to(inverseOfSxSH); Speedup 6.43x 39 After XBB: Inverse of (Scale*Shear)
  • 40. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Provide template specializations for adapters to map between DWA math classes and XBB’s. – Allows XBB deferred evaluation directly into DWA matrix types  In many scenarios, the transforms could have been Identity based on logic inside the Joint. – To take full advantage of XBB, we needed to know the exact type of transforms of involved.  Templatized Hierarchy algorithm making conditional logic controlled by template parameters. e.g. – Order of Rotations – Scale Propagation Mode  Specialized templates based on parameters to – Use the correct type of XBB transform  Identity whenever possible – Multiply the Rotations in the correct order XBB Integration to DWA Motion System 40
  • 41. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Built a jump table with instances of the algorithm for all the different combinations of options and rotation orders. – Used enums as indexes into multi-dimensional array of function pointers to the corresponding algorithm instance to execute.  Used XBB for decomposing World Space Matrix4x4 into individual Joint attributes.  Rewrote expensive “hier_apply_fk_around_pivot” with XBB directly vs. going through Hierarchy object – Avoid high overhead of building Hierarchy on on the fly  Performed non XBB related optimizations – Reduced dynamic memory allocation by replacing local std::vector<T> with stack based array when possible XBB Integration to DWA Motion System (continued…) 41
  • 42. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Before  After XBB DWA Motion System Results Overall Speedup 1.2x 42 hier_apply_fk_around_pivot Speedup 2.8x Motion System Speedup 1.6x
  • 43. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Reducing the Critical Path helped Thread Scaling. 43 XBB DWA Motion System Scaling Reached goal of 30 fps on single Avoton cartridge
  • 44. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  Good way to improve the impact of vectorization or threading is to reduce the amount of work being done outside those data parallel regions. – Ideally do less work in the first place.  Complex optimization problems can be represented in C++ and presented back to the compiler in a form it can excel at optimizing. – Expanding math by hand is untenable.  You can do much more with C++11/14 to encapsulate problems while retaining the original high level algorithm – Look for optimization problems that might be representable at a higher level. Call to Action 44
  • 45. Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.  XBB has exactly the features required to support the DWA Motion System.  For general purpose use – more transformations and math operations might be required. e.g.  Inverse of general 4x4 matrix  Single precision version or template based data type  XBB can be licensed or potentially open sourced upon request. – Could be of use to CAD, Animation Tools, and Gaming.  Contact Alex Wells (alex.m.wells@intel.com) Future Work 45
  • 46. C o p y r i g h t © 2 0 1 5 , I n t e l C o r p o r a t i o n . A l l r i g h t s r e s e r v e d . *O t h e r n a me s a n d b r a n d s ma y b e c l a i me d a s t h e p r o p e r t y o f o t h e r s .