3. Contents
Chapter 1
Overview of SAP BusinessObjects Data Services
5
SAP BusinessObjects Data Services and the SAP BusinessObjects solution
portfolio........................................................................................................6
Software benefits.........................................................................................7
Unification with the platform...................................................................7
Ease of use and high productivity..........................................................8
High availability and performance..........................................................8
Associated software.....................................................................................8
SAP BusinessObjects Metadata Management......................................9
Interfaces.....................................................................................................9
Chapter 2
Architecture
11
Standard components................................................................................12
Designer...............................................................................................14
Repository............................................................................................14
Job Server............................................................................................15
Engine..................................................................................................15
Access Server......................................................................................16
Address Server.....................................................................................16
Administrator........................................................................................16
Metadata Reports applications.............................................................17
Metadata Integrator..............................................................................19
Service.................................................................................................20
SNMP Agent.........................................................................................21
Adapter SDK........................................................................................21
Optional components.................................................................................21
SAP BusinessObjects Data Services Getting Started Guide
3
6. 1
Overview of SAP BusinessObjects Data Services
SAP BusinessObjects Data Services and the SAP BusinessObjects solution portfolio
About this section
This section introduces SAP BusinessObjects Data Services and explains
its place in the SAP BusinessObjects solution portfolio.
Related Topics
• SAP BusinessObjects Data Services and the SAP BusinessObjects solution
portfolio
• Software benefits
• Interfaces
SAP BusinessObjects Data Services and
the SAP BusinessObjects solution
portfolio
The SAP BusinessObjects solution portfolio delivers extreme insight through
specialized end-user tools on a single, trusted business intelligence platform.
This entire platform is supported by SAP BusinessObjects Data Services.
On top of SAP BusinessObjects Data Services, the SAP BusinessObjects
solution portfolio layers the most reliable, scalable, flexible, and manageable
business intelligence (BI) platform which supports the industry's best
integrated end-user interfaces: reporting, query and analysis, and
performance management dashboards, scorecards, and applications.
True data integration blends batch extraction, transformation, and loading
(ETL) technology with real-time bi-directional data flow across multiple
applications for the extended enterprise.
By building a relational datastore and intelligently blending direct real-time
and batch data-access methods to access data from enterprise resource
planning (ERP) systems and other sources, SAP has created a powerful,
high-performance data integration product that allows you to fully leverage
your ERP and enterprise application infrastructure for multiple uses.
SAP provides a batch and real-time data integration system to drive today's
new generation of analytic and supply-chain management applications. Using
the highly scalable data integration solution provided by SAP, your enterprise
can maintain a real-time, on-line dialogue with customers, suppliers,
6
SAP BusinessObjects Data Services Getting Started Guide
7. Overview of SAP BusinessObjects Data Services
Software benefits
employees, and partners, providing them with the critical information they
need for transactions and business analysis.
Software benefits
Use SAP BusinessObjects Data Services to develop enterprise data
integration for batch and real-time uses. With the software:
•
You can create a single infrastructure for batch and real-time data
movement to enable faster and lower cost implementation.
•
Your enterprise can manage data as a corporate asset independent of
any single system. Integrate data across many systems and reuse that
data for many purposes.
•
You have the option of using pre-packaged data solutions for fast
deployment and quick ROI. These solutions extract historical and daily
data from operational systems and cache this data in open relational
databases.
The software customizes and manages data access and uniquely combines
industry-leading, patent-pending technologies for delivering data to analytic,
supply-chain management, customer relationship management, and Web
applications.
Unification with the platform
SAP BusinessObjects Data Services provides several points of platform
unification:
•
Get end-to-end data lineage and impact analysis
•
Create the semantic layer (universe) and manage change within the ETL
design environment
SAP deeply integrates the entire ETL process with the business intelligence
platform so you benefit from:
•
Easy metadata management
•
Simplified and unified administration
•
Life cycle management
SAP BusinessObjects Data Services Getting Started Guide
7
1
8. 1
Overview of SAP BusinessObjects Data Services
Associated software
•
Trusted information
Ease of use and high productivity
SAP BusinessObjects Data Services combines both batch and real-time
data movement and management to provide a single data integration platform
for information management from any information source, for any information
use.
Using the software, you can:
•
Stage data in an operational datastore, data warehouse, or data mart.
•
Update staged data in batch or real-time modes.
•
Create a single graphical development environment for developing, testing,
and deploying the entire data integration platform.
•
Manage a single metadata repository to capture the relationships between
different extraction and access methods and provide integrated lineage
and impact analysis.
High availability and performance
The high-performance engine and proven data movement and management
capabilities of SAP BusinessObjects Data Services include:
•
Scalable, multi-instance data-movement for fast execution
•
Load balancing
•
Changed-data capture
•
Parallel processing
Associated software
Choose from other SAP BusinessObjects solution portfolio software options
to further support and enhance the power of your SAP BusinessObjects Data
Services software.
8
SAP BusinessObjects Data Services Getting Started Guide
9. Overview of SAP BusinessObjects Data Services
Interfaces
SAP BusinessObjects Metadata Management
SAP BusinessObjects Metadata Management provides an integrated view
of metadata and its multiple relationships for a complete Business Intelligence
project spanning some or all of the SAP BusinessObjects solution portfolio.
Use the software to:
•
View metadata about reports, documents, and data sources from a single
repository.
•
Analyze lineage to determine data sources of documents and reports.
•
Analyze the impact of changing a source table, column, element, or field
on existing documents and reports.
•
Track different versions (changes) to each object over time.
•
View operational metadata (such as the number of rows processed and
CPU utilization) as historical data with a datetime.
•
View metadata in different languages.
For more information on SAP BusinessObjects Metadata Management,
contact your SAP sales representative.
Interfaces
SAP BusinessObjects Data Services provides many types of interface
components. Your version of the software may provide some or all of them.
You can use the Interface Development Kit to develop adapters that read
from and/or write to other applications.
In addition to the interfaces listed above, the Nested Relational Data Model
(NRDM) allows you to apply the full power of SQL transforms to manipulate,
process, and enrich hierarchical business documents.
For a detailed list of supported environments and hardware requirements,
see the Supported Platforms document available in the SAP BusinessObjects
Support > Documentation > Supported Platforms/PARs section of the SAP
Service Marketplace: https://service.sap.com/bosap-support. This document
includes specific version and patch-level requirements for databases,
applications, web application servers, web browsers, and operating systems.
SAP BusinessObjects Data Services Getting Started Guide
9
1
10. 1
Overview of SAP BusinessObjects Data Services
Interfaces
Related Topics
• Designer Guide: Nested Data
10
SAP BusinessObjects Data Services Getting Started Guide
12. 2
Architecture
Standard components
This section describes SAP BusinessObjects Data Services components
and their distribution on your network.
This section contains the following topics:
•
Standard components
•
Optional components
•
Management tools
•
Operating system platforms
•
Distributed architecture
The architecture is layered to allow data integration to occur over a variety
of open, industry-standard APIs for optimal data and metadata management.
Related Topics
• Standard components
• Optional components
• Management tools
• Operating system platforms
• Distributed architecture
Standard components
The following diagram summarizes the relationships among SAP
BusinessObjects Data Services components.
12
SAP BusinessObjects Data Services Getting Started Guide
13. Architecture
Standard components
For a detailed list of supported environments and hardware requirements,
see the Supported Platforms document available in the SAP BusinessObjects
Support > Documentation > Supported Platforms/PARs section of the SAP
Service Marketplace: https://service.sap.com/bosap-support . This document
includes specific version and patch-level requirements for databases,
applications, web application servers, web browsers, and operating systems.
Related Topics
• Designer
• Repository
• Job Server
• Engine
• Access Server
• Address Server
• Administrator
SAP BusinessObjects Data Services Getting Started Guide
13
2
14. 2
Architecture
Standard components
• Metadata Reports applications
• Service
• SNMP Agent
• Adapter SDK
Designer
The Designer is a development tool with an easy-to-use graphical user
interface. It enables developers to define data management applications that
consist of data mappings, transformations, and control logic.
Use the Designer to create applications containing work flows (job execution
definitions) and data flows (data transformation definitions).
To use the Designer, create objects, then drag, drop, and configure them by
selecting icons in flow diagrams, table layouts, and nested workspace pages.
The objects in the Designer represent metadata. The Designer interface
allows you to manage metadata stored in a repository. From the Designer,
you can also trigger the Job Server to run your jobs for initial application
testing.
Related Topics
• Repository
• Job Server
Repository
The SAP BusinessObjects Data Services repository is a set of tables that
hold user-created and predefined system objects, source and target metadata,
and transformation rules. Set up repositories on an open client/server platform
to facilitate sharing metadata with other enterprise tools. Store each repository
on an existing RDBMS.
Each repository is associated with one or more Job Servers which run the
jobs you create. There are two types of repositories:
•
14
A local repository is used by an application designer to store definitions
of objects (like projects, jobs, work flows, and data flows) and source/target
metadata.
SAP BusinessObjects Data Services Getting Started Guide
15. Architecture
Standard components
•
A central repository is an optional component that can be used to support
multi-user development. The central repository provides a shared object
library allowing developers to check objects in and out of their local
repositories.
Job Server
The SAP BusinessObjects Data Services Job Server starts the data
movement engine that integrates data from multiple heterogeneous sources,
performs complex data transformations, and manages extractions and
transactions from ERP systems and other sources. The Job Server can move
data in either batch or real-time mode and uses distributed query optimization,
multi-threading, in-memory caching, in-memory data transformations, and
parallel processing to deliver high data throughput and scalability.
While designing a job, you can run it from the Designer which tells the Job
Server to run the job. The Job Server gets the job from its associated
repository, then starts an engine to process the job. In your production
environment, the Job Server runs jobs triggered by a scheduler or by a
real-time service managed by the Access Server. In production environments,
you can balance job loads by creating a Job Server Group (multiple Job
Servers) which executes jobs according to overall system load.
Related Topics
• Engine
• Access Server
Engine
When SAP BusinessObjects Data Services jobs are executed, the Job Server
starts engine processes to perform data extraction, transformation, and
movement. The engine processes use parallel processing and in-memory
data transformations to deliver high data throughput and scalability.
SAP BusinessObjects Data Services Getting Started Guide
15
2
16. 2
Architecture
Standard components
Access Server
The SAP BusinessObjects Data Services Access Server is a real-time,
request-reply message broker that collects message requests, routes them
to a real-time service, and delivers a message reply within a user-specified
time frame. The Access Server queues messages and sends them to the
next available real-time service across any number of computing resources.
This approach provides automatic scalability because the Access Server
can initiate additional real-time services on additional computing resources
if traffic for a given real-time service is high. You can configure multiple
Access Servers.
Address Server
The SAP BusinessObjects Data Services Address Server provides address
validation and correction for the Global Address Cleanse EMEA engine and
Global Suggestion Lists. The Address Server must be started prior to
processing data flows that contain the Global Suggestion List transform or
the Global Address Cleanse transform with the EMEA engine enabled.
Administrator
The Administrator provides browser-based administration of SAP
BusinessObjects Data Services resources including:
•
•
Configuring, starting, and stopping real-time services
•
Configuring Job Server, Access Server, and repository usage
•
Configuring and managing adapters
•
Managing users
•
16
Scheduling, monitoring, and executing batch jobs
Publishing batch jobs and real-time services via Web services
SAP BusinessObjects Data Services Getting Started Guide
17. Architecture
Standard components
Metadata Reports applications
The Metadata Reports applications provide browser-based analysis and
reporting capabilities on metadata that is associated with:
•
your SAP BusinessObjects Data Services jobs
•
other SAP BusinessObjects solution portfolio applications associated with
SAP BusinessObjects Data Services
Metadata Reports provide four applications for exploring your metadata:
•
Impact and lineage analysis
•
Operational dashboards
•
Auto documentation
•
Data validation
Impact and Lineage Analysis reports
Impact and Lineage Analysis reports include:
•
Datastore Analysis — For each datastore connection, view overview,
table, function, and hierarchy reports. SAP BusinessObjects Data Services
users can determine:
•
What data sources populate their tables
•
What target tables their tables populate
•
Whether one or more of the following SAP BusinessObjects solution
portfolio reports uses data from their tables:
•
Business Views
•
Crystal Reports
•
SAP BusinessObjects BW Universes Builder
•
SAP BusinessObjects Web Intelligence documents
•
SAP BusinessObjects Desktop Intelligence documents
SAP BusinessObjects Data Services Getting Started Guide
17
2
18. 2
Architecture
Standard components
•
Universe analysis — View Universe, class, and object lineage. Universe
users can determine what data sources populate their Universes and
what reports use their Universes.
•
Business View analysis — View the data sources for Business Views in
the Central Management Server (CMS). You can view business element
and business field lineage reports for each Business View. Crystal
Business View users can determine what data sources populate their
Business Views and what reports use their views.
•
Report analysis — View data sources for reports in the Central
Management Server (CMS). You can view table and column lineage
reports for each Crystal Report and Web Intelligence Document managed
by CMS. Report writers can determine what data sources populate their
reports.nic
•
Dependency analysis — Search for specific objects in your repository
and understand how those objects impact or are impacted by other SAP
BusinessObjects Data Services or SAP BusinessObjects BW Universe
Builder objects and reports. Metadata search results provide links back
into associated reports.
To view impact and lineage analysis for SAP BusinessObjects solution
portfolio applications, you must configure the Metadata Integrator.
Related Topics
• Installation Guide: Installing and Configuring the Metadata Integrator
Operational Dashboard reports
Operational dashboard reports provide graphical depictions of SAP
BusinessObjects Data Services job execution statistics. This feedback allows
you to view at a glance the status and performance of your job executions
for one or more repositories over a given time period. You can then use this
information to streamline and monitor your job scheduling and management
for maximizing overall efficiency and performance.
Auto Documentation reports
Auto documentation reports provide a convenient and comprehensive way
to create printed documentation for all of the objects you create in SAP
18
SAP BusinessObjects Data Services Getting Started Guide
19. Architecture
Standard components
BusinessObjects Data Services. Auto documentation reports capture critical
information for understanding your jobs so you can see at a glance the entire
ETL process.
After creating a project, you can use Auto documentation reports to quickly
create a PDF or Microsoft Word file that captures a selection of job, work
flow, and/or data flow information including graphical representations and
key mapping details.
Data Validation dashboard
Data Validation dashboard reports provide graphical depictions that let you
evaluate the reliability of your target data based on the validation rules you
created in your SAP BusinessObjects Data Services batch jobs. This feedback
allows business users to quickly review, assess, and identify potential
inconsistencies or errors in source data.
Metadata Integrator
The Metadata Integrator allows SAP BusinessObjects Data Services to
seamlessly share metadata with SAP BusinessObjects business intelligence
(BI) solutions. Run the Metadata Integrator to collect metadata into the SAP
BusinessObjects Data Services repository for Business Views and Universes
used by Crystal Reports, SAP BusinessObjects Desktop Intelligence
documents, and SAP BusinessObjects Web Intelligence documents.
SAP BusinessObjects Data Services Getting Started Guide
19
2
20. 2
Architecture
Standard components
Service
The SAP BusinessObjects Data Services Service is installed when Job and
Access Servers are installed. The Service starts Job Servers and Access
Servers when you restart your system. The Windows service name is Data
Services Service. The UNIX equivalent is a daemon named AL_JobService.
20
SAP BusinessObjects Data Services Getting Started Guide
21. Architecture
Optional components
SNMP Agent
SAP BusinessObjects Data Services error events can be communicated
using applications supported by simple network management protocol
(SNMP) for better error monitoring. Install an SAP BusinessObjects Data
Services SNMP agent on any computer running a Job Server. The SNMP
agent monitors and records information about the Job Servers and jobs
running on the computer where the agent is installed. You can configure
network management software (NMS) applications to communicate with the
SNMP agent. Thus, you can use your NMS application to monitor the status
of jobs.
Adapter SDK
The SAP BusinessObjects Data Services Adapter SDK provides a Java
platform for rapid development of adapters to other applications and
middleware products such as EAI systems. Adapters use industry-standard
XML and Java technology to ease the learning curve. Adapters provide all
necessary styles of interaction including:
•
reading, writing, and request-reply from SAP BusinessObjects Data
Services to other systems
•
request-reply from other systems to SAP BusinessObjects Data Services
Optional components
Multi-user
SAP BusinessObjects Data Services Multi-user is an advanced optional
component that enables your development team to work together on
interdependent parts of an application through all phases of development.
While each user works on applications in a unique local repository, the team
uses a central repository to store the master copy of the entire project. The
central repository preserves all versions of an application's objects, so you
can revert to a previous version if needed.
SAP BusinessObjects Data Services Getting Started Guide
21
2
22. 2
Architecture
Management tools
Multi-user development includes other advanced features such as labeling
and filtering to provide you with more flexibility and control in managing
application objects.
For more details, see the Management Console: Administrator Guide and
the Advanced Development Guide.
Management tools
SAP BusinessObjects Data Services has several management tools to assist
you in managing your components.
License Manager
The License Manager displays the SAP BusinessObjects Data Services
components for which you currently have a license.
Repository Manager
The Repository Manager allows you to create, upgrade, and check the
versions of local and central repositories.
Server Manager
The Server Manager allows you to add, delete, or edit the properties of Job
Servers and Access Servers. It is automatically installed on each computer
on which you install a Job Server or Access Server.
Use the Server Manager to define links between Job Servers and repositories.
You can link multiple Job Servers on different machines to a single repository
(for load balancing) or each Job Server to multiple repositories (with one
default) to support individual repositories (separating test from production,
for example).
You can also specify a Job Server as SNMP-enabled.
22
SAP BusinessObjects Data Services Getting Started Guide
23. Architecture
Operating system platforms
The Server Manager is also where you specify SMTP server settings for the
smtp_to email function..
Related Topics
• Designer Guide: Monitoring Jobs, SNMP support
• Reference Guide: To define and enable the smtp_to function
Operating system platforms
For a detailed list of supported environments and hardware requirements,
see the Supported Platforms document available in the SAP BusinessObjects
Support > Documentation > Supported Platforms/PARs section of the SAP
Service Marketplace: https://service.sap.com/bosap-support . This document
includes specific version and patch-level requirements for databases,
applications, web application servers, web browsers, and operating systems.
Distributed architecture
SAP BusinessObjects Data Services has a distributed architecture. An Access
Server can serve multiple Job Servers and repositories. The multi-user
licensed extension allows multiple Designers to work from a central repository.
The following diagram illustrates both of these features.
SAP BusinessObjects Data Services Getting Started Guide
23
2
24. 2
Architecture
Distributed architecture
You can distribute software components across multiple computers, subject
to the following rules:
•
•
Engine processes run on the same computer as the Job Server that
spawns them
Adapters require a local Job Server
Distribute components across a number of computers to best support the
traffic and connectivity requirements of your network. You can create a
minimally distributed system, designed for developing and testing or a highly
distributed system designed to scale with the demands of a production
environment.
24
SAP BusinessObjects Data Services Getting Started Guide
25. Architecture
Distributed architecture
Host names and port numbers
Communication between a Web application, the Access Server, the Job
Server, and real-time services occurs through TCP/IP connections specified
by IP addresses (or host names) and port numbers.
If your network does not use static addresses, use the name of the computer
as the host name. If connecting to a computer that uses a static IP address,
use that number as the host name for Access Server and Job Server
configurations.
To allow for a highly scalable system, each component maintains its own list
of connections. You define these connections through the Server Manager,
the Administrator, Repository Manager, and the Message Client library calls
(from Web client).
Related Topics
• Installation Guide: Preparing to Install the software, Check port assignments
SAP BusinessObjects Data Services Getting Started Guide
25
2
28. A
Glossary
ABAP
Advanced Business Application Programming. A fourth-generation
programming language developed by SAP in which SAP Applications
are written.
ABAP data flow
A data flow that extracts data from an SAP Applications source table.
Data Services translates steps you define in an ABAP data flow into
ABAP and then passes the ABAP program back to your SAP
Application system for execution. The resulting table or file resides
on the SAP Application system to be used as a source in the parent
data flow.
ABAP program
A program that executes database operations on an SAP
Applications server. Data Services ABAP data flows generate ABAP
programs.
Access Server
The Access Server dispatches requests to real-time services,
ensuring optimal load balancing and complete life cycle management.
Adapter
An external Data Services interface. There are two types of adapters:
• Custom adapters — Adapters developed using the Adapter SDK
(Software Development Kit)
• Prepackaged adapters — Adapters prebuilt and purchased from
SAP, such as the Data Services Salesforce.com adapter
Address Cleanse
Transforms that produce a correct and complete standardized form
of an input address. The transform can also assign codes for postal
automation and append other useful address information.
address line
A line of data in an address that contains the primary and, possibly,
secondary address. The primary address contains components such
as the primary range, primary name, directionals (post- and pre-),
and the suffix. The secondary address normally contains components
such as the unit designator and the secondary range.
Address Server
28
SAP BusinessObjects Data Services Getting Started Guide
29. Glossary
A process that provides address validation and correction for the
Global Address Cleanse transform's EMEA engine and Global
Suggestion Lists transform.
Administrator
A browser-based system administration application on the Data
Services Management Console. Use the Administrator to do the
following:
• Execute, schedule and monitor batch jobs
• Add connections to repositories
• Configure the profiler
• Define users for multi-user development (central repository)
• Manage the retention of logs files
• Monitor Access Server status and inbound/outbound messages
• Configure Adapter instances (a prerequisite for creating adapter
datastores)
• Configure SAP application client interfaces (to read IDocs)
• Configure, start, stop and monitor real-time services
• Configure Data Services jobs callable as webservices and
generate WSDL
• Set up the SAP RFC Server (to load data into or read data from
an SAPNetWeaver BW system).
after-image
The values in an UPDATE row after the row changes. You use
before- and after-images of UPDATE rows for log-based
changed-data capture (CDC) jobs which Data Services supports.
aggregate function
A function that summarizes data (sums, calculates an average,
identifies a maximum value, and so on). Where possible, Data
Services pushes down the execution of the aggregate function to
the underlying Relational Database Management System).
aggregated data
Data that results when a process combines elements. This data can
be presented collectively or in summary form.
ALE (Application Link Enabling)
An SAP Applications programming-related interface designed to
allow reliable communication across a distributed environment.
Implemented in Data Services with the iDoc interface.
SAP BusinessObjects Data Services Getting Started Guide
29
A
30. A
Glossary
alias
Alternate form or name. Data Services uses aliases in multiple ways,
including the following:
• Aliases are alternate forms that could potentially be matched to
the word. For example, Robert is a personal name alias for Bob.
Alias data is output in the Match_Std fields.
• In the Address Cleanse transforms, an alias is an alternative form
of a primary address line. Aliases apply only to primary addresses
(usually streets), not secondary addresses or last lines.
• You can also create multiple aliases for table owners in a
datastore and then use datastore configurations to change the
alias values. By using aliases instead of real owner names, you
limit the amount of time it takes to port jobs to different
environments.
AMAS
Australia Post’s Address Matching Approval System (AMAS). To
receive postal discounts in Australia, you are required to file an
AMAS report.
application
Another term for a software program.
association matching
A method of matching that combines the results of two or more Match
transforms by using the Associate transform. Association matching
is used to find duplicates based multiple different match criteria (for
example based on Name+Address and then SSN+DOB) and bring
them together.
A common use for association matching is to identify customers who
have multiple residences. Examples of such customers could include
students and snowbirds.
attribute
A property created for a type of object.
BAPI
Business Application Programming Interface. A standardized SAP
Applications programming interface that allows non-SAP applications
to access specific business processes and data.
Basis
30
SAP BusinessObjects Data Services Getting Started Guide
31. Glossary
The SAP infrastructure. Basis is the foundation for all SAP products
based on ABAP.
batch
Executes one job or a series of jobs all at one time. After batch
processing begins, it continues until it is done or until an error occurs.
batch job
The unit of work that can be scheduled independently for execution
by the Administrator. Jobs are special work flows that can be
scheduled for execution, but cannot be called by other work flows
or jobs.
before-image
The values in an UPDATE row before the row changes. You use
before- and after-images of UPDATE rows for log-based
changed-data capture (CDC) jobs which Data Services supports.
best record
Contains the most complete, accurate, and up-to-date information.
A best record is created by consolidating data elements from
matching records into a single record. For example, suppose you
found two records that match. One record has a phone number that
is different and more current than the other. You can move the more
current phone number into the other record to create your best
record.
A master record in a match group is also considered a “best” record,
based on the best record priority assigned to the source that the
record was in.
best record priority
Best record priority is a way for you to designate data from a
particular source as having more importance than other data. For
example, because your data warehouse meets your standards for
data, it might carry more weight in the matching process than would
a rented source.
The smaller the priority number, the higher the priority, and the more
likely that records from that source will rise to the top of their match
groups to become master records. Assign a priority of 0 to your best
source, and larger numbers to other sources.
The blank penalty can affect the value of the best record priority.
blank penalty
SAP BusinessObjects Data Services Getting Started Guide
31
A
32. A
Glossary
In the Match transform, tells Data Services that records with blank
fields should be considered less important (as driver or as Master
record) than records with completed fields (blank data = bad data).
Blank penalties increase the value of the best record priority for the
source that the blank field exists in, thereby reducing the priority of
the source. Lowering the priority of a source helps ensure that the
records in that source will not become the master record (or “best”
record) of a match group.
BLOB
A field whose data consists of Binary Large Objects—such as bitmap
graphics, images, OLE objects, metafiles, and so on.
blueprint
A sample Data Quality job that can be used by Data Services without
modification.
Boolean expression
An expression that defines a logical relationship between two or
more items. The expression is either TRUE or FALSE.
breadcrumb
A visual path of your location in the application.
break group
Places records into groups that are more likely to match. For
example, you might want to create a break group based on the first
three digits of the postcode. This break group will ensure that records
with a postcode of 546 are never even compared with records that
have a postcode of 611, saving valuable processing time for all but
the smallest jobs.
Break groups consist of driver and passenger records. Fields
commonly used for creating break groups are postcodes, account
or Social Security numbers, or the first two positions of a street name.
break key
A user-defined field that is used to create break groups. Create a
break key if the data you want to break on is contained in multiple
fields, such as the postcode and street name.
bulk loading
A software-based mechanism that moves large amounts of data into
a database to achieve optimal performance. Bulk loading is faster
32
SAP BusinessObjects Data Services Getting Started Guide
33. Glossary
than traditional INSERT statements. This mechanism supports
compression, blocking, and buffering to optimize transfer times.
business component
A set of tables Siebel applications use to create a logical object
called a business object.
business rules
1. Settings within your Data Quality transforms that explain how you
want to process your data. These include things like telling the
Global Address Cleanse transform how to case output data, or
setting up match criteria for a matching process.
2. Business rules can also be used to group validation rules from
Validation transforms for display in the Data Validation reports
in the Management Console.
Business views
Business views in Crystal Reports enable you to control the
presentation of your database to report designers and users.
case-sensitive
Pertaining to the differentiation between upper-case and lower-case
letters. A case-sensitive program differentiates between upper-case
and lower-case letters when evaluating a text string.
CASS
A United States Postal Service (USPS) certification that requires
software vendors to go through a series of tests to prove that their
software correctly codes addresses according to USPS requirements,
and produces the required USPS reports. Long form: Coding
Accuracy Support System
CDC checkpoint
A CDC checkpoint enables Data Services to restrict CDC subscription
reads. After you enable a checkpoint, the next time the CDC job
runs, it reads only the rows inserted into the CDC table since the
last checkpoint.
CDC datastore
A CDC datastore allows you to limit extracted data to changed data
only. A CDC datastore connects a changed-data capture table on a
source database to Data Services.
CDC subscription
SAP BusinessObjects Data Services Getting Started Guide
33
A
34. A
Glossary
A CDC subscription is an option on a source CDC table. You can
define multiple subscriptions on the same CDC table to allow different
data flows to extract data from the same table without corrupting
data extracted by other data flows. A subscription defines the start
and end of your data set, and it is often used with the check-point
option.
changed-data capture (CDC)
The process of retrieving changes made to a production data source.
This process consolidates units of work, ensures data is synchronized
with the original source, and reduces load times by loading only
changed data in a warehouse environment.
Citrix MetaFrame XP
Citrix MetaFrame XP software provides an access infrastructure for
enterprise applications. You can use this software to run Data
Services on a server which publishes instances of the Designer and
other Data Services interfaces to users on client computers.
classifications
Indicators to Data Cleanse of the types of situations that apply to
this word. For example, Hewlett is assigned the Firm_Name and
Name_Weak_Family_Name classifications, because it can be used
in both firm and personal names.
client/server
A distributed technology approach where the processing is divided
by function. The server performs shared functions (such as managing
communications and providing database services), while the client
performs individual user functions.
command
A directive given to a program to initiate an action.
Communication Structure
In SAP NetWeaver BW, a data structure that defines a set of
InfoObjects available from an InfoSource to put into InfoCubes.
compare buffer
A part of memory reserved for processing break groups (one break
group at a time) in the Match or Associate transform. A larger buffer
typically helps improve performance.
conditional
34
SAP BusinessObjects Data Services Getting Started Guide
35. Glossary
A single-use object, available in work flows, that allows you to branch
the execution logic based on the results of an expression. The
conditional takes the form of an if/then/else statement.
constant
A data string that does not change from one record to the next.
content type
Specifies the type of data in a field in your data source. This helps
you map your fields when you set up downstream transforms.
contribution value
A value you assign to a match criteria that represents the importance
(or weight) you place on that criteria’s data. For example, your
organization may place a high degree of importance on the customer
number. For these types of criteria you would assign a higher
contribution value to reflect a higher importance.
The contribution value is part of weighted scoring.
Crystal Reports
A reporting tool that allows users to create feature-rich reports and
integrate them into web and Windows applications.
Ctrl-click
An action to select multiple values within an application. This
accomplished by pressing the Control key and using the mouse.
cube
1. A multi-dimensional or OLAP database in which data is
summarized, consolidated, and stored in "dimensions" (each
representing information such as customer or product line) and
"measures" (for example sales, cost, or profit), enabling improved
processing time and storage space requirements over traditional
data storage methods such as relational databases.
2. The combination of indexes (dimensions and measures) stored
in SAP NetWeaver BW Accelerator.
custom ABAP program
A custom ABAP program runs an ABAP program and generates a
data set. With a custom ABAP program, you can run an existing
ABAP program as part of a job. Use a custom ABAP program as a
source in a data flow or an ABAP data flow.
custom adapter
SAP BusinessObjects Data Services Getting Started Guide
35
A
36. A
Glossary
An adapter developed using the Data Services Adapter Development
Kit.
custom function
A script you create to evaluate or make calculations on input values
and produce a return value.
Data Cleanse
A transform that identifies and isolates specific parts of mixed data,
and then standardizes the data based on information stored in the
parsing dictionary, business rules defined in the rule file, and
expressions defined in the pattern file.
data extraction
The process of moving data from a database or application source
to a database target (either from a legacy database to a data mart,
or from one data mart to another).
data flow
A reusable object containing steps to define the transformation of
data from source to target. Data flows are called from inside a work
flow or job. You can pass information into or out of data flows using
parameters.
data loading
The process of populating a data warehouse. Data loading is
provided by DBMS-specific load processes, DBMS insert processes,
and independent fast-load processes.
data mapping
The process of assigning a source data element to a target data
element.
data mart
A highly-focused version of a data warehouse. Typically, created by
a department or division of a company, data marts contain data for
a specific subject area, such as finance or sales. Data Services can
populate a data mart.
data movement
The aspect of the data integration process that includes extraction,
data transformation, and loading (ETL). That which the application
accomplishes as a whole. Do not confuse with data transformation,
which is what happens within one phase of a data flow.
data record
36
SAP BusinessObjects Data Services Getting Started Guide
37. Glossary
A row of data that is constructed at runtime. The data remains in the
form of the data record throughout the Data Services job.
data salvage
The process of temporarily copying data from a passenger record
to the driver record after the two records are compared. The data
that’s copied is data that is found in the passenger record, but is
missing or incomplete (initials, for example) in the driver record. Data
salvaging prevents blank matching or initials matching from matching
records that you may not want to match.
Data Services
A software system that allows users to build and execute applications
with which they can create and maintain data warehouses.
Data Services consists of several components:
Data Services engine
The core process that reads job information from the Data Services
repository and sets up run-time processes that execute the job. The
run-time processes extract, transform, and load relational and
hierarchical data. The Job Server starts the Data Services engine
to execute batch or real-time jobs.
Data Services interface
A program that Data Services uses to access data sources. Specific
interfaces vary by installation. There are internal interfaces (those
native to the installation) and external interfaces (those that you
install separately). Internal interfaces allow Data Services to access
applications like SAP Applications and SAP NetWeaver BW,
messages, relational database systems, and legacy systems. An
external interface is also known as an adapter. It allows Data
Services to access applications using information exchange
technologies such as JMS (Java Messaging Services) or
Salesforce.com.
Data Services repository
The database that contains information about a Data Services
application. The repository contains information about defined
reusable objects, the metadata for sources and targets, transforms
and functions. The repository also contains the job history and
runtime statistics information. When you invoke Data Services, you
log in to the repository containing the objects you want to use. You
can use a local repository or a central (shared) repository.
SAP BusinessObjects Data Services Getting Started Guide
37
A
38. A
Glossary
The Data Services profiler uses a profiler repository to store profiling
data. The Cleansing Packages repository stores reference data for
the data cleansing transform.
All repositories are created and maintained with the Repository
Manager.
Data Services service
The process that ensures that the Access Server and the Job Server
are running. You can configure the Data Services service to restart
the Access Server and Job Server whenever the computer where
they are located restarts.
data set
Rows of data with a defined schema. A step in a data flow—such
as reading data from a source, joining data in a Query transform, or
transforming data though another transform—yields a data set. You
can view individual data sets by placing a target table or file at that
point in the data flow.
data source name (DSN)
Provides connectivity for a Windows user to a database through an
Open Database Connectivity (ODBC) driver. The DSN may contain:
database name, directory, database driver, user ID, password, and
other information.
data transformation
The phase of the data movement process that occurs between
extraction and loading. Do not confuse with data movement, which
is what the data flow accomplishes as a whole. Data transformation
describes a process, while a transform is a tool (a step, icon, or
object) in Data Services that enacts the transformation (such as
query, merge, or data cleanse).
data transport
A step in an ABAP data flow that defines a target to store the data
set extracted during the flow. You can locate the target file on the
SAP Application server or in a location accessible to both the SAP
Application server and to Data Services across a network.
data type
The format used to store a value. Data types can imply a default
format for displaying and entering the value. Data read from a source
is converted to the appropriate Data Services data types; data loaded
38
SAP BusinessObjects Data Services Getting Started Guide
39. Glossary
to a target is converted from its Data Services data type to the type
appropriate for the target.
data validation
Defining rules to which correct data should conform. In Data Services,
you define these rules in the Validation transform. You can separate
data that passes the validation rules from failed data.
Data Validation dashboard
A category of graphical reports in the Management Console to
evaluate the reliability of your target data based on the validation
rules you created in your Data Services batch jobs. This feedback
allows business users to quickly review, assess, and identify potential
inconsistencies or errors in source data.
data warehouse
A Data Warehouse houses a standardized, consistent, clean and
integrated form of data sourced from various operational systems
in use in the organization, structured in a way to specifically address
the reporting and analytic requirements. Data Services can populate
a data warehouse.
database
A collection of tables managed by a DBMS such as Microsoft SQL
Server or Oracle.
database link
Communication path from one database server to another. The
datastores in a database link relationship are called linked datastores.
Data Services uses linked datastores to enhance its performance
by pushing down operations to a target database using a target
datastore.
DataConnector
DataConnector operator instances are used to read data files
generated by Data Services when performing bulk loading using the
Teradata Warehouse Builder.
datastore
A logical channel connecting Data Services to a source or target
application. Different datastore types include database, application,
web service, and adapters. The datastore definition typically includes
the name and location of the database as well as user authentication
information. Data Services uses a datastore definition to qualify a
SAP BusinessObjects Data Services Getting Started Guide
39
A
40. A
Glossary
table name wherever a table is indicated in a diagram or expression.
You can access the datastore definition through the object library.
datastore configuration
Defines a connection to a particular database from a single datastore.
DBMS (database management system)
A software system that builds and maintains database tables.
debug mode
Allows you to diagnose errors while executing a job using the
interactive debugger features in the Designer.
degree of parallelism (DOP)
A property of a data flow that defines how many times each transform
defined in the data flow replicates for use on a parallel subset of
data.
For example, if you set the Degree of parallelism to 4, then when
the job executes, Data Services replicates each transform in the
data flow four times. Each of these replicated transforms executes
in parallel using a separate thread. The operating system will
distribute the threads among the available CPUs.
delimited flat file
A data file in which each column value is separated by a delimiter,
such as a comma, semicolon, tab, space, and so on. Each row starts
a new line.
delimiter
Data Services has three types of delimiters: column, row, and text
(character string). To separate columns, a delimiter can be a tab,
semicolon, comma, space, or any character sequence. To separate
rows of data, a delimiter can be a {new line} or any other character
sequence. To denote the start and end of a character string, a
delimiter can be single quotation marks ('), double quotation marks
("), or {none}.
delivery point code
A two-digit number derived from the primary range (house number).
This number is used in the generation of a DPBC barcode.
Delivery Point Validation (DPV)
A technology that assists you in validating the accuracy of your
address information with the USA Regulatory Address Cleanse
40
SAP BusinessObjects Data Services Getting Started Guide
41. Glossary
transform. With DPV, you can identify addresses that are
undeliverable as addressed and determine whether or not an address
is a Commercial Mail Receiving Agency (CMRA).
Designer
A graphical user interface that allows you to design and test Data
Services jobs.
destination record
A location where you place your updated or “best” data when creating
a best record. A destination record can be either a master record, a
subordinate record, or both in a match group.
diacritical character
A character that contains an accent, dieresis (umlaut), tilde, cedilla,
or other distinguishing marks (for example, ä or Ç). You can choose
to have standardized data with these types of characters. The
application uses the Latin-1 code page for assigning these accents.
diagram
The icons and connections between the icons that make up the
definition of a job, work flow, or data flow. Diagrams appear in the
Designer workspace.
dictionary
Relational database that contains a lexicon of words and phrases
that the data cleansing packages and the Data Cleanse transform
use to identify, parse, and standardize data.
directional
A component of the address line that indicates direction. For
example, North in “211 N. 115th St.”
discrete field
Input or output data that has separate fields for each piece of
information, such as addresses and names.
discrete format
Input source format in which pieces of data are parsed down to
nearly the most distinct level. For example, a “first name” field would
be discrete, whereas a “name” field that could contain first, middle,
or last name information would not be discrete.
domain value
SAP BusinessObjects Data Services Getting Started Guide
41
A
42. A
Glossary
In PeopleSoft, the category name (or link) between a value and its
description.
downstream
A data flow object, such as a transform, that is placed after another
data flow object in a job.
DPBC (Delivery Point Barcode)
A form of Postnet barcode, consisting of 62 bars and based on the
combination of ZIP Code, ZIP+4, DPBC, and a check digit.
drill down
A method of exploring detailed data that was used in creating a
summary level of data. Drill-down levels depend on the granularity
of the data in the data warehouse.
driver record
A record that drives the comparison process. Driver records are part
of a break group and are compared with passenger records to
determine matches.
Driver records are chosen based on the driver order you assign to
a source. (In general, a source with your best data should be used
first.) After a driver record has been compared with all of the
passenger records, the next passenger record in the break group
becomes the driver record.
If you do not reorder your break groups using Group Prioritization,
the driver record is the first record in the break group.
DTD
Document type definition. A text file that describes the elements
(tags) in an XML document and the relationship among them. When
an XML document is used to describe a transaction, the DTD
describes the data schema used in the transaction.
dual address
A dual address occurs when a record contains two address lines.
Two combinations are typical:
• PO box and street address:
1000 Main Street, Suite 51
PO Box 2342
42
SAP BusinessObjects Data Services Getting Started Guide
43. Glossary
•
Rural route or Highway Contract and street address:
RR 1 Box 345
12784 Old Columbus Road
dual names
Two names included on an address line, for example, John and Jane
Doe.
Early Warning System (EWS)
A solution for matching valid delivery points that have been created
between updates to the national ZIP+4 directory. EWS uses four
months of rolling data found in an intermediate directory that is
updated weekly with data from the USPS.
EDI
Electronic Data Interchange. Electronic exchange of structured data
between businesses. This exchange is not dependent on hardware,
software, or communication protocols.
element
A component found within XML Schemas and DTDs.
eLOT
Enhanced Line of Travel (eLOT) takes Line of Travel one step further
in the presorting process. The original line of travel (LOT) narrowed
down the mail carrier’s delivery route to the block face level (ZIP+4
level) by discerning whether an address resided on the odd or even
side of a street or thoroughfare.
eLOT narrows the mail carrier’s delivery route walk sequence to the
house (delivery point) level. This allows you to sort your mailings to
a more precise level.
embedded data flow
A data flow with an open begin or an open end point that can be
used inside another dataflow. An embedded dataflow can be a
combination of sources or targets and transforms, and is mainly
used to reduce the visual complexity of a diagram in a dataflow. An
embedded dataflow can be re-used in multiple other dataflows.
Enterprise application
Enterprise applications enable enterprises to execute and optimize
business and IT strategies in domains like ERP (Enterprise Resource
SAP BusinessObjects Data Services Getting Started Guide
43
A
44. A
Glossary
Planning), CRM (Customer Relationship Management) or SCM
(Supply Chain Management). Enterprise applications usually store
data in a relational database optimized for operational use. SAP
provides these solutions through the SAP Business Suite. Data
Services supports both SAP's own solution as well as third-party
solutions like Oracle e-Business Suite, Siebel, JD Edwards or
PeopleSoft.
ERP system (Enterprise resource planning system)
. An enterprise application from which Data Services can extract
data. SAP offers this system as a solution part of the SAP Business
Suite.
exception
An error that occurs while executing a job. You can catch individual
or groups of exceptions using a try/catch block inside a work flow.
Catching an exception allows you to automatically execute a solution
for the error.
expression
A combination of variables, parameters, constants, and functions
linked by operation symbols and any required punctuation that
describe a rule for calculating a value. Expressions are used in
conditionals, functions, scripts, transforms, and while conditions to
route information and change fields.
extract date
The date that data was extracted.
extract frequency
The interval at which data is extracted, such as daily, weekly,
monthly, or quarterly. The frequency that data extracts are needed
in the data warehouse is determined by the shortest frequency
requested through an order, or by the frequency required to maintain
consistency of the other associated data types in the source data
warehouse.
fault code
A numeric value that is assigned to a record after the USA Regulatory
Address Cleanse transform validation process that signifies that the
particular record was not successfully validated. Each numeric value
represent a different type of fault.
file format
44
SAP BusinessObjects Data Services Getting Started Guide
45. Glossary
A description of how data is or should be organized in a file Data
Services reads from or loads to. A file format can be specific to a
single file or generic for many files.
filter
An expression that limits the data returned.
fixed-width flat file
A data file in which each column of data is the same width.
flat file
A flat file is a file containing records, generally one record per line.
Fields may have a fixed width with padding, or be delimited by tabs,
commas (CSV), or other characters. There are no structural
relationships. The data is “flat” like a sheet of paper, rather than to
more complex models such as a relational database.
FSA (Forward Sortation Area)
The first three characters of a Canadian alphanumeric postal code.
For example, K1A in the postal code for Canada Post’s Ottawa
headquarters, K1A 0B1.
function
A program that operates on values that are passed to it. Data
Services functions are available through a function wizard in a script,
conditional, or Query transform. Data Services also gives you access
to functions provided by the DBMS you are using. In addition, you
can define your own functions using the Data Services scripting
language.
gathering
Recombines terms that belong together, such as alphanumeric terms
that you would look up together in the dictionary. For example, if
Data Cleanse breaks 1st into "1" and "st", then gathering recombines
them to 1st.
gender
A code that indicates the likelihood of a record being a certain
gender. This code is derived from the name and has five possible
values: strong male, strong female, weak male, weak female,
ambiguous, and unassigned. For example, a record marked as
“strong male” indicates a high likelihood that the person is male.
generated field
SAP BusinessObjects Data Services Getting Started Guide
45
A
46. A
Glossary
A field that is generated on output by a transform. For example, a
postcode field generated by the Global Address Cleanse transform.
GeoCensus
A directory that contains latitude, longitude, census tract, and block
information. That information sets the stage for mapping,
demographic marketing, and other applications of your address data.
global suggestion lists
Global suggestion lists offer a way to complete and populate
addresses with minimal data, or it can offer suggestions for possible
matches. This address-entry system is ideal in call center
environments or any transactional environment where data cleansing
is necessary at the point of entry. It's also a research tool to manage
bad addresses from a previous batch process.
Global suggestion lists are available with the Global Suggestion Lists
transform.
highest level object
The object that is not a dependent of any object in the object
hierarchy.
host name
The computer’s network name (or IP address). Used most often in
Data Services to specify a computer where the Web application, the
Access Server, the Job Server, and real-time services reside.
hybrid format
A format for records in which some fields are discrete, whereas
others are in a multiline format.
IDoc
Intermediate Document. An SAP-specific format. Used for EDI
(Electronic Data Interchange) and ALE (Application Link Enabling).
IDoc type
Indicates the SAP format that is used to interpret the data of a
business transaction. Consists of the following components:
• A control record: Identical for each IDoc type.
• Several data records: A single data record consists of a fixed key
part and a variable data part. The data part is interpreted using
segments, which differ depending on the IDoc type selected.
46
SAP BusinessObjects Data Services Getting Started Guide
47. Glossary
•
Several status records: Identical for each IDoc type. Describe
the status states an IDoc has already passed through or the
status an IDoc has attained.
impact and lineage analysis
The category of reports on the Management Console that shows
the relationship between source and target tables on Data Services,
and with SAP BusinessObjects Enterprise objects such as universes,
business views, and reports.
import
The process of acquiring information for the Data Services repository.
Import the following kinds of information into Data Services:
• The metadata for source and target databases
• Descriptions and code for user-defined and DBMS functions and
transforms
• ATL or XML files with definitions of Data Services objects that
were previously exported out of a another Data Services
repository.
InfoArea
In SAP NetWeaver BW, an element for grouping meta-objects in the
BW system. Each InfoProvider is assigned an InfoArea. The resulting
hierarchy is displayed in the Data Warehousing Workbench.
In addition to their properties as an InfoProviders, InfoObjects can
also be assigned to different InfoAreas.
InfoCube
In SAP NetWeaver BW, a type of InfoProvider.
An InfoCube describes a self-contained dataset (from the reporting
view), for example, for a business-oriented area. This dataset can
be evaluated with the BEx query.
An InfoCube is a set of relational tables that are created in
accordance with the star schema: a large fact table in the center,
with several dimension tables surrounding it.
InfoObject
In SAP NetWeaver BW, Business evaluation objects (for example,
customers or sales) are called InfoObjects.
SAP BusinessObjects Data Services Getting Started Guide
47
A
48. A
Glossary
InfoObjects are subdivided into characteristics, key figures, units,
time characteristics, and technical characteristics (such as request
numbers).
InfoPackage
In SAP NetWeaver BW, describes which data in a DataSource should
be requested from a source system. The data can be precisely
selected using selection parameters (for example, only controlling
area 001 in period 10.1997).
An InfoPackage can request the following types of data
• Transaction data
• Attributes for master data
• Hierarchies for master data
• Master data texts
InfoPackages are also used to start Data Services jobs to load data
into SAP NetWeaver BW.
InfoSource
In SAP NetWeaver BW, a structure that consists of InfoObjects and
is used as a non-persistent store to connect two transformations.
input fields
Original fields in your input sources.
interactive debugger
A Designer feature that allows you to step through the data of a job
one row at a time using filters and breakpoints on a line. Like
executing a job, you can start the interactive debugger from the
Debug menu when a job is active in the workspace. While in debug
mode, all other Designer features are set to read-only.
interface
Data Services offers two types of interfaces:
An internal Data Services interface allows you to create datastore
connections to natively supported applications.
An external Data Services interface (or adapter) allows Data Services
to communicate with information exchange technologies such as
the Salesforce.com adapter.
intersource match
48
SAP BusinessObjects Data Services Getting Started Guide
49. Glossary
Match between records of different sources.
intrasource match
Match between records within a source.
JDBC
A Java API developed by Sun Microsystems that acts as an interface
between a developer’s Java code and a database. It provides a
mechanism for the developer to connect to a specified database,
request information about the database, and then select information
from it. Long form: Java Database Connectivity
job
The unit of work that can be scheduled independently for execution
by the Administrator. Jobs are special work flows that can be
scheduled for execution, but cannot be called by other work flows
or jobs.
Job Server
A process that receives requests from the Designer and the
Administrator to start and stop jobs. To start batch or real-time jobs,
the Job Server triggers the Data Services engine. Engine processes
run on the same computer as the Job Server process that triggers
them.
join rank
A value given to or calculated for all data sets in a data flow. Data
Services uses the join rank to determine which source to read first
when assembling the data set in a join. Data Services uses the
source with the lower join rank as the inner source of the join and
uses the source with the higher join rank as the outer source of the
join.
key
A value used to identify a record in a database.
key figure
In SAP NetWeaver BW, an InfoObject that represents a numeric
fact.
lastline
The lastline of an address contains components such as the locality,
region, and postcode (and it may contain the country name).
license-controlled feature
SAP BusinessObjects Data Services Getting Started Guide
49
A
50. A
Glossary
A Data Services feature that is enabled or disabled based on the
product license. The product license controls which icons and settings
are available in Data Services as an internal Data Services interface.
line of travel (LOT)
A sorting sequence in which ZIP+4 codes are arranged in the order
that they are served by the mail carrier. LOT sequencing is required
for some bulk mailing discounts.
linked datastores
The datastores in a database link relationship. A database link stores
information about how to connect to a remote data source, such as
its host name, database name, user name, password, and database
type. Data Services uses linked datastores to enhance its
performance by pushing down operations to a target database using
a target datastore.
Local Delivery Unit (LDU)
The last three characters of a Canadian alphanumeric postal code.
For example, 0B1 in the postal code for Canada Post’s Ottawa
headquarters, K1A 0B1.
locale
A set of parameters that define the user's language, country and
any special variant preferences that the user wants to see in their
user interface. A locale identifier consists of a codepage, a language
identifier and a region identifier.
locality
A part of the address line of a record. Locality most often refers to
the city or town. In some countries, such as the United Kingdom,
locality can extend to include district.
Locatable Address Conversion System (LACS)
A database of addresses that have been permanently converted,
usually due to 911 emergency system implementation. The changes
often consist of conversion from rural-style addressing to
standardized, city-style addressing, or renumbering of existing
city-style addresses.
lookup table
Contains data that other tables can reference with lookup functions
that return one or more output columns.
mail piece unit
50
SAP BusinessObjects Data Services Getting Started Guide
51. Glossary
Typically referred to as a version identifier for printers, it represents
the unique characteristics of a portion of a mailing. Every segment
within a Mail.dat must have at least one mail piece unit.
mapped field
A field in a specific transform, for which it has been defined which
field it should read from upstream transforms.
master record
The first record in a match group. You can control which record is
the master record by using the Group Prioritization operation in the
Match transform.
match criteria
A group of options that determine the rules for matching on particular
data.
match group
A group of records found to be matching with each other. A match
group consists of a master record and subordinate records.
match level
A Match level designates the level in "hierarchically" type matching.
One Match set can have one or more match levels. Duplicates that
are found at one level are passed to the next level, where they are
compared based on that level’s keys, and so on. For example, you
could use multiple match levels if you wanted to detect duplicates
at the household (residence), family, and individual level.
The order of the match levels is important because duplicates are
found at each level, and only the results are made available for the
next level. Usually, you will define your “broadest” match levels first,
followed by more specific match levels.
match set
A group of criteria used to perform matching on your data.
A typical setup might have only select data reaching each match set
for comparison. For example, you might want to exclude blank SSNs
(Social Security Numbers), certain foreign addresses, and so on
from reaching a particular match set. A match set also allows for
multiple match sets to be considered for association in a combined
match set.
matching record
SAP BusinessObjects Data Services Getting Started Guide
51
A
52. A
Glossary
A group of records found to be matches based on the criteria and
business rules you choose. The records do not necessarily have the
same data.
memory datastore
A datastore connection/container for memory tables.
memory table
Internal Data Services table used to store a data set in memory while
a job runs. Use instead of staging tables to improve performance of
a real-time job built with multiple data flows. Use a memory table to
move a data set between data flows.
message
Represents hierarchical data (such as a header with line items) for
document-oriented transactions (such as a purchase order).
metadata
In Data Services, information acquired and maintained to describe
tables in source and target databases. This information includes the
names of tables and their columns, and the data types of the
columns.
In general, metadata typically includes a description of data models,
a description of the layouts used in database design, the definition
of the system of record, the mapping of data from the system of
record to other places in the environment, and specific database
design definitions.
multi-source
Records that appear on two or more sources. For example, let’s say
you’re bringing together customer sources from several direct
marketers or publishers. Your best prospects may be the people
whose names appear on two or more sources, indicating they may
be most receptive to your offer.
multiline
The multiline format is a database record format in which address
data is not consistently located in the same arrangement in all
records. That is, data items “float” among fields. For example, an
input source may have fields named Line1, Line2, Line3, and Line4
that contain various categories of name and address data, as well
as non-address data.
nested data
52
SAP BusinessObjects Data Services Getting Started Guide
53. Glossary
Data in one table that is related to a single row of another table. A
nested table appears in Data Services as a column in a parent table.
Columns in the nested table can themselves contain tables.
normal source
A source of records that the application should consider to be good,
eligible records in a matching or association process.
North American Numbering Plan (NANP)
Telephone numbering plan shared by 19 North American countries.
These countries include the United States and territories, Canada,
Bermuda, Anguilla, Antigua & Barbuda, the Bahamas, Barbados,
the British Virgin Islands, the Cayman Islands, Dominica, the
Dominican Republic, Grenada, Jamaica, Montserrat, St. Kitts and
Nevis, St. Lucia, St. Vincent and the Grenadines, Trinidad and
Tobago, and Turks & Caicos.
null
The absence of a value within a database field for a given record. It
does not mean zero because zero is a value.
object
Any item that you create in the Designer. Data Services distinguishes
two classes of objects: reusable objects that are complete and can
be reused in your projects (such as data flows) and single-use objects
that only appear as components of other objects (such as a try/catch
block). This distinction affects how you create and retrieve each type
of object.
object definition
The options that describe the operation of an object. To view and
modify an object definition, open the object so that its definition
appears in the workspace.
object dependent
An object associated beneath the highest level object in the
hierarchy.
object library
A tool in the Designer that gives you access to reusable objects.
object version
An instance of an object. Each time a you add or check in an object
to the central repository, Data Services creates a new version of the
SAP BusinessObjects Data Services Getting Started Guide
53
A
54. A
Glossary
object. The latest version of an object is the last or most recent
version created.
ODBC (Open Database Connectivity)
A standard developed by the Microsoft Cooperation. It is an interface
that gives applications the ability to retrieve data in data management
systems using SQL for accessing the data. Such an interface allows
a developer to develop, compile, and ship applications without
targeting specific database management systems.
ODS (Operational data store)
An OLAP-designed relational database that an enterprise has
designated as the operational database of record (for example, a
finance department might use an ODS to close its books).
OLAP (Online Analytical Processing)
An approach to quickly answer multi-dimensional analytical queries.
Databases configured for OLAP use a multidimensional data model,
allowing for complex analytical and ad-hoc queries with a rapid
execution time. OLAP systems are used in a query environment,
such as for a business intelligence application.
OLTP
Online transaction processing. A relational database design optimized
for operational use. OLTP systems are used in an operational
environment, such as for an enterprise application.
open hub destination
An object within the open hub service that contains all information
about a target system for data in an InfoProvider. The open hub
service enables you to share data from an SAP NetWeaver BW
system to non-SAP data marts, analytical applications, and other
applications such as Data Services. It ensures controlled distribution
and the consistency of data across several systems.
operation code
A flag associated with a row in a data set that indicates the status
of the data in the row. The operation codes are INSERT, UPDATE,
DELETE, and NORMAL.
operational dashboard
A category of reports on the Management Console to see at a glance
the status and performance of job and data flow executions over a
given time period.
54
SAP BusinessObjects Data Services Getting Started Guide
55. Glossary
option
Business rules that can be set for a Data Quality transform that
specify how you want to process your data. Each Data Quality
transform has a different set of available options. Options and their
values are displayed in the Option Editor.
Option Editor
A tab in a Data Quality transform editor through which you can
change the value for each option within the transform.
Option Explorer
A pane in the Associate, Match, and User-Defined transform editors.
The Option Explorer shows a list of the option groups within a
transform.
option group
Contain a set of options that allow you to set different business rules
for a transform. These are displayed in the Option Explorer.
other source
In a Match transform, a source of records that should be treated as
transparent, such as seed sources. They are not counted in
determining how to characterize a match group—for example,
multi-source or single-source. For example, some mailers use a
seed source of potential buyers who report back to the mailer when
they receive a mail piece so that the mailer can measure delivery.
parameter
A value passed to a work flow or data flow when that flow is called.
partition
To divide table data into sets based on a criteria such as a range or
list of values in each row. You can configure Data Services to read
and write partitioned table data in parallel threads. Designing jobs
with partitioned table data can improve job performance if a Job
Server's computer memory and number of CPUs supports the job's
parallel-processing configuration settings.
passenger record
The records that are compared against driver records in a break
group. After a driver record has been compared with every passenger
record in a break group, a passenger record can become the new
driver record in the break group, or it can be found to be a match
SAP BusinessObjects Data Services Getting Started Guide
55
A
56. A
Glossary
with a driver record. At this point it is taken out of the comparison
process.
pattern file
User-defined patterns are stored in a pattern file. The pattern file is
a plain text file and can be edited in any text editing program. The
pattern file is used by the Data Cleanse transform.
pick list
A type of list returned by the Global Suggestion Lists transform that
is used to narrow down an address by starting with minimal
information. A pick list returns possibilities in a similar manner to a
suggestion list. You can pick an entry from this list to continue
processing.
PMB (Private mail box)
Private mail boxes are like post-office boxes but they are hosted by
private companies. The USA Regulatory Address Cleanse and the
Global Address Cleanse transforms can recognize certain forms of
PMB data when it appears in an address line.
postal address
A delivery address that is a rural route or box number.
postal code
A system of letters and/or digits used for sorting mail. Examples
include the ZIP Code used in the United States and the alphanumeric
FSA LDU system used in Canada.
postcode move
A valid postcode that has been split or moved, so only a portion of
the area that had been covered by the one postcode now has two
or more postcodes, including the original one, for the same area.
Postcode2
The secondary part of a postal code. For example, in the United
States, a postcode is composed of two parts (54601-4051). The first
five digits are followed by a hyphen and a four-digit code. The
four-digit code is the Postcode2 for a US postcode.
prepackaged adapter
An adapter prebuilt and purchased from SAP, such as the Data
Services Salesforce.com adapter.
primary entry
56
SAP BusinessObjects Data Services Getting Started Guide
57. Glossary
A word or phrase in the dictionary that the data cleansing packages
and Data Cleanse transform use to identify, parse, and standardize
data.
primary key
A column that is guaranteed to contain unique values, and whose
values identify all of the rows in a table.
project
The collection of jobs available in the Designer at a given time. A
project provides a way to organize the objects you create.
property
Detailed descriptive information about objects that you display on
the Designer. It includes information such as when it was created.
Query transform
A data transformation object that you can use to map columns from
a source to a target schema, add new columns to the target schema,
determine the data to extract, and perform operations on the data.
Similar to an SQL SELECT statement, a query creates a data set
that satisfies the conditions you specify.
Rapid Mart
Rapid Mart packages provide prebuilt data mart solutions for
enterprise applications, such as SAP, PeopleSoft, Oracle, and Siebel.
These powerful solutions combine domain knowledge and data
integration best practices in prebuilt data models, transformation
logic, and data extraction. Rapid Marts packages are add-ons to
Data Services.
real-time job
A group of objects (data flows, work flows, conditionals, scripts, and
so forth) that execute on-demand as a "request-response" system.
You design real-time jobs in the Designer, then configure them as
real-time services and associate them with an Access Server in the
Administrator, where they are started, managed and monitored.
When a real-time service receives a request from a caller, it
processes the request and returns a reply.
reference file
A file of address data used by Data Services to match, assign,
standardize, and verify addresses. Reference files are also referred
to as postal directories. These files have a .dir extension.
SAP BusinessObjects Data Services Getting Started Guide
57
A
58. A
Glossary
relational data
A data set in which data in each column contains a scalar value.
Data Services can process relational data; it can also process nested
data.
repository
See Data Services repository.
request/acknowledge operation
This operation is used to execute a remote HTTP service in the
Request Acknowledge mode. In other words, it makes the request
to the remote machine where the HTTP Adapter server is running
and does not wait for the reply; instead, it sends an acknowledgement
if the operation is successful.
request/reply
This operation is used to execute a remote HTTP service in the
Request Reply mode. In other words, it makes the request to the
remote machine where the HTTP server is running and waits for the
reply.
reusable object
An object (such as a data flow, datastore, or job) that can be defined,
stored, and reused independent of other objects. Any object that is
visible in the object library.
RFC (Remote Function Call) server
The Data Services RFC server allows third-party programs, including
SAP Applications and SAP NetWeaver BW, to schedule and initiate
Data Services jobs and return the results to Data Services.
RFC server Interface
The node on the Administrator application of the Data Services
Management Console where you configure SAP connections to load
data into or read data from an SAP NetWeaver BW system. Data
Services uses the RFC server interface to to schedule SAP jobs,
read from SAP open hub destinations, load data into SAP NetWeaver
BW, and to view Data Services logs from SAP NetWeaver BW.
rule file
For the Data Cleanse transform, the rule file controls how the
application parses groups of output type subcomponents for name,
firm, phone, SSN, and other non-address data.
58
SAP BusinessObjects Data Services Getting Started Guide
59. Glossary
For example, if you input “Mr. and Mrs. John Smith,” the application
could parse it into the individual components “Mr.,” “and,” “Mrs.,”
“John,” “Smith.” This is very useful, but generally, you would also
want to parse the whole group of related data “Mr. and Mrs. John
Smith.” To parse data in this way, you must create rules.
rule matching
Matches the token classifications against defined rules.
sample size
The number of rows to display in the View Data feature.
sampling rate
The number of rows processed after which Data Services writes
information to the monitor log file and updates job events.
sampling rows
The frequency to select a sample row to profile, starting with the first
row of the specified number of sampling rows. For example, if you
set Profiling size to 1000000 and set Sampling rows to 100, the
Profiler profiles rows number 1, 101, 201, and so forth until 1000000
rows are profiled.
SAP Applications
An ERP system. Formerly known as SAP R/3 or SAP ERP.
SAP BusinessObjects Enterprise
A business intelligence platform that powers the management and
secure deployment of specialized end-user tools for reporting, query
and analysis, and performance management on a scalable and open
services-oriented architecture.
SAP BusinessObjects InfoView
A web-based interface that end users access to view, schedule, and
keep track of published reports. InfoView consolidates the
presentation of a company's business intelligence information and
allows it to be accessed in a way that is secure, focused, and
personalized to users inside and outside an organization.
SAP BusinessObjects Rapid Mart
SAP BusinessObjects Rapid Mart packages provide prebuilt data
mart solutions for enterprise applications, such as SAP, PeopleSoft,
Oracle, and Siebel. These powerful solutions combine domain
knowledge and data integration best practices in prebuilt data
SAP BusinessObjects Data Services Getting Started Guide
59
A
60. A
Glossary
models, transformation logic, and data extraction. Rapid Marts
packages are add-ons to Data Services.
SAP BusinessObjects Web Intelligence
A web-based query and analysis tool that enables users to track,
understand, and manage corporate data using a simple browser as
their interface, while maintaining tight security over data access.
Long form:
SAP NetWeaver Business Warehouse (SAP NetWeaver BW)
SAP NetWeaver Business Warehouse. Formerly known as SAP
Business Information Warehouse.
script
A step in a job or work flow that allows you to calculate values to
pass to other parts of the job or work flow. The script can call
functions, execute if-then-else statements, and assign values to
variables. Write a script in the Data Services scripting language.
secondary information
Assists Data Cleanse in determining how to process the word when
it is used in different ways. Secondary information can include how
Data Cleanse will standardize the output data for the word or
alternate forms that could potentially be matched to the word .
segment
Format with which the data records of IDocs are interpreted.
SERP
Canada Post Corporation’s Software Evaluation and Recognition
Program. Data Quality is certified under this program, allowing you
to receive postage discounts for mailings to and within Canada.
server group
A defined collection of Job Servers on different computers. A server
group automatically measures resource availability on each Job
Server in the group and distributes batch jobs or part of a job to the
Job Server with the lightest load at run time. Use the Server Groups
node in the Administrator’s navigation tree to group Job Servers that
are associated with the same repository into a server group.
service request
Any message sent from a Web client that requires processing by a
real-time job.
similarity score
60
SAP BusinessObjects Data Services Getting Started Guide
61. Glossary
A percentage that indicates how much two fields or values are
considered alike. This percentage is calculated by the application
after the comparison process. For example, Ron and Rob are
considered 67% alike because two of the three characters are alike.
Similarity scores are used in a number of situations— not just in the
Match transform. For example, they can be used to determine which
suggestions to return for suggestion lists.
The similarity score is not always a direct result of a one-to-one
comparison; it can be altered by some options, such as those defined
in the Match transform, for example.
single-use object
A step in a work flow or data flow that cannot be saved independently
of the flow. Create single-use objects (such as a try/catch block,
script, or conditional) from the tool palette.
smart editor
A flexible editing tool in Data Services used for creating scripts,
expressions, and custom functions without having to type the names
of existing elements like column, function, and variable names.
SNMP (System Network Management Protocol)
A protocol that helps network administrators manage network routing
hardware. The protocol can manage a variety of hardware and
software devices. Data Services supports monitoring through SNMP.
snowbird
A casual term to describe someone who has multiple residences.
This term is derived from individuals who reside in a cooler-climate
region during the summer, and relocate to a home in a
warmer-climate region during the winter.
SOAP (Simple Object Access Protocol)
An XML-based message protocol used to encode the information in
a web service request and response messages before sending them
over a network or Internet.
source
1. An object (table, file, or legacy system) from which Data Services
reads data.
2. For the Match transform, the grouping of records on the basis of
some data characteristic that you can identify. A source might
be all records from one input file, or all records that contain a
SAP BusinessObjects Data Services Getting Started Guide
61
A
62. A
Glossary
particular value in a particular field. Sources are abstract and
arbitrary—there is no physical boundary line between sources.
Source membership can cut across data sources as well as
distinguish among records within a data source, based on how
you define the source.
source group
A group of sources that you can use to prepare a second set of
match statistics, combining the statistics for two or more regular
sources. For example, suppose you define five sources—two house
sources and three rented sources. You would get match statistics
for each individual source. But suppose that you also wanted a
summary for the house sources and a summary for the rented
sources. You could create two source groups—one for the house
sources and one for the rented sources.
Source groups affect only the way that match statistics are reported.
They do not affect matching or record priority.
source record
The location where the data you want to use to update or create
your best record with resides. A source record can be the master or
subordinate record of a match group.
SQL (Structured Query Language)
A query language for accessing relational, ODBC, DRDA (Distributed
Relational Database Architecture), or non-relational database
systems.
SQL query tool
An end-user tool that accepts SQL to be processed against one or
more relational databases.
standards
Define how Data Cleanse will standardize capitalization or other
output formatting on data.
star schema
A database design you can use to format data in a data mart. This
design is based on a single fact table to which any number of
dimensional tables may be joined. This type of database design
supports multi-dimensional database analysis.
step
62
SAP BusinessObjects Data Services Getting Started Guide
63. Glossary
An object that is part of the definition of a work flow or data flow.
Each step is represented by an icon in the diagram of the flow and
is connected to other steps to indicate the flow of data through the
data flow or the order of execution in the work flow.
street address
A delivery address that is the street name and house number.
subordinate record
Records that are part of a match group, and are found to be matches
with (and subordinate to) a master record. Subordinate records can
contain data that may be used to update a master record and, thus,
create a best record.
substitution parameter
A text string "alias" that you can use within your job and transforms.
You define a substitution parameter and its value in a substitution
parameter configuration. Then, at runtime, that parameter is replaced
with its value anywhere it is used in your job.
substitution parameter configuration
The definition of the substitution parameters used throughout your
job in a particular run-time environment . If you change the run-time
environment, you can change the substitution parameter
configuration before you execute the job.
suggestion lists
Normally, when an address cleansing transform looks up an address
in the postal directories, it finds one matching record. Sometimes,
due to incomplete information, there may be two or more records
(or suggestions) in the postal directories that could possibly be the
correct record. Suggestion lists provide you with a list of “matching”
addresses, so that you can choose which is the best address.
suppression source
A source that contains records of information that should be excluded
from other output destinations. The records in the suppression source
are used for matching in other sources. The records that match the
suppression source could then be removed from further processing.
For example, suppression sources may be your own bad-account
file or no-mail sources provided by the government or
direct-marketing association (DMA) to prevent wasted mailings and
offending consumers.
SAP BusinessObjects Data Services Getting Started Guide
63
A
64. A
Glossary
system configuration
Groups together a set of datastore configurations and a substitution
parameter configuration. Data store configurations define datastore
connections. A substitution parameter configuration can be
associated with one or more system configurations. For example,
you might create one system configuration for your local system and
a different system configuration for another system. When executing
a job, you can specify which system configuration to use.
table
A database table that Data Services reads data from or loads data
into. The path and mechanisms for reading and loading data and
apportioning the data among rows and columns are defined in the
datastore that the table is associated with. Writing a data set to a
database table means sending a combination of rows with
appropriate operation code to the database table.
target
An object in which Data Services loads extracted and transformed
data in a data flow. Data Services loads rows flagged as INSERT,
UPDATE, or DELETE.
TCP/IP
Transmission Control Protocol/Internet Protocol. The basic
communication protocol of the internet, and often intranets and
extranets. A computer having direct access to the internet contains
a copy of the TCP/IP program. TCP/IP makes it possible for
computers to communicate with each other.
Tdpid
(Teradata Director Program ID) The server name Data Services
uses when loading with the bulk loader option. Data Services uses
tdpid as a Teradata Warehouse Builder operator attribute.
territory
The locale value for a geographical location (usually the country)
where a locale language is used. The paring of a language with a
territory determines factors such as date format, time format, decimal
separator, currency format, and so on.
thread
The instance of the program running on behalf of some process.
Data Services typically creates one thread per data flow object. If
you are using parallel objects in data flows, the thread count will
64
SAP BusinessObjects Data Services Getting Started Guide
65. Glossary
increase to approximately one thread for each source or target table
partition. If you set the Degree of parallelism (DOP) option for your
data flow to a value greater than one, the thread count per transform
will increase. The operating system will distribute the threads among
the available CPUs.
tokenization
Assigns specific meanings to each of the pieces that result from
word breaking. Data Cleanse looks up each individual input word in
the dictionary. A list of tokens is created using the classifications
associated with each word in the dictionary.
tooltip
A small pop-up window with descriptive text.
transfer rule
In SAP NetWeaver BW, transfer rules help you determine how the
fields for the transfer structure are assigned to the InfoObjects of
the communication structure.
transfer structure
In SAP NetWeaver BW, a structure in which data is transferred from
the source system into BW. It displays a selection of fields for an
extract structure of the source system. To an ETL tool like Data
Services, a transfer structure looks like a table.
transform
A step in a data flow that acts on a data set. Data Services transforms
are available through the object library in three cateogories: Data
Integrator, Data quality, and Platform.
transparent network substrate (TNS)
The Oracle networking technology that provides a single application
interface to all industry-standard networking protocols. It is stored
in the tnsnames.ora network configuration file. Use a TNS to connect
to your Oracle database or the Data Services Repository (stored in
an Oracle database).
try/catch block
A combination of a try object and one or more catch objects that
define alternate execution paths in case an error occurs during the
execution of a job. You can tune try/catch blocks to trap specific
errors and to provide general alerts or messages if an error occurs.
Unicode
SAP BusinessObjects Data Services Getting Started Guide
65
A