Whitepaper: Converting Data into Information

Process Relations GmbH

Converting Data into Information
Whitepaper

Abstract
Every new product or product enhancement starts with a new idea. In the area of
process and device design for high-tech products like MEMS, NEMS, PV-Cells and
Nanomaterials, personal experiences gained through previous developments
provide a major contribution to new developments. Other sources for information
and inspiration include colleagues, scientific papers and old lab books. However,
this is where problems arise. Colleagues are not always available and it is not
always clear if a certain experiment has already been conducted. Lab books are a
great resource for historical data, but in most cases, they are only useful to the
people who wrote them, as they know where to look and how to read them. Even if
computer files are available, they are often distributed on several file servers or are
hidden in some place and they are only sorted by a one dimensional criterion.
Searching from another perspective is almost impossible. Furthermore, every
engineer has his/her own way of storing information, which means that various office
and freeware software is used to create documentation.
XperiDesk by Process Relations is a Process Development Execution System
(PDES). It is used to organize and track the data and information gathered during
the research phase of process development efforts. XperiDesk offers various tools
to load, to manage and to retrieve data from various sources. It allows engineers to
look at historical and current data and to make connections between the results
gathered. By doing so XperiDesk enhances data and converts it into information that
can be used for new product developments.
This whitepaper introduces two clients for the XperiDesk system that address the
“heap” challenge. Meant is here the heap of historical data and constantly generated
digital data on file servers. It will be explained how they can be used to convert the
raw data into usable information within the XperiDesk PDES and what advantages
engineers gain by using these tools.

www.process-relations.com

Table of Contents
Converting Data in Information ...................................................................................................................... 1
Abstract ......................................................................................................................................................... 1
The Challenge ............................................................................................................................................... 3
Addressing the problem................................................................................................................................. 3
Excel Client .................................................................................................................................................... 4
Analyzing historical data ............................................................................................................................ 5
Automated loading of data ......................................................................................................................... 7
File Loading Client ......................................................................................................................................... 8
Analyzing historical data ............................................................................................................................ 9
Automated loading of data ......................................................................................................................... 9
Summary ..................................................................................................................................................... 10

Whitepaper
Converting Data into Information Page 2/10

The Challenge
The insufficient internal information and knowledge management resulting in recurring engineering “déjà
vu’s” is one of the main issues experienced in today’s development organizations. Experiences gained by
previous developments, scientific papers, and old lab-books provide the major contribution to the
realization of new product ideas. Having no or only insufficient structure in these data causes a lot of
trouble and double work. Experts in semiconductor process development estimate that 10-15% of failed
and double experiments could be avoided, if previous results would be accessible in an easier way. This
ties in with issues arising from engineer fluctuation between different projects. Moving the project expert
into a different project might jeopardize the previous project while engineers moving into a running project
are flooded with lots of unstructured information.
Additionally the traditional means of data storage provide only a one-dimensional search criterion.
Cluttered result data storage and important data on local disk drives cause tedious and error prone
manual data collection and sometimes even data loss. Furthermore often only the pure data points or
result data sets are stored with limited or no context information. Having only limited context poses
problems when trying to reproduce previously seen effects or result in drawing the wrong conclusions from
cause-effect analysis. These circumstances produce “déjà vu’s” in the form of “Once we had a result ...”
that can be very annoying and cost intensive.
Documenting and reporting the development progress can be tedious at best. Cluttered results storage
puts major manual effort onto the development engineers requiring them to manually collect data from
diverse machinery. Additionally the assembly of the collected result data into reports and the evaluation
can take a major part of engineering time. Reporting on the development status is often times more a
manual assembly of the reports than an automated process. The input data is often not up to date so that
the Work In Progress (WIP) status is not necessarily precise. The impacts of these effects are even
aggravated by quality assurance and compliance demands such as ISO 900X, CMMI, SOX etc. Because
those apply more and more in development as well as in production, there is a strong demand to fulfill the
imposed documentation requirements.

Addressing the problem
One of the most commonly used tools in process development is Excel. Excel is very versatile and can be
used to very quickly convert raw numbers into diagrams to visualize relationships. However Excel was
never intended to replace a database. The data is structured in columns but some simple searches like
show me all results where the resistance is between 500kOhm and 1MOhm and the thickness is between
5nm and 7nm are difficult at best. If you want to access context information like determining which project
a specific wafer run was in or what other wafers were produced in the same project, the limits of Excel are
reached very quickly. Obviously tables can be extended to contain that data, but then the tables become
unreadable. Additionally copy and paste will be used as a starting point for entering new data, and that is
a huge source for errors.
Besides these disadvantages Excel has a big advantage: it is a known tool. Most engineers know how to
work with Excel and many software products can im- and export Excel files. The key is now to integrate
the best of both worlds – a known and easy to use Excel with the power of database functionalities.
Combining these two introduces many advantages to a R&D organization:
 Existing Excel files can be analyzed and preprocessed to enable comprehensive searches
 Excel templates can be used to ease the adaption of new methodologies, there will be no hard cut
in the tool usage, the known Excel can be used to feed data into the new database
 Established procedures of data collection can be kept and need only gradually to be changed
enabling a smooth change to new procedures
 Other tools that export Excel files can be integrated without much programming overhead.

Whitepaper

Another problem today is the heap of digital data. Digital data in the form of images, analysis result files,
diagrams and other formats is the backbone of any research organization. Many hours are spent to
archive this data and later on to search for it. However, common approaches like file servers or even
document management systems don’t account for the complexity of research data.
In many cases, the result files alone are useless. Without knowing the production process a wafer image
doesn’t provide significant data or information value. Result diagrams are also not of much value without
knowing the conditions in which they were created. More or less complicated hierarchies are used to
compensate for limitations in file systems or meta data structures of Document Management Systems
(DMS). However, as soon as a new dimension is added to the problem all of these approaches fail.
The following two paragraphs introduce two standalone clients for the XperiDesk system addressing the
above motivated issues. The so called Excel Client having the capability to import table based data into
XperiDesk, formalize the data on the fly and relate it to the pre-existing data and therefore building
information. The second standalone client is the File Loading Client. It is capable of importing all types
files into the system, add meta data to them and relate them to the pre-existing items or create new items
other than files from the information in the file paths. Managed files will be indexed as well, so that fast
searching in text files is enabled.

Excel Client
The XperiDesk Excel Client was developed to overcome this problem. It allows users to continue to work
with Excel, the tool they are used to, while enabling them to ask detailed questions of the XperiDesk
database. Thus the Excel Client can be used to extract historical data and to analyze the ongoing
experimentation. In addition, it gives a true meaning to the data. Raw numbers can be converted into
meaningful measurement results with units and parameters. To facilitate these advantages to the
maximum, the Excel Client is a standalone client that can be deployed to multiple servers and
workstations. Thus it is possible to extract data from different sources to centralize the data on servers.

Whitepaper

Figure 1: The Excel Client

Analyzing historical data
Often a significant amount of historical data is stored in Excel worksheets. If the same or similar templates
are used to create these worksheets, it is possible to easily import this large amount of data into
XperiDesk. The Excel Client offers a graphical user interface to define loading scenarios from Excel
worksheets. The content of the worksheets can be analyzed and modified while loading the data into
XperiDesk.
Another benefit comes from the automated generation of relationships. For instance, the name of the
wafer an experiment was done for can be extracted from any cell or any combination thereof in the
worksheet or even from the sheet, file or path name. This information can be used to automatically link the
experimental result to the specific wafer involved in the experiment. So the new information together with
the relationships to existing data is generated on the fly during the import. It is not just raw data import as
provides a searchable and browsable structure for the data converting it into information that can be used
for effective decision-making.

Whitepaper

Figure 2: Formalized data with relation in the XperiDesk Graph View

From this exercise the user gains a structured repository of the historical data. Data points from different
sources are now related and can be searched using the relationships. If, for instance, two workgroups did
measurements on the same device, the data can now be merged and referenced. Searches in a new
quality become possible by asking for device data from different research groups. All numbers have a
meaning after the import. Units and parameters are attached and are searchable. Dedicated searches
e.g., for everything with a resistance of less than 3kΩ, become possible.

Whitepaper

Figure 3: Search query and result on formalized data

Automated loading of data
The automated loading of data is enabled by the batch mode of the Excel Client. Once a job is defined
using the graphical user interface, the client can be started on the command line or using a scheduling
service. So the Excel files can be checked for changes at given intervals and changes are imported into
the XperiDesk system.
The described method can also be used to import external data from project partners. The results from
e.g., external lab measurements can be linked to the internal research database. They become
searchable and part of the context. This is extremely valuable for XperiDesk customers who outsource
rd
experiments or parts of experiments to 3 parties, but who want to track the results of an overall
experiment internally.
The result of this exercise is again a structured and searchable collection of the information. New
questions can be asked to the system in seconds instead of the hours it took before to collect the data
from the different Excel sheet. Imagine a search for all wafers manufactured with a certain combination of
process steps and given processing intervals where the resulting resistance measurement showed a
median resistance of 5kΩ. Additionally the “raw” Excel data can be attached to have the source of the
data for extended reference.

Whitepaper

Figure 4: Import Preview to check data before the import
In summary, the raw Excel data from different worksheets can be collected and structured. Links and
relationships between the data sets are established and all is included in the steadily growing information
network of the company, turning the raw data into usable information.

File Loading Client
Based upon the common use of Excel spreadsheets in research and process environments, the File
Loading Client was developed using similar principles. It can be used to analyze distributed file servers
even on different locations. It can create file containers, called artefacts, containing the files. Other entities
like wafers, lots or experiments can also be created using the filename and path information from the file
server hierarchy. All these entities can be linked together, applying context to the files.

Whitepaper

Figure 5: The File Loading Client

Analyzing historical data
One of the tasks the File Loading Client can be applied to is to analyze historical file hierarchies. Existing
file servers and backups can be loaded and information can be extracted from these. To do so the File
Loading client uses customizable patterns. These patterns can be used on any file and path name to
extract meaningful data. Together this data can be used to create new entries in the XperiDesk database,
to update existing ones and to attach files to newly created or existing wafers, experiments and other
entities under management.
The File Loading Client is a standalone client that can be deployed to multiple servers. Thus it is possible
to extract file data from different sources and to centralize the data. Additionally new views of the data
become available to everyone working with the system. Users are now able to look at the result files of
other departments (if security rights permit it). Raw data is transformed into information with contexts that
can now be used to further research projects throughout the company.

Automated loading of data
The File Loading Client is also of use once the historical structures are analyzed. Many tools will continue
to generate digital data. It is important to continuously archive these results in the context of the original
research project By establishing a certain hierarchical structure in the file system and by using
Whitepaper

standardized names for the files, the File Loading Client is able to load and link these files. No additional
overhead is necessary. Users can continue to use the tools as they are used to.
Similarly to the Excel Client the File Loading Client can be run in batch mode. A graphical user interface is
used to define the loading jobs and then updates can be run regularly using a scheduler. Any changes in
files or the hierarchy are detected and using the versioning system even multiple version an artifact (e.g.,
a project document) can be automatically managed.

Figure 6: Extraction of data from filepath using pattern matching
Again the advantage is that raw data becomes information. Result files are now available in their context.
Searches become much easier, critical results can be found much faster. Engineers can now spend their
time in analyzing the results rather than searching for them.

Summary
The Excel and the File Loading Client offer ways to get rid of “the heap” of raw data. By analyzing the raw
data both tools can generate entries in the XperiDesk database. Additionally they can create links
between these entries enabling the user to navigate, e.g. by visualizing the graph structure, to the needed
data a lot faster. New relationships can be found helping the engineer to better understand the ongoing
work. The links or relations kept in XperiDesk enable complex search queries that only deliver the results
searched for. Time spent on organizing raw data is reduced severely. In summary these tools enable
organizations to convert the heap of raw data into information usable for current and future research
projects.

Whitepaper

Whitepaper: Converting Data into Information

Recomendados

Recomendados

Mais conteúdo relacionado

Destaque

Destaque (15)

Mais de Dirk Ortloff

Mais de Dirk Ortloff (20)

Último

Último (20)

Whitepaper: Converting Data into Information