1. Paul Billingham Sales Director Concept Searching. +44 7866476691 [email_address] Searching .com concept Classifier for SharePoint Unlocking Enterprise Content To Drive Business Agility Carla Mulley VP Marketing Concept Searching. +1 (412) 567-4948 [email_address]
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13. Begins with highly accurate automatic semantic metadata capture to enable content to become a business driver to improve organizational performance, compliance, and data security Concept Searching’s Approach Concept Searching • Martin Garland • (703) 531-8567 • marting@conceptsearching.com
14.
15. Semantic Metadata Generation & Content Tagging to Deliver Transparency & Improve ECM, Records Management, Compliance, Search, & Data Privacy in a SharePoint Environment Source: Mission Critical Symposium 2009 – AFMS Presentation Activities Capture Generating, Capturing, Preparing & Processing Information Phases Manage Store (temporary) Repositories Library Services Storage Technologies Preserve Long Term Storage Media Long Term Preservation Deliver Output Management File Systems CMS Databases Data Warehouses Online, Nearline, & Offline Storage RAID,SAN, NAS Magnetic Tape CD/DVD/MO WORM Optical Disk Tape Hard Disk Storage Networks Microfilm Paper Migration Emulation Location, Administration & Media Selection Transformation Security Distribution Transformation XML PDFs Security PKI Digital Rights Management Distribution Internet, Extranet, Intranet, Portals RSS Feeds Management, Processing & Use of Information Document Mgmt Collaboration Web Content Mgmt Records Mgmt Workflow/BP Mgmt Pre-Capture Defining Business Rules Identifying Types of Information for Capture Taxonomy Development Creating a Metadata Environment (MDE) Based upon Org. Mission Options Use Existing Guidelines File Plans Records Retention Schedules, etc… a nd Automatic Metadata Generation Use Enterprise Content to Create MDE Manual Subjective Inaccurate Time Consuming Expensive versus Automatic Objective Precise Rapid Cost Effective Admin/Retrieval Databases & Access Authorization System Metadata Tagging & Content Type Definition Metadata Drives Update of Content Types Using MOSS Feature
24. Source: Air Force Medical Service InterSymp 2010 Presentation Using Microsoft EA & Concept Searching to Address Enterprise Capability Gaps - Increasing Data Exposure Events - Poor Search Result Precision - Inappropriate Data Storage & Preservation - Lack of Detection using Data Analytics
25.
26. Paul Billingham Sales Director Concept Searching. +44 7866476691 [email_address] Searching .com concept Classifier for SharePoint Unlocking Enterprise Content To Drive Business Agility Carla Mulley VP Marketing Concept Searching. +1 (412) 567-4948 [email_address]
Notas do Editor
The key points on this slide are: Been in business since 2002, first customers in 2003 Major Enterprises with up to 66 000 users have deployed successfully to manage unstructured data Owned by the Founders – no external investment. Profitable with 35% growth in 2008 and already trading for similar growth in 2009. Increasing number of specialized Partners in this space buying into our value proposition. Concept Searching was founded in 2002 with the goal of developing statistical search and classification products that delivered critical functionality currently unavailable in the marketplace. The products were launched in 2003 and Concept Searching has experienced growth and profitability every year since. Concept Searching is the only statistical classification software company in the world that uses concept extraction and compound term processing to achieve the highest precision without the loss of recall. Our products are the only solutions that are fully integrated with MOSS and Microsoft Search. In side-by-side comparisons against industry leaders, Concept Searching has been able to dramatically illustrate the strength of the technology. Concept Searching counts an ever growing number of global and Fortune 500 and Fortune 1000 clients. We have built a strong partnership channel with Microsoft Partners. Continuing to invest in product development Concept Searching is defining new standards for the search and classification industry and is committed to delivering quantifiable business benefits to organizations around the world.
Traditional search assumes the end user knows what they are looking for, or must enter the ‘right’ combination of words to get the ‘right’ result
- FINANCIAL RISK – MAJOR LEGAL EXPOSURE – LOST PRODUCTIVITY The implementation of ECM solutions where content is inappropriately or insufficiently metadata tagged or where inappropriate content types are deployed gives rise to ineffective capture within the 5 phase ECM process. The issue simply put is that ineffective capture means a corporation is unable to manage, store, preserve and deliver content in any effective manner to meet the goals of the organization. Furthermore issues such as inconsistency of tagging causing mismanagement of content invariably leading to data and security breaches gives rise to non compliance, increased risk, litigation and fines. The traditional answer is to implement applications, point solutions, if you will, to address individual elements of the problem. The true answer to this issue is to enable content within the enterprise with appropriate metadata and tagging drive process, increase business productivity, and reduce risk.
Concept Searching’s conceptClassifier for SharePoint is the enabler of Enterprise Content. Thru the automatic tagging of content with semantic metadata content can be tagged in a consistent manner. The metadata can then be used by any application or process that utilizes metadata.
The Issues: Unstructured content is doubling every 3 months and yet 80% of business decision are made using unstructured data The failure rate of Enterprise Content Management (ECM) initiatives is 50% in large organizations Keyword Search captures only 33% of relevant information Inability to find information across disparate content stores Only 50% of content is correctly indexed, meta-tagged or efficiently searchable The Technology Highly scalable, thousands of users millions of documents Taxonomy development savings of 3-6 months and $150K-$300K Able to classify terabytes of documents Unique technology – compound term processing, semantic metadata generation Benefits: Identify new relationships within content & discover new insights Intuitive, requires no training Enables the retrieval of relevant information and identification of highly correlated content that normally would not be found Reduces time & Cost associated with managing content Reduce time spent finding information Enables re-use and repurposing of existing content Expedites access to real-time information Optimizes existing investments in technology Rapidly installed and implements Delivers ability to make better informed decisions
The Issues: Unstructured content is doubling every 3 months and yet 80% of business decision are made using unstructured data The failure rate of Enterprise Content Management (ECM) initiatives is 50% in large organizations Keyword Search captures only 33% of relevant information Inability to find information across disparate content stores Only 50% of content is correctly indexed, meta-tagged or efficiently searchable The Technology Highly scalable, thousands of users millions of documents Taxonomy development savings of 3-6 months and $150K-$300K Able to classify terabytes of documents Unique technology – compound term processing, semantic metadata generation Benefits: Identify new relationships within content & discover new insights Intuitive, requires no training Enables the retrieval of relevant information and identification of highly correlated content that normally would not be found Reduces time & Cost associated with managing content Reduce time spent finding information Enables re-use and repurposing of existing content Expedites access to real-time information Optimizes existing investments in technology Rapidly installed and implements Delivers ability to make better informed decisions
The Issues: The public sector, hospitals, financial and educational institutions, as well as private businesses are facing continuing pressure and government regulations to protect information from unauthorized access, use, and disclosure. Both the public and private sector routinely collect confidential information regarding their employees, customers, products, research, and financial status. The inability to protect confidential information can cause irreparable harm to individuals as well as the organization and the consequences can lead to loss of business, litigation, and both criminal and civil penalties. Seattle based healthcare company paid $100,000 for HIPAA violations in addition to the $7-$9 million spent on the breach itself TJX compromised 94 million accounts at a cost of $256 million ValuClick paid the U.S. Federal Trade Commission to settle a charge that consumer’s data was not secured Average cost of a data breach is $6.3 million and ranges from $225K - $35 million 70% of all breaches are due to mistake or malicious intent by the organization’s own staff PIIdiscovery enables organizations to define unknown Personally Identifiable Information (PII) according to their specific requirements and needs. Types of PII can include social security numbers, credit card numbers, date of birth, bank account numbers, passports, drivers licenses, or any unique organizational or mandatory descriptors (for example: HIPAA). PII can be identified from diverse repositories including: email servers, fax servers, forms and scanned documents, Microsoft Office Applications, website, and servers and PC’s. Once identified, it can be automatically aggregated into a central location for review and disposition. Benefits: Protects the organization from costs associated with a data breach, civil and criminal penalties, sanctions, and loss of business and reputation Automatic identification of unknown PII mitigates risks associated with PII exposure Standardizes and improves organizational processes associated with the identification and segregation of PII Reduces organizational costs and effort in protecting and identifying PII Reduces costs and risk exposure through automatic identification of PII from disparate content repositories Eliminates risk associated with end user non-compliance issues Reduces the portability and transmissibility of protected data assets
The Issues: The public sector, hospitals, financial and educational institutions, as well as private businesses are facing continuing pressure and government regulations to protect information from unauthorized access, use, and disclosure. Both the public and private sector routinely collect confidential information regarding their employees, customers, products, research, and financial status. The inability to protect confidential information can cause irreparable harm to individuals as well as the organization and the consequences can lead to loss of business, litigation, and both criminal and civil penalties. Seattle based healthcare company paid $100,000 for HIPAA violations in addition to the $7-$9 million spent on the breach itself TJX compromised 94 million accounts at a cost of $256 million ValuClick paid the U.S. Federal Trade Commission to settle a charge that consumer’s data was not secured Average cost of a data breach is $6.3 million and ranges from $225K - $35 million 70% of all breaches are due to mistake or malicious intent by the organization’s own staff PIIdiscovery enables organizations to define unknown Personally Identifiable Information (PII) according to their specific requirements and needs. Types of PII can include social security numbers, credit card numbers, date of birth, bank account numbers, passports, drivers licenses, or any unique organizational or mandatory descriptors (for example: HIPAA). PII can be identified from diverse repositories including: email servers, fax servers, forms and scanned documents, Microsoft Office Applications, website, and servers and PC’s. Once identified, it can be automatically aggregated into a central location for review and disposition. Benefits: Protects the organization from costs associated with a data breach, civil and criminal penalties, sanctions, and loss of business and reputation Automatic identification of unknown PII mitigates risks associated with PII exposure Standardizes and improves organizational processes associated with the identification and segregation of PII Reduces organizational costs and effort in protecting and identifying PII Reduces costs and risk exposure through automatic identification of PII from disparate content repositories Eliminates risk associated with end user non-compliance issues Reduces the portability and transmissibility of protected data assets
The Issues: End user adoption is cited as the single most critical barrier to success in Records Management Enforcing governance at the end user level is rarely successful and requires management and time to enforce policies Non-compliance results when documents are never subjected to enterprise policies Metadata is often non-descriptive as it does not capture the essence of the record making it less useful to end user and the organization Lack of automated tools that can categorize content without user intervention so retention policies can be assigned Inability to ensure that all content is identified and correctly processed within the organization Benefits: Automated classification and integration with Microsoft Office and Exchange eliminates end user adoption issues Automated records collection, classification, and organization reduce costs, implementation and on-going management Protects the records integrity and the native security model Fully integrated with SharePoint A custom router or workflow can be configure to automatically send uploaded documents to the Records Center As documents are uploaded to the Libraries they can automatically be declared records
The Issues: End user adoption is cited as the single most critical barrier to success in Records Management Enforcing governance at the end user level is rarely successful and requires management and time to enforce policies Non-compliance results when documents are never subjected to enterprise policies Metadata is often non-descriptive as it does not capture the essence of the record making it less useful to end user and the organization Lack of automated tools that can categorize content without user intervention so retention policies can be assigned Inability to ensure that all content is identified and correctly processed within the organization Benefits: Automated classification and integration with Microsoft Office and Exchange eliminates end user adoption issues Automated records collection, classification, and organization reduce costs, implementation and on-going management Protects the records integrity and the native security model Fully integrated with SharePoint A custom router or workflow can be configure to automatically send uploaded documents to the Records Center As documents are uploaded to the Libraries they can automatically be declared records
Only statistical metadata, classification, and taxonomy software that uses concept extraction through our compound term processing technology Concepts in Context Compound Term Processing Triple Heart Bypass (Baseball or three? Organ or center? Road or avoid?) Life Sciences vs. Life or Sciences Michigan State University vs. Michigan or State or University Respiratory & Inflammation vs. Respiratory or/& inflammation “ At last a tool set that enables enterprise content be the driver for business productivity” Concept Searching provides a comprehensive suite of tools for the automatic classification and taxonomy management of enterprise content. The ability to identify ‘ concepts in context’ generates far richer meta data, improving the precision and relevancy in the information retrieval process. Concept Searching provides a comprehensive suite of tools for automatic semantic metadata generation, automated classification and taxonomy management of enterprise content. The metadata generation issue is increasingly a growing concern in large enterprises. A comprehensive approach requires more than syntactic metadata (i.e. date, author, title) and requiring end users to add rich metadata is haphazard and subjective at best. Since Concept Searching’s technology is no longer restricted to keyword identification, compound term metadata can be automatically generated either when the content is created or ingested. The generation of metadata based on concepts extracts compound terms and keywords from a document or corpus of documents that are highly correlated to a particular concept. By identifying the most significant patterns in any text, these compound terms can then be used to generate non-subjective metadata based on an understanding of conceptual meaning. The ability to identify ‘ concepts in context’ generates far richer meta data, improving the precision and relevancy in the information retrieval process. Meta-tags are automatically added to the properties field of each document making the document more valuable to the organization by increasing the ability of the document to be retrieved using Microsoft Search Products that use keywords and metadata to retrieve information. concept Classifier for SharePoint is fully integrated with both SharePoint, Microsoft Office, Exchange, FAST and Microsoft Enterprise Search. The automatic extraction of compound terms enables the Subject Matter Expert (SME) to use the terms within the taxonomy generation process, reducing the time to build out and maintain taxonomies by 80%. (Compound Term Processing performs matching on the basis of compound terms as opposed to keywords. Compound terms are built by combining two (or more) simple terms, for example ‘triple’ is a single word term but ‘triple heart bypass’ is a compound term. By identifying and forming compound (multi-word) terms and placing these in the search engine’s index the search can be performed with a greater degree of accuracy because the ambiguity inherent in single words is no longer a problem. A search for ‘ survival rate after triple bypass surgery’ will locate documents about this topic even if the precise phrase is not contained in any of the documents. A traditional search query return would return all documents that contained the words ‘triple’, all the words that contain ‘heart’, and all the words that contain ‘bypass’.) Features: Downloadable in 30 minutes – no programming required Automatic classification and compound term meta data extraction Classification technology uses concept extraction and compound term processing Taxonomy based and faceted navigation Robust suite of tools to build an maintain taxonomies Fully integrated with Content Types Automatic classification from MS Office and Outlook Taxonomy browse, faceted navigation, and preview functionality from the search interface Can automatically classify from SharePoint, folders, and web sites providing a single interface to all permmissable content Simple intuitive interface designed for the SME Fully SOA compliant, delivered as Web Parts, based on open standards Integrates with Microsoft Office, Microsoft Records Center, and the Microsoft Business Data Catalog
Concept Searching’s Concept Classifier for SharePoint enables enterprise content drive business productivity. Integrated fully with SharePoint Concept Classifier for SharePoint delivers robust Taxonomy management, Semantic metadata tagging, auto-classification and based upon the content classified and the tags therein can automatically update document content types that drive process, compliance management, storage, and preservation. It should be noted that to round out the full ECM solution other third party Microsoft and Concept Searching partners may be required for such things as scanning and paper capture, physical records management, business process workflow, etc.
A taxonomy is a classification structure that is represented by a hierarchical view of topics that have been grouped together because they share the same quality of characteristic. A taxonomy provides a unified view and access to relevant information across often disperse silos of information. Concept Searching supports multiple taxonomies within an organization. Taxonomy development is traditionally a very time consuming and costly activity. Our Taxonomy Manager has been proven to reduce taxonomy development time by 80%, generating a time savings of 6-12 months and a cost savings of $150K - $300K. Concept Searching also has a robust and frequently expanding library of off-the-shelf taxonomies covering a wide variety of domains to help jumpstart a classification project by providing off the shelf taxonomies to cover nearly any industry. The taxonomy (or multiple taxonomies) can be used by Subject Matter Experts (SME’s) to easily build taxonomies and classify document into predefined categories based on a small number of descriptors or clues. Once classified the documents can then be applied to a corporate taxonomy and made available to the organization. The taxonomy management features includes: - Ability to change the node weighting (score) - Auto clue suggestion: automatic generation of node clues from compound terms found in the document corpus eliminating training sets and complex Boolean rules - Dynamic screen updating: the user interface is fully AJAX enabled so changes to the taxonomy are immediately available for further refinement Document movement feedback: this feature enables the SME to see the cause and effect on the taxonomy without re-indexing. The metadata generation issue is increasingly a growing concern in large enterprises. A comprehensive approach requires more than syntactic metadata (i.e. date, author, title) and requiring end users to add rich metadata is haphazard and subjective at best. Since Concept Searching’s technology is no longer restricted to keyword identification, compound term metadata can be automatically generated either when the content is created or ingested. The generation of metadata based on concepts extracts compound terms and keywords from a document or corpus of documents that are highly correlated to a particular concept. By identifying the most significant patterns in any text, these compound terms can then be used to generate non-subjective metadata based on an understanding of conceptual meaning. Compound term processing is a new approach to an old problem. Instead of identifying single keywords, compound term processing identifies multi-word terms that form a complex entity and identifies them as a concept. By deriving these compound terms from the clients own document corpus we can tag content with meaningful semantic metadata and enable Microsoft’s Enterprise search to filter across that metadata at retrieval thus deliver a higher degree of accuracy because the ambiguity inherent in searching against single words in isolation is no longer a problem. As a result, a search for “survival rates following a triple heart bypass” will locate documents about this topic even if this precise phrase is not contained in any document. Compound term processing can address many challenges facing large enterprises and provide many benefits. Identification of concepts within a large corpus of information removes the ambiguity in search, eliminates inconsistent meta-tagging, and automatic classification and taxonomy management based on concept identification simplifies development and on-going maintenance. The unique compound term processing enables the identification of compound terms (not keywords) from highly relevant content that can be used to trigger the automatic meta-tagging and the auto-classification processes. This conceptual metadata is added to the original metadata for the category/folder. More semantic metadata that can be linked to a document or record results in information that becomes more useful to the organization. Meta-tags are automatically added to the properties field of each document making the document more valuable to the organization by increasing the ability of the document to be retrieved using Microsoft Search Products that use keywords and metadata to retrieve information.
Following the automatic generation (tagging) of compound terms and semantic metadata the documents in the document libraries are then automatically classified to multiple categories within the taxonomy. The terms generated can be edited from within SharePoint or from within the Taxonomy Manager tool. The content will remain and can be accessed from the original location but can be linked to multiple categories/nodes.
Enterprises are increasingly understanding the value and critical need to utilize Content Types to structure their content and identify the type of document regardless of its physical site or library storage location. Content Types can be used to enforce metadata governance, adhere to policies and drive workflows in line with business processes. Included in the new release is the ability to assign taxonomies to specific Content Types. Documents that correspond to the selected Content Types will be classified and documents that do not correspond to a content type or do not include some metadata elements that a specific content type has specified will not be classified. This essential functionality allows different taxonomies to be assigned to different Content Types for example, assign the HR taxonomy to all Content Types of type “HR”, including any Content Types derived from “HR” and assign the Finance taxonomy to all Content Types of type “Finance”, including any Content Types derived from “Finance”. The configuration can be performed using a wizard that runs inside SharePoint. The taxonomies will be available for these documents regardless of their location. concept Classifier’s site columns and Event Handlers are associated to the Content Types. This delivers the ability to automatically add classification functionality to new sites when created.
concept Classifier for SharePoint fully supports Content Types. An add-on features includes the ability to update Content Types based on the identification of content during the classification process. This is particularly useful in records management and data privacy and security. This provides the ability to develop a series of actions that can occur when content contains specific metadata as defined by the organization.
Knowledge workers need to identify content in the context of what they are seeking. The fundamental problem with most enterprise search solutions, and all statistical search solutions, is that they are based on an index of single words. Yet most queries are expressed in short patterns of words and not single words in isolation which are highly ambiguous. A concept search engine can isolate the key meaning that is normally expressed as proper nouns, nouns phrases and verb phrases. Although linguistic products can do this, their performance is highly variable depending upon the vocabulary and language in use. A statistical based language independent concept search can accept queries in natural language with the user typing words, phrases or whole sentences. The system then analyzes the natural language query to extract the keywords and phrases to identify the main concepts and retrieve content that is highly relevant. Precision and recall are the two key performance measures for information retrieval. Precision is the retrieval of only those items that are relevant to the query. Recall is the retrieval of all items that are relevant to the query. Yet most information retrieval technologies are less than 22% accurate for both precision and recall. The ideal goal is to have them balanced. Compound Term Processing has the ability to increase precision with no loss of recall. Documents that have been auto-classified are now accessible by searching for all the content within a folder and by using Microsoft Enterprise Search which can now filter on highly relevant metadata that has been created with Taxonomy Manager. Search results are clustered into categories or facets enabling an end user to rapidly drill into a result set based on organizational, functional, product line, and geographic metadata that have been generated using Taxonomy Manager and automatically tagged to relevant documents and records within document libraries. Based on the end user search refinement new facets will be generated when the query changes.
concept Classifier for SharePoint integration with Microsoft Office and Microsoft Exchange the automatic metadata generation and classification without end user participation. Alternatively, the Subject Matter Expert (SME) or Knowledge Worker can be granted the authority to modify the results from within the traditional Microsoft Office interface. The knowledge worker is the most qualified person to anticipate how the asset will be searched for and how to make it easy to find. The automatic classification returns not only single words but identifies concepts within the document to assist the knowledge worker in the classification process. This guided approach enables the knowledge worker to precisely and accurately classify the document for reuse and retrieval. Placing the ability to classify documents into the hands of knowledge workers results in rich and comprehensive metadata, significantly improving the organization’s ability to leverage their information capital. · Gives business experts the ability to classify critical business · information with highly relevant metadata · Greatly improves the search and retrieval process by ensuring accurate and complete metadata · Expedites organizational access to real-time information · Provides a consistent content management approach · Delivers metadata rich information retrieval thereby maximizing productivity and organizational agility
U.S. Air Force Medical Service US Air Force Medical Service rolled out Concept Searching to over 66,000 users. In their analysis of vendors Concept Searching was selected based on the technologies. In evaluating the Taxonomy Manager, compared to other vendors they estimated that utilizing Concept Searching technologies could reduce the taxonomy development time by 80% saving them considerable man hours, resources and costs. Cost savings was estimated at $150K - $300K. The U.S. Air Force wrote a paper about the solution and were subsequently selected to present the paper and findings at the International Institute for Advanced Studies in Systems Research and Cybernetics in Baden, Germany in the fall of 2008. U.S. Defense Center for Excellence for Psychological Health and Traumatic Brain Injury Client initially purchased the solution for their 24/7 Customer Service Center. This was fully deployed within 3 weeks. During the deployment engagement they viewed the other uses for the technology and immediately upgraded to an Enterprise License to use Concept Searching as their classification standard as well as use it to identify ‘personally identifiable’ and potentially unknown data exposures. This was not a MOSS environment and was included in the solution.
Let’s take a look how the Air Force Medical Service is using their existing Concept Searching and Microsoft Enterprise license agreements to address enterprise wide capability gaps relating to: 1. Inadequate search precision across every search platform in the federal sector used by AFMS members; 2. Increasing amounts of PII, PHI, Classified Message Incident, and Sensitive Information unauthorized data releases 3. Non-compliance with data storage and data preservation requirements set forth in federal records management programs; and 4. An inability to use data analytics to make leadership aware of sensitive information data breaches and other events such as upcoming records destruction schedules. Just about every organization has some type of migration plan and the USAF is no different. With over 74 organizations faced with having to migrate content to a SharePoint environment. For organizations that are looking to migrate their content to a SharePoint environment all that they have to do is copy or use migration scripts to place this new content into SharePoint. Documents, messaging/chat logs, e-mail, and other content in SharePoint is then automatically tagged and classified in accordance with the organizational enterprise metadata environment model that is managed and maintained in Concept Searching’s Taxonomy Manager. After the tagging process an event-handler identifies documents which have metadata that require the update of a Content Type. This step is very important since Content Types drive activities associated with every document. The manual or blanket application of a Content Type is no different than the manual application of metadata. It is subjective, inefficient, and costly to do one record at a time. By automatically updating Content Types in SharePoint to reflect the actual content of a particular data asset the organization is now making their information actionable within SharePoint. What does this mean? RMS templates can automatically be applied to documents containing sensitive information without having to read each and every document to decide if it contains sensitive information or not. Records Retention Codes can be automatically applied as metadata and then updated as its own Content Type to drive appropriate data storage location and preservation. To dramatically increase search precision Concept Searching then applies different taxonomies and their associated metadata to records based on their unique Content Type. For organizations using Search Server Express, SharePoint Search, or FAST all will experience increased search precision as a result of Concept Classifier for SharePoint automatically tagging documents and records with highly correlated metadata. For organizations that have deployed Performance Point they can then use their declared Content Types to report daily on prevented data exposure events, identify which members are consistently putting the organization at risk for fines and litigation, and identify how many and which documents and records are coming due for destruction. The AFMS is using Concept Searching to automatically generate PII metadata from their respective content sources that are being migrated. This metadata is then placed into Concept Searching’s Taxonomy Manager in order to ascertain the location of sensitive information during the classification process. Since PII, PHI, and other types of sensitive information are also collected on forms that contain handwriting, Taxonomy Manager is also used to create a metadata environment around how the organization collects sensitive information. During the classification process Concept Searching automatically identifies sensitive information and then migrates that information to a “staging” location on the network where Information Rights Management templates are applied.