19. Search will return results based on the concept even if the exact terms are not contained in the document (i.e. ‘coronary artery surgery’, ‘heart surgery’)
123. Enables import of FAST Entities into the conceptClassifier taxonomy manager to fine-tune them with metadata generated from your own content and nomenclature
124. Runs natively as a FAST Pipeline Stage eliminating integration and customization issues
160. Cons: high customization costs, increase in end-user labor costs, less end-user productivity, non-standardized application of metadata across enterprise
169. Uses content types derived from metadata to drive individual and group access to data assets using inherent SharePoint Security;
170.
171.
172.
Notas do Editor
Traditional search assumes the end user knows what they are looking for, or must enter the ‘right’ combination of words to get the ‘right’ result.Knowledge workers need to identify content in the context of what they are seeking. The fundamental problem with search solutions is that they are based on an index of single words. Yet most queries are expressed in short patterns of words and not single words in isolation – which are highly ambiguous. In the example above, a search engine would identify all the documents that contained the words: triple, heart, bypass instead of documents that contained the concept of ‘triple heart bypass’. Since the concept has been identified, other documents that have related concepts will be identified even if they do not contain that exact phrase. The metadata generation issue is increasingly a growing concern in enterprises. Not only for search but also for records management, compliance, and enterprise content management. A comprehensive approach requires more than syntactic metadata and requiring end users to add rich metadata is haphazard and subjective at best. Since conceptClassifier for SharePoint is no longer restricted to keyword identification, compound term metadata can be automatically generated either when the content is created or ingested. The generation of metadata based on concepts extracts compound terms and keywords from a document or corpus of documents that are highly correlated to a particular concept. By identifying the most significant patterns in any text, these compound terms can then be used to generate non-subjective metadata based on an understanding of conceptual meaning. Compound term processing can address many challenges facing large enterprises and provide many benefits. Identification of concepts within a large corpus of information removes the ambiguity in search, eliminates inconsistent meta-tagging, and automatic classification and taxonomy management based on concept identification simplifies development and on-going maintenance.
It is important to note that metadata, auto-classification, and taxonomies are not applications – the business value of these tools are often integrated with other solutions – such as the offerings of the other participants in this panelLet’s look at where these tools can compliment other solutions and improve business processesCLICK: Migration:With the vast amounts of content - moving all content doesn’t make sense and using valuable resources to identify what should/should not be migrated isn’t a good use of time or moneyBefore the migration you can use these technologies to: Eliminate duplicate documentsIdentify documents that contain confidential or privacy dataIdentify and declare records Identify high value contentSavings: We had one client who needed to manually tag 45K marketing documents and estimated that it would take 6 months will 2 full-time people – with our tools it took 2 weeksCLICK Search:The age old problem is how to get end users to tag content – it’s estimated that less 50% of content is correctly indexed, meta tagged or efficiently searchable – it isn’t about what search engine you useStatistics still claim that end users spend 15% of their time duplicating information, 25% searching, and 40% can’t find what they need to do their jobsAutomatic generation of conceptual metadata removes the end user from the tagging process HUMANS WON’T TAG CONTENT THROUGH FORMS, PICKLISTS, DROP DOWNS BUT WE WILL ALWAYS FIND WAYS TO AVOID TAGGINGContent, once tagged can be provided to any search engine index to deliver more accurate search resultsUsing the taxonomy users can more efficiently find relevant information via the hierarchical structure Savings: 2.5 hours per day per userCLICK Records Management:The problem cited most frequently is inconsistent end user tagging in the declaration of recordsWith metadata generation and a taxonomy that mirrors the file plan – documents can be automatically declared records based on the concepts and descriptors within the documentBased on custom Content Types in SharePoint the document can be declared a record and routed to the RM repository Savings: $4 - $7.04 per document recordCLICK Data Privacy ProtectionTaxonomy(s) can be created to identify any organizationally defined confidential information When content is created or ingested the document can be identified as containing confidential information and using Content Type updating the document can be routed to a secure location and locked down using Windows Rights ManagementCost Avoidance: Average cost of a data exposure is $225K - $35 million
Can have multiple instances of managed Metadata Services – ideal approach SharePoint 2010 ElementCommentsSite Collection/Site StructureCan be organized by a hierarchical taxonomy structureDocument Library StructureCan be organized by a hierarchical taxonomy structureColumnsWhere terms are applied to content in Document Libraries and ListsTermA metadata valueTerm SetHierarchical metadata with valuesManaged MetadataSP 2010’s ability to manage terms and term sets outside of columnsKeywordsAllows to add metadata from Term Sets or create new keywordContent TypesAbility to manage metadata associated with particular types of contentis to have an ‘Enterprise’ taxonomy and then could have multiple ‘local’ or ‘regional’ taxonomiesGroups are the security boundary that provides the ability to have groups of users who will manage themAlways use a core Managed Metadata Service term store for the enterprise taxonomy• Allow local Managed Metadata Services for isolated, locally managed term stores• Always use synonyms when defining terms, consistent content tagging is essential for content management and for driving findability• Use term translation to support other languages for the term• Avoid random or haphazard tagging due to unintelligible terms• Enable managed keywords for user-driven freeform tagging of content• Ensure that term sets are evolved according to best practices• Define and enforce a policy for reviewing open term sets for improper usageNote that search do not comprise term synonyms or translations when searching, it only finds the stored key term. The same applies to faceted search – or 'refinement panels' as they are called.You can have multiple Term Set stores and Content Type Hub inventories in SharePoint 2010. This allows for combining both enterprise definitions and local definitions to support both shared and isolated taxonomy configurations. See Plan to share terminology and content types on Technet.
The Only Microsoft Solution that Runs Natively in ... FAST Search, SharePoint 2007, 2010, Windows Server R2 FCI, and Microsoft Office conceptClassifier provides the tools to rapidly build and easily manage unstructured content. Providing automatic conceptual metadata generation, automated classification and taxonomy management organizations can harness the power of content to not only improve findability within the FAST Search product suite, but drive additional business processes such as records management, compliance, and enforce governance. The Only FAST Search Solution that ... Automatically Generates Conceptual Metadata Utilizing our unique concept identification and extraction capabilities, conceptClassifier’s statistical engine can identify out-of-the box all the meaningful concepts resident within an organization’s own information repositories and automatically generate semantic metadata that is unique to organization and their nomenclature. The ability to automatically generate conceptual multi-word term metadata and placing those terms in the FAST Search index, the search can be performed with a higher degree of accuracy because the ambiguity inherent in single words is no longer a problem. Utilizing the Concept Searching technology framework, end users can now search on concepts, delivering a multi-dimensional view of relevant information and easily identify the relationships between content assets that otherwise may not have been found. The Only FAST Search Solution that ... Eliminates Manual Metadata Tagging The Only FAST Search Solution that... Delivers Innovative, Intuitive, & Rapidly Deployed Taxonomy Management Managed by Business Users
BY ADDRESSING THE TECHNOLOGY AND PROCESS INSTEAD OF THE HUMAN BEHAVIOR ORGANIZATIONS CAN IMPROVE SEARCH OUTCOMES, BRING ABOUT COMPLIANCE WITH INFORMATION AND RECORDS MANAGEMENT POLICIES AND DECREASE POTENTIAL DATA EXPOSURE EVENTS.IN THIS SCENARIO THE CLIENT IS USING SHAREPOINT (BUT IT CAN BE ANY REPOSITORY) THE END USER SIMPLY LOADS A DOCUMENT OR SET OF DOCUMENTS INTO SHAREPOINTCONCEPTCLASSIFIER AUTOMATICALLY APPLIES CONCEPTUAL METADATA FOUND WITHIN THE DOCUMENT SO IT CAN BE USED TO IMPROVE SEARCH, WHERE APPROPRIATE THE CORRECT CONTENT TYPE IS APPLIED TO ENABLE WINDOWS RIGHTS MANAGEMENT, KICK OFF WORKFLOWS, AND APPLY RECORDS RETENTION CODES FOR STORAGE AND PRESERVATION.FOR CONTENT RESIDING IN ARCHIVE AND BACKUP SYSTEMS CONCEPTCLASSIFIER APPLIES THE SAME CATEGORIES OF METADATA WITHOUT ANY END-USER INTERVENTION.
On this slide we content enters the MOSS environment either from multiple sources. Once in that environment and event handler triggers conceptClassifier for SharePoint to apply metadata to each data asset based upon the organizational metadata environment maintained in Taxonomy Manager. The metadata environment maintained in Taxonomy manager can include both organizational created metadata and third party metadata that has been aligned to organizational functions, data privacy and security guidelines, and records retention codes. Once metadata has been applied by conceptClassifier for SharePoint Custom Content Types that have been aligned to specific metadata tags are automatically applied. These two critical steps (automatic application of both metadata and Custom Content Types) provide the following value to organizations: Organizations can discern in real time “what is a document” and “what is a record” and can immediately take action relating to ensuring that declared records are stored in the right location and preserved for the correct period of time; Individual and group access permissions can be automatically applied to data assets/documents based on Custom Content Type; and, Data assets/documents can be automatically migrated to the appropriate document library for the automatic application of Windows Rights Management services to control data usage (i.e. can be viewed but not downloaded, e-mailed, printed, etc…).By automating the metadata and content type application processes organizations not only provide transparency and findability for their end-users, they can also ensure that appropriate document access permissions are applied across the enterprise while also controlling how documents are by end-users who have been granted access. In summary – Concept Searching’s enabling technology improves findability and reduces costs by enabling automated compliance with organizational e-Discovery, Records Management, and Information Management (data privacy and security) guidelines.
On this slide data assets and documents that have been automatically tagged with metadata and custom content types have been automatically migrated to document libraries based on their custom content type. Based on organizational data privacy and security guidelines document libraries containing sensitive information are now only accessible to certain individuals and groups and each library contains content where Windows Rights Management services have been applied to control how the documents in a particular library are used. When these documents are “checked-out” a key is issued that allows the end-user to access and use the data asset/document in a manner that has been pre-approved by the organization. This of course occurs after the Active Directory-Rights Management Services (AD-RMS) database and server already communicated with the MOSS farm and both publishing and user license credentials have been established and provisioned.For individuals who may attempt to access the SharePoint Content Database directly and by-pass AD-RMS protected content the linking of Custom Content Types to SharePoint Security Services prevents un-authorized users “back door” access to organizational content.