The best content strategists are limited by how much content they can analyze. There comes a point where a content set becomes too large to analyze using usual methods. Do you have the skills to scale your strategy? Enter XML and XPath, two languages that provide deep insight into content with superhuman efficiency. This session teaches code-shy content strategists enough of these languages to be effective. You may be used to interviewing your users. Now learn how to query your content itself!
The document discusses XPath, which is a language for finding information in an XML document. It defines XPath syntax using path expressions to select nodes. It describes XPath terminology like nodes, relationships between nodes, and functions. Examples are provided to demonstrate XPath expressions for selecting elements, attributes, and filtering nodes. Predicates are also described for finding specific nodes or values.
XML is a markup language used to define custom document formats and data exchange standards. It allows users to define tags and attributes to structure text-based data. XML documents must adhere to rules like having matching start/end tags and a single root element to be considered well-formed. Document Type Definitions (DTDs) can be used to establish a fixed vocabulary and structure for XML documents in an application. XPath and XQuery are query languages that allow retrieving and manipulating parts of XML documents and datasets based on element names, attributes, values and structures.
This document provides an overview of XML including:
- XML stands for Extensible Markup Language and is used to carry data, not display it. Tags are user-defined.
- An XML example shows a simple note with predefined tags.
- XML schemas define valid elements, attributes, structure and data types for XML documents.
- XML documents form a tree structure with elements nested within a root element. Syntax rules ensure documents are well-formed.
- XML parsers like SAX and DOM are used to read and build a model of an XML document programmatically.
This document provides an overview of XML, including:
- XML is not a replacement for HTML, a presentation format, programming language, or network transfer protocol, but can be used with these.
- XML examples demonstrating tags, elements, attributes, and how XML documents form ordered trees.
- Key aspects of XML like namespaces, DTDs, schemas, and how XML documents are linked to external definitions.
Introduction to ArchiveSpark, given at the WebSci' 2016 Hackathon: Exploring the Past of the Web: Alexandria & Archive-It Hackathon http://www.websci16.org/hackathon https://github.com/helgeho/ArchiveSpark
- XML (eXtensible Markup Language) is a markup language that is designed to store and transport data. It was released in the late 1990s and became a W3C recommendation in 1998.
- XML is not meant to display data like HTML, but rather to carry data. It is designed to be self-descriptive, platform independent, and language independent. Tags are defined by the user rather than being predefined.
- A markup language uses tags to highlight or underline parts of a document. Modern markup languages like XML use tags to replace highlighting and underlining.
By now, you have heard how important structured content is. But, maybe you poked around with something like DITA and were baffled by the complexity. Or, maybe you still aren’t sure what XSLT stands for. This workshop will take participants back to the basics, to provide a foundation for higher-level concepts that have taken hold of our industry. Topics will include:
- What XML looks like, what it does, and how to create it.
- How to define a structure model, including whether to use a - DTD, Schema, etc.
- What XSLT looks like, what it does, and how to make it work.
- What DITA and DocBook really are and whether one is right for you.
Russell Ward is an experienced technical writer and structured technologies developer. He has spent many years working with structured content to maximize efficiency in the techcomm environment, both as an employee and as an independent consultant. He is also an experienced trainer and speaks periodically at conferences and other peer events.
This document provides an introduction to HTML and CSS. It begins with an overview of HTML, including its history and purpose. It then covers HTML5 updates and differences from previous versions. The document also introduces CSS, explaining concepts like rules, selectors, properties, and values. It describes different methods for adding CSS to HTML, such as internal, external, and imported stylesheets. Finally, the document discusses CSS selectors like type, ID, and class selectors, as well as inheritance in CSS.
The document discusses XPath, which is a language for finding information in an XML document. It defines XPath syntax using path expressions to select nodes. It describes XPath terminology like nodes, relationships between nodes, and functions. Examples are provided to demonstrate XPath expressions for selecting elements, attributes, and filtering nodes. Predicates are also described for finding specific nodes or values.
XML is a markup language used to define custom document formats and data exchange standards. It allows users to define tags and attributes to structure text-based data. XML documents must adhere to rules like having matching start/end tags and a single root element to be considered well-formed. Document Type Definitions (DTDs) can be used to establish a fixed vocabulary and structure for XML documents in an application. XPath and XQuery are query languages that allow retrieving and manipulating parts of XML documents and datasets based on element names, attributes, values and structures.
This document provides an overview of XML including:
- XML stands for Extensible Markup Language and is used to carry data, not display it. Tags are user-defined.
- An XML example shows a simple note with predefined tags.
- XML schemas define valid elements, attributes, structure and data types for XML documents.
- XML documents form a tree structure with elements nested within a root element. Syntax rules ensure documents are well-formed.
- XML parsers like SAX and DOM are used to read and build a model of an XML document programmatically.
This document provides an overview of XML, including:
- XML is not a replacement for HTML, a presentation format, programming language, or network transfer protocol, but can be used with these.
- XML examples demonstrating tags, elements, attributes, and how XML documents form ordered trees.
- Key aspects of XML like namespaces, DTDs, schemas, and how XML documents are linked to external definitions.
Introduction to ArchiveSpark, given at the WebSci' 2016 Hackathon: Exploring the Past of the Web: Alexandria & Archive-It Hackathon http://www.websci16.org/hackathon https://github.com/helgeho/ArchiveSpark
- XML (eXtensible Markup Language) is a markup language that is designed to store and transport data. It was released in the late 1990s and became a W3C recommendation in 1998.
- XML is not meant to display data like HTML, but rather to carry data. It is designed to be self-descriptive, platform independent, and language independent. Tags are defined by the user rather than being predefined.
- A markup language uses tags to highlight or underline parts of a document. Modern markup languages like XML use tags to replace highlighting and underlining.
By now, you have heard how important structured content is. But, maybe you poked around with something like DITA and were baffled by the complexity. Or, maybe you still aren’t sure what XSLT stands for. This workshop will take participants back to the basics, to provide a foundation for higher-level concepts that have taken hold of our industry. Topics will include:
- What XML looks like, what it does, and how to create it.
- How to define a structure model, including whether to use a - DTD, Schema, etc.
- What XSLT looks like, what it does, and how to make it work.
- What DITA and DocBook really are and whether one is right for you.
Russell Ward is an experienced technical writer and structured technologies developer. He has spent many years working with structured content to maximize efficiency in the techcomm environment, both as an employee and as an independent consultant. He is also an experienced trainer and speaks periodically at conferences and other peer events.
This document provides an introduction to HTML and CSS. It begins with an overview of HTML, including its history and purpose. It then covers HTML5 updates and differences from previous versions. The document also introduces CSS, explaining concepts like rules, selectors, properties, and values. It describes different methods for adding CSS to HTML, such as internal, external, and imported stylesheets. Finally, the document discusses CSS selectors like type, ID, and class selectors, as well as inheritance in CSS.
This document provides an overview of key concepts and processes in text analysis, including identifying themes, building codebooks, coding data, describing codes, making comparisons, and building and testing models. Some of the main points covered include defining what a theme is, different approaches to identifying themes (inductive vs. deductive), tips for building codebooks, the open and axial coding process, using code descriptions to help coders, comparing findings to prior literature, and testing conceptual models that are developed.
ATLAS.ti Training - Covering the Basics (Mac edition)Arun Verma
Covering the basics for ATLAS.ti users Mac edition (With some Windows version information as well). For information on introductory workshops, support or advice on ATLAS.ti, please get in touch.
(User must get permission from author to re-use any part of this presentation. Any use of presentation must be referenced clearly to the author with relevant hyperlinks and contact information to author)
ATLAS.ti training presentation: Covering the basics Arun Verma
This is a short introduction to using ATLAS.ti on Mac. This presentation provides you with all the basics to get you started with your qualitative data analysis.
Introduction to XML and Structured Authoring • Overview of DITA • Topics: The Basic Information Types • Maps: Assembling Topics into Deliverables • Common elements and attributes • Metadata • Examples and exercises
XPath is a language for finding information in an XML document, using path expressions to navigate elements and attributes. It supports operators, functions and axes to locate nodes and return node sets, booleans, strings, numbers or other values. XSLT uses XPath to select nodes for transformation and XSL-FO uses the document structure defined by XSLT for formatting and layout.
Post-conference workshop at tcworld India 2012. Provides background on structured authoring, XML, planning your topics, writing topics, and writing for re-use.
Decoding and developing the online finding aidkgerber
Workshop for the Library Technology Conference on Encoded Archival Description, and the mark-up languages involved in its use including HTML, XML, and XSLT.
Python was created by Guido van Rossum in the late 1980s and named after Monty Python. It is a general purpose, high-level programming language that supports multiple paradigms like object-oriented, functional, and imperative programming. Django is a Python web framework that grew out of a newspaper project and follows the MVC pattern, separating concerns into models, views, templates. It provides tools for authentication, forms, administration, and more so that developers can focus on their specific applications.
The document discusses the objectives and structure of an HTML5 tutorial, including exploring the history of the web, creating the structure of an HTML document, inserting elements and attributes, and linking to other resources. It covers the basics of HTML5 such as the document type declaration, element tags, attributes, comments, and different types of elements like headings, paragraphs, images, and links.
This document provides an overview of XML programming and XML documents. It discusses the physical and logical views of an XML document, document structure including the root element, and how XML documents are commonly stored as text files. It also summarizes how an XML parser reads and validates an XML document by checking its syntax and structure. The document then covers various XML components in more detail, such as elements, attributes, character encoding, entities, processing instructions, well-formedness, validation via DTDs, and document modeling.
Molly is a knowledge discovery system that aims to infer the desired document from a user's keyword query by building indices from a corpus database and linking related entities. It takes in a configuration file to specify the entities and values to index from tables, transforms rows into labeled entities, and stores the entities and their relationships in documents to allow for recursive searches across linked groups. Future work includes improving the system's ability to learn user intent and developing an interface to clearly present inferred results.
What's With The 1S And 0S? Making Sense Of Binary Data At Scale With Tika And...gagravarr
This document provides an overview of Apache Tika, an open source toolkit for detecting and extracting metadata and structured text from various file formats. It discusses Tika's capabilities for detecting file types using filename extensions, magic bytes, and parsing file containers. It also describes how Tika extracts metadata, plain text, and XHTML from files and supports detecting text encodings and languages. The document outlines different ways to extend and customize Tika, as well as various options for integrating and running Tika programs.
ProjectPro offers Solved End-to-End, Ready to Deploy, Enterprise-Grade Big Data, and Data Science Projects for Reuse and Upskilling. Each project solves a real business problem end-to-end and comes with solution code, explanation videos, cloud lab, and tech support.
ProjectPro offers a hands-on approach to mastering machine learning and data science through 150+ solved end-to-end deployable machine learning and data science projects. They also provide 2000+ FREE data science code examples that can help one master the foundations of basic data science and machine learning concepts.
A presentation from ApacheCon Europe 2015 / Apache Big Data Europe 2015
Apache Tika detects and extracts metadata and text from a huge range of file formats and types. From Search to Big Data, single file to internet scale, if you've got files, Tika can help you get out useful information!
Apache Tika has been around for nearly 10 years now, and in that time, a lot has changed. Not only has the number of formats supported gone up and up, but the ways of using Tika have expanded, and some of the philosophies on the best way to handle things have altered with experience. Tika has gained support for a wide range of programming languages to, and more recently, Big-Data scale support, and ways to automatically compare effects of changes to the library.
Whether you're an old-hand with Tika looking to know what's hot or different, or someone new looking to learn more about the power of Tika, this talk will have something in it for you!
DC SPUG Feb 2015 The Secret Sauce to Information ArchitectureJill Hannemann
The Secret Sauce to Information Architecture for SharePoint. This presentation discusses challenges, strategy, and techniques for creating appropriate information architecture for your SharePoint or Office 365 Sites environment.
Using the Archivists' Toolkit: Hands-on practice and related toolsAudra Eagle Yun
The document provides an overview of the Archivists' Toolkit (AT), a free and open-source archival management application. It discusses the key functions of AT including recording accessions, describing archival materials and digital objects, managing locations, and exporting records. The document then demonstrates how to get started with AT by setting up a repository record and user accounts. It guides the user through activities like adding locations, creating accession and resource records, and linking them together. Finally, it discusses how to export finding aids from AT in EAD format and submit them to online archives like the Online Archive of California.
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
Mais conteúdo relacionado
Semelhante a Structured Strategy: How to Supercharge Your Content Analysis with XML and XPath
This document provides an overview of key concepts and processes in text analysis, including identifying themes, building codebooks, coding data, describing codes, making comparisons, and building and testing models. Some of the main points covered include defining what a theme is, different approaches to identifying themes (inductive vs. deductive), tips for building codebooks, the open and axial coding process, using code descriptions to help coders, comparing findings to prior literature, and testing conceptual models that are developed.
ATLAS.ti Training - Covering the Basics (Mac edition)Arun Verma
Covering the basics for ATLAS.ti users Mac edition (With some Windows version information as well). For information on introductory workshops, support or advice on ATLAS.ti, please get in touch.
(User must get permission from author to re-use any part of this presentation. Any use of presentation must be referenced clearly to the author with relevant hyperlinks and contact information to author)
ATLAS.ti training presentation: Covering the basics Arun Verma
This is a short introduction to using ATLAS.ti on Mac. This presentation provides you with all the basics to get you started with your qualitative data analysis.
Introduction to XML and Structured Authoring • Overview of DITA • Topics: The Basic Information Types • Maps: Assembling Topics into Deliverables • Common elements and attributes • Metadata • Examples and exercises
XPath is a language for finding information in an XML document, using path expressions to navigate elements and attributes. It supports operators, functions and axes to locate nodes and return node sets, booleans, strings, numbers or other values. XSLT uses XPath to select nodes for transformation and XSL-FO uses the document structure defined by XSLT for formatting and layout.
Post-conference workshop at tcworld India 2012. Provides background on structured authoring, XML, planning your topics, writing topics, and writing for re-use.
Decoding and developing the online finding aidkgerber
Workshop for the Library Technology Conference on Encoded Archival Description, and the mark-up languages involved in its use including HTML, XML, and XSLT.
Python was created by Guido van Rossum in the late 1980s and named after Monty Python. It is a general purpose, high-level programming language that supports multiple paradigms like object-oriented, functional, and imperative programming. Django is a Python web framework that grew out of a newspaper project and follows the MVC pattern, separating concerns into models, views, templates. It provides tools for authentication, forms, administration, and more so that developers can focus on their specific applications.
The document discusses the objectives and structure of an HTML5 tutorial, including exploring the history of the web, creating the structure of an HTML document, inserting elements and attributes, and linking to other resources. It covers the basics of HTML5 such as the document type declaration, element tags, attributes, comments, and different types of elements like headings, paragraphs, images, and links.
This document provides an overview of XML programming and XML documents. It discusses the physical and logical views of an XML document, document structure including the root element, and how XML documents are commonly stored as text files. It also summarizes how an XML parser reads and validates an XML document by checking its syntax and structure. The document then covers various XML components in more detail, such as elements, attributes, character encoding, entities, processing instructions, well-formedness, validation via DTDs, and document modeling.
Molly is a knowledge discovery system that aims to infer the desired document from a user's keyword query by building indices from a corpus database and linking related entities. It takes in a configuration file to specify the entities and values to index from tables, transforms rows into labeled entities, and stores the entities and their relationships in documents to allow for recursive searches across linked groups. Future work includes improving the system's ability to learn user intent and developing an interface to clearly present inferred results.
What's With The 1S And 0S? Making Sense Of Binary Data At Scale With Tika And...gagravarr
This document provides an overview of Apache Tika, an open source toolkit for detecting and extracting metadata and structured text from various file formats. It discusses Tika's capabilities for detecting file types using filename extensions, magic bytes, and parsing file containers. It also describes how Tika extracts metadata, plain text, and XHTML from files and supports detecting text encodings and languages. The document outlines different ways to extend and customize Tika, as well as various options for integrating and running Tika programs.
ProjectPro offers Solved End-to-End, Ready to Deploy, Enterprise-Grade Big Data, and Data Science Projects for Reuse and Upskilling. Each project solves a real business problem end-to-end and comes with solution code, explanation videos, cloud lab, and tech support.
ProjectPro offers a hands-on approach to mastering machine learning and data science through 150+ solved end-to-end deployable machine learning and data science projects. They also provide 2000+ FREE data science code examples that can help one master the foundations of basic data science and machine learning concepts.
A presentation from ApacheCon Europe 2015 / Apache Big Data Europe 2015
Apache Tika detects and extracts metadata and text from a huge range of file formats and types. From Search to Big Data, single file to internet scale, if you've got files, Tika can help you get out useful information!
Apache Tika has been around for nearly 10 years now, and in that time, a lot has changed. Not only has the number of formats supported gone up and up, but the ways of using Tika have expanded, and some of the philosophies on the best way to handle things have altered with experience. Tika has gained support for a wide range of programming languages to, and more recently, Big-Data scale support, and ways to automatically compare effects of changes to the library.
Whether you're an old-hand with Tika looking to know what's hot or different, or someone new looking to learn more about the power of Tika, this talk will have something in it for you!
DC SPUG Feb 2015 The Secret Sauce to Information ArchitectureJill Hannemann
The Secret Sauce to Information Architecture for SharePoint. This presentation discusses challenges, strategy, and techniques for creating appropriate information architecture for your SharePoint or Office 365 Sites environment.
Using the Archivists' Toolkit: Hands-on practice and related toolsAudra Eagle Yun
The document provides an overview of the Archivists' Toolkit (AT), a free and open-source archival management application. It discusses the key functions of AT including recording accessions, describing archival materials and digital objects, managing locations, and exporting records. The document then demonstrates how to get started with AT by setting up a repository record and user accounts. It guides the user through activities like adding locations, creating accession and resource records, and linking them together. Finally, it discusses how to export finding aids from AT in EAD format and submit them to online archives like the Online Archive of California.
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceIndexBug
Imagine a world where machines not only perform tasks but also learn, adapt, and make decisions. This is the promise of Artificial Intelligence (AI), a technology that's not just enhancing our lives but revolutionizing entire industries.
CAKE: Sharing Slices of Confidential Data on BlockchainClaudio Di Ciccio
Presented at the CAiSE 2024 Forum, Intelligent Information Systems, June 6th, Limassol, Cyprus.
Synopsis: Cooperative information systems typically involve various entities in a collaborative process within a distributed environment. Blockchain technology offers a mechanism for automating such processes, even when only partial trust exists among participants. The data stored on the blockchain is replicated across all nodes in the network, ensuring accessibility to all participants. While this aspect facilitates traceability, integrity, and persistence, it poses challenges for adopting public blockchains in enterprise settings due to confidentiality issues. In this paper, we present a software tool named Control Access via Key Encryption (CAKE), designed to ensure data confidentiality in scenarios involving public blockchains. After outlining its core components and functionalities, we showcase the application of CAKE in the context of a real-world cyber-security project within the logistics domain.
Paper: https://doi.org/10.1007/978-3-031-61000-4_16
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
Fueling AI with Great Data with Airbyte WebinarZilliz
This talk will focus on how to collect data from a variety of sources, leveraging this data for RAG and other GenAI use cases, and finally charting your course to productionalization.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Taking AI to the Next Level in Manufacturing.pdfssuserfac0301
Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as:
1. How quickly AI is being implemented in manufacturing.
2. Which barriers stand in the way of AI adoption.
3. How data quality and governance form the backbone of AI.
4. Organizational processes and structures that may inhibit effective AI adoption.
6. Ideas and approaches to help build your organization's AI strategy.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
Best 20 SEO Techniques To Improve Website Visibility In SERPPixlogix Infotech
Boost your website's visibility with proven SEO techniques! Our latest blog dives into essential strategies to enhance your online presence, increase traffic, and rank higher on search engines. From keyword optimization to quality content creation, learn how to make your site stand out in the crowded digital landscape. Discover actionable tips and expert insights to elevate your SEO game.
2. Who I am
• Information Architect at Precision
Content
• Certified Professional Technical
Communicator (CPTC) Foundation
• Co-Organized World Information
Architecture Day 2023 Toronto
• Master of Information from the
University of Toronto
3. 3
We are experts in structured content.
We’re a full-service, end-to-end technical
communications consultancy, technology
innovator, and systems integrator offering
professional services, training, and technology.
Areas of expertise
• structured authoring methods
• content lifecycle management
• DITA/XML design and
implementation
• information architecture
• content strategy,
• and structured content delivery.
4. 4
Who is this presentation for?
People who…
• strategize, plan for, and otherwise work with text
content
• understand the benefits of structured content
• are familiar with XML but perhaps not with XPath
• want to learn how to take their content analysis
skills to the next level
5. 5
Structured
content
Content is easier to use and
understand when organized in a
predictable way.
Content is written to fit a model:
• Title
• Presenter
• Description
• Speaker Bio
6. 6
Structure makes content FAIR
Findable
Accessible
Interoperable
Reusable
“The FAIR Guiding Principles for scientific data management and stewardship” was published in Scientific Data in 2016
7. 7
An example of structure: HTML
Content is contained
inside opening and
closing tags.
Sometimes elements
contain other
elements.
All elements are
contained within a
single root.
8. 8
There’s just one problem…
These elements don’t tell me anything about what the content is about!
9. 9
XML
• XML is a way to store information
• XML stands for “eXtensible Markup Language”
• “Extensible” means that you define your own structure
HTML: Pre-defined tags XML: Define your own tags
10. 10
How do you define your own structure?
You define your structure, or your content model,
in a Document Type Definition (DTD).
11. 11
Defining your structure
What you can
define with a DTD
What you can’t
define with a DTD
• Elements
• Attributes
• If an element can
contain text, another
element, or both
• Order of elements
• If something is
required or optional
• Length of content
• Occurrence
constraints
• What text can go
inside elements
12. 12
Structure helps you analyze your content
• Structure is a prerequisite to performing content
analysis at scale
• You want a way to tell if your content is valid or invalid
• Semantic structures can be understood by both people
and computers
• Using a widely adopted standard like XML lets us take
advantage of specialized tools
• Oftentimes you can adopt a standard structure rather
than inventing your own
13. 13
XML-based standards
Some extensions of XML have become standards in their own right
Scalable Vector Graphics Resource Description
Framework
Darwin Information Typing Architecture (DITA)
14. 14
Finding structure
• What if your content is unstructured?
• Look for patterns in
• attributes
• classes
• common parent/sibling elements, and
• common text strings.
15. 15
Creating structure
• Break your content down into microcontent
• about one primary idea, fact, or concept
• easily scannable
• labelled for clear identification
and meaning, and
• appropriately written and formatted
for use anywhere and anytime it is needed.
16. 16
Microcontent structure
Source: The DITA Style Guide – Best Practices for Authors. Tony Self. www.ditastyle.com
• You do not need code to have structure
• Structure means
• systematic labelling
• modular, topic-based architecture
• constrained writing environments, and
• separation of content and form.
17. 17
Focus
Information about hours of work
Requirement for unplanned absences
Information about lunch breaks
Requirement for planned absences
21. 21
What is XPath?
• XPath is a language that lets you identify particular parts
of XML documents
• In XPath, we write “location paths”
• Example of an XPath location path: //bookstore/book/@id
• XPath can help you answer queries like…
• “Show me every element called ‘book’.”
• “Show me the parent element of the element called
‘price’.”
• “Show me all the elements that have the attribute
‘language’ set to ‘English’.”
• … and much more
• XPath is used in other XML-related languages like
XQuery and XSLT
23. 23
1. The root node
2. Element nodes
3. Text nodes
4. Attribute nodes
5. Comment nodes
6. Processing
instruction nodes
7. Namespace nodes
Seven kinds of XML nodes
24. 24
1. The root node
2. Element nodes
3. Text nodes
4. Attribute nodes
5. Comment nodes
6. Processing
instruction nodes
7. Namespace nodes
Seven kinds of XML nodes
25. 25
1. The root node
2. Element nodes
3. Text nodes
4. Attribute nodes
5. Comment nodes
6. Processing
instruction nodes
7. Namespace nodes
Seven kinds of XML nodes
26. 26
1. The root node
2. Element nodes
3. Text nodes
4. Attribute nodes
5. Comment nodes
6. Processing
instruction nodes
7. Namespace nodes
Seven kinds of XML nodes
27. 27
1. The root node
2. Element nodes
3. Text nodes
4. Attribute nodes
5. Comment nodes
6. Processing
instruction nodes
7. Namespace nodes
Seven kinds of XML nodes
28. 28
1. The root node
2. Element nodes
3. Text nodes
4. Attribute nodes
5. Comment nodes
6. Processing
instruction nodes
7. Namespace nodes
Seven kinds of XML nodes
29. 29
Node selectors
Expression Description
/ Selects the document root node
// Selects from all descendants of the context node and the context node itself
. Selects the current node
.. Selects the parent of the current node
@ Selects attribute nodes
* Selects any element node, regardless of type.
43. 43
Select the comment
nodes that
are children of book
elements
• /bookstore/book/
comment()
• //book/comment()
How to select nodes in XPath
44. 44
Axes
An axis is a direction that
we travel along to get to
different parts of an XML
document.
All XPath location paths
have an axis. So far, we
have used “abbreviated
location paths.”
Unabbreviated, they use a
double colon before the
node test. It looks like this:
//child::bookstore
Image source: https://jrebecchi.github.io/xpath-helper/xpath-axes.html
50. 50
Select the sibling
elements following
the title element
• //title/following-
sibling::element()
Selecting with axes in XPath
51. 51
Predicates
• Predicates are like a filter on your results
• Predicates appear inside [square brackets]
• Predicates are Boolean expressions
• The full syntax of an XPath location path is
axis::node[predicate]
• Axis and node are required. Predicate is optional.
• If you do not specify an axis, it is assumed to be “child::”
52. 52
Select the book with
the title “Harry Potter”
Selecting with predicates in XPath
53. 53
Selecting with predicates in XPath
Select the book with
the title “Harry Potter”
• //book[title=“Harry
Potter”]
62. 64
Real-world content analysis with XPath
• From experience, I know that tables inside tables often
have unpredictable issues. I want to check on them.
• //table/table
• I need to change all the section titles called
“Introduction” to “Overview.” Did I miss any?
• //section[title=“Overview”]
• The client wants a disclaimer paragraph at the very end
of the topic. Are there any disclaimers that are in the
wrong place?
• //p[@outputclass=“disclaimer”]/following-sibling::element()
64. 66
Ideas for content analysis with XPath
• Look for outliers
• Ensure that elements are used for their intended
purpose (not just for some formatting shortcut)
• Check consistency across different types of elements
• Track down unnecessary child elements
66. 68
XML and XPath resources
• W3Schools tutorials
• https://www.w3schools.com/xml/
• XPath cheat sheet
• https://devhints.io/xpath
67. Thank You!
Are you ready to upgrade, transform, and future-enable your content?
Contact us and we’ll show you what’s possible.
precisioncontent.com | more-info@precisioncontent.com | 1(647)265-8500
Notas do Editor
Precision Content is a consultancy specializing in end-to-end services for technical communications.
We provide services in writer training, content strategy, information architecture, content lifecycle management, systems integration, and content publishing.
We use our expertise in microcontent and structured authoring with DITA/XML to empower our clients across a variety of industries to modernize their content. [click]
[Image – “Hours of Work” section from the old handbook]
[Image – The series of briefer microcontent topics in the updated handbook. “Work Hour Limits,” “Time Tracking Requirement,” “Your Work Environment,” etc.
[Image – highlight both reference and principle information in the original employee handbook topic “Hours of Work”]
[Image – show two separate topics (with type info, if possible) that were broken out of the single mixed-function topic “Hours of Work”]
(Maybe what I can do for this is go on Heretto, find a topic, then delete the headings and paragraph breaks and such and use that as my example of “unstructured” content) Maybe “Hours of work” from the old employee handbook compared to the rewritten passage in the new one
Link to old employee handbook: https://ascan.sharepoint.com/CorpCommunications/Forms/AllItems.aspx?id=%2FCorpCommunications%2FPrecision%20Content%20Employee%20Handbook%2Epdf&parent=%2FCorpCommunications
Look at some of the other PCAS microcontent presentations for some stuff about what we mean by structure. In fact, use material from those presentations throughout your talk.