Free Webinar on the Lynx Services Platform LySP: Architecture and basic Services
The main objective of the Lynx research and innovation project is to create an ecosystem of smart cloud services to better manage compliance, based on a Legal Knowledge Graph (LKG) which integrates and links multilingual and heterogeneous compliance data sources including legislation, case law, standards, regulations and other private contracts, beside others.
This webinar will provide insights into all smart services of the Lynx Services Platform (LySP) including demos of these LySP services, as for instance: Named Entity Extraction (NER) by DFKI, Relation Extraction and Question-Answering by SWC, Machine Translation by Tilde or the Lexicala cross-lingual lexical data service by KDictionaries.
SQL Database Design For Developers at php[tek] 2024
Lynx Webinar #4: Lynx Services Platform (LySP) - Part 2 - The Services
1. BUILDING THE LEGAL KNOWLEDGE GRAPH
FOR SMART COMPLIANCE SERVICES IN
MULTILINGUAL EUROPE
http://lynx-project.eu/
Lynx - Compliance made easy
Legal Knowledge Graph for Multilingual Compliance Services
Webinar: Lynx Services Platform (LySP) - Part 2: The Services
18/02/2021, 10.30am-11.30am CET
2. Agenda
• Introduction & the Lynx project - 5’
Martin Kaltenböck (Co-Founder and CFO of Semantic Web Company, SWC)
• Lynx Services Platform: The Services - Introduction - 5’
Artem Revenko (Director of Research & Innovation, Semantic Web Company)
• Lynx Services Platform: The Services in Detail - 40’
María Navas-Loro (Ontology Engineering Group, Artificial Intelligence Department, UPM), Julian Moreno
(Researcher at DFKI), Ruben Martinez (Manager Customer Service, Tilde), Pablo Calleja (Ontology Engineering
Group, Artificial Intelligence Department, UPM), Ilan Kernerman (CEO at Lexicala by K Dictionaries), Christian
Sageder (CEO at Cybly).
• Questions & Answers - 10’
3. The Lynx project
ICT14-2016-2017 (IA) Innovation action
Pillar: Industrial Leadership
Work Programme Year: H2020-2016-2017
Work Programme Part: Information and Communication Technologies
TOPIC : Big Data PPP: cross-sectorial and cross-lingual data integration and
experimentation
Duration: 40 months
Start date: 1st December 2017
Estimated Project Cost: €3,638,065.00
Requested EU Contribution: €2,959,247.52
Project Officer: Johan BODENKAMP/Pierre-Paul SONDAG
9. Temporal Expression Service (1)
Finds the following types of expressions:
• DATE: April, 23/05, in 1998.
• TIME: At 2 o’clock, 5pm.
• SET: every Thursday, twice a month.
• DURATION: two days and a half, three years.
• INTERVALS (ongoing): From 3rd of April to 6th May.
10. Temporal Expression Service (2)
Once the previous expressions are found, they are normalized.
(...) In 19981 it increased exponentially; that summer2 (...)
(1) → 1998
(2) → summer of 1998 (1998-SU)
11. Temporal Expression Service (3)
Languages covered:
● Legal focused ruled-based approaches:
○ Spanish
○ English
○ German
● Standard external tool for:
○ Italian
○ Dutch
For more information, please check:
https://www.youtube.com/watch?v=6-CwPal2ArE
12. Named Entity Recognition Service
• Four model families:
• General Domain:
• Statistical language models (EN, DE)
• BERT based Neural Networks (EN, DE, ES)
• Legal Domain (DE):
• Conditional Random Fields (CRF)
• Bilateral Long Short Term Memory Neural Networks (BiLSTM)
• Corpus: German court decisions
• 67,000 sentences and 54,000 entities
• 7 coarse-grained classes and 19 fine-grained classes
14. Geolocation Service
● Three approaches:
• Statistical language model
• Trained with a specific German and English corpus
• 17 fine-grained classes
• Dictionary based approach
• Spanish dictionary of companies
• Rule-based approach
• Regular expressions for recognition of addresses
17. Entity Linking
Link a target (“Jaguar”) in a context to
the correct entity in a knowledge base.
Assumption: All senses of the target are
present in the knowledge base.
Usually suitable for large knowledge
bases, for example DBpedia, WordNet.
Relax assumption -> decide if a target
should be linked to some entity in
knowledge base.
Suitable for smaller enterprise
knowledge graphs.
20. Machine Translation - Benefits
1 Internal & External multilingual communication
Improve the organizations communication culture, starting from your internal team to
speaking the language of the customer
2 Increase translation productivity by 35%
Provide immediate human-like translations, facilitate processes of large
volume text translation
3 Enter new markets
Scale your business, move content quickly and enter new markets as fast as
possible while reducing the time and capital spent on projects
21. Machine Translation - in Lynx
- External service: use directly from most up-to-date cloud
platform with Neural MT technology & terminology capabilities.
Regular technological updates
- Source Document, Text and Annotation translation
- Use case specific - contracts, labor law, renewable energy
(trained on Lynx partners documents and identified sources)
22. Extractive Summarization Service
● Selection of relevant sentences
• TF-IDF
• Encode documents and calculate weights
for sentences using TF-IDF
• Centroids and composability of word
embeddings
• Extract keywords and concepts
• Composing embeddings
• Created centroid (document‘s)
• Project sentence in embedding space
• Relevance scores (distance to centroid)
25. Cross-Lingual Search (1)
• Full text search in multi lingual corporas
• APIs for
• Add / Delete Lynx Documents to the search index, a Lynx Document
Part is its on document in the index
• Search documents / parts
• Possibility of complex search queries
• AND, OR, NOT, MUST, NEAR, (), Phrases,
• Filters for metadata
• Search term will be translated to the language of the corpora
based on the targeted jurisdiction
26. Cross-Lingual Search (2)
Example: Maternity leave Spain AND Austria
detect
language
detect
jurisdiction(s)
to query +
language
create a
AST
(abstract
syntax tree)
translate
and
expand
query
query Index
annotation
query
● GEO
● NER
● EL
● Translation
● Dictionary
● Terminology
English Austria, Spain ● Austria, Spain, EU
● German, Spanish,
English
● permiso por
maternidad
● Karenzzeit
● Maternity leave
(licencia de maternidad AND metadata.jurisdiction:ES) OR
(Karenzzeit AND metadata.jurisdiction:AT) OR
(maternity leave AND metadata.jurisdiction:EU) OR
27. Search and Information Retrieval (1)
http://lkg.lynx-project.eu/
• Web Portal & RESTful API
• Relies on an Elasticsearch DCM
• Manages parts of documents as
independent documents
28. Search and Information Retrieval (2)
Parameters of search query
• words/terms
• collection
• jurisdiction
• language
• part of another document
• rows
• ...
Evaluation
• Gold standard created by CuatreCasas
• Spanish worker’s statute document
• 152 questions (en/es) with answers
(sections)
• Achieved >85% of accuracy
• Experimentation with:
• stems, synonyms, term extraction
31. Dictionary Services: Entry Components
• headwords and expressions, inflections and variants
• phonetic transcription (IPA) and alternative script
• part of speech, grammatical gender and number
• subcategorization and valency
• definitions, sense indication and disambiguation
• examples of usage
• synonyms, antonyms, domains, context
• range of application, register, sentiment, geo usage
• translations
33. Dictionary Services: RDF Pipeline (1)
• data modelled with OntoLex, adhering to lexicog module
• XML → JSON → JSON LD conversion pipeline
• incremental approach
• mapping XML paths to corresponding Linked Data
element
• URI naming strategy established
• implementation of the model
• validation
35. Dictionary Services: Sample Query
• A response to querying all lexical
senses linked to the RDF entry
:LexiconEN/bow-n, gathering the
information originating from the
different homographs as well as from
other resources in which bow is given
as a translation.
• The query currently results in 56
possible senses in different
languages of bow as an English noun
across the Global series.
36. Terminology Service
● Corpus based terminologies per pilot:
Labour Law, Contracts, Industrial Standards
● Multilingual, disambiguated knowledge
retrieved from the LLOD
● Languages covered:
○ Dutch
○ English
○ German
○ Spanish
Avaliable at: http://lkg.lynx-project.eu/kos
37. Lynx Webinar Series
• Webinar 1: Lynx overall introduction
When: 10.12.2020, 10.30am CET (1 hour)
Recording: https://youtube.com/playlist?list=PLxa__IZYjIaiGbl3a-PyK3DqNhhMdnnHv
• Webinar 2: 3 Business Cases on top of the Lynx Legal Knowledge Graph
When: 14.1.2021, 10.30am CET
Recording: https://youtube.com/playlist?list=PLxa__IZYjIaiDL2O22ureD_nLmtgRq9LB
• Webinar 3: The Lynx Services Platform (LySP) - Part 1: Overview
When: 11/02/2021, 11.30am CET
Recording: https://youtube.com/playlist?list=PLxa__IZYjIahhiSXoJbVyxv_iAliExH5e
• Webinar 4: The Lynx Services Platform (LySP) - Part 2: The Services
When: 18/02/2021, 10.30am CET
Recording: https://youtube.com/playlist?list=PLxa__IZYjIaiv5MeV7uZsujv-MOi6SE-a