Enviar pesquisa
Carregar
Automatic Key Term Extraction and Summarization from Spoken Course Lectures
•
Transferir como PPTX, PDF
•
1 gostou
•
656 visualizações
Yun-Nung (Vivian) Chen
Seguir
National Taiwan University - MS Oral Defense (Jun., 2011)
Leia menos
Leia mais
Tecnologia
Vista de apresentação de diapositivos
Denunciar
Compartilhar
Vista de apresentação de diapositivos
Denunciar
Compartilhar
1 de 55
Baixar agora
Recomendados
IEEE SLT Best Student Paper Award
Automatic Key Term Extraction from Spoken Course Lectures
Automatic Key Term Extraction from Spoken Course Lectures
Yun-Nung (Vivian) Chen
This slide is a slightly modified version of that used for the author’s doctoral defense at NAIST on December 16, 2021.
Word Segmentation and Lexical Normalization for Unsegmented Languages
Word Segmentation and Lexical Normalization for Unsegmented Languages
hs0041
Vanitadevi Patil Sapthagiri College of Engineering, Department of Computer Science and Engineering, Bangalore, INDIA
An Intuitive Natural Language Understanding System
An Intuitive Natural Language Understanding System
inscit2006
In this talk we cover traditional search models, as well as the state-of-the-art approaches based on language models.
Language Models for Information Retrieval
Language Models for Information Retrieval
Nik Spirin
J. Anurag, P. Nupur and Agrawal, S.S. School of Information Technology, Guru Gobind Singh Indraprastha University, Delhi, India Centre for Development of Advanced Computing, Noida, India
Improvement in Quality of Speech associated with Braille codes - A Review
Improvement in Quality of Speech associated with Braille codes - A Review
inscit2006
Filled pauses and L2 proficiency: Finnish Australians speaking English
Filled pauses and L2 proficiency: Finnish Australians speaking English
Wybo Wiersma
第141回音声言語情報処理研究発表会/音声研究会 招待講演 W.-C. Huang, E. Cooper, Y. Tsao, H.-M. Wang, T. Toda, J. Yamagishi:The VoiceMOS Challenge 2022,Mar. 2022 名古屋大学 情報学研究科 知能システム学専攻 戸田研究室
The VoiceMOS Challenge 2022
The VoiceMOS Challenge 2022
NU_I_TODALAB
Words and sentences are the basic units of text. In this lecture we discuss basics of operations on words and sentences such as tokenization, text normalization, tf-idf, cosine similarity measures, vector space models and word representation
Natural Language Processing: L02 words
Natural Language Processing: L02 words
ananth
Recomendados
IEEE SLT Best Student Paper Award
Automatic Key Term Extraction from Spoken Course Lectures
Automatic Key Term Extraction from Spoken Course Lectures
Yun-Nung (Vivian) Chen
This slide is a slightly modified version of that used for the author’s doctoral defense at NAIST on December 16, 2021.
Word Segmentation and Lexical Normalization for Unsegmented Languages
Word Segmentation and Lexical Normalization for Unsegmented Languages
hs0041
Vanitadevi Patil Sapthagiri College of Engineering, Department of Computer Science and Engineering, Bangalore, INDIA
An Intuitive Natural Language Understanding System
An Intuitive Natural Language Understanding System
inscit2006
In this talk we cover traditional search models, as well as the state-of-the-art approaches based on language models.
Language Models for Information Retrieval
Language Models for Information Retrieval
Nik Spirin
J. Anurag, P. Nupur and Agrawal, S.S. School of Information Technology, Guru Gobind Singh Indraprastha University, Delhi, India Centre for Development of Advanced Computing, Noida, India
Improvement in Quality of Speech associated with Braille codes - A Review
Improvement in Quality of Speech associated with Braille codes - A Review
inscit2006
Filled pauses and L2 proficiency: Finnish Australians speaking English
Filled pauses and L2 proficiency: Finnish Australians speaking English
Wybo Wiersma
第141回音声言語情報処理研究発表会/音声研究会 招待講演 W.-C. Huang, E. Cooper, Y. Tsao, H.-M. Wang, T. Toda, J. Yamagishi:The VoiceMOS Challenge 2022,Mar. 2022 名古屋大学 情報学研究科 知能システム学専攻 戸田研究室
The VoiceMOS Challenge 2022
The VoiceMOS Challenge 2022
NU_I_TODALAB
Words and sentences are the basic units of text. In this lecture we discuss basics of operations on words and sentences such as tokenization, text normalization, tf-idf, cosine similarity measures, vector space models and word representation
Natural Language Processing: L02 words
Natural Language Processing: L02 words
ananth
Presentation at WAPOR Buenos Aires, June 2015
Packing and Unpacking the Bag of Words: Introducing a Toolkit for Inductive A...
Packing and Unpacking the Bag of Words: Introducing a Toolkit for Inductive A...
Department of Communication Science, University of Amsterdam
The paper accepted on ICSE'17 and TSE'19. https://se-thesaurus.appspot.com/ https://pypi.org/project/DomainThesaurus/ Informal discussions on social platforms (e.g., Stack Overflow) accumulates a large body of programming knowledge in natural language text. Natural language process (NLP) techniques can be exploited to harvest this knowledge base for software engineering tasks. To make an effective use of NLP techniques, consistent vocabulary is essential. Unfortunately, the same concepts are often intentionally or accidentally mentioned in many different morphological forms in informal discussions, such as abbreviations, synonyms and misspellings. Existing techniques to deal with such morphological forms are either designed for general English or predominantly rely on domain-specific lexical rules. A thesaurus of software-specific terms and commonlyused morphological forms is desirable for normalizing software engineering text, but very difficult to build manually. In this work, we propose an automatic approach to build such a thesaurus. Our approach identifies software-specific terms by contrasting software-specific and general corpuses, and infers morphological forms of software-specific terms by combining distributed word semantics, domain-specific lexical rules and transformations, and graph analysis of morphological relations. We evaluate the coverage and accuracy of the resulting thesaurus against community-curated lists of software-specific terms, abbreviations and synonyms. We also manually examine the correctness of the identified abbreviations and synonyms in our thesaurus. We demonstrate the usefulness of our thesaurus in a case study of normalizing questions from Stack Overflow and CodeProject.
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Chunyang Chen
L. Damas and C. Million-Rousseau Condillac Group, LISTIC, Université de Savoie. 73370 Le Bourget du Lac, France Ontologos Corp. 6, route de Nanfray, 74000 Cran-Gevrier, France
Taking into account communities of practice’s specific vocabularies in inform...
Taking into account communities of practice’s specific vocabularies in inform...
inscit2006
Roger Labahn (University of Rostock, DE): Handwritten Text Recognition. Key concepts co:op-READ-Convention Marburg Technology meets Scholarship, or how Handwritten Text Recognition will Revolutionize Access to Archival Collections. With a special focus on biographical data in archives Hessian State Archives Marburg Friedrichsplatz 15, D - 35037 Marburg 19-21 January 2016
co:op-READ-Convention Marburg - Roger Labahn
co:op-READ-Convention Marburg - Roger Labahn
ICARUS - International Centre for Archival Research
Describes at the high level how Automated Abstracts work and how these algorithms can be scaled to a massive corpus.
Automated Abstracts and Big Data
Automated Abstracts and Big Data
Sameer Wadkar
Presentation on the Frontiers of Natural Language Processing held at the Deep Learning Indaba 2018 (http://www.deeplearningindaba.com/).
Frontiers of Natural Language Processing
Frontiers of Natural Language Processing
Sebastian Ruder
Asr
Asr
alexisronquillo
On Efficient Cross-modal Distillation
2010 INTERSPEECH
2010 INTERSPEECH
WarNik Chow
Slides from Neural Text Embeddings for Information Retrieval tutorial at WSDM 2017
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Bhaskar Mitra
K-means Algorithm Latent Dirichlet Allocation
Topic Extraction on Domain Ontology
Topic Extraction on Domain Ontology
Keerti Bhogaraju
Automatic summarization, a difficult but pressing problem in natural language processing, aims at shortening source documents while retaining main information. In recent years, more statistical machine learning methods have been applied to automatic summarization. In this paper, we propose a novel approach for summarization, based on hierarchical Bayesian model of topic-semantic indexing (TSI) and extraction strategy of average log-likelihood. The new method is tested on Brown corpus, and its performance is analyzed by a well-designed blind experiment of one-way ANOVA on human reviews. The experimental results show that TSI model is promising on topic- driven summarization.
Latent Topic-semantic Indexing based Automatic Text Summarization
Latent Topic-semantic Indexing based Automatic Text Summarization
Elaheh Barati
PhD Defense from Stanford University. Full dissertation: https://stacks.stanford.edu/file/druid:cg721hb0673/thesis-augmented.pdf
Processing short-message communications in low-resource languages