Atualizámos a nossa política de privacidade. Clique aqui para ver os detalhes. Toque aqui para ver os detalhes.
Ative o seu período de avaliaçõo gratuito de 30 dias para desbloquear leituras ilimitadas.
Ative o seu teste gratuito de 30 dias para continuar a ler.
Baixar para ler offline
Natural Language Processing (NLP) practitioners often have to deal with analyzing large corpora of unstructured documents and this is often a tedious process. Python tools like NLTK do not scale to large production data sets and cannot be plugged into a distributed scalable framework like Apache Spark or Apache Flink.
The Apache OpenNLP library is a popular machine learning based toolkit for processing unstructured text. Combining a permissive licence, a easy-to-use API and set of components which are highly customize and trainable to achieve a very high accuracy on a particular dataset. Built-in evaluation allows to measure and tune OpenNLP’s performance for the documents that need to be processed.
From sentence detection and tokenization to parsing and named entity finder, Apache OpenNLP has the tools to address all tasks in a natural language processing workflow. It applies Machine Learning algorithms such as Perceptron and Maxent, combined with tools such as word2vec to achieve state of the art results. In this talk, we’ll be seeing a demo of large scale Name Entity extraction and Text classification using the various Apache OpenNLP components wrapped into Apache Flink stream processing pipeline and as an Apache NiFI processor.
NLP practitioners will come away from this talk with a better understanding of how the various Apache OpenNLP components can help in processing large reams of unstructured data using a highly scalable and distributed framework like Apache Spark/Apache Flink/Apache NiFi.
Natural Language Processing (NLP) practitioners often have to deal with analyzing large corpora of unstructured documents and this is often a tedious process. Python tools like NLTK do not scale to large production data sets and cannot be plugged into a distributed scalable framework like Apache Spark or Apache Flink.
The Apache OpenNLP library is a popular machine learning based toolkit for processing unstructured text. Combining a permissive licence, a easy-to-use API and set of components which are highly customize and trainable to achieve a very high accuracy on a particular dataset. Built-in evaluation allows to measure and tune OpenNLP’s performance for the documents that need to be processed.
From sentence detection and tokenization to parsing and named entity finder, Apache OpenNLP has the tools to address all tasks in a natural language processing workflow. It applies Machine Learning algorithms such as Perceptron and Maxent, combined with tools such as word2vec to achieve state of the art results. In this talk, we’ll be seeing a demo of large scale Name Entity extraction and Text classification using the various Apache OpenNLP components wrapped into Apache Flink stream processing pipeline and as an Apache NiFI processor.
NLP practitioners will come away from this talk with a better understanding of how the various Apache OpenNLP components can help in processing large reams of unstructured data using a highly scalable and distributed framework like Apache Spark/Apache Flink/Apache NiFi.
Parece que você já adicionou este slide ao painel
Você recortou seu primeiro slide!
Recortar slides é uma maneira fácil de colecionar slides importantes para acessar mais tarde. Agora, personalize o nome do seu painel de recortes.A família SlideShare acabou de crescer. Desfrute do acesso a milhões de ebooks, áudiolivros, revistas e muito mais a partir do Scribd.
Cancele a qualquer momento.Leitura ilimitada
Aprenda de forma mais rápida e inteligente com os maiores especialistas
Transferências ilimitadas
Faça transferências para ler em qualquer lugar e em movimento
Também terá acesso gratuito ao Scribd!
Acesso instantâneo a milhões de e-books, audiolivros, revistas, podcasts e muito mais.
Leia e ouça offline com qualquer dispositivo.
Acesso gratuito a serviços premium como Tuneln, Mubi e muito mais.
Atualizámos a nossa política de privacidade de modo a estarmos em conformidade com os regulamentos de privacidade em constante mutação a nível mundial e para lhe fornecer uma visão sobre as formas limitadas de utilização dos seus dados.
Pode ler os detalhes abaixo. Ao aceitar, está a concordar com a política de privacidade atualizada.
Obrigado!