This presentation gives an overview of challenges in building Natual Language Processing for Nepali Language and why python is good for NLP developments.
3. NLP Task English Indic Languages Nepali
Machine Translation Very Good Good
Very Poor
(Google/M$)
Named Entity
Recognition
Very Good Fair None
(Few Ground work)
Optical Character
Recognition
Very Good Poor Very Poor
POS Tagging Good Poor Very Poor
Sentiment Analysis Very Good Fair
Poor
(works on-going)
Speech Recognition Good Poor
None
(Google’s on-work)
What So Far?
12. –Prof. James A. Hendler
University of Maryland
“I have the students learn Python in our
undergraduate and graduate Semantic Web
courses. Why? Because basically there's nothing
else with the flexibility and as many web
libraries”
13. WHY PYTHON?
• NLTK, although not the most efficient
implementation, provides a lot of awesome tools
to quickly prototype a hypothesis
Source: Quora
14. WHY PYTHON?
• Scipy + Numpy: Everything that isn't in NLTK is
definitely in these libraries. If you want to use more
advanced algorithms like Latent Semantic
Indexing or Latent Dirichlet Allocation, Python has
libraries to do that.
Source: Quora
15. WHY PYTHON?
• Python has really great XML/HTML parsing
libraries such as Beautiful Soup and Scrape.py.
You can use these libraries to quickly scrape the web and generate large
data sets to improve the performance of your models (because lets face
it, big data trumps complexity)
Source: Quora
16. WHY PYTHON?
• Python has great web-frameworks like Django/
Pylons/Tornado.
If you invent a revolutionary sarcasm detector that can predict trends in
the stock market, you can quickly integrated it into a web service, make
millions, and buy a large island in a third-world country.
Source: Quora
17. WHY PYTHON?
• Consider your other options: It would not make
sense to use a compiled language like C++/Java
for this type of work unless you needed to increase
performance (computational speed, not model
accuracy).
As far as I can tell, Ruby is completely useless for any Machine Learning,
Data Mining, or Natural Language Processing task. Maybe you could use
Lisp, but at this point, Python has a larger eco-system.
Source: Quora