SlideShare uma empresa Scribd logo
1 de 8
Rangarajan Chari
US Citizen
6709 WinnipegCove
Austin, TX 78759
512-461-1810 (C)
512-346-4616 (H)
rangarajan.chari@gmail.com
LinkedInProfile:https://www.linkedin.com/in/mlengineer
SUMMARY
Data Scientist,Machine Learningand Natural Language Processing Specialist and SoftwareEngineer
with a unique combination of solid algorithm design skills and research acumen. Always eager to learn
and apply new technologies. Relevant experience includes:
● Research in AI at the PhD level
● Applying word and sentence embeddings, CNNs and RNNs for NLP tasks
● Implementing Convolutional Neural Nets for face recognition
● Using Machine Learningmethods for classification
● Using ‘Big Data” technologies like Apache Spark (with Scala and PySpark), AWS, Hadoop and
Cascading
● Excellent programmingin python,C/C++ and Java
● Keeping pace with the latest developments and trends in ML and NLP
● Graph mining in social networkanalysis
● Conducting research in DoD SBIR projects
EDUCATION
● Georgia Instituteof Technology, Atlanta,Georgia, M.S., Computer Science. Admitted into PhD
program.
Major Focus in PhD program:AI (Computational Vision,Case-Based Reasoning)
Minor in Statistics (Stochastic Processes,Queuing Theory, Nonparametric Statistics)
Research in Ph.D. program on Computational Vision (Pre-attentivevision,textureperception,
visual routines). VisitingStudent,CMU Computer Vision Lab under Prof. Takeo Kanade.
● University of Denver, Denver, Colorado, M.S., Math and Computer Science.
Thesis on Computational Vision.
● Indian Instituteof Technology, Bombay,India, M.Sc., Physics.
Recent Courses
● Certificatefrom deeplearning.ai (October 2017)for “Neural Networks and Deep Learning”,
“StructuringMachineLearning Projects”, “Convolutional Networks”and “Sequence Models” in
the Deep Learning Specialization on Coursera.
Built several CNNs for object detection from scratch as well as with TensorFlow and Keras. Built
character-level as well as word-level sequence to sequence models for NLP.
● NLP courses taken on Coursera: “Introduction to Natural Language Processing” by Prof.
Dragomir Radev, Univ. of Michigan, “Introduction to Natural Language Processing” by Jurafsky
and Manning, Stanford U.
● Currently takinga fast.ai course on Deep Learning.
PATENTS AND PUBLICATIONS
 “Entity Resolution from Inferred Relationships and Behavior”, Jonathan Mugan, Rangarajan
Chari, et al., IEEE BigData 2014 .
 Method and Computer System for Identifying Entities in Interaction Data, Laura Hitt,
Rangarajan Chari,et al., U.S. Patent Application #14/525040,2014.
● AlternativeMethodology and Tool for AnalyzingCompetitiveBenchmarks (US Pat.# 6,381,558,
IBM, 2002)
● System and Method for Compressing and Decompressing Fonts Based Upon Font Stroke
Regularities (US Pat.# 5,524,182, Hewlett-Packard,1996)
PROFESSIONAL EXPERIENCE
General Motors, IT Innovation Center Aug 2018 – Present
Sr. Machine Learning Scientist (Contract)
NLP
 Experimenting with various approaches,techniques and tools to solve the business problem of
helping technicians,and in the longer-term,owners,diagnose and troubleshoot problems with
their GM cars/trucks in a much easier way than consulting manuals.
 Attackingthe immediateproblemof findingrelevantmaterialaboutmostprobable cause from
manuals andbulletins.
 Techniques and tools include word embeddings -- word2vec,GloVe, fastText -- and sentence
embeddings (e.g., the Google Universal Sentence Encoder and Facebook’s InferSent) as well as
machine learningand scientific computing libraries like sklearn and scipy.The Universal
Sentence Encoder and InferSent areless than twoyears old as of October 2018.
 Investigatingan algorithm for text segmentation intoparagraphs or topical sections based on a
signal processingalgorithm for salient peak detection in timeseries data.
 The continuation of this project is contingent upon funding in the next quarter.
Cognizant Technology Solutions Nov 2017 – Aug 2018
Sr. Data Scientist
Deep Learning
● Assignments with:
o A well-known server technology company intending to offer machine learning and AI
services to their high-end customers.
o A large clinical laboratory services company seeking to get improved reimbursement
outcomes from health insurers by analyzing reasons for denial.
o The largest pharmacy networkand prescription benefit plan manager in the US, with the
goal of reducing human effort in interpreting changing rules.
● Constructed a static Knowledge Graph of the prescription benefit plan and drug coverage domain
and devised a process to enrich it with entities and relations extracted from dynamic human-
generated text and enable it to be queried via SPARQL. The goal was to reduce human effort in
interpreting complex free-text annotations tied to a particular plan and particular drugs.
● Built a CNN with TensorFlow for predicting failures of freezers and coolers in a large chain of
stores using time series of sensor readings.
● Benchmarked the performanceof Dell servers with and without GPUs and with and without Intel-
optimized TensorFlow on well-known Convolutional Neural Network architectures such as
Lenet5, GoogLenet, Alexnet, VGG17 and Resnet50.
Infosys Limited May 2017 – Sep 2017
Data Scientist
Natural Language Processing
 Developed sentiment analysis and summarization algorithms for a world leader in the Oil & Gas
industry.
 Developed an algorithm to draw word clouds from text based on relative prominence.
 For another Fortune 10 company, developed an algorithm to extract information related to
customer care (the cause and resolution of customer complaints) from email threads.
Visual Semantics (now Third Insight) Nov 2016 – April 2017
Machine Learning Consultant
Deep Neural Networks
Implemented a compact,three-stagecascaded convolutional networkfor use in a system for face
detection. The network,built using Torch, reproduced results of a 2015 publication which had minimal
information about the architecture.Other frameworks like Theano/Lasagne werealso evaluated.Also
looked at face detection libraries such as FaceNet (Google) and OpenFace (CMU), which is partly based
on OpenCV.
Jobs2Careers (now Talroo) Sep 2015 – Oct 2016
Data Scientist
Semantic Search
Enabling job seekers to find jobs relevant to the intent of their queries by understanding job
descriptions – classifying them, tagging them, and finding the dominant “topics” in them.
● Used word2vec, a neural embedding algorithm,on millions of job descriptions along with the
graph clusteringalgorithm mcl to assign “signatures”to the descriptions for job retrieval.
Developed heuristics for word-sensedisambiguation and for automatically determiningterm
specificity.
● Tried a community detection approach todocument clustering.
● Applied the probabilistic topic modeling techniques LDA (Latent Dirichlet Allocation) and HDP
(Hierarchical Dirichlet Processes) availablein the gensim package to find the major themes in a
large job description corpus and use the model for Information Retrieval. Also experimented
with methods that combineLDA with word2vec (e.g., Topical Word Embeddings),
● Some of the abovecomputations weredone with Spark/Scala and PySparkin Databricks
notebooks and AWS S3. Gained experience with SparkDataFrames, RDDs and Spark SQL.
● Extensively used python machinelearning and NLP stacks (scikit-learn,nltk, scipy, numpy as
well as genism, spaCy and chainer - a python neural networklibrary with CUDA and GPU
computation support)plus open source Java libraries like OpenNLP, Stanford Core NLP and
GATE.
● Developed a gold standard of responses to a carefully engineered set of queries and a random
sample of job descriptions toevaluatesearch engine versions rapidly and without expensive and
time-consumingA/B testing.
● Tried developing folksonomy-styletagging methods for documents. In this context,
experimented with keyword extraction techniques (Kea, Maui-indexer, KP-Miner and TextRank).
● Correlated click-through data with presented jobs and combined this with clusteringof word
neighborhood graphs to find jobs likely to be clicked on.
Skills Used: research aptitude, machine learning algorithms, documentclustering, text
classification, graph clustering, neural networks, word2vec, lda2vec, spaCy, chainer, mcl,
statistical NLP, python, nltk, scikit-learn, numpy, scipy, Spark, Scala, Spark MLLib, Databricks,
Spark Data Frames and Datasets, SQL, MySql,AWS, parquetfiles, gensim package, WordNet,
Stanford Core NLP, OpenNLP, Solr, LDA, HDP, IR, Information Retrieval
The Home Depot, Atlanta, GA Sep 2014 – June 2015
Data Scientist
Online Search
Searching for products that are relevant to a customer by looking at product descriptions in
natural language in addition to structured data about them.
● Applied recent research on neural-net generated distributed,dense vector representations of
words and phrases in experiments to understand thecontext and intent of a user query by
mining a hithertounexploited corpus of descriptions of ~1M products sold online by The Home
Depot.
● Used word2vec to overcome vocabulary mismatch by suggestingrelated search terms with the
objectiveof improving online customer experience on homedepot.com and increasingconversion
rates by an order of magnitude.
● Devised and selected algorithms that scaleto millions of product descriptions.
● Categorized and provided insight into the reasons for “No Results Found” pages by mining
query logs containingtens of millions of unique queries. Assessed the potential impacts of better
spellchecking, model number recognition, automatic rephrasingofqueries on the customer’s
experience and conversion rate.
● Discovered a way touse word2vec for correctingspelling errors in O(1) time.
Technologies Used: Python, Java, C/C++, bash, Linux, cygwin, awk, sed, Maven,Ant,
ontologies, OWL, RDF, Protégé, OpenRDF, WordNet, neural networks, word2vec,clustering, k-
means, kNN, R, Dragon Toolkit, aspell, Hunspell, Jazzy, LingPipe, ARK TurboParser
Dependency Parser, Stanford NLP, GATE, OpenNLP, Named Entity Recognition, Statistical
NLP, TF-IDF, Jaro-Winkler, Levenshtein distance, fuzzy search algorithms, recommendation
systems, collaborative filtering, Named Entity Recognition (NER), POS tagging.
21st Century Technologies (21CT), Austin, TX Dec 2013 – July 2014
Senior R&D Software Engineer
Social Network Analytics
Analyzing local neighborhood structure of social network nodes in a graph-theoretic way to
discover and quantify similarities between them.
● Developed a highly scalable and fast technique for analyzing and characterizing roles of
individuals within large social networks, by importing ideas from the analysis of protein
interaction networks in bioinformatics.This innovativeapplication of graphlets to social networks
with ~105 edges is able to precisely identify in a matter of seconds individuals who play similar
roles toa single exemplar. It made a US Navy project for identifying potential terroristthreats in a
large social network enormously successful and is now part of the core IP of 21CT.
● Employed R packages for principal components analysis,k-means clusteringand decision trees to
analyzeresults of using graphlet methods on Facebook100, a complete set of Facebook friendship
data from 100 American Universities in 2005.
● Implemented the graphlet application in C++ as well as Java for incorporation into company
codebase as a Maven project.
● Participated in a project to study collective entity resolution by fusing network data coming from
sources in different modalities.System is aimed at coalescing multiple monikers belonging to the
same individual.
● Gained experience working on DoD SBIR research projects with tight deadlines.
● Created a small OWL ontology with RDF n-triples usingProtégé and Sesame. Experimented with
Rya, a distributed RDF repository on top of the Accumulo key-value store. Generated and ran
SPARQL queries against the repository.
● Converted a group detection algorithm to MapReduce, using the Cascading abstraction layer on
top of Hadoop.
● Worked with several Python scripts and libraries as well as R packages for classification,
clustering, principal components analysis and visualization.
Tools Used: Terrorism Intelligence Analytics, DoD contracts, Social Network Analysis, Java,
Maven . C++, NoSQL, Accumulo, R, principal components analysis (PCA), machine learning,
Python, iPython, Scipy, Numpy, Eclipse, Netbeans, Cytoscape, graphlets, graph mining, RDF,
SPARQL, OpenRdf, ontologies, OWL, Protégé, Sesame, Hadoop, Cascading, MapReduce, Big
Data, Cloud, Linux, cygwin, bash, sed, awk, svn, Agile, SCRUM, software integration..
RenewData Corp. Oct. 2012 – July 2013
Senior Software Engineer
Information Retrieval from free text databases
Retrieving legal documents relevant to a litigation with high precision and recall expanding
queries where needed and dealing with “vocabulary mismatch”,
● Researched and implemented some of the latest IR techniques for query suggestion, relevance
feedback and ranked retrieval to modernize and differentiate the company’s two main products
in the eDiscovery marketplace.
● Experimented with Latent Semantic Indexing (LSI) as implemented in the “semantic vectors”
package to creates models which represent collections of documents in terms of underlying
concepts.
● Enhanced components which are written in Java, Ruby and C#, use MongoDB, MySQL and SQL
Server databases and communicatevia SOAP/REST web services. Technologies employed include
Apache Lucene and Solr (for free text search), JBoss, Spring and Maven
Skills Used: C++, Boost, g++, cygwin, Visual C++, Java, JBoss, Maven, Spring, Svn, QuickBuild,
web services, SOAP, REST, XML, Big Data, SQL, NoSQL, MongoDB, Agile, SCRUM, Rally,
Applied Research in Information retrieval (IR), TF-IDF, machine learning, algorithm design and
implementation, universal hash functions,Bloom filters, performance analysis and optimization,
Eclipse, Mockito, Junit, document classification.
Polycom, Inc. Mar. 2012 – Oct. 2012
Senior Staff Software Engineer
Videoconferencing Systems
Developed RESTful web services in the Java Restlet framework on the Android platform to expose
functionalities of an embedded videoconferencing system w ith Java and C++ components communicating
via Google protobuf.
Skills Used: Java, REST, Restlet framework, web services, JSON, XML, C++, Google protobuf,
Android, Agile, SCRUM, Jira, svn
Consulting Software Engineer June 2006 – February 2012
Notable clients include:
● ShoreTel, Austin, TX, VoIP Phones
Refactored and completely re-implemented the Qt 4.6/C++-based Network Access Controller
eliminating critical bugs and memory leaks in next-generation phones.
● PayPal, Austin, TX, Infrastructural Software
Modified enterprise-wideC++ softwaretouse a standard version ofthe Xerces XML Parser.Led
a pilot project toprevent buffer overflow,code injection and other vulnerabilities in PayPal
softwareby introducingFortify,a static analysis tool intothe development process.
● Advanced Micro Devices, Austin, TX, CPU Diagnostics
Developed a remote diagnostics tool using the XML-RPC protocol.
● IBM, Austin, TX, AIX Kernel Technical Support
Interfaced with IBM customers worldwideas well as AIX kernel developers to resolvecode defects
in the loader and linker.
Tools and Skills: C, C++, Qt 4.6, Ubuntu Linux, Embedded Linux, CentOs Linux, Visual Studio
2008, Eclipse IDE, KDevelop, Subversion, CVS, Perforce, Rational ClearCase/ClearQuest, Jira,
XML, Xerces DOM Parser, Agile methodology, SCRUM, MVC architecture, design patterns,
multi-threaded systems, compiler front-end, XML-RPC protocol, http, ftp, AIX, bash, ksh, gcc,
gdb, Oracle VirtualBox, network programming, sockets, client interaction
Tanisys Technology, Austin, TX Nov 2004 – June 2006
Senior Engineer
Embedded compiler in C and message-oriented middleware in C++ for a memory tester
● Facilitated the use of the M1000 high-end memory tester by defining a custom language and
implementing an embedded compiler (target – PPC405) for it using flex, bison and Gnu crosstool
on Linux.
● Developed and deployed multi-threaded middlewarein Visual C++ for a distributed system with
socket communication between modules.
Skills: C, gcc, g++, flex, bison, compilers, GNU crosstool,embedded Linux, cygwin,Visual C++, sockets,
multi-threaded programming,UML, use-casescenarios, SQL, DatabaseTemplate Library (DTL)
Pattern Discovery, Austin, TX Jan 2004 – Nov 2004
Owner
Startup; Federal contracts under SBIR/STTR programs; collaboration with National Labs
● Researched ways of using swarm intelligence techniques for target identification by a collection
of power-constrained unmanned aerial vehicles (UAVs) in response to a US Navy SBIR
solicitation.
● Gained extensive field and industry knowledge by collaboratingwith Sandia National Labs and
faculty at the University of New Mexico.
Skills: Research, Artificial Intelligence, autonomous,decentralized systems,collaboration,knowledge
transfer
Intel Corp., Austin, TX July 2003 – Jan 2004
Compiler Engineer Consultant
Dynamic, profile-directed compiler development
● Implemented parts of an experimental dynamically retranslating binary-to-binary compiler
conceptually similar to H-P Labs' Dynamo to enable x86 (IA-32) code to be run in the Itanium 2
(IA-64) environment.
Skills: dynamic, profile-directed compilers, x86, and Itanium 2 instruction sets,research,Visual C++
Metrowerks, then a Motorola Company, Austin, TX July 2000 – June 2003
Compiler Engineer
Compiler Development; Performance Analysis and Measurement; Software Integration
● Initiated and led a project to integratetheMetrowerks re-targetablecompiler with HiWare,
Switzerland's Static Single-Assignment (SSA) based compiler to modernizeit and demonstrate
how new,more powerful optimizations enabled by SSA Form can improve code quality without
degrading performance.
● Re-implemented Global Common Sub Expression Elimination and other major dataflow
optimizations in the Metrowerks IntermediateRepresentation Optimizer toremove flaws and
enhance code quality.
● Measured compilation speed and code quality using Intel VTune and EEMBC, gcc and SPEC92
benchmarks.
Skills Used: C, compiler design, CodeWarrior IDE, CVS, SSA Form, performance analysis, Intel VTune,
collaboration
HIGHLIGHTS OF PREVIOUS EXPERIENCE
● Improved the performanceof the IBM Java Just-in-Time(JIT) compiler in conjunction with IBM
Tokyo Research Labs, achievingbenchmarkscores exceeding that of Microsoft Internet Explorer
by more than 35%. Discovered patterns of suboptimal code in regions outside busy loops using
Intel VTune as profiler leading to further improvement in scores on the order of 12%.
● Modified the JVM and tested a new criterion for JIT compilation based on actuarial lifetime
prediction algorithms.
● Invented an alternativeprofilingmethodology for analyzingcompetitiveJava benchmarks.
● Studied human perception of fonts and patented a font compression algorithm based on
discoveringpatterns in font “strokes”.
● Doubled the throughput of GTSTRUDL – a widely-used computer-aided structural engineering
tool by drastically refactoringits C-based kernel in which more than 90% of runtimewas spent.
This performanceoptimization helped make the product viablein the face of competition from
similar products from other companies such as McDonnell Douglas.
● Showed in MS Thesis at Univ. of Denver that theRelativeNeighborhood Graph of a dot pattern
is a strongpredictor of how humans connect dots in that pattern,and whether the random-dot
Moiré effect is perceived in it, leading to some hypotheses about early vision.
● GraduateResearch Assistant,University ofDenver Dept. of Geography and Georgia Tech, Dept.
of Computer Science, AI Group.

Mais conteúdo relacionado

Mais procurados

Streaming HYpothesis REasoning
Streaming HYpothesis REasoningStreaming HYpothesis REasoning
Streaming HYpothesis REasoningWilliam Smith
 
Streaming Hypothesis Reasoning - William Smith, Jan 2016
Streaming Hypothesis Reasoning - William Smith, Jan 2016Streaming Hypothesis Reasoning - William Smith, Jan 2016
Streaming Hypothesis Reasoning - William Smith, Jan 2016Seattle DAML meetup
 
Ashisdeb analytics new_cv_doc
Ashisdeb analytics new_cv_docAshisdeb analytics new_cv_doc
Ashisdeb analytics new_cv_docashis deb
 
Exploring Qualitative Data Analytics with NVivo 12 Plus
Exploring Qualitative Data Analytics with NVivo 12 Plus Exploring Qualitative Data Analytics with NVivo 12 Plus
Exploring Qualitative Data Analytics with NVivo 12 Plus Shalin Hai-Jew
 
Character Recognition using Data Mining Technique (Artificial Neural Network)
Character Recognition using Data Mining Technique (Artificial Neural Network)Character Recognition using Data Mining Technique (Artificial Neural Network)
Character Recognition using Data Mining Technique (Artificial Neural Network)Sudipto Krishna Dutta
 
AAAI 2016 - A Visual Semantic Framework For Innovation Analytics
AAAI 2016 - A Visual Semantic Framework For Innovation AnalyticsAAAI 2016 - A Visual Semantic Framework For Innovation Analytics
AAAI 2016 - A Visual Semantic Framework For Innovation AnalyticsKripa (कृपा) Rajshekhar
 
Lexalytics Text Analytics Workshop: Perfect Text Analytics
Lexalytics Text Analytics Workshop: Perfect Text AnalyticsLexalytics Text Analytics Workshop: Perfect Text Analytics
Lexalytics Text Analytics Workshop: Perfect Text AnalyticsLexalytics
 
IRJET - Conversion of Unsupervised Data to Supervised Data using Topic Mo...
IRJET -  	  Conversion of Unsupervised Data to Supervised Data using Topic Mo...IRJET -  	  Conversion of Unsupervised Data to Supervised Data using Topic Mo...
IRJET - Conversion of Unsupervised Data to Supervised Data using Topic Mo...IRJET Journal
 
Probablistic information retrieval
Probablistic information retrievalProbablistic information retrieval
Probablistic information retrievalNisha Arankandath
 
Requirementv4
Requirementv4Requirementv4
Requirementv4stat
 
Deep Recommender Systems - PAPIs.io LATAM 2018
Deep Recommender Systems - PAPIs.io LATAM 2018Deep Recommender Systems - PAPIs.io LATAM 2018
Deep Recommender Systems - PAPIs.io LATAM 2018Gabriel Moreira
 
A Semantic Question Answering through Heterogeneous Data Source in the Domain...
A Semantic Question Answering through Heterogeneous Data Source in the Domain...A Semantic Question Answering through Heterogeneous Data Source in the Domain...
A Semantic Question Answering through Heterogeneous Data Source in the Domain...ijnlc
 
Popular Text Analytics Algorithms
Popular Text Analytics AlgorithmsPopular Text Analytics Algorithms
Popular Text Analytics AlgorithmsPromptCloud
 
Language Technologies for Geomatics: From Intelligence to Agility
Language Technologies for Geomatics: From Intelligence to AgilityLanguage Technologies for Geomatics: From Intelligence to Agility
Language Technologies for Geomatics: From Intelligence to AgilityVisionGEOMATIQUE2014
 

Mais procurados (18)

Streaming HYpothesis REasoning
Streaming HYpothesis REasoningStreaming HYpothesis REasoning
Streaming HYpothesis REasoning
 
Streaming Hypothesis Reasoning - William Smith, Jan 2016
Streaming Hypothesis Reasoning - William Smith, Jan 2016Streaming Hypothesis Reasoning - William Smith, Jan 2016
Streaming Hypothesis Reasoning - William Smith, Jan 2016
 
Ashisdeb analytics new_cv_doc
Ashisdeb analytics new_cv_docAshisdeb analytics new_cv_doc
Ashisdeb analytics new_cv_doc
 
Exploring Qualitative Data Analytics with NVivo 12 Plus
Exploring Qualitative Data Analytics with NVivo 12 Plus Exploring Qualitative Data Analytics with NVivo 12 Plus
Exploring Qualitative Data Analytics with NVivo 12 Plus
 
Character Recognition using Data Mining Technique (Artificial Neural Network)
Character Recognition using Data Mining Technique (Artificial Neural Network)Character Recognition using Data Mining Technique (Artificial Neural Network)
Character Recognition using Data Mining Technique (Artificial Neural Network)
 
AAAI 2016 - A Visual Semantic Framework For Innovation Analytics
AAAI 2016 - A Visual Semantic Framework For Innovation AnalyticsAAAI 2016 - A Visual Semantic Framework For Innovation Analytics
AAAI 2016 - A Visual Semantic Framework For Innovation Analytics
 
Kamakhya_Python
Kamakhya_Python Kamakhya_Python
Kamakhya_Python
 
Lexalytics Text Analytics Workshop: Perfect Text Analytics
Lexalytics Text Analytics Workshop: Perfect Text AnalyticsLexalytics Text Analytics Workshop: Perfect Text Analytics
Lexalytics Text Analytics Workshop: Perfect Text Analytics
 
IRJET - Conversion of Unsupervised Data to Supervised Data using Topic Mo...
IRJET -  	  Conversion of Unsupervised Data to Supervised Data using Topic Mo...IRJET -  	  Conversion of Unsupervised Data to Supervised Data using Topic Mo...
IRJET - Conversion of Unsupervised Data to Supervised Data using Topic Mo...
 
Probablistic information retrieval
Probablistic information retrievalProbablistic information retrieval
Probablistic information retrieval
 
Requirementv4
Requirementv4Requirementv4
Requirementv4
 
Deep Recommender Systems - PAPIs.io LATAM 2018
Deep Recommender Systems - PAPIs.io LATAM 2018Deep Recommender Systems - PAPIs.io LATAM 2018
Deep Recommender Systems - PAPIs.io LATAM 2018
 
A Semantic Question Answering through Heterogeneous Data Source in the Domain...
A Semantic Question Answering through Heterogeneous Data Source in the Domain...A Semantic Question Answering through Heterogeneous Data Source in the Domain...
A Semantic Question Answering through Heterogeneous Data Source in the Domain...
 
Transform unstructured e&p information
Transform unstructured e&p informationTransform unstructured e&p information
Transform unstructured e&p information
 
Cv zamir siddiqui
Cv zamir siddiquiCv zamir siddiqui
Cv zamir siddiqui
 
Popular Text Analytics Algorithms
Popular Text Analytics AlgorithmsPopular Text Analytics Algorithms
Popular Text Analytics Algorithms
 
Case study
Case studyCase study
Case study
 
Language Technologies for Geomatics: From Intelligence to Agility
Language Technologies for Geomatics: From Intelligence to AgilityLanguage Technologies for Geomatics: From Intelligence to Agility
Language Technologies for Geomatics: From Intelligence to Agility
 

Semelhante a Data science nlp_resume-2018-abridged

Christine_Straub - ML Engineer.pdf
Christine_Straub - ML Engineer.pdfChristine_Straub - ML Engineer.pdf
Christine_Straub - ML Engineer.pdfChristine Straub
 
Premanand naik data_scientist_4years_pune
Premanand naik data_scientist_4years_punePremanand naik data_scientist_4years_pune
Premanand naik data_scientist_4years_punepremanand naik
 
Introduction To Data Science with Apache Spark
Introduction To Data Science with Apache Spark Introduction To Data Science with Apache Spark
Introduction To Data Science with Apache Spark ZaranTech LLC
 
Unlocking Value from Unstructured Data
Unlocking Value from Unstructured DataUnlocking Value from Unstructured Data
Unlocking Value from Unstructured DataAccenture Insurance
 
Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...
Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...
Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...Neo4j
 
Introduction to Daigo Tanaka @ Anelen
Introduction to Daigo Tanaka @ AnelenIntroduction to Daigo Tanaka @ Anelen
Introduction to Daigo Tanaka @ AnelenDaigo Tanaka, Ph.D.
 
Data science presentation
Data science presentationData science presentation
Data science presentationMSDEVMTL
 
How to Become a Big Data Professional.pdf
How to Become a Big Data Professional.pdfHow to Become a Big Data Professional.pdf
How to Become a Big Data Professional.pdfCareervira
 
Navigating the Era of Big Data Analytics: A Roadmap for Data Analyst Courses ...
Navigating the Era of Big Data Analytics: A Roadmap for Data Analyst Courses ...Navigating the Era of Big Data Analytics: A Roadmap for Data Analyst Courses ...
Navigating the Era of Big Data Analytics: A Roadmap for Data Analyst Courses ...BayaReddy M
 
Data science Nagarajan and madhav.pptx
Data science Nagarajan and madhav.pptxData science Nagarajan and madhav.pptx
Data science Nagarajan and madhav.pptxNagarajanG35
 
SiddharthaMitra_resume_pdf
SiddharthaMitra_resume_pdfSiddharthaMitra_resume_pdf
SiddharthaMitra_resume_pdfSiddhartha Mitra
 
Is Spark the right choice for data analysis ?
Is Spark the right choice for data analysis ?Is Spark the right choice for data analysis ?
Is Spark the right choice for data analysis ?Ahmed Kamal
 
Scaling Knowledge Graph Architectures with AI
Scaling Knowledge Graph Architectures with AIScaling Knowledge Graph Architectures with AI
Scaling Knowledge Graph Architectures with AIEnterprise Knowledge
 
Multiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier DominguezMultiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier DominguezBig Data Spain
 
Heet detroja.resume
Heet detroja.resumeHeet detroja.resume
Heet detroja.resumeHeetDetroja
 
PPT5: Neuron Introduction
PPT5: Neuron IntroductionPPT5: Neuron Introduction
PPT5: Neuron Introductionakira-ai
 

Semelhante a Data science nlp_resume-2018-abridged (20)

Christine_Straub - ML Engineer.pdf
Christine_Straub - ML Engineer.pdfChristine_Straub - ML Engineer.pdf
Christine_Straub - ML Engineer.pdf
 
sudipto_resume
sudipto_resumesudipto_resume
sudipto_resume
 
Premanand naik data_scientist_4years_pune
Premanand naik data_scientist_4years_punePremanand naik data_scientist_4years_pune
Premanand naik data_scientist_4years_pune
 
Introduction To Data Science with Apache Spark
Introduction To Data Science with Apache Spark Introduction To Data Science with Apache Spark
Introduction To Data Science with Apache Spark
 
Unlocking Value from Unstructured Data
Unlocking Value from Unstructured DataUnlocking Value from Unstructured Data
Unlocking Value from Unstructured Data
 
Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...
Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...
Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...
 
DATANOMIQ Qualification & Certification
DATANOMIQ Qualification & CertificationDATANOMIQ Qualification & Certification
DATANOMIQ Qualification & Certification
 
Introduction to Daigo Tanaka @ Anelen
Introduction to Daigo Tanaka @ AnelenIntroduction to Daigo Tanaka @ Anelen
Introduction to Daigo Tanaka @ Anelen
 
Data science presentation
Data science presentationData science presentation
Data science presentation
 
How to Become a Big Data Professional.pdf
How to Become a Big Data Professional.pdfHow to Become a Big Data Professional.pdf
How to Become a Big Data Professional.pdf
 
resume
resumeresume
resume
 
Navigating the Era of Big Data Analytics: A Roadmap for Data Analyst Courses ...
Navigating the Era of Big Data Analytics: A Roadmap for Data Analyst Courses ...Navigating the Era of Big Data Analytics: A Roadmap for Data Analyst Courses ...
Navigating the Era of Big Data Analytics: A Roadmap for Data Analyst Courses ...
 
Data science Nagarajan and madhav.pptx
Data science Nagarajan and madhav.pptxData science Nagarajan and madhav.pptx
Data science Nagarajan and madhav.pptx
 
SiddharthaMitra_resume_pdf
SiddharthaMitra_resume_pdfSiddharthaMitra_resume_pdf
SiddharthaMitra_resume_pdf
 
Is Spark the right choice for data analysis ?
Is Spark the right choice for data analysis ?Is Spark the right choice for data analysis ?
Is Spark the right choice for data analysis ?
 
Scaling Knowledge Graph Architectures with AI
Scaling Knowledge Graph Architectures with AIScaling Knowledge Graph Architectures with AI
Scaling Knowledge Graph Architectures with AI
 
Resume
ResumeResume
Resume
 
Multiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier DominguezMultiplatform Spark solution for Graph datasources by Javier Dominguez
Multiplatform Spark solution for Graph datasources by Javier Dominguez
 
Heet detroja.resume
Heet detroja.resumeHeet detroja.resume
Heet detroja.resume
 
PPT5: Neuron Introduction
PPT5: Neuron IntroductionPPT5: Neuron Introduction
PPT5: Neuron Introduction
 

Último

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 

Último (20)

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 

Data science nlp_resume-2018-abridged

  • 1. Rangarajan Chari US Citizen 6709 WinnipegCove Austin, TX 78759 512-461-1810 (C) 512-346-4616 (H) rangarajan.chari@gmail.com LinkedInProfile:https://www.linkedin.com/in/mlengineer SUMMARY Data Scientist,Machine Learningand Natural Language Processing Specialist and SoftwareEngineer with a unique combination of solid algorithm design skills and research acumen. Always eager to learn and apply new technologies. Relevant experience includes: ● Research in AI at the PhD level ● Applying word and sentence embeddings, CNNs and RNNs for NLP tasks ● Implementing Convolutional Neural Nets for face recognition ● Using Machine Learningmethods for classification ● Using ‘Big Data” technologies like Apache Spark (with Scala and PySpark), AWS, Hadoop and Cascading ● Excellent programmingin python,C/C++ and Java ● Keeping pace with the latest developments and trends in ML and NLP ● Graph mining in social networkanalysis ● Conducting research in DoD SBIR projects EDUCATION ● Georgia Instituteof Technology, Atlanta,Georgia, M.S., Computer Science. Admitted into PhD program. Major Focus in PhD program:AI (Computational Vision,Case-Based Reasoning) Minor in Statistics (Stochastic Processes,Queuing Theory, Nonparametric Statistics) Research in Ph.D. program on Computational Vision (Pre-attentivevision,textureperception, visual routines). VisitingStudent,CMU Computer Vision Lab under Prof. Takeo Kanade. ● University of Denver, Denver, Colorado, M.S., Math and Computer Science. Thesis on Computational Vision. ● Indian Instituteof Technology, Bombay,India, M.Sc., Physics. Recent Courses ● Certificatefrom deeplearning.ai (October 2017)for “Neural Networks and Deep Learning”, “StructuringMachineLearning Projects”, “Convolutional Networks”and “Sequence Models” in the Deep Learning Specialization on Coursera. Built several CNNs for object detection from scratch as well as with TensorFlow and Keras. Built character-level as well as word-level sequence to sequence models for NLP.
  • 2. ● NLP courses taken on Coursera: “Introduction to Natural Language Processing” by Prof. Dragomir Radev, Univ. of Michigan, “Introduction to Natural Language Processing” by Jurafsky and Manning, Stanford U. ● Currently takinga fast.ai course on Deep Learning. PATENTS AND PUBLICATIONS  “Entity Resolution from Inferred Relationships and Behavior”, Jonathan Mugan, Rangarajan Chari, et al., IEEE BigData 2014 .  Method and Computer System for Identifying Entities in Interaction Data, Laura Hitt, Rangarajan Chari,et al., U.S. Patent Application #14/525040,2014. ● AlternativeMethodology and Tool for AnalyzingCompetitiveBenchmarks (US Pat.# 6,381,558, IBM, 2002) ● System and Method for Compressing and Decompressing Fonts Based Upon Font Stroke Regularities (US Pat.# 5,524,182, Hewlett-Packard,1996) PROFESSIONAL EXPERIENCE General Motors, IT Innovation Center Aug 2018 – Present Sr. Machine Learning Scientist (Contract) NLP  Experimenting with various approaches,techniques and tools to solve the business problem of helping technicians,and in the longer-term,owners,diagnose and troubleshoot problems with their GM cars/trucks in a much easier way than consulting manuals.  Attackingthe immediateproblemof findingrelevantmaterialaboutmostprobable cause from manuals andbulletins.  Techniques and tools include word embeddings -- word2vec,GloVe, fastText -- and sentence embeddings (e.g., the Google Universal Sentence Encoder and Facebook’s InferSent) as well as machine learningand scientific computing libraries like sklearn and scipy.The Universal Sentence Encoder and InferSent areless than twoyears old as of October 2018.  Investigatingan algorithm for text segmentation intoparagraphs or topical sections based on a signal processingalgorithm for salient peak detection in timeseries data.  The continuation of this project is contingent upon funding in the next quarter. Cognizant Technology Solutions Nov 2017 – Aug 2018 Sr. Data Scientist Deep Learning ● Assignments with: o A well-known server technology company intending to offer machine learning and AI services to their high-end customers. o A large clinical laboratory services company seeking to get improved reimbursement outcomes from health insurers by analyzing reasons for denial. o The largest pharmacy networkand prescription benefit plan manager in the US, with the goal of reducing human effort in interpreting changing rules. ● Constructed a static Knowledge Graph of the prescription benefit plan and drug coverage domain and devised a process to enrich it with entities and relations extracted from dynamic human-
  • 3. generated text and enable it to be queried via SPARQL. The goal was to reduce human effort in interpreting complex free-text annotations tied to a particular plan and particular drugs. ● Built a CNN with TensorFlow for predicting failures of freezers and coolers in a large chain of stores using time series of sensor readings. ● Benchmarked the performanceof Dell servers with and without GPUs and with and without Intel- optimized TensorFlow on well-known Convolutional Neural Network architectures such as Lenet5, GoogLenet, Alexnet, VGG17 and Resnet50. Infosys Limited May 2017 – Sep 2017 Data Scientist Natural Language Processing  Developed sentiment analysis and summarization algorithms for a world leader in the Oil & Gas industry.  Developed an algorithm to draw word clouds from text based on relative prominence.  For another Fortune 10 company, developed an algorithm to extract information related to customer care (the cause and resolution of customer complaints) from email threads. Visual Semantics (now Third Insight) Nov 2016 – April 2017 Machine Learning Consultant Deep Neural Networks Implemented a compact,three-stagecascaded convolutional networkfor use in a system for face detection. The network,built using Torch, reproduced results of a 2015 publication which had minimal information about the architecture.Other frameworks like Theano/Lasagne werealso evaluated.Also looked at face detection libraries such as FaceNet (Google) and OpenFace (CMU), which is partly based on OpenCV. Jobs2Careers (now Talroo) Sep 2015 – Oct 2016 Data Scientist Semantic Search Enabling job seekers to find jobs relevant to the intent of their queries by understanding job descriptions – classifying them, tagging them, and finding the dominant “topics” in them. ● Used word2vec, a neural embedding algorithm,on millions of job descriptions along with the graph clusteringalgorithm mcl to assign “signatures”to the descriptions for job retrieval. Developed heuristics for word-sensedisambiguation and for automatically determiningterm specificity. ● Tried a community detection approach todocument clustering. ● Applied the probabilistic topic modeling techniques LDA (Latent Dirichlet Allocation) and HDP (Hierarchical Dirichlet Processes) availablein the gensim package to find the major themes in a large job description corpus and use the model for Information Retrieval. Also experimented with methods that combineLDA with word2vec (e.g., Topical Word Embeddings), ● Some of the abovecomputations weredone with Spark/Scala and PySparkin Databricks notebooks and AWS S3. Gained experience with SparkDataFrames, RDDs and Spark SQL. ● Extensively used python machinelearning and NLP stacks (scikit-learn,nltk, scipy, numpy as well as genism, spaCy and chainer - a python neural networklibrary with CUDA and GPU computation support)plus open source Java libraries like OpenNLP, Stanford Core NLP and GATE.
  • 4. ● Developed a gold standard of responses to a carefully engineered set of queries and a random sample of job descriptions toevaluatesearch engine versions rapidly and without expensive and time-consumingA/B testing. ● Tried developing folksonomy-styletagging methods for documents. In this context, experimented with keyword extraction techniques (Kea, Maui-indexer, KP-Miner and TextRank). ● Correlated click-through data with presented jobs and combined this with clusteringof word neighborhood graphs to find jobs likely to be clicked on. Skills Used: research aptitude, machine learning algorithms, documentclustering, text classification, graph clustering, neural networks, word2vec, lda2vec, spaCy, chainer, mcl, statistical NLP, python, nltk, scikit-learn, numpy, scipy, Spark, Scala, Spark MLLib, Databricks, Spark Data Frames and Datasets, SQL, MySql,AWS, parquetfiles, gensim package, WordNet, Stanford Core NLP, OpenNLP, Solr, LDA, HDP, IR, Information Retrieval The Home Depot, Atlanta, GA Sep 2014 – June 2015 Data Scientist Online Search Searching for products that are relevant to a customer by looking at product descriptions in natural language in addition to structured data about them. ● Applied recent research on neural-net generated distributed,dense vector representations of words and phrases in experiments to understand thecontext and intent of a user query by mining a hithertounexploited corpus of descriptions of ~1M products sold online by The Home Depot. ● Used word2vec to overcome vocabulary mismatch by suggestingrelated search terms with the objectiveof improving online customer experience on homedepot.com and increasingconversion rates by an order of magnitude. ● Devised and selected algorithms that scaleto millions of product descriptions. ● Categorized and provided insight into the reasons for “No Results Found” pages by mining query logs containingtens of millions of unique queries. Assessed the potential impacts of better spellchecking, model number recognition, automatic rephrasingofqueries on the customer’s experience and conversion rate. ● Discovered a way touse word2vec for correctingspelling errors in O(1) time. Technologies Used: Python, Java, C/C++, bash, Linux, cygwin, awk, sed, Maven,Ant, ontologies, OWL, RDF, Protégé, OpenRDF, WordNet, neural networks, word2vec,clustering, k- means, kNN, R, Dragon Toolkit, aspell, Hunspell, Jazzy, LingPipe, ARK TurboParser Dependency Parser, Stanford NLP, GATE, OpenNLP, Named Entity Recognition, Statistical NLP, TF-IDF, Jaro-Winkler, Levenshtein distance, fuzzy search algorithms, recommendation systems, collaborative filtering, Named Entity Recognition (NER), POS tagging. 21st Century Technologies (21CT), Austin, TX Dec 2013 – July 2014 Senior R&D Software Engineer Social Network Analytics Analyzing local neighborhood structure of social network nodes in a graph-theoretic way to discover and quantify similarities between them.
  • 5. ● Developed a highly scalable and fast technique for analyzing and characterizing roles of individuals within large social networks, by importing ideas from the analysis of protein interaction networks in bioinformatics.This innovativeapplication of graphlets to social networks with ~105 edges is able to precisely identify in a matter of seconds individuals who play similar roles toa single exemplar. It made a US Navy project for identifying potential terroristthreats in a large social network enormously successful and is now part of the core IP of 21CT. ● Employed R packages for principal components analysis,k-means clusteringand decision trees to analyzeresults of using graphlet methods on Facebook100, a complete set of Facebook friendship data from 100 American Universities in 2005. ● Implemented the graphlet application in C++ as well as Java for incorporation into company codebase as a Maven project. ● Participated in a project to study collective entity resolution by fusing network data coming from sources in different modalities.System is aimed at coalescing multiple monikers belonging to the same individual. ● Gained experience working on DoD SBIR research projects with tight deadlines. ● Created a small OWL ontology with RDF n-triples usingProtégé and Sesame. Experimented with Rya, a distributed RDF repository on top of the Accumulo key-value store. Generated and ran SPARQL queries against the repository. ● Converted a group detection algorithm to MapReduce, using the Cascading abstraction layer on top of Hadoop. ● Worked with several Python scripts and libraries as well as R packages for classification, clustering, principal components analysis and visualization. Tools Used: Terrorism Intelligence Analytics, DoD contracts, Social Network Analysis, Java, Maven . C++, NoSQL, Accumulo, R, principal components analysis (PCA), machine learning, Python, iPython, Scipy, Numpy, Eclipse, Netbeans, Cytoscape, graphlets, graph mining, RDF, SPARQL, OpenRdf, ontologies, OWL, Protégé, Sesame, Hadoop, Cascading, MapReduce, Big Data, Cloud, Linux, cygwin, bash, sed, awk, svn, Agile, SCRUM, software integration.. RenewData Corp. Oct. 2012 – July 2013 Senior Software Engineer Information Retrieval from free text databases Retrieving legal documents relevant to a litigation with high precision and recall expanding queries where needed and dealing with “vocabulary mismatch”, ● Researched and implemented some of the latest IR techniques for query suggestion, relevance feedback and ranked retrieval to modernize and differentiate the company’s two main products in the eDiscovery marketplace. ● Experimented with Latent Semantic Indexing (LSI) as implemented in the “semantic vectors” package to creates models which represent collections of documents in terms of underlying concepts. ● Enhanced components which are written in Java, Ruby and C#, use MongoDB, MySQL and SQL Server databases and communicatevia SOAP/REST web services. Technologies employed include Apache Lucene and Solr (for free text search), JBoss, Spring and Maven Skills Used: C++, Boost, g++, cygwin, Visual C++, Java, JBoss, Maven, Spring, Svn, QuickBuild, web services, SOAP, REST, XML, Big Data, SQL, NoSQL, MongoDB, Agile, SCRUM, Rally, Applied Research in Information retrieval (IR), TF-IDF, machine learning, algorithm design and
  • 6. implementation, universal hash functions,Bloom filters, performance analysis and optimization, Eclipse, Mockito, Junit, document classification. Polycom, Inc. Mar. 2012 – Oct. 2012 Senior Staff Software Engineer Videoconferencing Systems Developed RESTful web services in the Java Restlet framework on the Android platform to expose functionalities of an embedded videoconferencing system w ith Java and C++ components communicating via Google protobuf. Skills Used: Java, REST, Restlet framework, web services, JSON, XML, C++, Google protobuf, Android, Agile, SCRUM, Jira, svn Consulting Software Engineer June 2006 – February 2012 Notable clients include: ● ShoreTel, Austin, TX, VoIP Phones Refactored and completely re-implemented the Qt 4.6/C++-based Network Access Controller eliminating critical bugs and memory leaks in next-generation phones. ● PayPal, Austin, TX, Infrastructural Software Modified enterprise-wideC++ softwaretouse a standard version ofthe Xerces XML Parser.Led a pilot project toprevent buffer overflow,code injection and other vulnerabilities in PayPal softwareby introducingFortify,a static analysis tool intothe development process. ● Advanced Micro Devices, Austin, TX, CPU Diagnostics Developed a remote diagnostics tool using the XML-RPC protocol. ● IBM, Austin, TX, AIX Kernel Technical Support Interfaced with IBM customers worldwideas well as AIX kernel developers to resolvecode defects in the loader and linker. Tools and Skills: C, C++, Qt 4.6, Ubuntu Linux, Embedded Linux, CentOs Linux, Visual Studio 2008, Eclipse IDE, KDevelop, Subversion, CVS, Perforce, Rational ClearCase/ClearQuest, Jira, XML, Xerces DOM Parser, Agile methodology, SCRUM, MVC architecture, design patterns, multi-threaded systems, compiler front-end, XML-RPC protocol, http, ftp, AIX, bash, ksh, gcc, gdb, Oracle VirtualBox, network programming, sockets, client interaction Tanisys Technology, Austin, TX Nov 2004 – June 2006 Senior Engineer Embedded compiler in C and message-oriented middleware in C++ for a memory tester ● Facilitated the use of the M1000 high-end memory tester by defining a custom language and implementing an embedded compiler (target – PPC405) for it using flex, bison and Gnu crosstool on Linux. ● Developed and deployed multi-threaded middlewarein Visual C++ for a distributed system with socket communication between modules. Skills: C, gcc, g++, flex, bison, compilers, GNU crosstool,embedded Linux, cygwin,Visual C++, sockets, multi-threaded programming,UML, use-casescenarios, SQL, DatabaseTemplate Library (DTL) Pattern Discovery, Austin, TX Jan 2004 – Nov 2004 Owner
  • 7. Startup; Federal contracts under SBIR/STTR programs; collaboration with National Labs ● Researched ways of using swarm intelligence techniques for target identification by a collection of power-constrained unmanned aerial vehicles (UAVs) in response to a US Navy SBIR solicitation. ● Gained extensive field and industry knowledge by collaboratingwith Sandia National Labs and faculty at the University of New Mexico. Skills: Research, Artificial Intelligence, autonomous,decentralized systems,collaboration,knowledge transfer Intel Corp., Austin, TX July 2003 – Jan 2004 Compiler Engineer Consultant Dynamic, profile-directed compiler development ● Implemented parts of an experimental dynamically retranslating binary-to-binary compiler conceptually similar to H-P Labs' Dynamo to enable x86 (IA-32) code to be run in the Itanium 2 (IA-64) environment. Skills: dynamic, profile-directed compilers, x86, and Itanium 2 instruction sets,research,Visual C++ Metrowerks, then a Motorola Company, Austin, TX July 2000 – June 2003 Compiler Engineer Compiler Development; Performance Analysis and Measurement; Software Integration ● Initiated and led a project to integratetheMetrowerks re-targetablecompiler with HiWare, Switzerland's Static Single-Assignment (SSA) based compiler to modernizeit and demonstrate how new,more powerful optimizations enabled by SSA Form can improve code quality without degrading performance. ● Re-implemented Global Common Sub Expression Elimination and other major dataflow optimizations in the Metrowerks IntermediateRepresentation Optimizer toremove flaws and enhance code quality. ● Measured compilation speed and code quality using Intel VTune and EEMBC, gcc and SPEC92 benchmarks. Skills Used: C, compiler design, CodeWarrior IDE, CVS, SSA Form, performance analysis, Intel VTune, collaboration HIGHLIGHTS OF PREVIOUS EXPERIENCE ● Improved the performanceof the IBM Java Just-in-Time(JIT) compiler in conjunction with IBM Tokyo Research Labs, achievingbenchmarkscores exceeding that of Microsoft Internet Explorer by more than 35%. Discovered patterns of suboptimal code in regions outside busy loops using Intel VTune as profiler leading to further improvement in scores on the order of 12%. ● Modified the JVM and tested a new criterion for JIT compilation based on actuarial lifetime prediction algorithms. ● Invented an alternativeprofilingmethodology for analyzingcompetitiveJava benchmarks. ● Studied human perception of fonts and patented a font compression algorithm based on discoveringpatterns in font “strokes”. ● Doubled the throughput of GTSTRUDL – a widely-used computer-aided structural engineering tool by drastically refactoringits C-based kernel in which more than 90% of runtimewas spent. This performanceoptimization helped make the product viablein the face of competition from similar products from other companies such as McDonnell Douglas.
  • 8. ● Showed in MS Thesis at Univ. of Denver that theRelativeNeighborhood Graph of a dot pattern is a strongpredictor of how humans connect dots in that pattern,and whether the random-dot Moiré effect is perceived in it, leading to some hypotheses about early vision. ● GraduateResearch Assistant,University ofDenver Dept. of Geography and Georgia Tech, Dept. of Computer Science, AI Group.