SlideShare uma empresa Scribd logo
1 de 19
Private & ConfidentialPrivate & Confidential
Transforming unstructured web into
actionable insights using AI
March-2018
Private & Confidential
2
Using AI to accelerate Digital Transformation
Private & Confidential
3
Ugam is a data and analytics
company helping leading
corporations to improve
business decisions
Analytics
application
s
Analytics
services
E-
commerce
operations
17
years
Manufacturer
Distributor
B2C B2B
Manufacturer
Retailer
Private & Confidential
4
Problem Definition & Business Impact
Private & Confidential
5
Private & Confidential
6
Private & Confidential
7
• Amazon: 562 million 2018
- 372 million 2017.
• ~20 K every hour
Volume
• Every retailer has different
site-cat-path
• Photo, video, Social,
Mobile
Variety
• Periodic, near Real Time,
Real TimeVelocity
• Unstructured data
representation
• Schema Varies per
retailer
Structure
200K +
Categories
500k + Brands
800K +
Attributes
8m + Sellers
400 Million
Products
Processing Performance
Curse of Modularity
Class Imbalance
Curse of Dimensionality
Feature Engineering
Heterogeneity & Noise
Private & Confidential
8
Cleaning Deduping Classification Attribution Compression Matching
What - How - Why
The Holy Grail
Retailers
Price Intelligence & Optimization
Assortment Intelligence
Product Content solutions
Analytics for Merchandising &
Marketing Decisions
Brands
Dynamic Pricing
Map Monitoring
Data Aggregation Data Synthesis Data Analysis Data Delivery
Private & Confidential
9
Cleaning Deduping Classification Attribution Compression Matching
Category Research
Hierarchical Classification
Multiclass Linear SVM
Convolutional NN
Ensemble
Private & Confidential
10
Cleaning Deduping Classification Attribution Compression Matching
Original Data Set
D1 D2 Dn-1 Dn
Multiple
Data sets
Multiple
Classifier
s
Combining
Classifiers
C1 C2 Cn-1 Cn
Bootstrap Aggregating for improved performance
Clothing
Laptops
Electronics
Toys
Handbags &
Luggage
Health
Beauty
Antiques
Kitchen
Miscellaneous
Personal care
Baby
Ensemble
⅀
Private & Confidential
11
Black Shoe Black Pointed-toe stilettoBlack High Heel Black studded leather pointed-toe Christian
Louboutin 6” glided heel stiletto for night out
Cleaning Deduping Classification Attribution Compression Matching
Category Research
Text Attributes: CNN,
Sequence Labeling
Image Feature Extraction :
CNN
Type: Casual
Heel Height: 0.5
Inch
Heel Type: Flat
Material: PVC
ASIN:
B077BMVXLQ
Brand: Footsoul
Managed
Attributes
Unmanaged
Attributes
Private & Confidential
12
Info Bundle delivered through Image Processing APIPre-classified Input image
Cleaning Deduping Classification Attribution Compression Matching
Private & Confidential
13
Cleaning Deduping Classification Attribution Compression Matching
Feature Libraries/Functions Data used for training
Object identification,
Image clustering
• Tensorflow
• Keras (CNN)
• Caffe
• Internal product database
• CIFAR-100
• CIFAR-10
Foreground extraction/
Edge & contour
• OpenCV
• Keras (CNN)
• KITTI vision benchmark
• GTI image database
Template matching/
Brand dectection
• Keras
• OpenCV
• Internal product database
• CIFAR-100
• KITI
• Gait dataset
Text/Color extraction • Tensorflow
• Tesseract
• OpenCV
• Internal product database
• CIFAR-10
• CIFAR-100
Merchandise
Category
Managed Features Coverage achieved
Hardline: Consumer
Electronics,
etc.
• Brand, Color,
Product
• Up to 95%
Soft line: Apparel • Up to 80%
Merchandise
Category
Unmanaged Coverage achieved
Hardline: Consumer
Electronics,
etc.
• MPN, UPC • Up to 80%
Soft line: Apparel • Up to 70%
Private & Confidential
14
Cleaning Deduping Classification Attribution Compression Matching
02
Attribute
Extraction• Maximizing attribute coverage
• Brand, MPN/UPC, Category
specific enforcer attribute
04
Associations
• Associative rule matching
Product
Matching
Getting Classification
done• Correct classification gives us
right set of attributes.
03
Compression /
Clustering• Allows us to work on scale
• Hierarchical Agglomerative
Clustering
01
• Exact, Similar matches
Private & Confidential
15
Private & Confidential
16
Reinforcement: Validation of attributes - Tool
Cleaning Deduping Classification Attribution Compression Matching
Private & Confidential
17
Cleaning Deduping Classification Attribution Compression Matching
Matching Engine - Tool
Private & Confidential
18
Private & Confidential
www.ugamsolutions.com
Disclaimer:
The information set out in this presentation is produced by Ugam Solutions (“the Company” or “Ugam”) and is being made available AS IS to recipients
solely for information purposes only. This presentation and its contents are strictly confidential to Ugam and may not be used, reproduced, redistributed
or transmitted, passed on or published, in whole or in part, to any other person for any purpose whatsoever.

Mais conteúdo relacionado

Mais de CodeOps Technologies LLP

CREATING REAL TIME DASHBOARD WITH BLAZOR, AZURE FUNCTION COSMOS DB AN AZURE S...
CREATING REAL TIME DASHBOARD WITH BLAZOR, AZURE FUNCTION COSMOS DB AN AZURE S...CREATING REAL TIME DASHBOARD WITH BLAZOR, AZURE FUNCTION COSMOS DB AN AZURE S...
CREATING REAL TIME DASHBOARD WITH BLAZOR, AZURE FUNCTION COSMOS DB AN AZURE S...CodeOps Technologies LLP
 
WRITE SCALABLE COMMUNICATION APPLICATION WITH POWER OF SERVERLESS
WRITE SCALABLE COMMUNICATION APPLICATION WITH POWER OF SERVERLESSWRITE SCALABLE COMMUNICATION APPLICATION WITH POWER OF SERVERLESS
WRITE SCALABLE COMMUNICATION APPLICATION WITH POWER OF SERVERLESSCodeOps Technologies LLP
 
Training And Serving ML Model Using Kubeflow by Jayesh Sharma
Training And Serving ML Model Using Kubeflow by Jayesh SharmaTraining And Serving ML Model Using Kubeflow by Jayesh Sharma
Training And Serving ML Model Using Kubeflow by Jayesh SharmaCodeOps Technologies LLP
 
Deploy Microservices To Kubernetes Without Secrets by Reenu Saluja
Deploy Microservices To Kubernetes Without Secrets by Reenu SalujaDeploy Microservices To Kubernetes Without Secrets by Reenu Saluja
Deploy Microservices To Kubernetes Without Secrets by Reenu SalujaCodeOps Technologies LLP
 
Leverage Azure Tech stack for any Kubernetes cluster via Azure Arc by Saiyam ...
Leverage Azure Tech stack for any Kubernetes cluster via Azure Arc by Saiyam ...Leverage Azure Tech stack for any Kubernetes cluster via Azure Arc by Saiyam ...
Leverage Azure Tech stack for any Kubernetes cluster via Azure Arc by Saiyam ...CodeOps Technologies LLP
 
YAML Tips For Kubernetes by Neependra Khare
YAML Tips For Kubernetes by Neependra KhareYAML Tips For Kubernetes by Neependra Khare
YAML Tips For Kubernetes by Neependra KhareCodeOps Technologies LLP
 
Must Know Azure Kubernetes Best Practices And Features For Better Resiliency ...
Must Know Azure Kubernetes Best Practices And Features For Better Resiliency ...Must Know Azure Kubernetes Best Practices And Features For Better Resiliency ...
Must Know Azure Kubernetes Best Practices And Features For Better Resiliency ...CodeOps Technologies LLP
 
Monitor Azure Kubernetes Cluster With Prometheus by Mamta Jha
Monitor Azure Kubernetes Cluster With Prometheus by Mamta JhaMonitor Azure Kubernetes Cluster With Prometheus by Mamta Jha
Monitor Azure Kubernetes Cluster With Prometheus by Mamta JhaCodeOps Technologies LLP
 
Functional Programming in Java 8 - Lambdas and Streams
Functional Programming in Java 8 - Lambdas and StreamsFunctional Programming in Java 8 - Lambdas and Streams
Functional Programming in Java 8 - Lambdas and StreamsCodeOps Technologies LLP
 
Distributed Tracing: New DevOps Foundation
Distributed Tracing: New DevOps FoundationDistributed Tracing: New DevOps Foundation
Distributed Tracing: New DevOps FoundationCodeOps Technologies LLP
 
"Distributed Tracing: New DevOps Foundation" by Jayesh Ahire
"Distributed Tracing: New DevOps Foundation" by Jayesh Ahire  "Distributed Tracing: New DevOps Foundation" by Jayesh Ahire
"Distributed Tracing: New DevOps Foundation" by Jayesh Ahire CodeOps Technologies LLP
 
Improve customer engagement and productivity with conversational ai
Improve customer engagement and productivity with conversational aiImprove customer engagement and productivity with conversational ai
Improve customer engagement and productivity with conversational aiCodeOps Technologies LLP
 
Text semantics with azure text analytics cognitive services
Text semantics with azure text analytics cognitive servicesText semantics with azure text analytics cognitive services
Text semantics with azure text analytics cognitive servicesCodeOps Technologies LLP
 
Build your model using azure custom vision and deploy it in a webapp
Build your model using azure custom vision and deploy it in a webappBuild your model using azure custom vision and deploy it in a webapp
Build your model using azure custom vision and deploy it in a webappCodeOps Technologies LLP
 
Quantum machine learning with microsoft q# at AI Dev Day
Quantum machine learning with microsoft q# at AI Dev DayQuantum machine learning with microsoft q# at AI Dev Day
Quantum machine learning with microsoft q# at AI Dev DayCodeOps Technologies LLP
 
Understanding Azure Face API at AI Dev Day Conference
Understanding Azure Face API at AI Dev Day ConferenceUnderstanding Azure Face API at AI Dev Day Conference
Understanding Azure Face API at AI Dev Day ConferenceCodeOps Technologies LLP
 
Go Serverless with Java and Azure Functions
Go Serverless with Java and Azure FunctionsGo Serverless with Java and Azure Functions
Go Serverless with Java and Azure FunctionsCodeOps Technologies LLP
 

Mais de CodeOps Technologies LLP (20)

CREATING REAL TIME DASHBOARD WITH BLAZOR, AZURE FUNCTION COSMOS DB AN AZURE S...
CREATING REAL TIME DASHBOARD WITH BLAZOR, AZURE FUNCTION COSMOS DB AN AZURE S...CREATING REAL TIME DASHBOARD WITH BLAZOR, AZURE FUNCTION COSMOS DB AN AZURE S...
CREATING REAL TIME DASHBOARD WITH BLAZOR, AZURE FUNCTION COSMOS DB AN AZURE S...
 
WRITE SCALABLE COMMUNICATION APPLICATION WITH POWER OF SERVERLESS
WRITE SCALABLE COMMUNICATION APPLICATION WITH POWER OF SERVERLESSWRITE SCALABLE COMMUNICATION APPLICATION WITH POWER OF SERVERLESS
WRITE SCALABLE COMMUNICATION APPLICATION WITH POWER OF SERVERLESS
 
Training And Serving ML Model Using Kubeflow by Jayesh Sharma
Training And Serving ML Model Using Kubeflow by Jayesh SharmaTraining And Serving ML Model Using Kubeflow by Jayesh Sharma
Training And Serving ML Model Using Kubeflow by Jayesh Sharma
 
Deploy Microservices To Kubernetes Without Secrets by Reenu Saluja
Deploy Microservices To Kubernetes Without Secrets by Reenu SalujaDeploy Microservices To Kubernetes Without Secrets by Reenu Saluja
Deploy Microservices To Kubernetes Without Secrets by Reenu Saluja
 
Leverage Azure Tech stack for any Kubernetes cluster via Azure Arc by Saiyam ...
Leverage Azure Tech stack for any Kubernetes cluster via Azure Arc by Saiyam ...Leverage Azure Tech stack for any Kubernetes cluster via Azure Arc by Saiyam ...
Leverage Azure Tech stack for any Kubernetes cluster via Azure Arc by Saiyam ...
 
YAML Tips For Kubernetes by Neependra Khare
YAML Tips For Kubernetes by Neependra KhareYAML Tips For Kubernetes by Neependra Khare
YAML Tips For Kubernetes by Neependra Khare
 
Must Know Azure Kubernetes Best Practices And Features For Better Resiliency ...
Must Know Azure Kubernetes Best Practices And Features For Better Resiliency ...Must Know Azure Kubernetes Best Practices And Features For Better Resiliency ...
Must Know Azure Kubernetes Best Practices And Features For Better Resiliency ...
 
Monitor Azure Kubernetes Cluster With Prometheus by Mamta Jha
Monitor Azure Kubernetes Cluster With Prometheus by Mamta JhaMonitor Azure Kubernetes Cluster With Prometheus by Mamta Jha
Monitor Azure Kubernetes Cluster With Prometheus by Mamta Jha
 
Jet brains space intro presentation
Jet brains space intro presentationJet brains space intro presentation
Jet brains space intro presentation
 
Functional Programming in Java 8 - Lambdas and Streams
Functional Programming in Java 8 - Lambdas and StreamsFunctional Programming in Java 8 - Lambdas and Streams
Functional Programming in Java 8 - Lambdas and Streams
 
Distributed Tracing: New DevOps Foundation
Distributed Tracing: New DevOps FoundationDistributed Tracing: New DevOps Foundation
Distributed Tracing: New DevOps Foundation
 
"Distributed Tracing: New DevOps Foundation" by Jayesh Ahire
"Distributed Tracing: New DevOps Foundation" by Jayesh Ahire  "Distributed Tracing: New DevOps Foundation" by Jayesh Ahire
"Distributed Tracing: New DevOps Foundation" by Jayesh Ahire
 
Improve customer engagement and productivity with conversational ai
Improve customer engagement and productivity with conversational aiImprove customer engagement and productivity with conversational ai
Improve customer engagement and productivity with conversational ai
 
Text semantics with azure text analytics cognitive services
Text semantics with azure text analytics cognitive servicesText semantics with azure text analytics cognitive services
Text semantics with azure text analytics cognitive services
 
Build your model using azure custom vision and deploy it in a webapp
Build your model using azure custom vision and deploy it in a webappBuild your model using azure custom vision and deploy it in a webapp
Build your model using azure custom vision and deploy it in a webapp
 
Quantum machine learning with microsoft q# at AI Dev Day
Quantum machine learning with microsoft q# at AI Dev DayQuantum machine learning with microsoft q# at AI Dev Day
Quantum machine learning with microsoft q# at AI Dev Day
 
Understanding Azure Face API at AI Dev Day Conference
Understanding Azure Face API at AI Dev Day ConferenceUnderstanding Azure Face API at AI Dev Day Conference
Understanding Azure Face API at AI Dev Day Conference
 
Java & Microservices in Azure
Java & Microservices in AzureJava & Microservices in Azure
Java & Microservices in Azure
 
Go Serverless with Java and Azure Functions
Go Serverless with Java and Azure FunctionsGo Serverless with Java and Azure Functions
Go Serverless with Java and Azure Functions
 
Tracing Java Applications on Azure
Tracing Java Applications on AzureTracing Java Applications on Azure
Tracing Java Applications on Azure
 

Último

Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentationvaddepallysandeep122
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceBrainSell Technologies
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
Software Coding for software engineering
Software Coding for software engineeringSoftware Coding for software engineering
Software Coding for software engineeringssuserb3a23b
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsChristian Birchler
 

Último (20)

Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentation
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. Salesforce
 
Odoo Development Company in India | Devintelle Consulting Service
Odoo Development Company in India | Devintelle Consulting ServiceOdoo Development Company in India | Devintelle Consulting Service
Odoo Development Company in India | Devintelle Consulting Service
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
Software Coding for software engineering
Software Coding for software engineeringSoftware Coding for software engineering
Software Coding for software engineering
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
 

Transforming Unstructured Web into Actionable Insights Using AI - Abhimanyu - Ugam Solutions

  • 1. Private & ConfidentialPrivate & Confidential Transforming unstructured web into actionable insights using AI March-2018
  • 2. Private & Confidential 2 Using AI to accelerate Digital Transformation
  • 3. Private & Confidential 3 Ugam is a data and analytics company helping leading corporations to improve business decisions Analytics application s Analytics services E- commerce operations 17 years Manufacturer Distributor B2C B2B Manufacturer Retailer
  • 4. Private & Confidential 4 Problem Definition & Business Impact
  • 7. Private & Confidential 7 • Amazon: 562 million 2018 - 372 million 2017. • ~20 K every hour Volume • Every retailer has different site-cat-path • Photo, video, Social, Mobile Variety • Periodic, near Real Time, Real TimeVelocity • Unstructured data representation • Schema Varies per retailer Structure 200K + Categories 500k + Brands 800K + Attributes 8m + Sellers 400 Million Products Processing Performance Curse of Modularity Class Imbalance Curse of Dimensionality Feature Engineering Heterogeneity & Noise
  • 8. Private & Confidential 8 Cleaning Deduping Classification Attribution Compression Matching What - How - Why The Holy Grail Retailers Price Intelligence & Optimization Assortment Intelligence Product Content solutions Analytics for Merchandising & Marketing Decisions Brands Dynamic Pricing Map Monitoring Data Aggregation Data Synthesis Data Analysis Data Delivery
  • 9. Private & Confidential 9 Cleaning Deduping Classification Attribution Compression Matching Category Research Hierarchical Classification Multiclass Linear SVM Convolutional NN Ensemble
  • 10. Private & Confidential 10 Cleaning Deduping Classification Attribution Compression Matching Original Data Set D1 D2 Dn-1 Dn Multiple Data sets Multiple Classifier s Combining Classifiers C1 C2 Cn-1 Cn Bootstrap Aggregating for improved performance Clothing Laptops Electronics Toys Handbags & Luggage Health Beauty Antiques Kitchen Miscellaneous Personal care Baby Ensemble ⅀
  • 11. Private & Confidential 11 Black Shoe Black Pointed-toe stilettoBlack High Heel Black studded leather pointed-toe Christian Louboutin 6” glided heel stiletto for night out Cleaning Deduping Classification Attribution Compression Matching Category Research Text Attributes: CNN, Sequence Labeling Image Feature Extraction : CNN Type: Casual Heel Height: 0.5 Inch Heel Type: Flat Material: PVC ASIN: B077BMVXLQ Brand: Footsoul Managed Attributes Unmanaged Attributes
  • 12. Private & Confidential 12 Info Bundle delivered through Image Processing APIPre-classified Input image Cleaning Deduping Classification Attribution Compression Matching
  • 13. Private & Confidential 13 Cleaning Deduping Classification Attribution Compression Matching Feature Libraries/Functions Data used for training Object identification, Image clustering • Tensorflow • Keras (CNN) • Caffe • Internal product database • CIFAR-100 • CIFAR-10 Foreground extraction/ Edge & contour • OpenCV • Keras (CNN) • KITTI vision benchmark • GTI image database Template matching/ Brand dectection • Keras • OpenCV • Internal product database • CIFAR-100 • KITI • Gait dataset Text/Color extraction • Tensorflow • Tesseract • OpenCV • Internal product database • CIFAR-10 • CIFAR-100 Merchandise Category Managed Features Coverage achieved Hardline: Consumer Electronics, etc. • Brand, Color, Product • Up to 95% Soft line: Apparel • Up to 80% Merchandise Category Unmanaged Coverage achieved Hardline: Consumer Electronics, etc. • MPN, UPC • Up to 80% Soft line: Apparel • Up to 70%
  • 14. Private & Confidential 14 Cleaning Deduping Classification Attribution Compression Matching 02 Attribute Extraction• Maximizing attribute coverage • Brand, MPN/UPC, Category specific enforcer attribute 04 Associations • Associative rule matching Product Matching Getting Classification done• Correct classification gives us right set of attributes. 03 Compression / Clustering• Allows us to work on scale • Hierarchical Agglomerative Clustering 01 • Exact, Similar matches
  • 16. Private & Confidential 16 Reinforcement: Validation of attributes - Tool Cleaning Deduping Classification Attribution Compression Matching
  • 17. Private & Confidential 17 Cleaning Deduping Classification Attribution Compression Matching Matching Engine - Tool
  • 19. Private & Confidential www.ugamsolutions.com Disclaimer: The information set out in this presentation is produced by Ugam Solutions (“the Company” or “Ugam”) and is being made available AS IS to recipients solely for information purposes only. This presentation and its contents are strictly confidential to Ugam and may not be used, reproduced, redistributed or transmitted, passed on or published, in whole or in part, to any other person for any purpose whatsoever.