SlideShare uma empresa Scribd logo
1 de 21
Baixar para ler offline
「ほとんど同じ」画像を
        簡単に整理するために
- To throw “almost same” pictures
      into trash bin casually -

     2012-09-29 LT(day 2)@YAPC::Asia2012

                         @turugina
What I talked about
@ yesterday's LT-thon
    Making perl script




          Perl Script




to downlod **** images :D_
Based on that.




   それを踏まえて
Do you have any photos like these?




85(Original) (Different JPEG Quality) 50
  http://www.gatag.net/07/02/2009/090000.html
Like these




450×299(Original) (different size) 160x106
      https://twitter.com/lestrrat
Or, like these.




Without glasses                 With glasses
     らとみく | あらたとしひら http://p.tl/i/18558811
Let's throw unnecessary one away




 85(Original) (Different JPEG Quality) 50
  http://www.gatag.net/07/02/2009/090000.html
throw unnecessary one away...




450×299(Original) (different size) 160x106
      https://twitter.com/lestrrat
Oh, I want to keep both of them. :)




 Without glasses                 With glasses
      らとみく | あらたとしひら http://p.tl/i/18558811
How can we do that?




      けど、どうやって?
One of the answers,


 By your eyes and hand
for all of target pictures




     1つの方法: 手でやれ
“Are you kidding? I have tons of pictures!”




      「バカな!手元には万単位で画像ファイルがあるんだぞ!」
Ok, let's make “similar image search system”




          「よろしい、ならば類似画像検索システムだ」
Previous researches of
“Similar Image Search” algorithms
●   Compare pixels by pixels (Image::Compare?)
●   By (reduced) color histogram
●   By extracted outlines of image.
●   By representative projection vectors of fractal image
    compression
●   By characteristic values of divided regions of image.
    (Average RGB, Hue, Saturation, Value, or so)
Previous researches of
“Similar Image Search” algorithms
●   Compare pixels by pixels (Image::Compare?)
●   By (reduced) color histogram
●   By extracted outlines of image.
●   By representative projection vectors of fractal image
    compression
●   By characteristic values of divided regions of image.
    (Average RGB, Hue, Saturation, Value, or so)
Algorithm (1/3)
Divide image into regions (3x3 for example)
Algorithm (2/3)
            Calculate characteristic value of each regions




0.7
      0.3            1.0               0.7           0.4     0.9




0.8   0.7            0.8               0.7          0.7      0.7



                                                             0.2
0.1   0.2             0.2              0.2          0.2
Algorithm (3/3)
       Calculate RMSE of characteristic value of each images



0.7   0.3      1.0
                            二乗                  0.7       0.4   0.9

                            平均                            0.7
                                                0.7             0.7
0.8   0.7      0.8
                            平方
0.1   0.2      0.2          誤差                  0.2       0.2   0.2



                            RMSE

       0.0 (Same) < ~0.0x (Resembled) <<<<<<< 1.0 (Different)
Implementation
●   Image::Characteristics (not in CPAN)
    –   lp:~turugina/+junk/p5-image-characteristics
    –   Using Imager API from XS
●   Samples (contains no picture)
    –   lp:~turugina/+junk/img_detect
        ●   gather.pl (gathers files using File::Find and
            calculates characteristic values)
        ●   matching.pl (makes pairs of suspecious files)
        ●   rmse.pl (calculates RMSE of pairs)
        ●   web.pl (GUI using Mojolicious::Lite)
DEMO
Thank you.

Mais conteúdo relacionado

Semelhante a 「ほとんど同じ」画像を簡単に整理するために

Generic Image Processing With Climb
Generic Image Processing With ClimbGeneric Image Processing With Climb
Generic Image Processing With Climb
Laurent Senta
 
Generic Image Processing With Climb – 5th ELS
Generic Image Processing With Climb – 5th ELSGeneric Image Processing With Climb – 5th ELS
Generic Image Processing With Climb – 5th ELS
Christopher Chedeau
 
Basics of Image Processing using MATLAB
Basics of Image Processing using MATLABBasics of Image Processing using MATLAB
Basics of Image Processing using MATLAB
vkn13
 
ee8220_project_W2013_v5
ee8220_project_W2013_v5ee8220_project_W2013_v5
ee8220_project_W2013_v5
Farhad Gholami
 

Semelhante a 「ほとんど同じ」画像を簡単に整理するために (20)

review of image memorability methods
review of image memorability methodsreview of image memorability methods
review of image memorability methods
 
Generic Image Processing With Climb
Generic Image Processing With ClimbGeneric Image Processing With Climb
Generic Image Processing With Climb
 
Generic Image Processing With Climb – 5th ELS
Generic Image Processing With Climb – 5th ELSGeneric Image Processing With Climb – 5th ELS
Generic Image Processing With Climb – 5th ELS
 
IRJET- 3D Vision System using Calibrated Stereo Camera
IRJET- 3D Vision System using Calibrated Stereo CameraIRJET- 3D Vision System using Calibrated Stereo Camera
IRJET- 3D Vision System using Calibrated Stereo Camera
 
Seminar5
Seminar5Seminar5
Seminar5
 
Kinect v1+Processing workshot fabcafe_taipei
Kinect v1+Processing workshot fabcafe_taipeiKinect v1+Processing workshot fabcafe_taipei
Kinect v1+Processing workshot fabcafe_taipei
 
Dip day1&2
Dip day1&2Dip day1&2
Dip day1&2
 
Image recognition applications and dataset preparation - DevFest Baghdad 2018
Image recognition applications and dataset preparation - DevFest Baghdad 2018Image recognition applications and dataset preparation - DevFest Baghdad 2018
Image recognition applications and dataset preparation - DevFest Baghdad 2018
 
Report
ReportReport
Report
 
Working with images in matlab graphics
Working with images in matlab graphicsWorking with images in matlab graphics
Working with images in matlab graphics
 
Ember.js Tokyo event 2014/09/22 (English)
Ember.js Tokyo event 2014/09/22 (English)Ember.js Tokyo event 2014/09/22 (English)
Ember.js Tokyo event 2014/09/22 (English)
 
Basics of Image Processing using MATLAB
Basics of Image Processing using MATLABBasics of Image Processing using MATLAB
Basics of Image Processing using MATLAB
 
ee8220_project_W2013_v5
ee8220_project_W2013_v5ee8220_project_W2013_v5
ee8220_project_W2013_v5
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & Analytics
 
Structure Unstructured Data
Structure Unstructured DataStructure Unstructured Data
Structure Unstructured Data
 
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
 
Slope one recommender on hadoop
Slope one recommender on hadoopSlope one recommender on hadoop
Slope one recommender on hadoop
 
Histogram based Enhancement
Histogram based Enhancement Histogram based Enhancement
Histogram based Enhancement
 
Histogram based enhancement
Histogram based enhancementHistogram based enhancement
Histogram based enhancement
 
Structured streaming for machine learning
Structured streaming for machine learningStructured streaming for machine learning
Structured streaming for machine learning
 

Último

Último (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

「ほとんど同じ」画像を簡単に整理するために

  • 1. 「ほとんど同じ」画像を 簡単に整理するために - To throw “almost same” pictures into trash bin casually - 2012-09-29 LT(day 2)@YAPC::Asia2012 @turugina
  • 2. What I talked about @ yesterday's LT-thon Making perl script Perl Script to downlod **** images :D_
  • 3. Based on that. それを踏まえて
  • 4. Do you have any photos like these? 85(Original) (Different JPEG Quality) 50 http://www.gatag.net/07/02/2009/090000.html
  • 5. Like these 450×299(Original) (different size) 160x106 https://twitter.com/lestrrat
  • 6. Or, like these. Without glasses With glasses らとみく | あらたとしひら http://p.tl/i/18558811
  • 7. Let's throw unnecessary one away 85(Original) (Different JPEG Quality) 50 http://www.gatag.net/07/02/2009/090000.html
  • 8. throw unnecessary one away... 450×299(Original) (different size) 160x106 https://twitter.com/lestrrat
  • 9. Oh, I want to keep both of them. :) Without glasses With glasses らとみく | あらたとしひら http://p.tl/i/18558811
  • 10. How can we do that? けど、どうやって?
  • 11. One of the answers, By your eyes and hand for all of target pictures 1つの方法: 手でやれ
  • 12. “Are you kidding? I have tons of pictures!” 「バカな!手元には万単位で画像ファイルがあるんだぞ!」
  • 13. Ok, let's make “similar image search system” 「よろしい、ならば類似画像検索システムだ」
  • 14. Previous researches of “Similar Image Search” algorithms ● Compare pixels by pixels (Image::Compare?) ● By (reduced) color histogram ● By extracted outlines of image. ● By representative projection vectors of fractal image compression ● By characteristic values of divided regions of image. (Average RGB, Hue, Saturation, Value, or so)
  • 15. Previous researches of “Similar Image Search” algorithms ● Compare pixels by pixels (Image::Compare?) ● By (reduced) color histogram ● By extracted outlines of image. ● By representative projection vectors of fractal image compression ● By characteristic values of divided regions of image. (Average RGB, Hue, Saturation, Value, or so)
  • 16. Algorithm (1/3) Divide image into regions (3x3 for example)
  • 17. Algorithm (2/3) Calculate characteristic value of each regions 0.7 0.3 1.0 0.7 0.4 0.9 0.8 0.7 0.8 0.7 0.7 0.7 0.2 0.1 0.2 0.2 0.2 0.2
  • 18. Algorithm (3/3) Calculate RMSE of characteristic value of each images 0.7 0.3 1.0 二乗 0.7 0.4 0.9 平均 0.7 0.7 0.7 0.8 0.7 0.8 平方 0.1 0.2 0.2 誤差 0.2 0.2 0.2 RMSE 0.0 (Same) < ~0.0x (Resembled) <<<<<<< 1.0 (Different)
  • 19. Implementation ● Image::Characteristics (not in CPAN) – lp:~turugina/+junk/p5-image-characteristics – Using Imager API from XS ● Samples (contains no picture) – lp:~turugina/+junk/img_detect ● gather.pl (gathers files using File::Find and calculates characteristic values) ● matching.pl (makes pairs of suspecious files) ● rmse.pl (calculates RMSE of pairs) ● web.pl (GUI using Mojolicious::Lite)
  • 20. DEMO