SlideShare uma empresa Scribd logo
1 de 35
A Locality Sensitive Hashing Filter
 for Encrypted Vector Databases
           Junpei Kawamoto
     (University of Tsukuba, Japan)




         This work is partly supported by The Nakajima Foundation
Dec. 4, 2012            A Locality Sensitive Hashing Filter for Encrypted Vector Databases   2




Vector databases
• A kind of databases consists of vectors and values.
  • eg. a picture database

                   feature vector                       picture (value)



               (129, 251, 94, …. )T




               (98, 112, 49, …. )T
Dec. 4, 2012               A Locality Sensitive Hashing Filter for Encrypted Vector Databases   3




Vector databases
• A kind of databases consist of vectors and values.


• Simply assume the scheme is (k, v)
   • k: key vector attribute (feature vector, etc.)
   • v: value attribute (do not care about the data type)


• Queries
  • Only over the key vector attribute.
  • Find tuples having key vectors k s.t. sim(k, q) ≧ α
       • q: query vector
       • α: threshold                                      We employ cosine similarity and
                                                          assume all vectors are normalised.
Dec. 4, 2012              A Locality Sensitive Hashing Filter for Encrypted Vector Databases           4




Vector databases
• A query example.

          Query vector:                              feature vector                  picture (value)
            (129, 251, 90, …. )T
          Threshold:
            0.8                                 (129, 251, 94, …. )T




                (129, 251, 94, …. )T             (98, 112, 49, …. )T
Dec. 4, 2012                 A Locality Sensitive Hashing Filter for Encrypted Vector Databases   5




Cloud sourced vector databases
• A database owner wants to deploy it on a cloud service
  • To share data easily



                                               VDB
                                                                        deploy
                         access

                                                                                          VDB



                 Colleagues                                              Database owner
               (Database user)

   • The owner does not have to manage any servers.
Dec. 4, 2012               A Locality Sensitive Hashing Filter for Encrypted Vector Databases   6




Privacy and security concerns
• Can the owner and the users trust the cloud services?
                                                                 Malicious services can read
 Malicious services can read                                           data in the VDB.
     queries from users.
                                             VDB
                                                                      deploy
                        access

                                                                                        VDB



               Database user                                           Database owner
   • VDB might have sensitive information.
   • Queries (i.e. query vectors) also might be sensitive information.
Dec. 4, 2012                 A Locality Sensitive Hashing Filter for Encrypted Vector Databases   7




Encrypted vector databases
• All tuples are encrypted before deploying them.
• Queries are also encrypted.                                       Malicious services cannot
  Malicious services cannot                                        read any data in the EVDB.
read any queries from users.
                                              EVDB
                                                                        deploy
                         access
                                                                                          VDB



                 Colleagues                                              Database owner
               (Database user)

   • Many approaches are proposed.
   • We use those methods as basic protocols.
Dec. 4, 2012         A Locality Sensitive Hashing Filter for Encrypted Vector Databases   8




Encrypted vector databases
• All tuples are encrypted before deploying them.
  • Enck: An algorithm to encrypt key vectors
  • Encv: An algorithm to encrypt values
  • A plain tuple (k, v) is encrypted to (Enck(k), Encv(v))


• Queries are also encrypted.
  • Encq: An algorithm to encrypt query vector
  • A query vector q is encrypted to Encq(q)


• An important property of those encryption algorithm
  • Invariance of similarity
  • k・q = Enck(k)・Encq(q) (cosine similarities are same after encryption)
Dec. 4, 2012          A Locality Sensitive Hashing Filter for Encrypted Vector Databases   9




Encrypted vector databases
• Decryption algorithm are also shared in owner and users.
  • Deck: An decryption algorithm for key vectors
  • Decv: An decryption algorithm for values


   • Decryption algorithms for query vectors are not necessary.


• All encryption/decryption algorithms are
  • defined by each existing protocol,
  • secret for servers.        We do not define those algorithms in this work.
Dec. 4, 2012               A Locality Sensitive Hashing Filter for Encrypted Vector Databases   10




Encrypted vector databases
• Malicious cloud services cannot read any data.

       Cannot decrypt any data.


                                            EVDB
  find tuples s.t.                                                   (Enck(k), Encv(v))
Enck(k)・Encq(q) ≧ α

                                                                                        VDB



               Database user
                                                                       Database owner

• Cloud services also cannot optimise query processes.
  • Must compute similarities for all tuples.
Dec. 4, 2012          A Locality Sensitive Hashing Filter for Encrypted Vector Databases   11




The Problem of existing protocols
• Cloud services (servers) must check all tuples.
• Because of encryptions
   • Structures of vectors are not same after encryption
   • Structure based indexing such as R-tree cannot work well


   • Server also cannot cache query results, since cannot know which
      queries are same.


• We introduce a filtering method based on LSH.
  • We focus the fact that even after encryption,
    the similarities are not changed.
  • LSH is a compressed data structure to estimate similarities of vectors.
Dec. 4, 2012             A Locality Sensitive Hashing Filter for Encrypted Vector Databases   12




Locality sensitive hashing (LSH)
• Approximate similarities with small data
• LSH consists of       m functions: hi (I = 1, 2, …, m)
                    1; v・bi ≧ 0
          hi(v) =                             (bi is the base vector of function hi)
                    0; otherwise
• LSH value of a vector v
   • lsh(v) = (h1(v), h2(v), …, hm(v))
• Property
   cos(u, v )  cos( (1  Pr[ lsh (u )  lsh ( v )]))
   • Pr[lsh(u)=lsh(v)]:
      how many hash values of the two vectors u and v have same values
      i.e. hi(u) = hi(v)
Dec. 4, 2012              A Locality Sensitive Hashing Filter for Encrypted Vector Databases             13




Locality sensitive hashing (LSH)
• eg.
                                                                                               b1
   • lsh(u) = (1, 1, 0)                                                            u
   • lsh(v) = (1, 1, 1)
                                                                                         v
   • Pr[lsh(u) = lsh(v)] = 2/3                                                                      b2


   • cos(u, v) 〜 cos(π(1 – 2/3)) = 1/2

                                                                                             b3
• The accuracy of the approximation depends on
  • the number of base vectors m
  • the distribution of target vectors
Dec. 4, 2012         A Locality Sensitive Hashing Filter for Encrypted Vector Databases            14




Locality sensitive hashing (LSH)
• If the distribution of encrypted vectors is lopsided,
  LSH cannot distinguish those vectors efficiently
                        b1 To distinguish v1-v3, additional                               b1 b4
                       v1    base vectors are needed.                                v1
                          v2                                                              v2 b 5
                        v3                                                            v3
                             b2                                                             b2




                        b3                                                            b3

   • In worst case, the number of base vectors m = the number of tuples

• We employ whitening transformation to reduce skew of the
  vector space.
Dec. 4, 2012         A Locality Sensitive Hashing Filter for Encrypted Vector Databases   15




Whitening transformation
• A technique to remove correlations from vectors
  • At first, compute the average vector μ and covariance matrix Σ.

                   S = E((v - m )(v - m ) )                   T


   • Then, decompose Σ.

                   S = FLF-1
   • The whitening matrix Wk is

                   Wk = FL-1/2
Dec. 4, 2012          A Locality Sensitive Hashing Filter for Encrypted Vector Databases   16




Whitening transformation
• For any vector v, the whitened vector vw is
               v w = W (v - m )
                      k
                       T



• The covariance matrix of whitened vectors is

               E(v w vT )
                      w

               = E(WkT (v - m )(v - m )T Wk )
               = E(L -1/2FT SFL -1/2 ) = I
   • there are no correlations between the whitened vectors.
Dec. 4, 2012         A Locality Sensitive Hashing Filter for Encrypted Vector Databases   17




Applying Whitening
• Original protocol (typical EVDB protocols)
  • encrypted vector of k: Enck(k)
  • query condition of q: find Enck(k) s.t. Enck(k)・Encq(q)≧α



• Our proposal protocol           Whitening

  • encrypted vector of k: WkT(Enck(k) – μ)
  • query condition of q:
    find WkT(Enck(k)–μ) s.t. WkT(Enck(k)–μ)・Wk-1Encq(q)≧α–μ・Encq(q)
                                                                          Counter whitening
• The following two conditions are same
   • find Enck(k) s.t. Enck(k)・Encq(q)≧α
   • find WkT(Enck(k)–μ) s.t. WkT(Enck(k)–μ)・Wk-1Encq(q)≧α–μ・Encq(q)
Dec. 4, 2012        A Locality Sensitive Hashing Filter for Encrypted Vector Databases   18




Applying Whitening
• Define wrapped algorithms:
  • Enck*(k) = WkT(Enck(k) – μ)
  • Encq*(q) = Wk-1Encq(q)
  • Deck*(ke) = Deck((WkT)-1ke + μ)


• These algorithms are shared between owner and users.
Dec. 4, 2012          A Locality Sensitive Hashing Filter for Encrypted Vector Databases   19




   Preparing the LSH filter
   • At first, servers add LSH values to all tuples


                                                     converted by the server
deploy by the owner   (Enck*(k), Encv(v))


   VDB                                                      (lsh(Enck*(k)), Enck*(k), Encv(v))



                                                                                server
  Database owner
Dec. 4, 2012               A Locality Sensitive Hashing Filter for Encrypted Vector Databases   20




Preparing the LSH filter
• Make groups by LSH values.



               LSH value                                         tuple
          (1, 0, ……, 0)                       ((1, 0, ….., 0), Enck*(k1), Encv(v1))
                                              ((1, 0, ….., 0), Enck*(k2), Encv(v2))



          (1, 1, ……, 0)                       ((1, 1, ….., 0), Enck*(k1), Encv(v1))
Dec. 4, 2012          A Locality Sensitive Hashing Filter for Encrypted Vector Databases   21




    Filtering
    • After receiving queries, server computes lsh of quey vector

                                            Compute lsh(Encq*(q))


find Enck*(k) s.t.
  Enck*(k)・Encq*(q)≧α*
                                        LSH value                              tuple
                                     (1, 0, ……, 0) ((1, 0, ….., 0), Enck*(k1), Encv(v1))
                                                            ((1, 0, ….., 0), Enck*(k2), Encv(v2))


          Database user
                                     (1, 1, ……, 0) ((1, 1, ….., 0), Enck*(k1), Encv(v1))



where α* = α–μ・Encq(q)
Dec. 4, 2012            A Locality Sensitive Hashing Filter for Encrypted Vector Databases   22




    Filtering
    • After receiving queries, server computes lsh of quey vector

   Estimate similarity between                Compute lsh(Encq*(q))
   Encq*(q) and this group by
  Pr[(1,0,…,0)=lsh(Encq*(q))]
find Enck*(k) s.t.
  Enck*(k)・Encq*(q)≧α*
                                          LSH value                              tuple
                                       (1, 0, ……, 0) ((1, 0, ….., 0), Enck*(k1), Encv(v1))
                                                              ((1, 0, ….., 0), Enck*(k2), Encv(v2))


          Database user
                                       (1, 1, ……, 0) ((1, 1, ….., 0), Enck*(k1), Encv(v1))



 where α* = α–μ・Encq(q)
Dec. 4, 2012            A Locality Sensitive Hashing Filter for Encrypted Vector Databases   23




    Filtering
    • After receiving queries, server computes lsh of quey vector

   Estimate similarity between                Compute lsh(Encq*(q))
   Encq*(q) and this group by
  Pr[(1,0,…,0)=lsh(Encq*(q))]
find Enck*(k) s.t.
  Enck*(k)・Encq*(q)≧α*
                                          LSH value                              tuple
                                       (1, 0, ……, 0) ((1, 0, ….., 0), Enck*(k1), Encv(v1))
  If the estimated similarity <α*,                            ((1, 0, ….., 0), Enck*(k2), Encv(v2))
           skip this group
          Database user
                                       (1, 1, ……, 0) ((1, 1, ….., 0), Enck*(k1), Encv(v1))



 where α* = α–μ・Encq(q)
Dec. 4, 2012            A Locality Sensitive Hashing Filter for Encrypted Vector Databases   24




    Filtering
    • After receiving queries, server computes lsh of quey vector

   Estimate similarity between                Compute lsh(Encq*(q))
   Encq*(q) and this group by
  Pr[(1,1,…,0)=lsh(Encq*(q))]
find Enck*(k) s.t.
  Enck*(k)・Encq*(q)≧α*
                                          LSH value                              tuple
                                       (1, 0, ……, 0) ((1, 0, ….., 0), Enck*(k1), Encv(v1))
                                                              ((1, 0, ….., 0), Enck*(k2), Encv(v2))


          Database user
                                       (1, 1, ……, 0) ((1, 1, ….., 0), Enck*(k1), Encv(v1))



 where α* = α–μ・Encq(q)
Dec. 4, 2012            A Locality Sensitive Hashing Filter for Encrypted Vector Databases   25




    Filtering
    • After receiving queries, server computes lsh of quey vector

   Estimate similarity between                Compute lsh(Encq*(q))
   Encq*(q) and this group by
  Pr[(1,1,…,0)=lsh(Encq*(q))]
find Enck*(k) s.t.
  Enck*(k)・Encq*(q)≧α*
                                          LSH value                              tuple
                                       (1, 0, ……, 0) ((1, 0, ….., 0), Enck*(k1), Encv(v1))
  If the estimated similarity ≧α*,
                                                              ((1, 0, ….., 0), Enck*(k2), Encv(v2))
 check the actual query condition
      for all tuples in this group
           Database user
                                       (1, 1, ……, 0) ((1, 1, ….., 0), Enck*(k1), Encv(v1))



 where α* = α–μ・Encq(q)
Dec. 4, 2012               A Locality Sensitive Hashing Filter for Encrypted Vector Databases   26




    Filtering
    • After receiving queries, server computes lsh of quey vector

   Estimate similarity between                   Compute lsh(Encq*(q))
   Encq*(q) and this group by
  Pr[(1,1,…,0)=lsh(Encq*(q))]
find Enck*(k) s.t.
  Enck*(k)・Encq*(q)≧α*
                                             LSH value                              tuple
                                          (1, 0, ……, 0) ((1, 0, ….., 0), Enck*(k1), Encv(v1))
  If the estimated similarity ≧α*,
                                                                 ((1, 0, ….., 0), Compute), Encv(v2))
                                                                                  Enck*(k2
 check the actual query condition
      for all tuples in this group                                           Enck*(k)・Encq*(q)
           Database user
                                          (1, 1, ……, 0) ((1, 1, ….., 0), Enck*(k1), Encv(v1))


                  We can omit to computing similarity for less similar vectors
 where   α*   = α–μ・Encq(q)
Dec. 4, 2012          A Locality Sensitive Hashing Filter for Encrypted Vector Databases   27




Summary of our methodology
• Client side
  • Use Enck*(k), Encq*(q), and Deck*(ke)
  • instead of original algorithms defined by the associated protocol.


   • Use query conditions Enck*(k)・Encq*(q) ≧ α – μ・Encq(q)


• Server side
   • Add LSH values all tuples
   • Filter to less similar vectors using LSH values.
Dec. 4, 2012         A Locality Sensitive Hashing Filter for Encrypted Vector Databases   28




Experimental evaluations
• Effectiveness of whitening transformation.


• Recall of query results.
  • Our filter uses approximation of LSH
  • So that query results have errors.


• Query processing time.
Dec. 4, 2012            A Locality Sensitive Hashing Filter for Encrypted Vector Databases   29




Effectiveness of whitening transformation
• Comparing
  • how many different LSH values exist. (size)
  • how many vectors has same LSH values. (min, max)

                                                                 (the number of tuples = 10000)
    with whitening transformation                       without whitening transformation
Dec. 4, 2012              A Locality Sensitive Hashing Filter for Encrypted Vector Databases   30




Effectiveness of whitening transformation
• Comparing
  • how many different LSH values exist. (size)
  • how many vectors has same LSH values. (min, max)

                                                                  (the number of tuples = 100000)
    with whitening transformation                         without whitening transformation




       LSH filter can distinguish                              There is only one LSH value,
         key vectors minutely.                                 which means LSH filter
                                                               doesn’t work.
Dec. 4, 2012              A Locality Sensitive Hashing Filter for Encrypted Vector Databases   31




Effectiveness of whitening transformation
• Comparing
  • how many different LSH values exist. (size)
  • how many vectors has same LSH values. (min, max)
 In all cases, min. = 1                                           (the number of tuples = 100000)
    with whitening transformation                         without whitening transformation


   bigger m provides well
      distinguishability.                                                  almost vectors has the
                                                                              same LSH value.
Dec. 4, 2012     A Locality Sensitive Hashing Filter for Encrypted Vector Databases   32




Recall of query results
• Recalls depend on the number of base vectors
• Much base vectors achieves higher recalls.




                                                            (the number of tuples = 10000)
Dec. 4, 2012                              A Locality Sensitive Hashing Filter for Encrypted Vector Databases   33




    Query processing time
    • Calculate query processing times on an IPP EVDB.
      • IPP EVDB is a encrypted vector database†.
      • We omit the detail of IPP EVDB and the x-axis of the following fig.
                     time (sec) (log scale)




                                                                                       (the number of tuples = 100000)
†J. Kawamoto, M. Yoshikawa: Private Range Query by Perturbation and Matrix Based Encryption. In
Proc. of the 6th IEEE International Conf. on Digital Information Management, pp. 211–216. (2011)
Dec. 4, 2012                              A Locality Sensitive Hashing Filter for Encrypted Vector Databases   34




    Query processing time
    • Calculate query processing times on an IPP EVDB.
      • IPP EVDB is a encrypted vector database†.
      • We omit the detail of IPP EVDB and the x-axis of the following fig.

 We can reduce query
 processing time
                     time (sec) (log scale)




                                                                         m = 128 (recall = 0.6)




                                                                                       (the number of tuples = 100000)
†J. Kawamoto, M. Yoshikawa: Private Range Query by Perturbation and Matrix Based Encryption. In
Proc. of the 6th IEEE International Conf. on Digital Information Management, pp. 211–216. (2011)
Dec. 4, 2012         A Locality Sensitive Hashing Filter for Encrypted Vector Databases     35




Conclusion and future work
• Introduce a filtering methodology for EVDBs based on
   • locality sensitive hashing (LSH)
   • whitening transformation



• Our filter uses an approximation
  • Query results may have false negative errors
  • Applicable when users aren’t expecting perfect query results
  • We will modify our filter to increase the accuracy of query results




                                                                                      Thank you!

Mais conteúdo relacionado

Mais procurados

Connected Mobile and Web Applications with Vortex
Connected Mobile and Web Applications with VortexConnected Mobile and Web Applications with Vortex
Connected Mobile and Web Applications with VortexAngelo Corsaro
 
DDS Advanced Tutorial - OMG June 2013 Berlin Meeting
DDS Advanced Tutorial - OMG June 2013 Berlin MeetingDDS Advanced Tutorial - OMG June 2013 Berlin Meeting
DDS Advanced Tutorial - OMG June 2013 Berlin MeetingJaime Martin Losa
 
DDS over Low Bandwidth Data Links - Connext Conf London October 2014
DDS over Low Bandwidth Data Links - Connext Conf London October 2014DDS over Low Bandwidth Data Links - Connext Conf London October 2014
DDS over Low Bandwidth Data Links - Connext Conf London October 2014Jaime Martin Losa
 
ORM and distributed caching
ORM and distributed cachingORM and distributed caching
ORM and distributed cachingaragozin
 
Managing Big Data (Chapter 2, SC 11 Tutorial)
Managing Big Data (Chapter 2, SC 11 Tutorial)Managing Big Data (Chapter 2, SC 11 Tutorial)
Managing Big Data (Chapter 2, SC 11 Tutorial)Robert Grossman
 
Mcitp server administrator
Mcitp server administratorMcitp server administrator
Mcitp server administrator97148881557
 

Mais procurados (7)

Connected Mobile and Web Applications with Vortex
Connected Mobile and Web Applications with VortexConnected Mobile and Web Applications with Vortex
Connected Mobile and Web Applications with Vortex
 
DDS Advanced Tutorial - OMG June 2013 Berlin Meeting
DDS Advanced Tutorial - OMG June 2013 Berlin MeetingDDS Advanced Tutorial - OMG June 2013 Berlin Meeting
DDS Advanced Tutorial - OMG June 2013 Berlin Meeting
 
DDS over Low Bandwidth Data Links - Connext Conf London October 2014
DDS over Low Bandwidth Data Links - Connext Conf London October 2014DDS over Low Bandwidth Data Links - Connext Conf London October 2014
DDS over Low Bandwidth Data Links - Connext Conf London October 2014
 
ORM and distributed caching
ORM and distributed cachingORM and distributed caching
ORM and distributed caching
 
Managing Big Data (Chapter 2, SC 11 Tutorial)
Managing Big Data (Chapter 2, SC 11 Tutorial)Managing Big Data (Chapter 2, SC 11 Tutorial)
Managing Big Data (Chapter 2, SC 11 Tutorial)
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Mcitp server administrator
Mcitp server administratorMcitp server administrator
Mcitp server administrator
 

Semelhante a A Locality Sensitive Hashing Filter for Encrypted Vector Databases

NoSQL overview #phptostart turin 11.07.2011
NoSQL overview #phptostart turin 11.07.2011NoSQL overview #phptostart turin 11.07.2011
NoSQL overview #phptostart turin 11.07.2011David Funaro
 
Lync Server 2010: High Availability [I3004]
Lync Server 2010: High Availability [I3004] Lync Server 2010: High Availability [I3004]
Lync Server 2010: High Availability [I3004] Fabrizio Volpe
 
SDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsSDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsKorea Sdec
 
Introducing the WSO2 Elastic Load Balancer
Introducing the WSO2 Elastic Load BalancerIntroducing the WSO2 Elastic Load Balancer
Introducing the WSO2 Elastic Load BalancerWSO2
 
An introduction to Pincaster
An introduction to PincasterAn introduction to Pincaster
An introduction to PincasterFrank Denis
 
Dynamodb Presentation
Dynamodb PresentationDynamodb Presentation
Dynamodb Presentationadvaitdeo
 
DSD-INT 2020 Scripting a Delft-FEWS configuration - Verkade
DSD-INT 2020 Scripting a Delft-FEWS configuration - VerkadeDSD-INT 2020 Scripting a Delft-FEWS configuration - Verkade
DSD-INT 2020 Scripting a Delft-FEWS configuration - VerkadeDeltares
 
OpenStack and OpenDaylight Workshop: ONUG Spring 2014
OpenStack and OpenDaylight Workshop: ONUG Spring 2014OpenStack and OpenDaylight Workshop: ONUG Spring 2014
OpenStack and OpenDaylight Workshop: ONUG Spring 2014mestery
 
Decade architecture discussion 20110311
Decade architecture discussion 20110311Decade architecture discussion 20110311
Decade architecture discussion 20110311chenlijiang
 
Pyramid: A large-scale array-oriented active storage system
Pyramid: A large-scale array-oriented active storage systemPyramid: A large-scale array-oriented active storage system
Pyramid: A large-scale array-oriented active storage systemViet-Trung TRAN
 
OrientDB the graph database
OrientDB the graph databaseOrientDB the graph database
OrientDB the graph databaseartem_orobets
 
OrientDB the graph database
OrientDB the graph databaseOrientDB the graph database
OrientDB the graph databaseArtem Orobets
 
Directory services by SAJID
Directory services by SAJIDDirectory services by SAJID
Directory services by SAJIDSajid khan
 
Fedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data ProcessingFedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data ProcessingPeter Haase
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservicesBigstep
 
How Kafka Powers the World's Most Popular Vector Database System with Charles...
How Kafka Powers the World's Most Popular Vector Database System with Charles...How Kafka Powers the World's Most Popular Vector Database System with Charles...
How Kafka Powers the World's Most Popular Vector Database System with Charles...HostedbyConfluent
 

Semelhante a A Locality Sensitive Hashing Filter for Encrypted Vector Databases (20)

Couchbase 101
Couchbase 101 Couchbase 101
Couchbase 101
 
Methods of NoSQL database systems benchmarking
Methods of NoSQL database systems benchmarkingMethods of NoSQL database systems benchmarking
Methods of NoSQL database systems benchmarking
 
NoSQL overview #phptostart turin 11.07.2011
NoSQL overview #phptostart turin 11.07.2011NoSQL overview #phptostart turin 11.07.2011
NoSQL overview #phptostart turin 11.07.2011
 
Lync Server 2010: High Availability [I3004]
Lync Server 2010: High Availability [I3004] Lync Server 2010: High Availability [I3004]
Lync Server 2010: High Availability [I3004]
 
SDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsSDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and models
 
Introducing the WSO2 Elastic Load Balancer
Introducing the WSO2 Elastic Load BalancerIntroducing the WSO2 Elastic Load Balancer
Introducing the WSO2 Elastic Load Balancer
 
An introduction to Pincaster
An introduction to PincasterAn introduction to Pincaster
An introduction to Pincaster
 
Dynamodb Presentation
Dynamodb PresentationDynamodb Presentation
Dynamodb Presentation
 
DSD-INT 2020 Scripting a Delft-FEWS configuration - Verkade
DSD-INT 2020 Scripting a Delft-FEWS configuration - VerkadeDSD-INT 2020 Scripting a Delft-FEWS configuration - Verkade
DSD-INT 2020 Scripting a Delft-FEWS configuration - Verkade
 
OpenStack and OpenDaylight Workshop: ONUG Spring 2014
OpenStack and OpenDaylight Workshop: ONUG Spring 2014OpenStack and OpenDaylight Workshop: ONUG Spring 2014
OpenStack and OpenDaylight Workshop: ONUG Spring 2014
 
Decade architecture discussion 20110311
Decade architecture discussion 20110311Decade architecture discussion 20110311
Decade architecture discussion 20110311
 
Pyramid: A large-scale array-oriented active storage system
Pyramid: A large-scale array-oriented active storage systemPyramid: A large-scale array-oriented active storage system
Pyramid: A large-scale array-oriented active storage system
 
OrientDB the graph database
OrientDB the graph databaseOrientDB the graph database
OrientDB the graph database
 
OrientDB the graph database
OrientDB the graph databaseOrientDB the graph database
OrientDB the graph database
 
Directory services by SAJID
Directory services by SAJIDDirectory services by SAJID
Directory services by SAJID
 
Fedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data ProcessingFedbench - A Benchmark Suite for Federated Semantic Data Processing
Fedbench - A Benchmark Suite for Federated Semantic Data Processing
 
Big data stores
Big data  storesBig data  stores
Big data stores
 
No Sql
No SqlNo Sql
No Sql
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservices
 
How Kafka Powers the World's Most Popular Vector Database System with Charles...
How Kafka Powers the World's Most Popular Vector Database System with Charles...How Kafka Powers the World's Most Popular Vector Database System with Charles...
How Kafka Powers the World's Most Popular Vector Database System with Charles...
 

Mais de Junpei Kawamoto

レビューサイトにおける不均質性を考慮した特異なレビュアー発⾒とレビューサマリの推測
レビューサイトにおける不均質性を考慮した特異なレビュアー発⾒とレビューサマリの推測レビューサイトにおける不均質性を考慮した特異なレビュアー発⾒とレビューサマリの推測
レビューサイトにおける不均質性を考慮した特異なレビュアー発⾒とレビューサマリの推測Junpei Kawamoto
 
初期レビューを用いた長期間評価推定􏰀
初期レビューを用いた長期間評価推定􏰀初期レビューを用いた長期間評価推定􏰀
初期レビューを用いた長期間評価推定􏰀Junpei Kawamoto
 
Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...
Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...
Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...Junpei Kawamoto
 
Securing Social Information from Query Analysis in Outsourced Databases
Securing Social Information from Query Analysis in Outsourced DatabasesSecuring Social Information from Query Analysis in Outsourced Databases
Securing Social Information from Query Analysis in Outsourced DatabasesJunpei Kawamoto
 
クエリログとナビゲーション履歴から探索意図抽出による協調探索支援
クエリログとナビゲーション履歴から探索意図抽出による協調探索支援クエリログとナビゲーション履歴から探索意図抽出による協調探索支援
クエリログとナビゲーション履歴から探索意図抽出による協調探索支援Junpei Kawamoto
 
Privacy for Continual Data Publishing
Privacy for Continual Data PublishingPrivacy for Continual Data Publishing
Privacy for Continual Data PublishingJunpei Kawamoto
 
暗号化ベクトルデータベースのための索引構造
暗号化ベクトルデータベースのための索引構造暗号化ベクトルデータベースのための索引構造
暗号化ベクトルデータベースのための索引構造Junpei Kawamoto
 
暗号化データベースモデルにおける問合せの関連情報を秘匿する範囲検索
暗号化データベースモデルにおける問合せの関連情報を秘匿する範囲検索暗号化データベースモデルにおける問合せの関連情報を秘匿する範囲検索
暗号化データベースモデルにおける問合せの関連情報を秘匿する範囲検索Junpei Kawamoto
 
マルコフ過程を用いた位置情報継続開示のためのアドバーザリアルプライバシ
マルコフ過程を用いた位置情報継続開示のためのアドバーザリアルプライバシマルコフ過程を用いた位置情報継続開示のためのアドバーザリアルプライバシ
マルコフ過程を用いた位置情報継続開示のためのアドバーザリアルプライバシJunpei Kawamoto
 
データ共有型WEBアプリケーションにおけるサーバ暗号化
データ共有型WEBアプリケーションにおけるサーバ暗号化データ共有型WEBアプリケーションにおけるサーバ暗号化
データ共有型WEBアプリケーションにおけるサーバ暗号化Junpei Kawamoto
 
マルコフモデルを仮定した位置情報開示のためのアドバーザリアルプライバシ
マルコフモデルを仮定した位置情報開示のためのアドバーザリアルプライバシマルコフモデルを仮定した位置情報開示のためのアドバーザリアルプライバシ
マルコフモデルを仮定した位置情報開示のためのアドバーザリアルプライバシJunpei Kawamoto
 
プライベート問合せにおける問合せ頻度を用いた制約緩和手法
プライベート問合せにおける問合せ頻度を用いた制約緩和手法プライベート問合せにおける問合せ頻度を用いた制約緩和手法
プライベート問合せにおける問合せ頻度を用いた制約緩和手法Junpei Kawamoto
 
プライバシを考慮した移動系列情報解析のための安全性の提案
プライバシを考慮した移動系列情報解析のための安全性の提案プライバシを考慮した移動系列情報解析のための安全性の提案
プライバシを考慮した移動系列情報解析のための安全性の提案Junpei Kawamoto
 
位置情報解析のためのプライバシ保護手法
位置情報解析のためのプライバシ保護手法位置情報解析のためのプライバシ保護手法
位置情報解析のためのプライバシ保護手法Junpei Kawamoto
 
Sponsored Search Markets (from Networks, Crowds, and Markets: Reasoning About...
Sponsored Search Markets (from Networks, Crowds, and Markets: Reasoning About...Sponsored Search Markets (from Networks, Crowds, and Markets: Reasoning About...
Sponsored Search Markets (from Networks, Crowds, and Markets: Reasoning About...Junpei Kawamoto
 
Private Range Query by Perturbation and Matrix Based Encryption
Private Range Query by Perturbation and Matrix Based EncryptionPrivate Range Query by Perturbation and Matrix Based Encryption
Private Range Query by Perturbation and Matrix Based EncryptionJunpei Kawamoto
 
暗号化データベースモデルにおける関係情報推定を防ぐ索引手法
暗号化データベースモデルにおける関係情報推定を防ぐ索引手法暗号化データベースモデルにおける関係情報推定を防ぐ索引手法
暗号化データベースモデルにおける関係情報推定を防ぐ索引手法Junpei Kawamoto
 
VLDB09勉強会 Session27 Privacy2
VLDB09勉強会 Session27 Privacy2VLDB09勉強会 Session27 Privacy2
VLDB09勉強会 Session27 Privacy2Junpei Kawamoto
 
Reducing Data Decryption Cost by Broadcast Encryption and Account Assignment ...
Reducing Data Decryption Cost by Broadcast Encryption and Account Assignment ...Reducing Data Decryption Cost by Broadcast Encryption and Account Assignment ...
Reducing Data Decryption Cost by Broadcast Encryption and Account Assignment ...Junpei Kawamoto
 
Security of Social Information from Query Analysis in DaaS
Security of Social Information from Query Analysis in DaaSSecurity of Social Information from Query Analysis in DaaS
Security of Social Information from Query Analysis in DaaSJunpei Kawamoto
 

Mais de Junpei Kawamoto (20)

レビューサイトにおける不均質性を考慮した特異なレビュアー発⾒とレビューサマリの推測
レビューサイトにおける不均質性を考慮した特異なレビュアー発⾒とレビューサマリの推測レビューサイトにおける不均質性を考慮した特異なレビュアー発⾒とレビューサマリの推測
レビューサイトにおける不均質性を考慮した特異なレビュアー発⾒とレビューサマリの推測
 
初期レビューを用いた長期間評価推定􏰀
初期レビューを用いた長期間評価推定􏰀初期レビューを用いた長期間評価推定􏰀
初期レビューを用いた長期間評価推定􏰀
 
Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...
Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...
Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...
 
Securing Social Information from Query Analysis in Outsourced Databases
Securing Social Information from Query Analysis in Outsourced DatabasesSecuring Social Information from Query Analysis in Outsourced Databases
Securing Social Information from Query Analysis in Outsourced Databases
 
クエリログとナビゲーション履歴から探索意図抽出による協調探索支援
クエリログとナビゲーション履歴から探索意図抽出による協調探索支援クエリログとナビゲーション履歴から探索意図抽出による協調探索支援
クエリログとナビゲーション履歴から探索意図抽出による協調探索支援
 
Privacy for Continual Data Publishing
Privacy for Continual Data PublishingPrivacy for Continual Data Publishing
Privacy for Continual Data Publishing
 
暗号化ベクトルデータベースのための索引構造
暗号化ベクトルデータベースのための索引構造暗号化ベクトルデータベースのための索引構造
暗号化ベクトルデータベースのための索引構造
 
暗号化データベースモデルにおける問合せの関連情報を秘匿する範囲検索
暗号化データベースモデルにおける問合せの関連情報を秘匿する範囲検索暗号化データベースモデルにおける問合せの関連情報を秘匿する範囲検索
暗号化データベースモデルにおける問合せの関連情報を秘匿する範囲検索
 
マルコフ過程を用いた位置情報継続開示のためのアドバーザリアルプライバシ
マルコフ過程を用いた位置情報継続開示のためのアドバーザリアルプライバシマルコフ過程を用いた位置情報継続開示のためのアドバーザリアルプライバシ
マルコフ過程を用いた位置情報継続開示のためのアドバーザリアルプライバシ
 
データ共有型WEBアプリケーションにおけるサーバ暗号化
データ共有型WEBアプリケーションにおけるサーバ暗号化データ共有型WEBアプリケーションにおけるサーバ暗号化
データ共有型WEBアプリケーションにおけるサーバ暗号化
 
マルコフモデルを仮定した位置情報開示のためのアドバーザリアルプライバシ
マルコフモデルを仮定した位置情報開示のためのアドバーザリアルプライバシマルコフモデルを仮定した位置情報開示のためのアドバーザリアルプライバシ
マルコフモデルを仮定した位置情報開示のためのアドバーザリアルプライバシ
 
プライベート問合せにおける問合せ頻度を用いた制約緩和手法
プライベート問合せにおける問合せ頻度を用いた制約緩和手法プライベート問合せにおける問合せ頻度を用いた制約緩和手法
プライベート問合せにおける問合せ頻度を用いた制約緩和手法
 
プライバシを考慮した移動系列情報解析のための安全性の提案
プライバシを考慮した移動系列情報解析のための安全性の提案プライバシを考慮した移動系列情報解析のための安全性の提案
プライバシを考慮した移動系列情報解析のための安全性の提案
 
位置情報解析のためのプライバシ保護手法
位置情報解析のためのプライバシ保護手法位置情報解析のためのプライバシ保護手法
位置情報解析のためのプライバシ保護手法
 
Sponsored Search Markets (from Networks, Crowds, and Markets: Reasoning About...
Sponsored Search Markets (from Networks, Crowds, and Markets: Reasoning About...Sponsored Search Markets (from Networks, Crowds, and Markets: Reasoning About...
Sponsored Search Markets (from Networks, Crowds, and Markets: Reasoning About...
 
Private Range Query by Perturbation and Matrix Based Encryption
Private Range Query by Perturbation and Matrix Based EncryptionPrivate Range Query by Perturbation and Matrix Based Encryption
Private Range Query by Perturbation and Matrix Based Encryption
 
暗号化データベースモデルにおける関係情報推定を防ぐ索引手法
暗号化データベースモデルにおける関係情報推定を防ぐ索引手法暗号化データベースモデルにおける関係情報推定を防ぐ索引手法
暗号化データベースモデルにおける関係情報推定を防ぐ索引手法
 
VLDB09勉強会 Session27 Privacy2
VLDB09勉強会 Session27 Privacy2VLDB09勉強会 Session27 Privacy2
VLDB09勉強会 Session27 Privacy2
 
Reducing Data Decryption Cost by Broadcast Encryption and Account Assignment ...
Reducing Data Decryption Cost by Broadcast Encryption and Account Assignment ...Reducing Data Decryption Cost by Broadcast Encryption and Account Assignment ...
Reducing Data Decryption Cost by Broadcast Encryption and Account Assignment ...
 
Security of Social Information from Query Analysis in DaaS
Security of Social Information from Query Analysis in DaaSSecurity of Social Information from Query Analysis in DaaS
Security of Social Information from Query Analysis in DaaS
 

Último

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Último (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

A Locality Sensitive Hashing Filter for Encrypted Vector Databases

  • 1. A Locality Sensitive Hashing Filter for Encrypted Vector Databases Junpei Kawamoto (University of Tsukuba, Japan) This work is partly supported by The Nakajima Foundation
  • 2. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 2 Vector databases • A kind of databases consists of vectors and values. • eg. a picture database feature vector picture (value) (129, 251, 94, …. )T (98, 112, 49, …. )T
  • 3. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 3 Vector databases • A kind of databases consist of vectors and values. • Simply assume the scheme is (k, v) • k: key vector attribute (feature vector, etc.) • v: value attribute (do not care about the data type) • Queries • Only over the key vector attribute. • Find tuples having key vectors k s.t. sim(k, q) ≧ α • q: query vector • α: threshold We employ cosine similarity and assume all vectors are normalised.
  • 4. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 4 Vector databases • A query example. Query vector: feature vector picture (value) (129, 251, 90, …. )T Threshold: 0.8 (129, 251, 94, …. )T (129, 251, 94, …. )T (98, 112, 49, …. )T
  • 5. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 5 Cloud sourced vector databases • A database owner wants to deploy it on a cloud service • To share data easily VDB deploy access VDB Colleagues Database owner (Database user) • The owner does not have to manage any servers.
  • 6. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 6 Privacy and security concerns • Can the owner and the users trust the cloud services? Malicious services can read Malicious services can read data in the VDB. queries from users. VDB deploy access VDB Database user Database owner • VDB might have sensitive information. • Queries (i.e. query vectors) also might be sensitive information.
  • 7. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 7 Encrypted vector databases • All tuples are encrypted before deploying them. • Queries are also encrypted. Malicious services cannot Malicious services cannot read any data in the EVDB. read any queries from users. EVDB deploy access VDB Colleagues Database owner (Database user) • Many approaches are proposed. • We use those methods as basic protocols.
  • 8. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 8 Encrypted vector databases • All tuples are encrypted before deploying them. • Enck: An algorithm to encrypt key vectors • Encv: An algorithm to encrypt values • A plain tuple (k, v) is encrypted to (Enck(k), Encv(v)) • Queries are also encrypted. • Encq: An algorithm to encrypt query vector • A query vector q is encrypted to Encq(q) • An important property of those encryption algorithm • Invariance of similarity • k・q = Enck(k)・Encq(q) (cosine similarities are same after encryption)
  • 9. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 9 Encrypted vector databases • Decryption algorithm are also shared in owner and users. • Deck: An decryption algorithm for key vectors • Decv: An decryption algorithm for values • Decryption algorithms for query vectors are not necessary. • All encryption/decryption algorithms are • defined by each existing protocol, • secret for servers. We do not define those algorithms in this work.
  • 10. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 10 Encrypted vector databases • Malicious cloud services cannot read any data. Cannot decrypt any data. EVDB find tuples s.t. (Enck(k), Encv(v)) Enck(k)・Encq(q) ≧ α VDB Database user Database owner • Cloud services also cannot optimise query processes. • Must compute similarities for all tuples.
  • 11. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 11 The Problem of existing protocols • Cloud services (servers) must check all tuples. • Because of encryptions • Structures of vectors are not same after encryption • Structure based indexing such as R-tree cannot work well • Server also cannot cache query results, since cannot know which queries are same. • We introduce a filtering method based on LSH. • We focus the fact that even after encryption, the similarities are not changed. • LSH is a compressed data structure to estimate similarities of vectors.
  • 12. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 12 Locality sensitive hashing (LSH) • Approximate similarities with small data • LSH consists of m functions: hi (I = 1, 2, …, m) 1; v・bi ≧ 0 hi(v) = (bi is the base vector of function hi) 0; otherwise • LSH value of a vector v • lsh(v) = (h1(v), h2(v), …, hm(v)) • Property cos(u, v )  cos( (1  Pr[ lsh (u )  lsh ( v )])) • Pr[lsh(u)=lsh(v)]: how many hash values of the two vectors u and v have same values i.e. hi(u) = hi(v)
  • 13. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 13 Locality sensitive hashing (LSH) • eg. b1 • lsh(u) = (1, 1, 0) u • lsh(v) = (1, 1, 1) v • Pr[lsh(u) = lsh(v)] = 2/3 b2 • cos(u, v) 〜 cos(π(1 – 2/3)) = 1/2 b3 • The accuracy of the approximation depends on • the number of base vectors m • the distribution of target vectors
  • 14. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 14 Locality sensitive hashing (LSH) • If the distribution of encrypted vectors is lopsided, LSH cannot distinguish those vectors efficiently b1 To distinguish v1-v3, additional b1 b4 v1 base vectors are needed. v1 v2 v2 b 5 v3 v3 b2 b2 b3 b3 • In worst case, the number of base vectors m = the number of tuples • We employ whitening transformation to reduce skew of the vector space.
  • 15. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 15 Whitening transformation • A technique to remove correlations from vectors • At first, compute the average vector μ and covariance matrix Σ. S = E((v - m )(v - m ) ) T • Then, decompose Σ. S = FLF-1 • The whitening matrix Wk is Wk = FL-1/2
  • 16. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 16 Whitening transformation • For any vector v, the whitened vector vw is v w = W (v - m ) k T • The covariance matrix of whitened vectors is E(v w vT ) w = E(WkT (v - m )(v - m )T Wk ) = E(L -1/2FT SFL -1/2 ) = I • there are no correlations between the whitened vectors.
  • 17. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 17 Applying Whitening • Original protocol (typical EVDB protocols) • encrypted vector of k: Enck(k) • query condition of q: find Enck(k) s.t. Enck(k)・Encq(q)≧α • Our proposal protocol Whitening • encrypted vector of k: WkT(Enck(k) – μ) • query condition of q: find WkT(Enck(k)–μ) s.t. WkT(Enck(k)–μ)・Wk-1Encq(q)≧α–μ・Encq(q) Counter whitening • The following two conditions are same • find Enck(k) s.t. Enck(k)・Encq(q)≧α • find WkT(Enck(k)–μ) s.t. WkT(Enck(k)–μ)・Wk-1Encq(q)≧α–μ・Encq(q)
  • 18. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 18 Applying Whitening • Define wrapped algorithms: • Enck*(k) = WkT(Enck(k) – μ) • Encq*(q) = Wk-1Encq(q) • Deck*(ke) = Deck((WkT)-1ke + μ) • These algorithms are shared between owner and users.
  • 19. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 19 Preparing the LSH filter • At first, servers add LSH values to all tuples converted by the server deploy by the owner (Enck*(k), Encv(v)) VDB (lsh(Enck*(k)), Enck*(k), Encv(v)) server Database owner
  • 20. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 20 Preparing the LSH filter • Make groups by LSH values. LSH value tuple (1, 0, ……, 0) ((1, 0, ….., 0), Enck*(k1), Encv(v1)) ((1, 0, ….., 0), Enck*(k2), Encv(v2)) (1, 1, ……, 0) ((1, 1, ….., 0), Enck*(k1), Encv(v1))
  • 21. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 21 Filtering • After receiving queries, server computes lsh of quey vector Compute lsh(Encq*(q)) find Enck*(k) s.t. Enck*(k)・Encq*(q)≧α* LSH value tuple (1, 0, ……, 0) ((1, 0, ….., 0), Enck*(k1), Encv(v1)) ((1, 0, ….., 0), Enck*(k2), Encv(v2)) Database user (1, 1, ……, 0) ((1, 1, ….., 0), Enck*(k1), Encv(v1)) where α* = α–μ・Encq(q)
  • 22. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 22 Filtering • After receiving queries, server computes lsh of quey vector Estimate similarity between Compute lsh(Encq*(q)) Encq*(q) and this group by Pr[(1,0,…,0)=lsh(Encq*(q))] find Enck*(k) s.t. Enck*(k)・Encq*(q)≧α* LSH value tuple (1, 0, ……, 0) ((1, 0, ….., 0), Enck*(k1), Encv(v1)) ((1, 0, ….., 0), Enck*(k2), Encv(v2)) Database user (1, 1, ……, 0) ((1, 1, ….., 0), Enck*(k1), Encv(v1)) where α* = α–μ・Encq(q)
  • 23. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 23 Filtering • After receiving queries, server computes lsh of quey vector Estimate similarity between Compute lsh(Encq*(q)) Encq*(q) and this group by Pr[(1,0,…,0)=lsh(Encq*(q))] find Enck*(k) s.t. Enck*(k)・Encq*(q)≧α* LSH value tuple (1, 0, ……, 0) ((1, 0, ….., 0), Enck*(k1), Encv(v1)) If the estimated similarity <α*, ((1, 0, ….., 0), Enck*(k2), Encv(v2)) skip this group Database user (1, 1, ……, 0) ((1, 1, ….., 0), Enck*(k1), Encv(v1)) where α* = α–μ・Encq(q)
  • 24. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 24 Filtering • After receiving queries, server computes lsh of quey vector Estimate similarity between Compute lsh(Encq*(q)) Encq*(q) and this group by Pr[(1,1,…,0)=lsh(Encq*(q))] find Enck*(k) s.t. Enck*(k)・Encq*(q)≧α* LSH value tuple (1, 0, ……, 0) ((1, 0, ….., 0), Enck*(k1), Encv(v1)) ((1, 0, ….., 0), Enck*(k2), Encv(v2)) Database user (1, 1, ……, 0) ((1, 1, ….., 0), Enck*(k1), Encv(v1)) where α* = α–μ・Encq(q)
  • 25. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 25 Filtering • After receiving queries, server computes lsh of quey vector Estimate similarity between Compute lsh(Encq*(q)) Encq*(q) and this group by Pr[(1,1,…,0)=lsh(Encq*(q))] find Enck*(k) s.t. Enck*(k)・Encq*(q)≧α* LSH value tuple (1, 0, ……, 0) ((1, 0, ….., 0), Enck*(k1), Encv(v1)) If the estimated similarity ≧α*, ((1, 0, ….., 0), Enck*(k2), Encv(v2)) check the actual query condition for all tuples in this group Database user (1, 1, ……, 0) ((1, 1, ….., 0), Enck*(k1), Encv(v1)) where α* = α–μ・Encq(q)
  • 26. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 26 Filtering • After receiving queries, server computes lsh of quey vector Estimate similarity between Compute lsh(Encq*(q)) Encq*(q) and this group by Pr[(1,1,…,0)=lsh(Encq*(q))] find Enck*(k) s.t. Enck*(k)・Encq*(q)≧α* LSH value tuple (1, 0, ……, 0) ((1, 0, ….., 0), Enck*(k1), Encv(v1)) If the estimated similarity ≧α*, ((1, 0, ….., 0), Compute), Encv(v2)) Enck*(k2 check the actual query condition for all tuples in this group Enck*(k)・Encq*(q) Database user (1, 1, ……, 0) ((1, 1, ….., 0), Enck*(k1), Encv(v1)) We can omit to computing similarity for less similar vectors where α* = α–μ・Encq(q)
  • 27. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 27 Summary of our methodology • Client side • Use Enck*(k), Encq*(q), and Deck*(ke) • instead of original algorithms defined by the associated protocol. • Use query conditions Enck*(k)・Encq*(q) ≧ α – μ・Encq(q) • Server side • Add LSH values all tuples • Filter to less similar vectors using LSH values.
  • 28. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 28 Experimental evaluations • Effectiveness of whitening transformation. • Recall of query results. • Our filter uses approximation of LSH • So that query results have errors. • Query processing time.
  • 29. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 29 Effectiveness of whitening transformation • Comparing • how many different LSH values exist. (size) • how many vectors has same LSH values. (min, max) (the number of tuples = 10000) with whitening transformation without whitening transformation
  • 30. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 30 Effectiveness of whitening transformation • Comparing • how many different LSH values exist. (size) • how many vectors has same LSH values. (min, max) (the number of tuples = 100000) with whitening transformation without whitening transformation LSH filter can distinguish There is only one LSH value, key vectors minutely. which means LSH filter doesn’t work.
  • 31. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 31 Effectiveness of whitening transformation • Comparing • how many different LSH values exist. (size) • how many vectors has same LSH values. (min, max) In all cases, min. = 1 (the number of tuples = 100000) with whitening transformation without whitening transformation bigger m provides well distinguishability. almost vectors has the same LSH value.
  • 32. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 32 Recall of query results • Recalls depend on the number of base vectors • Much base vectors achieves higher recalls. (the number of tuples = 10000)
  • 33. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 33 Query processing time • Calculate query processing times on an IPP EVDB. • IPP EVDB is a encrypted vector database†. • We omit the detail of IPP EVDB and the x-axis of the following fig. time (sec) (log scale) (the number of tuples = 100000) †J. Kawamoto, M. Yoshikawa: Private Range Query by Perturbation and Matrix Based Encryption. In Proc. of the 6th IEEE International Conf. on Digital Information Management, pp. 211–216. (2011)
  • 34. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 34 Query processing time • Calculate query processing times on an IPP EVDB. • IPP EVDB is a encrypted vector database†. • We omit the detail of IPP EVDB and the x-axis of the following fig. We can reduce query processing time time (sec) (log scale) m = 128 (recall = 0.6) (the number of tuples = 100000) †J. Kawamoto, M. Yoshikawa: Private Range Query by Perturbation and Matrix Based Encryption. In Proc. of the 6th IEEE International Conf. on Digital Information Management, pp. 211–216. (2011)
  • 35. Dec. 4, 2012 A Locality Sensitive Hashing Filter for Encrypted Vector Databases 35 Conclusion and future work • Introduce a filtering methodology for EVDBs based on • locality sensitive hashing (LSH) • whitening transformation • Our filter uses an approximation • Query results may have false negative errors • Applicable when users aren’t expecting perfect query results • We will modify our filter to increase the accuracy of query results Thank you!