SlideShare a Scribd company logo
1 of 28
Astronomical Data Processing Using
   SciQL, an SQL Based Query
     Language for Array Data



 Ying Zhang, Bart Scheers, Martin Kersten, Milena Ivanova, Niels Nes
                          CWI Amsterdam

             ADASS XXI, Nov. 06-10, 2011, Paris, France

                                       !"#$%&'()*+,#-&$.#/(012#&+$#%3$%#,(
                                       2.#(4&#$5()*+,#-&$".1(6&$&



                                       !"#$%&'()&"#*+,-(     ./0/123
                                       4")*'()5"%%,%*'(*#-(( 6!7(8 9:7;;9
Why Not RDBMS?

             SQL is difficult

                No appropriate array denotations

                No functional complete operation set

             DBMSs are slow

                Too much overhead

                Size limitations (due to BLOB representations)

                Existing foreign files

                Scale

                ...




2011-11-09                               ADASS XXI                  3
SciQL
             An array query language based on SQL:2003

             To lower the entrance fee to RDBMSs



             Distinguish features (Kersten et al. AD ’11; Zhang et al.
             IDEAS2011):

                Arrays and tables as first class citizens of DBMSs

                Seamless integration of relational and array paradigms

                Named dimensions with constraints

                Flexible structure-based grouping




             LOFAR Transient Key Project use case

2011-11-09                                ADASS XXI                          4
Array Definitions
                        y               null

                    3       0.0   0.0          0.0     0.0
                    2       0.0   0.0          0.0     0.0
             null                                                null
                    1       0.0   0.0          0.0     0.0
                    0       0.0   0.0          0.0     0.0
                                                             x
                            0     1            2       3
                                        null

             CREATE ARRAY A1 (
              x INT DIMENSION [0:1:4], y INT DIMENSION [0:1:4],
              v FLOAT DEFAULT 0.0);




2011-11-09                        ADASS XXI                             5
Array Definitions
                        y                   null

                    3       0.0       0.0          0.0     0.0
                    2       0.0       0.0          0.0     0.0
             null                                                    null
                    1       0.0       0.0          0.0     0.0
                    0       0.0       0.0          0.0     0.0
                                                                 x
                            0          1           2       3
                                            null

             CREATE ARRAY A1 (
              x INT DIMENSION [0:1:4], y INT DIMENSION [0:1:4],
              v FLOAT DEFAULT 0.0);


                                      dimensions,
                                  any scalar data type



2011-11-09                             ADASS XXI                            5
Array Definitions
                        y                    null

                    3       0.0        0.0          0.0         0.0
                    2       0.0        0.0          0.0         0.0
             null                                                         null
                    1       0.0        0.0          0.0         0.0
                    0       0.0        0.0          0.0         0.0
                                                                      x
                            0            1           2          3
                                             null

             CREATE ARRAY A1 (
              x INT DIMENSION [0:1:4], y INT DIMENSION [0:1:4],
              v FLOAT DEFAULT 0.0);

                            dimensional range:
                            [(start|∗) : (step|∗) : (stop|∗)]




2011-11-09                              ADASS XXI                                6
Array Definitions
                        y                null

                    3       0.0   0.0           0.0     0.0
                    2       0.0   0.0           0.0     0.0
             null                                                 null
                    1       0.0   0.0           0.0     0.0
                    0       0.0   0.0           0.0     0.0
                                                              x
                            0        1           2      3
                                         null

             CREATE ARRAY A1 (
              x INT DIMENSION [0:1:4], y INT DIMENSION [0:1:4],
              v FLOAT DEFAULT 0.0);




                   cell values,
              any column data type


2011-11-09                           ADASS XXI                           7
Array Tiling
                SELECT [x], [y], AVG(v) FROM A1
                GROUP BY A1[x:x+2][y:y+2];


                        y              null

                    3       0.0   0.0         0.0   0.0

                    2       0.0   0.0         0.0   0.0
             null                                         null
                    1       0.0   0.5         0.5   0.5

                    0       0.0   0.0         0.0   0.0
                             0     1           2     3
                                                           x
                                       null




2011-11-09                        ADASS XXI                                 8
Array Tiling
                      SELECT [x], [y], AVG(v) FROM A1
                      GROUP BY A1[x:x+2][y:y+2];


                              y              null

                          3       0.0   0.0         0.0   0.0


   Anchor point:          2       0.0   0.0         0.0   0.0
     A1[x][y]      null                                         null
                          1       0.0   0.5         0.5   0.5

                          0       0.0   0.0         0.0   0.0
                                   0     1           2     3
                                                                 x
                                             null




2011-11-09                              ADASS XXI                                 8
Array Tiling
                      SELECT [x], [y], AVG(v) FROM A1
                      GROUP BY A1[x:x+2][y:y+2];


                              y              null

                          3       0.0   0.0         0.0   0.0


   Anchor point:          2       0.0   0.0         0.0   0.0
     A1[x][y]      null                                         null
                          1       0.0   0.5         0.5   0.5

                          0       0.0   0.0         0.0   0.0
                                   0     1           2     3
                                                                 x
                                             null




2011-11-09                              ADASS XXI                                 8
Array Tiling
                      SELECT [x], [y], AVG(v) FROM A1
                      GROUP BY A1[x:x+2][y:y+2];


                              y              null

                          3       0.0   0.0         0.0   0.0


   Anchor point:          2       0.0   0.0         0.0   0.0
     A1[x][y]      null                                         null
                          1       0.0   0.5         0.5   0.5

                          0       0.0   0.0         0.0   0.0
                                   0     1           2     3
                                                                 x
                                             null




2011-11-09                              ADASS XXI                                 8
Array Tiling
                      SELECT [x], [y], AVG(v) FROM A1
                      GROUP BY A1[x:x+2][y:y+2];


                              y              null

                          3       0.0   0.0         0.0   0.0


   Anchor point:          2       0.0   0.0         0.0   0.0
     A1[x][y]      null                                         null
                          1       0.0   0.5         0.5   0.5

                          0       0.0   0.0         0.0   0.0
                                   0     1           2     3
                                                                 x
                                             null




2011-11-09                              ADASS XXI                                 8
Array Tiling
                      SELECT [x], [y], AVG(v) FROM A1
                      GROUP BY A1[x:x+2][y:y+2];


                              y              null

                          3       0.0   0.0         0.0   0.0


   Anchor point:          2       0.0   0.0         0.0   0.0
     A1[x][y]      null                                         null
                          1       0.0   0.5         0.5   0.5

                          0       0.0   0.0         0.0   0.0
                                   0     1           2     3
                                                                 x
                                             null




2011-11-09                              ADASS XXI                                 8
Array Tiling
                SELECT [x], [y], AVG(v) FROM A1
                GROUP BY A1[x:x+2][y:y+2];


                        y               null

                    3        0.0   0.0         0.0   0.0

                    2        0.0   0.0         0.0   0.0
             null                                          null
                    1       0.125 0.25 0.25 0.25

                    0       0.125 0.25 0.25 0.25
                              0     1           2     3
                                                            x
                                        null




2011-11-09                         ADASS XXI                                 9
LOFAR Catalogue
                                                                                                         ra DOUBLE,
  zone (Gray et al. 2006)                                frequency                                       decl DOUBLE,
                                                                                                         ra_err DOUBLE,
  90                                                                                                     decl_err DOUBLE,
                                                         ...                                             flux DOUBLE,
   ...                                                                                                   ...
                                                         ν4
   2
                                                         ν3
    1
                                                         ν2                                          V
    0                                                                                            U
                                                         ν1                                  Q
   -1                                                                                    I
   -2                                                          t1   t2   t3   t4   ...           time
   ...
  -90

          0   1   2   3   ...   357   358   359   meridian



         CREATE ARRAY LOFARsrc (
           zone INT DIMENSION[-90:1:91], mrdn INT DIMENSION[0:1:360],
           ts   TIMESTAMP DIMENSION,     freq INT DIMENSION[30:10:241],
           id   INT DIMENSION[0:1:*],    stks CHAR(1) DIMENSION
                   CHECK(stks=`I' OR stks=`Q' OR stks=`U' OR stks=`V'),
           ra DOUBLE, decl DOUBLE, ra_err DOUBLE, decl_err DOUBLE,
           flux DOUBLE, ...);

2011-11-09                                               ADASS XXI                                                          10
LOFAR Use Case
                                                                                                          ra DOUBLE,
  zone (Gray et al. 2006)                                 frequency                                       decl DOUBLE,
                                                                                                          ra_err DOUBLE,
  90                                                                                                      decl_err DOUBLE,
                                                          ...                                             flux DOUBLE,
   ...                                                                                                    ...
                                                          ν4
   2
                                                          ν3
    1
                                                          ν2                                          V
    0                                                                                             U
                                                          ν1                                  Q
   -1                                                                                     I
   -2                                                           t1   t2   t3   t4   ...           time
   ...
  -90

         0    1   2   3    ...   357   358   359   meridian



             Similarity of the flux of a LOFAR source at frequencies 30
             MHz and 200 MHz

                          cross-correlation of two time series


2011-11-09                                                ADASS XXI                                                          11
Cross-Correlation

        idx       0         1        2        3
  F
        val       4         3        6        2

                                              idx        0       1       2
                                     G
                                              val        1       5       7




                      idx       -3       -2         -1       0       1       2
             Cr
                      val




2011-11-09                                               ADASS XXI                          12
Cross-Correlation
                                                                                      Cr.idx = -3

                  idx     0        1        2          3
         F                                                                            F [3 : 4]
                  val     4        3        6          2

                                            idx        0       1       2              G [0 : 1]
                                   G
                                            val        1       5       7




                    idx       -3       -2         -1       0       1       2
             Cr
                    val       2




2011-11-09                                             ADASS XXI                                  13
Cross-Correlation
                                                                                          Cr.idx = -2

                            idx        0        1          2       3
                  F                                                                       F [2 : 4]
                            val        4        3          6       2

                                                idx        0       1       2              G [0 : 2]
                                       G
                                                val        1       5       7




                      idx         -3       -2         -1       0       1       2
             Cr
                      val         2        16




2011-11-09                                                 ADASS XXI                                  14
Cross-Correlation
                                                                                      Cr.idx = -1

                                 idx        0          1       2       3
                        F                                                             F [1 : 4]
                                 val        4          3       6       2

                                            idx        0       1       2              G [0 : 3]
                                 G
                                            val        1       5       7




                  idx       -3         -2         -1       0       1       2
             Cr
                  val       2        16         47




2011-11-09                                             ADASS XXI                                  15
Cross-Correlation
                                                                                 Cr.idx = 0

                                      idx        0        1       2       3
                             F                                                   F [0 : 3]
                                      val        4        3       6       2

                                      idx        0        1       2              G [0 : 3]
                             G
                                      val        1        5       7




                  idx   -3       -2         -1       0        1       2
             Cr
                  val   2        16     47           61




2011-11-09                                       ADASS XXI                                   16
Cross-Correlation
                                                                                    Cr.idx = 1

                                                 idx       0        1       2   3
                                      F                                             F [0 : 2]
                                                 val       4        3       6   2

                                      idx        0         1        2               G [1 : 3]
                             G
                                      val        1         5        7




                  idx   -3       -2         -1         0       1        2
             Cr
                  val   2        16       47         61        41




2011-11-09                                       ADASS XXI                                      17
Cross-Correlation
                                                                                         Cr.idx = 2

                                                          idx       0        1   2   3
                                                 F                                       F [0 : 1]
                                                          val       4        3   6   2

                                      idx        0        1         2                    G [2 : 3]
                             G
                                      val        1        5         7




                  idx   -3       -2         -1       0          1       2
             Cr
                  val   2        16     47           61       41        28




2011-11-09                                       ADASS XXI                                           18
LOFAR Use Case
                                                                                                           ra DOUBLE,
  zone (Gray et al. 2006)                                  frequency                                       decl DOUBLE,
                                                                                                           ra_err DOUBLE,
  90                                                                                                       decl_err DOUBLE,
                                                           ...                                             flux DOUBLE,
   ...                                                                                                     ...
                                                           ν4
   2
                                                           ν3
    1
                                                           ν2                                          V
    0                                                                                              U
                                                           ν1                                  Q
   -1                                                                                      I
   -2                                                            t1   t2   t3   t4   ...           time
   ...
  -90

            0   1   2   3   ...   357   358   359   meridian
         DECLARE fcnt INT, gcnt INT;
         SET fcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][30][11][‘I’];
         SET gcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][200][11][‘I’];

         CREATE ARRAY VIEW F (idx INT DIMENSION[0:1:fcnt], flux DOUBLE DEFAULT 0.0) AS SELECT flux FROM
            LOFARsrc[*][*][*][30][11][‘I’];
         CREATE ARRAY VIEW G (idx INT DIMENSION[0:1:gcnt], val DOUBLE DEFAULT 0.0) AS SELECT flux FROM
            LOFARsrc[*][*][*][200][11][‘I’];

         CREATE ARRAY CrCorr30_200 (idx INT DIMENSION[-fcnt+1:1:gcnt], val DOUBLE DEFAULT 0.0);
         INSERT INTO CrCorr SELECT SUM(F.flux * G.flux) FROM F, G, CrCorr30_200 AS C
           GROUP BY F[MAX(0, -C.idx) : MIN(fcnt, gcnt-C.idx)], G[MAX(0, C.idx) : MIN(gcnt, fcnt+C.idx)];



2011-11-09                                                 ADASS XXI                                                          19
LOFAR Use Case
                                                                                                           ra DOUBLE,
  zone (Gray et al. 2006)                                  frequency                                       decl DOUBLE,
                                                                                                           ra_err DOUBLE,
  90                                                                                                       decl_err DOUBLE,
                                                           ...                                             flux DOUBLE,
   ...                                                                                                     ...
                                                           ν4
   2
                                                           ν3
    1
                                                           ν2                                          V
    0                                                                                              U
                                                           ν1                                  Q
   -1                                                                                      I
   -2                                                            t1   t2   t3   t4   ...           time
   ...
  -90

            0   1   2   3   ...   357   358   359   meridian                           retrieve the time series

         DECLARE fcnt INT, gcnt INT;
         SET fcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][30][11][‘I’];
         SET gcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][200][11][‘I’];

         CREATE ARRAY VIEW F (idx INT DIMENSION[0:1:fcnt], flux DOUBLE DEFAULT 0.0) AS SELECT flux FROM
            LOFARsrc[*][*][*][30][11][‘I’];
         CREATE ARRAY VIEW G (idx INT DIMENSION[0:1:gcnt], val DOUBLE DEFAULT 0.0) AS SELECT flux FROM
            LOFARsrc[*][*][*][200][11][‘I’];

         CREATE ARRAY CrCorr30_200 (idx INT DIMENSION[-fcnt+1:1:gcnt], val DOUBLE DEFAULT 0.0);
         INSERT INTO CrCorr SELECT SUM(F.flux * G.flux) FROM F, G, CrCorr30_200 AS C
           GROUP BY F[MAX(0, -C.idx) : MIN(fcnt, gcnt-C.idx)], G[MAX(0, C.idx) : MIN(gcnt, fcnt+C.idx)];



2011-11-09                                                 ADASS XXI                                                          19
LOFAR Use Case
                                                                                                           ra DOUBLE,
   zone                                                    frequency                                       decl DOUBLE,
                                                                                                           ra_err DOUBLE,
  90                                                                                                       decl_err DOUBLE,
                                                           ...                                             flux DOUBLE,
   ...                                                                                                     ...
                                                           ν4
   2
                                                           ν3
    1
                                                           ν2                                          V
    0                                                                                              U
                                                           ν1                                  Q
   -1                                                                                      I
   -2                                                            t1   t2   t3   t4   ...           time
   ...
  -90

            0   1   2   3   ...   357   358   359   meridian     dynamic grouping for every iteration

         DECLARE fcnt INT, gcnt INT;
         SET fcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][30][11][‘I’];
         SET gcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][200][11][‘I’];

         CREATE ARRAY VIEW F (idx INT DIMENSION[0:1:fcnt], flux DOUBLE DEFAULT 0.0) AS SELECT flux FROM
            LOFARsrc[*][*][*][30][11][‘I’];
         CREATE ARRAY VIEW G (idx INT DIMENSION[0:1:gcnt], val DOUBLE DEFAULT 0.0) AS SELECT flux FROM
            LOFARsrc[*][*][*][200][11][‘I’];

         CREATE ARRAY CrCorr30_200 (idx INT DIMENSION[-fcnt+1:1:gcnt], val DOUBLE DEFAULT 0.0);
         INSERT INTO CrCorr SELECT SUM(F.flux * G.flux) FROM F, G, CrCorr30_200 AS C
           GROUP BY F[MAX(0, -C.idx) : MIN(fcnt, gcnt-C.idx)], G[MAX(0, C.idx) : MIN(gcnt, fcnt+C.idx)];



2011-11-09                                                 ADASS XXI                                                          20
Conclusion

             SciQL: a novel query language for scientific data

                A symbiosis of relational and array paradigm

             Simplifies expression of complex scientific algorithms

             Leave optimisation to DBMS kernel

             Opens opportunities to enhance scientific data mining



             Under active implementation


                                                   !"#$%&'()*+,#-&$.#/(012#&+$#%3$%#,(

                   www.scilens.org          www.monetdb.org
                                                   2.#(4&#$5()*+,#-&$".1(6&$&



                                                   !"#$%&'()&"#*+,-(     ./0/123
                                                   4")*'()5"%%,%*'(*#-(( 6!7(8 9:7;;9




2011-11-09                             ADASS XXI                                                 21

More Related Content

More from PlanetData Network of Excellence

Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamDemo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamPlanetData Network of Excellence
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingPlanetData Network of Excellence
 
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...PlanetData Network of Excellence
 
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchLinking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchPlanetData Network of Excellence
 
SciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSSciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSPlanetData Network of Excellence
 
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduceScalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReducePlanetData Network of Excellence
 
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...PlanetData Network of Excellence
 
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsTowards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsPlanetData Network of Excellence
 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...PlanetData Network of Excellence
 
Adaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of EndpointsAdaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of EndpointsPlanetData Network of Excellence
 
Exploring The Hubness-Related Properties of Oceanographic Sensor Data
Exploring The Hubness-Related Properties of Oceanographic Sensor DataExploring The Hubness-Related Properties of Oceanographic Sensor Data
Exploring The Hubness-Related Properties of Oceanographic Sensor DataPlanetData Network of Excellence
 

More from PlanetData Network of Excellence (20)

Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamDemo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream Processing
 
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
 
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchLinking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
 
SciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSSciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMS
 
CLODA: A Crowdsourced Linked Open Data Architecture
CLODA: A Crowdsourced Linked Open Data ArchitectureCLODA: A Crowdsourced Linked Open Data Architecture
CLODA: A Crowdsourced Linked Open Data Architecture
 
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduceScalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
 
Data and Knowledge Evolution
Data and Knowledge Evolution  Data and Knowledge Evolution
Data and Knowledge Evolution
 
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
 
Access Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract ModelsAccess Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract Models
 
Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?
 
Abstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF DatasetsAbstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF Datasets
 
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsTowards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
 
Heuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQLHeuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQL
 
Adaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of EndpointsAdaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of Endpoints
 
Building a Front End for a Sensor Data Cloud
Building a Front End for a Sensor Data CloudBuilding a Front End for a Sensor Data Cloud
Building a Front End for a Sensor Data Cloud
 
OntoGen Extension for Exploring Image Collections
OntoGen Extension for Exploring Image CollectionsOntoGen Extension for Exploring Image Collections
OntoGen Extension for Exploring Image Collections
 
Exploring The Hubness-Related Properties of Oceanographic Sensor Data
Exploring The Hubness-Related Properties of Oceanographic Sensor DataExploring The Hubness-Related Properties of Oceanographic Sensor Data
Exploring The Hubness-Related Properties of Oceanographic Sensor Data
 
Exposing Real World Information for the Web of Things
Exposing Real World Information for the Web of ThingsExposing Real World Information for the Web of Things
Exposing Real World Information for the Web of Things
 

Recently uploaded

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Recently uploaded (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Astronomical Data Processing Using SciQL, an SQL Based Query Language for Array Data

  • 1. Astronomical Data Processing Using SciQL, an SQL Based Query Language for Array Data Ying Zhang, Bart Scheers, Martin Kersten, Milena Ivanova, Niels Nes CWI Amsterdam ADASS XXI, Nov. 06-10, 2011, Paris, France !"#$%&'()*+,#-&$.#/(012#&+$#%3$%#,( 2.#(4&#$5()*+,#-&$".1(6&$& !"#$%&'()&"#*+,-( ./0/123 4")*'()5"%%,%*'(*#-(( 6!7(8 9:7;;9
  • 2.
  • 3. Why Not RDBMS? SQL is difficult No appropriate array denotations No functional complete operation set DBMSs are slow Too much overhead Size limitations (due to BLOB representations) Existing foreign files Scale ... 2011-11-09 ADASS XXI 3
  • 4. SciQL An array query language based on SQL:2003 To lower the entrance fee to RDBMSs Distinguish features (Kersten et al. AD ’11; Zhang et al. IDEAS2011): Arrays and tables as first class citizens of DBMSs Seamless integration of relational and array paradigms Named dimensions with constraints Flexible structure-based grouping LOFAR Transient Key Project use case 2011-11-09 ADASS XXI 4
  • 5. Array Definitions y null 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 x 0 1 2 3 null CREATE ARRAY A1 ( x INT DIMENSION [0:1:4], y INT DIMENSION [0:1:4], v FLOAT DEFAULT 0.0); 2011-11-09 ADASS XXI 5
  • 6. Array Definitions y null 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 x 0 1 2 3 null CREATE ARRAY A1 ( x INT DIMENSION [0:1:4], y INT DIMENSION [0:1:4], v FLOAT DEFAULT 0.0); dimensions, any scalar data type 2011-11-09 ADASS XXI 5
  • 7. Array Definitions y null 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 x 0 1 2 3 null CREATE ARRAY A1 ( x INT DIMENSION [0:1:4], y INT DIMENSION [0:1:4], v FLOAT DEFAULT 0.0); dimensional range: [(start|∗) : (step|∗) : (stop|∗)] 2011-11-09 ADASS XXI 6
  • 8. Array Definitions y null 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 x 0 1 2 3 null CREATE ARRAY A1 ( x INT DIMENSION [0:1:4], y INT DIMENSION [0:1:4], v FLOAT DEFAULT 0.0); cell values, any column data type 2011-11-09 ADASS XXI 7
  • 9. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x:x+2][y:y+2]; y null 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null 2011-11-09 ADASS XXI 8
  • 10. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x:x+2][y:y+2]; y null 3 0.0 0.0 0.0 0.0 Anchor point: 2 0.0 0.0 0.0 0.0 A1[x][y] null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null 2011-11-09 ADASS XXI 8
  • 11. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x:x+2][y:y+2]; y null 3 0.0 0.0 0.0 0.0 Anchor point: 2 0.0 0.0 0.0 0.0 A1[x][y] null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null 2011-11-09 ADASS XXI 8
  • 12. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x:x+2][y:y+2]; y null 3 0.0 0.0 0.0 0.0 Anchor point: 2 0.0 0.0 0.0 0.0 A1[x][y] null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null 2011-11-09 ADASS XXI 8
  • 13. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x:x+2][y:y+2]; y null 3 0.0 0.0 0.0 0.0 Anchor point: 2 0.0 0.0 0.0 0.0 A1[x][y] null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null 2011-11-09 ADASS XXI 8
  • 14. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x:x+2][y:y+2]; y null 3 0.0 0.0 0.0 0.0 Anchor point: 2 0.0 0.0 0.0 0.0 A1[x][y] null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null 2011-11-09 ADASS XXI 8
  • 15. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x:x+2][y:y+2]; y null 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.125 0.25 0.25 0.25 0 0.125 0.25 0.25 0.25 0 1 2 3 x null 2011-11-09 ADASS XXI 9
  • 16. LOFAR Catalogue ra DOUBLE, zone (Gray et al. 2006) frequency decl DOUBLE, ra_err DOUBLE, 90 decl_err DOUBLE, ... flux DOUBLE, ... ... ν4 2 ν3 1 ν2 V 0 U ν1 Q -1 I -2 t1 t2 t3 t4 ... time ... -90 0 1 2 3 ... 357 358 359 meridian CREATE ARRAY LOFARsrc ( zone INT DIMENSION[-90:1:91], mrdn INT DIMENSION[0:1:360], ts TIMESTAMP DIMENSION, freq INT DIMENSION[30:10:241], id INT DIMENSION[0:1:*], stks CHAR(1) DIMENSION CHECK(stks=`I' OR stks=`Q' OR stks=`U' OR stks=`V'), ra DOUBLE, decl DOUBLE, ra_err DOUBLE, decl_err DOUBLE, flux DOUBLE, ...); 2011-11-09 ADASS XXI 10
  • 17. LOFAR Use Case ra DOUBLE, zone (Gray et al. 2006) frequency decl DOUBLE, ra_err DOUBLE, 90 decl_err DOUBLE, ... flux DOUBLE, ... ... ν4 2 ν3 1 ν2 V 0 U ν1 Q -1 I -2 t1 t2 t3 t4 ... time ... -90 0 1 2 3 ... 357 358 359 meridian Similarity of the flux of a LOFAR source at frequencies 30 MHz and 200 MHz cross-correlation of two time series 2011-11-09 ADASS XXI 11
  • 18. Cross-Correlation idx 0 1 2 3 F val 4 3 6 2 idx 0 1 2 G val 1 5 7 idx -3 -2 -1 0 1 2 Cr val 2011-11-09 ADASS XXI 12
  • 19. Cross-Correlation Cr.idx = -3 idx 0 1 2 3 F F [3 : 4] val 4 3 6 2 idx 0 1 2 G [0 : 1] G val 1 5 7 idx -3 -2 -1 0 1 2 Cr val 2 2011-11-09 ADASS XXI 13
  • 20. Cross-Correlation Cr.idx = -2 idx 0 1 2 3 F F [2 : 4] val 4 3 6 2 idx 0 1 2 G [0 : 2] G val 1 5 7 idx -3 -2 -1 0 1 2 Cr val 2 16 2011-11-09 ADASS XXI 14
  • 21. Cross-Correlation Cr.idx = -1 idx 0 1 2 3 F F [1 : 4] val 4 3 6 2 idx 0 1 2 G [0 : 3] G val 1 5 7 idx -3 -2 -1 0 1 2 Cr val 2 16 47 2011-11-09 ADASS XXI 15
  • 22. Cross-Correlation Cr.idx = 0 idx 0 1 2 3 F F [0 : 3] val 4 3 6 2 idx 0 1 2 G [0 : 3] G val 1 5 7 idx -3 -2 -1 0 1 2 Cr val 2 16 47 61 2011-11-09 ADASS XXI 16
  • 23. Cross-Correlation Cr.idx = 1 idx 0 1 2 3 F F [0 : 2] val 4 3 6 2 idx 0 1 2 G [1 : 3] G val 1 5 7 idx -3 -2 -1 0 1 2 Cr val 2 16 47 61 41 2011-11-09 ADASS XXI 17
  • 24. Cross-Correlation Cr.idx = 2 idx 0 1 2 3 F F [0 : 1] val 4 3 6 2 idx 0 1 2 G [2 : 3] G val 1 5 7 idx -3 -2 -1 0 1 2 Cr val 2 16 47 61 41 28 2011-11-09 ADASS XXI 18
  • 25. LOFAR Use Case ra DOUBLE, zone (Gray et al. 2006) frequency decl DOUBLE, ra_err DOUBLE, 90 decl_err DOUBLE, ... flux DOUBLE, ... ... ν4 2 ν3 1 ν2 V 0 U ν1 Q -1 I -2 t1 t2 t3 t4 ... time ... -90 0 1 2 3 ... 357 358 359 meridian DECLARE fcnt INT, gcnt INT; SET fcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][30][11][‘I’]; SET gcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][200][11][‘I’]; CREATE ARRAY VIEW F (idx INT DIMENSION[0:1:fcnt], flux DOUBLE DEFAULT 0.0) AS SELECT flux FROM LOFARsrc[*][*][*][30][11][‘I’]; CREATE ARRAY VIEW G (idx INT DIMENSION[0:1:gcnt], val DOUBLE DEFAULT 0.0) AS SELECT flux FROM LOFARsrc[*][*][*][200][11][‘I’]; CREATE ARRAY CrCorr30_200 (idx INT DIMENSION[-fcnt+1:1:gcnt], val DOUBLE DEFAULT 0.0); INSERT INTO CrCorr SELECT SUM(F.flux * G.flux) FROM F, G, CrCorr30_200 AS C GROUP BY F[MAX(0, -C.idx) : MIN(fcnt, gcnt-C.idx)], G[MAX(0, C.idx) : MIN(gcnt, fcnt+C.idx)]; 2011-11-09 ADASS XXI 19
  • 26. LOFAR Use Case ra DOUBLE, zone (Gray et al. 2006) frequency decl DOUBLE, ra_err DOUBLE, 90 decl_err DOUBLE, ... flux DOUBLE, ... ... ν4 2 ν3 1 ν2 V 0 U ν1 Q -1 I -2 t1 t2 t3 t4 ... time ... -90 0 1 2 3 ... 357 358 359 meridian retrieve the time series DECLARE fcnt INT, gcnt INT; SET fcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][30][11][‘I’]; SET gcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][200][11][‘I’]; CREATE ARRAY VIEW F (idx INT DIMENSION[0:1:fcnt], flux DOUBLE DEFAULT 0.0) AS SELECT flux FROM LOFARsrc[*][*][*][30][11][‘I’]; CREATE ARRAY VIEW G (idx INT DIMENSION[0:1:gcnt], val DOUBLE DEFAULT 0.0) AS SELECT flux FROM LOFARsrc[*][*][*][200][11][‘I’]; CREATE ARRAY CrCorr30_200 (idx INT DIMENSION[-fcnt+1:1:gcnt], val DOUBLE DEFAULT 0.0); INSERT INTO CrCorr SELECT SUM(F.flux * G.flux) FROM F, G, CrCorr30_200 AS C GROUP BY F[MAX(0, -C.idx) : MIN(fcnt, gcnt-C.idx)], G[MAX(0, C.idx) : MIN(gcnt, fcnt+C.idx)]; 2011-11-09 ADASS XXI 19
  • 27. LOFAR Use Case ra DOUBLE, zone frequency decl DOUBLE, ra_err DOUBLE, 90 decl_err DOUBLE, ... flux DOUBLE, ... ... ν4 2 ν3 1 ν2 V 0 U ν1 Q -1 I -2 t1 t2 t3 t4 ... time ... -90 0 1 2 3 ... 357 358 359 meridian dynamic grouping for every iteration DECLARE fcnt INT, gcnt INT; SET fcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][30][11][‘I’]; SET gcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][200][11][‘I’]; CREATE ARRAY VIEW F (idx INT DIMENSION[0:1:fcnt], flux DOUBLE DEFAULT 0.0) AS SELECT flux FROM LOFARsrc[*][*][*][30][11][‘I’]; CREATE ARRAY VIEW G (idx INT DIMENSION[0:1:gcnt], val DOUBLE DEFAULT 0.0) AS SELECT flux FROM LOFARsrc[*][*][*][200][11][‘I’]; CREATE ARRAY CrCorr30_200 (idx INT DIMENSION[-fcnt+1:1:gcnt], val DOUBLE DEFAULT 0.0); INSERT INTO CrCorr SELECT SUM(F.flux * G.flux) FROM F, G, CrCorr30_200 AS C GROUP BY F[MAX(0, -C.idx) : MIN(fcnt, gcnt-C.idx)], G[MAX(0, C.idx) : MIN(gcnt, fcnt+C.idx)]; 2011-11-09 ADASS XXI 20
  • 28. Conclusion SciQL: a novel query language for scientific data A symbiosis of relational and array paradigm Simplifies expression of complex scientific algorithms Leave optimisation to DBMS kernel Opens opportunities to enhance scientific data mining Under active implementation !"#$%&'()*+,#-&$.#/(012#&+$#%3$%#,( www.scilens.org www.monetdb.org 2.#(4&#$5()*+,#-&$".1(6&$& !"#$%&'()&"#*+,-( ./0/123 4")*'()5"%%,%*'(*#-(( 6!7(8 9:7;;9 2011-11-09 ADASS XXI 21