O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.
an introduction to pinot
Jean-François Im <jfim@linkedin.com>
2016-01-04 Tue
outline
Introduction
When to use Pinot?
An overview of the Pinot architecture
Managing Data in Pinot
Data storage
Realtime...
introduction
what is pinot?
∙ Distributed near-realtime OLAP datastore
∙ Used at LinkedIn for various user-facing (“Who viewed
my profil...
what is pinot
∙ Offers a SQL query interface on top of a custom-written
data store
∙ Offers near-realtime ingestion of eve...
example of queries
SELECT
weeksSinceEpochSunday,
distinctCount(viewerId)
FROM mirrorProfileViewEvents
WHERE vieweeId = ......
example of queries
7/38
how does “who viewed my profile” work?
8/38
usage of pinot at linkedin
∙ Over 50 use cases at LinkedIn
∙ Several thousands of queries per second across
multiple data ...
when to use pinot?
design limitations
∙ Pinot is designed for analytical workloads (OLAP), not
transactional ones (OLTP)
∙ Data in Pinot is i...
when to use pinot?
∙ When you have an analytics problem (How many of “x”
happened?)
∙ When you have many queries per day a...
an overview of the pinot
architecture
controller, broker and server
∙ There are three components in Pinot: Controller, broker
and server
∙ Controller: Handles c...
controller, broker and server
15/38
controller, broker and server
∙ All of these components are redundant, so there is no
single point of failure by design
∙ ...
managing data in pinot
getting data into pinot
∙ Let’s first look at the offline case. We have data in
Hadoop that we would like to get into Pinot....
getting data into pinot
∙ Data in pinot is packaged into segments, which contain
a set of rows
∙ These are then uploaded i...
getting data into pinot
∙ A segment is a pre-built index over this set of rows
∙ Data in Pinot is stored in columnar forma...
getting data into pinot
∙ Each segment file that is generated contains both the
minimum and maximum timestamp contained in ...
getting data into pinot
∙ Data uploaded into Pinot is stored on a segment basis
∙ Uploading a segment with the same name o...
data storage
data orientation: rows and columns
∙ Most OLTP databases store data in a row-oriented
format
∙ Pinot stores its data in a ...
data orientation: rows and columns
25/38
benefits of column-orientation
∙ Queries only read the data they need (columns not
used in a query are not read)
∙ Individ...
a couple of tricks
∙ Pinot uses a couple of techniques to reduce data size
∙ Dictionary encoding allows us to deduplicate ...
realtime data in pinot
tables: offline and realtime
∙ Pinot has two kinds of tables: offline and realtime
∙ An offline table stores data that has b...
data ingestion
∙ Realtime data ingestion is done through Kafka
∙ In the open source release, there is a JSON decoder
and a...
hybrid querying
∙ Since realtime and offline tables are disjoint, how are
they queried?
∙ If an offline and realtime table h...
hybrid querying
∙ Data is partitioned according to a time column, with a
preference given to offline data
32/38
data
∙ Since there are two data sources for the same data, if
there is an issue with one (eg. Kafka/Samza issue or
Hadoop ...
retention
retention
∙ Tables in Pinot can have a customizable retention
period
∙ Segments will be expunged automatically when their
...
retention
∙ Offline and realtime tables have different retention
periods. For example, “who viewed my profile?” has a
realti...
conclusion
conclusion
∙ Pinot is a realtime distributed analytical data store that
can handle interactive analytical queries running ...
Próximos SlideShares
Carregando em…5
×

Intro to Pinot (2016-01-04)

21.185 visualizações

Publicada em

A short introduction to Linkedin's Pinot (http://github.com/linkedin/pinot)

Publicada em: Software
  • Love spell from dr.Unity brought my husband back"contact Unityspelltemple@gmail.com to help you get your Ex lover back urgently after breakup/divorce" After 12years of marriage, me and my husband has been into one quarrel or the other until he finally left me and moved to California to be with another woman. I felt my life was over and my kids thought they would never see their father again. i tried to be strong just for the kids but i could not control the pains that torments my heart, my heart was filled with sorrows and pains because i was really in love with my husband. Every day and night i think of him and always wish he could come back to me, I was really worried and i needed help, so i searched for help online and I came across a website that suggested that Dr Unity can help get ex back fast. So, I felt I should give him a try. I contacted him and he told me what to do and i did it then he did a Love spell for me. 11hours later, my husband really called me and told me that he miss me and the kids so much, So Amazing!! So that was how he came back that same day,with lots of love and joy,and he apologized for his mistake,and for the pain he caused me and the kids. Then from that day,our Marriage was now stronger than how it were before, All thanks to Dr Unity. he is so powerful and i decided to share my story on the internet that Dr.Unity is real spell caster who i will always pray to live long to help his children in the time of trouble, if you are here and you need your ex lover back or save your marriage fast. Do not cry anymore, contact this powerful spell caster Dr.Unity now. Here’s his contact: Call/WhatsApp him: +2348071622464 ,Email him at: Unityspelltemple@gmail.com , Website:https://unityspelltemples.blogspot.com ,Your kindness will never be forgotten.
       Responder 
    Tem certeza que deseja  Sim  Não
    Insira sua mensagem aqui
  • heal from broken heart" Love spell from dr.Unity brought my husband back"contact Unityspelltemple@gmail.com to help you get your Ex lover back urgently after breakup/divorce" After 12years of marriage, me and my husband has been into one quarrel or the other until he finally left me and moved to California to be with another woman. I felt my life was over and my kids thought they would never see their father again. i tried to be strong just for the kids but i could not control the pains that torments my heart, my heart was filled with sorrows and pains because i was really in love with my husband. Every day and night i think of him and always wish he could come back to me, I was really worried and i needed help, so i searched for help online and I came across a website that suggested that Dr Unity can help get ex back fast. So, I felt I should give him a try. I contacted him and he told me what to do and i did it then he did a Love spell for me. 11hours later, my husband really called me and told me that he miss me and the kids so much, So Amazing!! So that was how he came back that same day,with lots of love and joy,and he apologized for his mistake,and for the pain he caused me and the kids. Then from that day,our Marriage was now stronger than how it were before, All thanks to Dr Unity. he is so powerful and i decided to share my story on the internet that Dr.Unity is real spell caster who i will always pray to live long to help his children in the time of trouble, if you are here and you need your ex lover back or save your marriage fast. Do not cry anymore, contact this powerful spell caster Dr.Unity now. Here’s his contact: Call/WhatsApp him: +2348071622464 ,Email him at: Unityspelltemple@gmail.com , Website:https://unityspelltemples.blogspot.com ,Your kindness will never be forgotten. Natasha Wanderly form USA.
       Responder 
    Tem certeza que deseja  Sim  Não
    Insira sua mensagem aqui
  • Love spell from dr.Unity brought my husband back"contact Unityspelltemple@gmail.com to help you get your Ex lover back urgently after breakup/divorce" After 12years of marriage, me and my husband has been into one quarrel or the other until he finally left me and moved to California to be with another woman. I felt my life was over and my kids thought they would never see their father again. i tried to be strong just for the kids but i could not control the pains that torments my heart, my heart was filled with sorrows and pains because i was really in love with my husband. Every day and night i think of him and always wish he could come back to me, I was really worried and i needed help, so i searched for help online and I came across a website that suggested that Dr Unity can help get ex back fast. So, I felt I should give him a try. I contacted him and he told me what to do and i did it then he did a Love spell for me. 11hours later, my husband really called me and told me that he miss me and the kids so much, So Amazing!! So that was how he came back that same day,with lots of love and joy,and he apologized for his mistake,and for the pain he caused me and the kids. Then from that day,our Marriage was now stronger than how it were before, All thanks to Dr Unity. he is so powerful and i decided to share my story on the internet that Dr.Unity is real spell caster who i will always pray to live long to help his children in the time of trouble, if you are here and you need your ex lover back or save your marriage fast. Do not cry anymore, contact this powerful spell caster Dr.Unity now. Here’s his contact: Call/WhatsApp him: +2348071622464 ,Email him at: Unityspelltemple@gmail.com , Website:https://unityspelltemples.blogspot.com ,Your kindness will never be forgotten. Natasha Wanderly form USA.
       Responder 
    Tem certeza que deseja  Sim  Não
    Insira sua mensagem aqui
  • heal from broken heart" Love spell from dr.Unity brought my husband back"contact Unityspelltemple@gmail.com to help you get your Ex lover back urgently after breakup/divorce" After 12years of marriage, me and my husband has been into one quarrel or the other until he finally left me and moved to California to be with another woman. I felt my life was over and my kids thought they would never see their father again. i tried to be strong just for the kids but i could not control the pains that torments my heart, my heart was filled with sorrows and pains because i was really in love with my husband. Every day and night i think of him and always wish he could come back to me, I was really worried and i needed help, so i searched for help online and I came across a website that suggested that Dr Unity can help get ex back fast. So, I felt I should give him a try. I contacted him and he told me what to do and i did it then he did a Love spell for me. 11hours later, my husband really called me and told me that he miss me and the kids so much, So Amazing!! So that was how he came back that same day,with lots of love and joy,and he apologized for his mistake,and for the pain he caused me and the kids. Then from that day,our Marriage was now stronger than how it were before, All thanks to Dr Unity. he is so powerful and i decided to share my story on the internet that Dr.Unity is real spell caster who i will always pray to live long to help his children in the time of trouble, if you are here and you need your ex lover back or save your marriage fast. Do not cry anymore, contact this powerful spell caster Dr.Unity now. Here’s his contact: Call/WhatsApp him: +2348071622464 ,Email him at: Unityspelltemple@gmail.com , Website:https://unityspelltemples.blogspot.com ,Your kindness will never be forgotten.
       Responder 
    Tem certeza que deseja  Sim  Não
    Insira sua mensagem aqui
  • DOWNLOAD FULL BOOKS INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL PDF EBOOK here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL EPUB Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL doc Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL PDF EBOOK here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL EPUB Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL doc Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Responder 
    Tem certeza que deseja  Sim  Não
    Insira sua mensagem aqui

Intro to Pinot (2016-01-04)

  1. 1. an introduction to pinot Jean-François Im <jfim@linkedin.com> 2016-01-04 Tue
  2. 2. outline Introduction When to use Pinot? An overview of the Pinot architecture Managing Data in Pinot Data storage Realtime data in Pinot Retention Conclusion 2/38
  3. 3. introduction
  4. 4. what is pinot? ∙ Distributed near-realtime OLAP datastore ∙ Used at LinkedIn for various user-facing (“Who viewed my profile,” publisher analytics, etc.), client-facing (ad campaign creation and tracking) and internal analytics (XLNT, EasyBI, Raptor, etc.) 4/38
  5. 5. what is pinot ∙ Offers a SQL query interface on top of a custom-written data store ∙ Offers near-realtime ingestion of events from Kafka (a few seconds latency at most) ∙ Supports pushing data from Hadoop ∙ Can combine data from Hadoop and Kafka at runtime ∙ Scales horizontally and linearly if data size or query rate increases ∙ Fault tolerant (any component can fail without causing availability issues, no single point of failure) ∙ Automatic data expiration 5/38
  6. 6. example of queries SELECT weeksSinceEpochSunday, distinctCount(viewerId) FROM mirrorProfileViewEvents WHERE vieweeId = ... AND (viewerPrivacySetting = ’F’ OR ... OR viewerPrivacySetting = ’’) AND daysSinceEpoch >= 16624 AND daysSinceEpoch <= 16714 GROUP BY weeksSinceEpochSunday TOP 20 LIMIT 0 6/38
  7. 7. example of queries 7/38
  8. 8. how does “who viewed my profile” work? 8/38
  9. 9. usage of pinot at linkedin ∙ Over 50 use cases at LinkedIn ∙ Several thousands of queries per second across multiple data centers ∙ Operates 24x7, exposes metrics for production monitoring ∙ The internal de facto solution for scalable data querying 9/38
  10. 10. when to use pinot?
  11. 11. design limitations ∙ Pinot is designed for analytical workloads (OLAP), not transactional ones (OLTP) ∙ Data in Pinot is immutable (eg. no UPDATE statement), though it can be overwritten in bulk ∙ Realtime data is append-only (can only load new rows) ∙ There is no support for JOINs or subselects ∙ There are no UDFs for aggregation (work in progress) 11/38
  12. 12. when to use pinot? ∙ When you have an analytics problem (How many of “x” happened?) ∙ When you have many queries per day and require low query latency (otherwise use Hadoop for one-time ad hoc queries) ∙ When you can’t pre-aggregate data to be stored in some other storage system (otherwise use Voldemort or an OLAP cubing solution) 12/38
  13. 13. an overview of the pinot architecture
  14. 14. controller, broker and server ∙ There are three components in Pinot: Controller, broker and server ∙ Controller: Handles cluster-wide coordination using Apache Helix and Apache Zookeeper ∙ Broker: Handles query fan out and query routing to servers ∙ Server: Responds to query requests originating from the brokers 14/38
  15. 15. controller, broker and server 15/38
  16. 16. controller, broker and server ∙ All of these components are redundant, so there is no single point of failure by design ∙ Uses Zookeeper as a coordination mechanism 16/38
  17. 17. managing data in pinot
  18. 18. getting data into pinot ∙ Let’s first look at the offline case. We have data in Hadoop that we would like to get into Pinot. 18/38
  19. 19. getting data into pinot ∙ Data in pinot is packaged into segments, which contain a set of rows ∙ These are then uploaded into Pinot 19/38
  20. 20. getting data into pinot ∙ A segment is a pre-built index over this set of rows ∙ Data in Pinot is stored in columnar format (we’ll get to this later) ∙ Each input Avro file maps to one Pinot segment 20/38
  21. 21. getting data into pinot ∙ Each segment file that is generated contains both the minimum and maximum timestamp contained in the data ∙ Each segment file also has a sequential number appended to the end ∙ mirrorProfileViewEvents_2015-10-04_2015-10-04_0 ∙ mirrorProfileViewEvents_2015-10-04_2015-10-04_1 ∙ mirrorProfileViewEvents_2015-10-04_2015-10-04_2 21/38
  22. 22. getting data into pinot ∙ Data uploaded into Pinot is stored on a segment basis ∙ Uploading a segment with the same name overwrites the data that currently exists in that segment ∙ This is the only way to update data in Pinot 22/38
  23. 23. data storage
  24. 24. data orientation: rows and columns ∙ Most OLTP databases store data in a row-oriented format ∙ Pinot stores its data in a column-oriented format ∙ If you have heard the terms array of structures (AoS) and structure of arrays (SoA), this is the same idea 24/38
  25. 25. data orientation: rows and columns 25/38
  26. 26. benefits of column-orientation ∙ Queries only read the data they need (columns not used in a query are not read) ∙ Individual row lookups are slower, aggregations are faster ∙ Compression can be a lot more effective, as related data is packed together 26/38
  27. 27. a couple of tricks ∙ Pinot uses a couple of techniques to reduce data size ∙ Dictionary encoding allows us to deduplicate repetitive data in a single column (eg. country, state, gender) ∙ Bit packing allows us to pack multiple values in the same byte/word/dword 27/38
  28. 28. realtime data in pinot
  29. 29. tables: offline and realtime ∙ Pinot has two kinds of tables: offline and realtime ∙ An offline table stores data that has been pushed from Hadoop, while a realtime sources its data from Kafka ∙ These two tables are disjoint and can contain the same data 29/38
  30. 30. data ingestion ∙ Realtime data ingestion is done through Kafka ∙ In the open source release, there is a JSON decoder and an Avro decoder for messages ∙ This architecture allows plugging in new data ingestion sources (eg. other message queuing systems), though at this time there are no other sources implemented 30/38
  31. 31. hybrid querying ∙ Since realtime and offline tables are disjoint, how are they queried? ∙ If an offline and realtime table have the same name, when a broker receives a query, it rewrites it to two queries, one for the offline and one for the realtime table 31/38
  32. 32. hybrid querying ∙ Data is partitioned according to a time column, with a preference given to offline data 32/38
  33. 33. data ∙ Since there are two data sources for the same data, if there is an issue with one (eg. Kafka/Samza issue or Hadoop cluster issue), the other one is used to answer queries ∙ This means that you don’t get called in the middle of the night for data-related issues and there’s a large time window for fixing issues 33/38
  34. 34. retention
  35. 35. retention ∙ Tables in Pinot can have a customizable retention period ∙ Segments will be expunged automatically when their last timestamp is past the retention period ∙ This is done by a process called the retention manager 35/38
  36. 36. retention ∙ Offline and realtime tables have different retention periods. For example, “who viewed my profile?” has a realtime retention of seven days and an offline retention period of 90 days. ∙ This means that even if the Hadoop job doesn’t run for a couple of days, data from the realtime flow will answer the query 36/38
  37. 37. conclusion
  38. 38. conclusion ∙ Pinot is a realtime distributed analytical data store that can handle interactive analytical queries running on large amounts of data ∙ It’s used for various internal and external use-cases at LinkedIn ∙ It’s open source! (github.com/linkedin/pinot) ∙ Ping me if you want to deploy it, I’ll help you out 38/38

×