Hackconf 2016 - Да пишем код за хиляди сървъри

•Transferir como PPTX, PDF•

0 gostou•640 visualizações

Nikolay Stoitsev

Много често, когато искаме да станем по-добри backend програмисти се опитваме да научим различни езици за програмиране и съответните библиотеки. Проблема е че в Rails, Express.js, Django или Zend Framework има горе долу едни и същи концепции. Ако искаме да се научим как да пишем код за големи системи, които скалират добре и се справят сами с различни грешки и неочаквани ситуации, трябва да овладеем един друг дял от човешкото познание, който се нарича разпределени системи. В моята презентация ще видим защо трябва да задълбаем в тях и какви са основните принципи като консистентност(consistency), достъпност(availability) и издръжливост на разделения(partition tolerance). Също, ще разгледаме стъпки, които всеки може да направи за да научи повече по темата и да получава нови и актуални знания.

Да пишем код за хиляди
сървъри
@stoitsev

Backend
или Сървърна част

MVC Структура
ORM
Библиотека за тестове
Миграции за СУБД
Библиотека за шаблони
Библиотека за кеширане
Превод и локализация
Scaffolding
Logging
Сигурност
Валидация на форми

Мащабируемост
или scalability

Вертикално

Хоризонтално скалиране

Хоризонтално скалиране

Разпределена система
“Разпределена система е група от
самостоятелни сървъри, които работят заедно и
отвън изглеждат като една цялостна система”

120 сървъра
=
1 сървър на месец

1200 сървъра
=
1 сървър на 15 дни

12000 сървъра
=
1 сървър на 7.5 часа

Няма stackoverflow

Децентрализирани
алгоритми
1. Никоя машина няма информация за състоянитето на
цялата система.
2. Всяка машина решава спряло локалната си
информация.
3. Повреда е една машина не разваля целия алгоритъм.
4. Не се предполага че съществъва глобален часовник.

Gossip based membership

1. Няма централизирано
знание
2. Всеки сам има списък
3. Ако една машина се
повреди, алгоритъма си
работи
4. Няма глобален часовник

Консистентност
Consistency

Достъпност
Availability

Репликация

Репликация

Разделяне на мрежата
Partition tolerance

100 лв.

100 лв. 100 лв.

CAP Теорема

Доказателство
Seth Gilbert and Nancy Lynch. 2002. Brewer's conjecture and the feasibility of consistent, available,
partition-tolerant web services.

Консистентност
Или
Достъпност

Кворум
PH PDC
TSTS

“A distributed system
is one in which the
failure of a computer
you didn't even know
existed can render
your own computer
unusable.”

Ресурси
https://en.wikipedia.org/wiki/Fallacies_of_distributed_computing
https://www.goodreads.com/book/show/405614.Distributed_Systems
https://www.coursera.org/specializations/cloudcomputing
http://the-paper-trail.org/blog/consensus-protocols-paxos/
http://dl.acm.org/citation.cfm?id=564601
https://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-
ladis2009.pdf
http://static.googleusercontent.com/media/research.google.com/en//archi
ve/gfs-sosp2003.pdf

Въпроси?

Mais conteúdo relacionado

Destaque

OWLIM@AWS - On-demand RDF Data Management in the Cloud

OWLIM@AWS - On-demand RDF Data Management in the Cloud

OWLIM@AWS - On-demand RDF Data Management in the Cloud

RDF Database-as-a-Service with S4

RDF Database-as-a-Service with S4

RDF Database-as-a-Service with S4

Text Analytics & Linked Data Management As-a-Service

Text Analytics & Linked Data Management As-a-Service

Text Analytics & Linked Data Management As-a-Service

The microservice architecture approach has been very popular in the recent years. There is a big hype around it and a large swarm of open source tools to facilitate each aspect of this architecture. The purpose of this talk is to identify the main components of a microservice architecture. After that we compare different open source tools that fits into each area. At the end we’ll have a good understanding what a microservice architecture based on OSS looks like.

OpenFest 2016 - Open Microservice Architecture

OpenFest 2016 - Open Microservice Architecture

OpenFest 2016 - Open Microservice Architecture

Nikolay Stoitsev

Delivering Linked Data Training to Data Science Practitioners

Delivering Linked Data Training to Data Science Practitioners

Delivering Linked Data Training to Data Science Practitioners

Scaling to Millions of Concurrent SPARQL Queries on the Cloud

Scaling to Millions of Concurrent SPARQL Queries on the Cloud

Scaling to Millions of Concurrent SPARQL Queries on the Cloud

GraphDB Connectors – Powering Complex SPARQL Queries

GraphDB Connectors – Powering Complex SPARQL Queries

GraphDB Connectors – Powering Complex SPARQL Queries

Scaling up Linked Data

Scaling up Linked Data

Scaling up Linked Data

From Big Data to Smart Data

From Big Data to Smart Data

From Big Data to Smart Data

After more than a decade of active efforts towards establishing Semantic Web, Linked Data and related standards, the verdict of whether the technology has delivered its promise and has proven itself in the enterprise is still unclear, despite the numerous existing success stories. Every emerging technology and disruptive innovation has to overcome the challenge of “crossing the chasm” between the early adopters, who are just eager to experiment with the technology potential, and the majority of the companies, who need a proven technology that can be reliably used in mission critical scenarios and deliver quantifiable cost savings. Succeeding with a Semantic Technology product in the enterprise is a challenging task involving both top quality research and software development practices, but most often the technology adoption challenges are not about the quality of the R&D but about successful business model generation and understanding the complexities and challenges of the technology adoption lifecycle by the enterprise. This talk will discuss topics related to the challenge of “crossing the chasm” for a Semantic Technology product and provide examples from Ontotext’s experience of successfully delivering Semantic Technology solutions to enterprises.

Crossing the Chasm with Semantic Technology

Crossing the Chasm with Semantic Technology

Crossing the Chasm with Semantic Technology

Go at uber

Rob Skillington

HTTP2 and gRPC

Go and Uber’s time series database m3

Go and Uber’s time series database m3

Go and Uber’s time series database m3

Rob Skillington

Distributed tracing for Node.js

Distributed tracing for Node.js

Distributed tracing for Node.js

Nikolay Stoitsev

Semantic Technologies for Big Data

Semantic Technologies for Big Data

Semantic Technologies for Big Data

Destaque (15)

OWLIM@AWS - On-demand RDF Data Management in the Cloud

OWLIM@AWS - On-demand RDF Data Management in the Cloud

OWLIM@AWS - On-demand RDF Data Management in the Cloud

RDF Database-as-a-Service with S4

RDF Database-as-a-Service with S4

RDF Database-as-a-Service with S4

Text Analytics & Linked Data Management As-a-Service

Text Analytics & Linked Data Management As-a-Service

Text Analytics & Linked Data Management As-a-Service

OpenFest 2016 - Open Microservice Architecture

OpenFest 2016 - Open Microservice Architecture

OpenFest 2016 - Open Microservice Architecture

Delivering Linked Data Training to Data Science Practitioners

Delivering Linked Data Training to Data Science Practitioners

Delivering Linked Data Training to Data Science Practitioners

Scaling to Millions of Concurrent SPARQL Queries on the Cloud

Scaling to Millions of Concurrent SPARQL Queries on the Cloud

Scaling to Millions of Concurrent SPARQL Queries on the Cloud

GraphDB Connectors – Powering Complex SPARQL Queries

GraphDB Connectors – Powering Complex SPARQL Queries

GraphDB Connectors – Powering Complex SPARQL Queries

Scaling up Linked Data

Scaling up Linked Data

Scaling up Linked Data

From Big Data to Smart Data

From Big Data to Smart Data

From Big Data to Smart Data

Crossing the Chasm with Semantic Technology

Crossing the Chasm with Semantic Technology

Crossing the Chasm with Semantic Technology

Go at uber

HTTP2 and gRPC

Go and Uber’s time series database m3

Go and Uber’s time series database m3

Go and Uber’s time series database m3

Distributed tracing for Node.js

Distributed tracing for Node.js

Distributed tracing for Node.js

Semantic Technologies for Big Data

Semantic Technologies for Big Data

Semantic Technologies for Big Data

Semelhante a Hackconf 2016 - Да пишем код за хиляди сървъри

2022 TurnovoConf MySQL за начинаещи.pptx

2022 TurnovoConf MySQL за начинаещи.pptx

2022 TurnovoConf MySQL за начинаещи.pptx

WordPress Security

WordPress Security

WordPress Security

Veselin Nikolov

Сигурност и права за достъп в уеб приложения изработени с работната рамка Yii

Сигурност и права за достъп в уеб приложения изработени с работната рамка Yii

Сигурност и права за достъп в уеб приложения изработени с работната рамка Yii

Защита при създаване на java прил.в интернет

Защита при създаване на java прил.в интернет

Защита при създаване на java прил.в интернет

Сравненителна характеристика на криптографски алгоритми

Сравненителна характеристика на криптографски алгоритми

Сравненителна характеристика на криптографски алгоритми

Ваня Иванова

Php sec

Semelhante a Hackconf 2016 - Да пишем код за хиляди сървъри (6)

2022 TurnovoConf MySQL за начинаещи.pptx

2022 TurnovoConf MySQL за начинаещи.pptx

2022 TurnovoConf MySQL за начинаещи.pptx

WordPress Security

WordPress Security

WordPress Security

Сигурност и права за достъп в уеб приложения изработени с работната рамка Yii

Сигурност и права за достъп в уеб приложения изработени с работната рамка Yii

Сигурност и права за достъп в уеб приложения изработени с работната рамка Yii

Защита при създаване на java прил.в интернет

Защита при създаване на java прил.в интернет

Защита при създаване на java прил.в интернет

Сравненителна характеристика на криптографски алгоритми

Сравненителна характеристика на криптографски алгоритми

Сравненителна характеристика на криптографски алгоритми

Php sec

Mais de Nikolay Stoitsev

As engineers, we like to solve problems by building solutions from scratch. Even though on some occasions it’s better to buy and integrate existing software. But how come? Are engineers who don’t always deliver from scratch real engineers? The goal of this talk is to answer all important questions about making build vs buy decisions. We’ll see how to define a clear strategy for making such decisions. And we’ll explore how to select and integrate existing software efficiently. Even if your company doesn’t have the habit of doing it.

Building vs Buying Software

Building vs Buying Software

Building vs Buying Software

Nikolay Stoitsev

How and why to manage your manager

How and why to manage your manager

How and why to manage your manager

Nikolay Stoitsev

From programming to management

From programming to management

From programming to management

Nikolay Stoitsev

Good observability is essential for modern software. It gives us confidence that our systems are working properly. And it also allows us to debug issues efficiently. In this talk, we’ll explore everything you need to know to start applying good observability to your projects. And we’ll see the most common pitfalls you need to be aware of. We will start with the tools and basic concepts in monitoring. And we’ll go over the 3 most common mistakes people make with it. Then we’ll see how to have automatic alerts to detect issues. And, we’ll touch on the principles for setting up good alerts. As a final step, we’ll see how to build our logging system and how to apply it in the most efficient way to debug issues easily.

A practical introduction to observability

A practical introduction to observability

A practical introduction to observability

Nikolay Stoitsev

Building a modern Software as a Service platform brings a lot of interesting engineering challenges. During this talk, I’m going to share my team’s journey of building a SaaS from scratch in 2020. First, we are going to start with the technologies and the architecture we picked. Then, we’ll go over the interesting challenge of implementing multitenancy. And we'll see how we benchmarked three different options and picked one. And last but not least, we’ll explore how every startup can use open source technologies to build observability infrastructure. And how to run their SaaS in production.

Building a modern SaaS in 2020

Building a modern SaaS in 2020

Building a modern SaaS in 2020

Nikolay Stoitsev

The database is usually the heart of a software system. And there are many database technologies that we can pick from. In this talk, we’ll explore where RDBMS and NoSQL fall short and how NewSQL fills the gap. We’ll see what types of NewSQL databases exist and how they work. And we’ll also go over different NewSQL solutions that we can pick for our projects. By the end of the talk, we’ll have a good understanding of when and how to apply a NewSQL database in our big scale applications.

Everything You Need to Know About NewSQL in 2020

Everything You Need to Know About NewSQL in 2020

Everything You Need to Know About NewSQL in 2020

Nikolay Stoitsev

Effective communication is one of the most important skills we need. It greatly improves our productivity. And multiplies the positive impact that we have on the products we build and the people we work with. In this talk, we are going to explore three lessons on better communication. First, we’ll start with key principles for building trust and good relationships with the people around us. Then, we’ll see why and how to manage expectations. And we’ll explore how requirements facilitation can make our work easier. We are also going to see how to apply code reviews to our communication and scale it to amplify our impact. And most importantly, we’ll go over some real-world examples of how to apply these lessons in our everyday work to become better engineers.

3 lessons on effective communication for engineers

3 lessons on effective communication for engineers

3 lessons on effective communication for engineers

Nikolay Stoitsev

In order for our systems to scale continuously and be resilient, they need to be constantly evolving. In this talk, I’m going to tell the store of how my team migrated a data-intensive microservice from Python to Go. First, we are going to start with the rationale behind the migration. Then we are going to go over the Python and Go tech stacks that we use. Last but not least, I’m also going to share our approach for migrating the service while running in production, adding new features and making sure there are no regressions.

ISTA 2019 - Migrating data-intensive microservices from Python to Go

ISTA 2019 - Migrating data-intensive microservices from Python to Go

ISTA 2019 - Migrating data-intensive microservices from Python to Go

Nikolay Stoitsev

Microservices are a well-established architecture applied by many organizations around the world to build scalable and fault-tolerant backend systems. But as these systems grow so does the number of services in them. And this brings many challenges when we want to introduce new functionality. For a simple feature, engineers may need to spend a lot of time designing the end to end flow, changing code in multiple services and setting up complex test scenarios. During this talk, we’ll explore how to evolve a microservice architecture to be easily extensible based on some lessons learned from running 5000 microservices in production. We’ll go over different architectural patterns and open source tools that we can use to make it easy for all engineers to understand, extend and be more and more productive in such big complex systems.

Evolving big microservice architectures

Evolving big microservice architectures

Evolving big microservice architectures

Nikolay Stoitsev

The career path of software engineers and how to navigate it

The career path of software engineers and how to navigate it

The career path of software engineers and how to navigate it

Nikolay Stoitsev

As Uber is hyper-growing as a company so does our need for scalable and resilient systems. In this talk, I’m going to tell the story of how my team migrated from Python to Go, a microservice that processes millions of events every day. First, we are going to start with the rationale behind the migration. Then we are going to go over the Python and Go tech stacks that we use. Last but not least, I’m also going to share our approach for migrating the service while running in production, adding new features and making sure there are no regressions.

Migrating a data intensive microservice from Python to Go

Migrating a data intensive microservice from Python to Go

Migrating a data intensive microservice from Python to Go

Nikolay Stoitsev

Using Apache Kafka from Go

Using Apache Kafka from Go

Using Apache Kafka from Go

Nikolay Stoitsev

In today’s world it’s no longer enough to build systems that process big volumes of information. We now need applications that can handle large continuous streams of data with very low latency so we can react to the ever-changing environment around us. To efficiently handle such problems we need to deploy a stream processing solution. During the talk we’ll explore one of the most popular frameworks for stream processing – Apache Flink. We’ll see what unique capabilities it provides and how they apply to some real world problems. And we’ll also explore how it works under the hood and how to get the scalable and fault-tolerant stream processing that Flink provides.

Large scale stream processing with Apache Flink

Large scale stream processing with Apache Flink

Large scale stream processing with Apache Flink

Nikolay Stoitsev

Apache Kafka sits at the core of the modern scalable event driven architecture. It’s no longer used only as logging infrastructure, but as a core component in thousands of companies around the world. It has the unique capability to provide low-latency, fault-tolerant pipeline at scale that is very important for today’s world of big data. During this talk we’ll see what makes Apache Kafka perfect for the job. We’ll explore how to optimize it for throughput or for durability. And we’ll also go over the messaging semantics it provides. Last but not least, we’ll see how Apache Kafka can help us solve some everyday problems that we face when we build large scale systems in an elegant way.

Scaling big with Apache Kafka

Scaling big with Apache Kafka

Scaling big with Apache Kafka

Nikolay Stoitsev

The database has always been one of the key components in every architecture. There is a great variety of tradeoffs we should consider and implementation that we can pick from. If we need consistency and correctness in exchange of availability and performance, we should pick a relational database. If we need scale and increased availability by sacrificing transactional and consistency guarantees, we should use a NoSQL database. And if we need both horizontal scalability and transactions, we need to pick a NewSQL database. During this talk we’ll explore what guarantees a NewSQL system provides. We’ll go over the different approaches in building such a system. And we’ll see some open source projects that implements each approach. At the end of the talk we’ll have a good understanding of when and how to apply a NewSQL database in our big scale applications.

NewSQL: what, when and how

NewSQL: what, when and how

NewSQL: what, when and how

Nikolay Stoitsev

How to read the v8 source code?

How to read the v8 source code?

How to read the v8 source code?

Nikolay Stoitsev

The applications that we build for today's world have a lot of requirements. They need to provide the best user experience and to be always up and running. To achieve this in a massive scale you need a multi data center architecture. When we have more than one data centers, even if one of them goes down, the other can handle the traffic and your users will continue to use your application uninterrupted. Also by having datacenters in different locations around the world you can you take advantage of lower latencies and provide a better usability. But to take advantage of all those benefits you need to architect your application in a special way. During the talk we’ll explore the different multi data center configurations and the tradeoff of each one of them. We’ll also go over the ways to do failover and some useful processes to facilitate it better. Moreover, we’ll see how each layer of the application is affected by such architecture, all the way down to the database and the data model. Finally, I’ll share what technologies help Uber to run in multiple data centers and the lessons we learned by doing so.

Running in multiple data centers

Running in multiple data centers

Running in multiple data centers

Nikolay Stoitsev

As our systems grow, so does our software architecture complexity. As we scale and add more and more components, the interactions occurring between them become very complex and we start to lose visibility into the system. Traditional monitoring tools such as metrics and distributed logging still have their place, but they often fail to provide visibility across services. This is where distributed tracing thrives. During the talk we’ll explore what distributed tracing is, what open tools we can use to facilitate it and some lessons learned while implementing distributed tracing at Uber and how it helps us build big and impactful systems.

Distributed tracing for big systems

Distributed tracing for big systems

Distributed tracing for big systems

Nikolay Stoitsev

The shipping containers were introduced around 1830s and since then they play a crucial role in the modern society by providing efficient packaging, storage and transportation. Today we see the same revolution happening in the DevOps world with the adoption of container technologies like Docker. They allow us to package, distribute and run software in a scalable and efficient way. In this talk we’ll see how we can abstract the common problem we solve everyday when building scalable Java APIs with Docker into design patterns to create reusable solutions. We’ll explore the good practices of writing Java applications with Docker. Then we’ll see how each design pattern can be applied in real scenarios that address different concerns that a large system creates. We’ll see some real life implementations of those patterns and how they help us solve problems in scalable systems. By the end of the talk we’ll have a very powerful abstraction to tackle the everyday problems we face in building big and impactful systems.

Reusable patterns for scalable APIs running on Docker @ Java2Days

Reusable patterns for scalable APIs running on Docker @ Java2Days

Reusable patterns for scalable APIs running on Docker @ Java2Days

Nikolay Stoitsev

Everyday tools and tricks for scaling Node.js

Everyday tools and tricks for scaling Node.js

Everyday tools and tricks for scaling Node.js

Nikolay Stoitsev

Mais de Nikolay Stoitsev (20)

Building vs Buying Software

Building vs Buying Software

Building vs Buying Software

How and why to manage your manager

How and why to manage your manager

How and why to manage your manager

From programming to management

From programming to management

From programming to management

A practical introduction to observability

A practical introduction to observability

A practical introduction to observability

Building a modern SaaS in 2020

Building a modern SaaS in 2020

Building a modern SaaS in 2020

Everything You Need to Know About NewSQL in 2020

Everything You Need to Know About NewSQL in 2020

Everything You Need to Know About NewSQL in 2020

3 lessons on effective communication for engineers

3 lessons on effective communication for engineers

3 lessons on effective communication for engineers

ISTA 2019 - Migrating data-intensive microservices from Python to Go

ISTA 2019 - Migrating data-intensive microservices from Python to Go

ISTA 2019 - Migrating data-intensive microservices from Python to Go

Evolving big microservice architectures

Evolving big microservice architectures

Evolving big microservice architectures

The career path of software engineers and how to navigate it

The career path of software engineers and how to navigate it

The career path of software engineers and how to navigate it

Migrating a data intensive microservice from Python to Go

Migrating a data intensive microservice from Python to Go

Migrating a data intensive microservice from Python to Go

Using Apache Kafka from Go

Using Apache Kafka from Go

Using Apache Kafka from Go

Large scale stream processing with Apache Flink

Large scale stream processing with Apache Flink

Large scale stream processing with Apache Flink

Scaling big with Apache Kafka

Scaling big with Apache Kafka

Scaling big with Apache Kafka

NewSQL: what, when and how

NewSQL: what, when and how

NewSQL: what, when and how

How to read the v8 source code?

How to read the v8 source code?

How to read the v8 source code?

Running in multiple data centers

Running in multiple data centers

Running in multiple data centers

Distributed tracing for big systems

Distributed tracing for big systems

Distributed tracing for big systems

Reusable patterns for scalable APIs running on Docker @ Java2Days

Reusable patterns for scalable APIs running on Docker @ Java2Days

Reusable patterns for scalable APIs running on Docker @ Java2Days

Everyday tools and tricks for scaling Node.js

Everyday tools and tricks for scaling Node.js

Everyday tools and tricks for scaling Node.js

Hackconf 2016 - Да пишем код за хиляди сървъри

1. Да пишем код за хиляди сървъри @stoitsev

2.

3.

4. Backend или Сървърна част

5.

6. MVC Структура ORM Библиотека за тестове Миграции за СУБД Библиотека за шаблони Библиотека за кеширане Превод и локализация Scaffolding Logging Сигурност Валидация на форми

7.

8. Мащабируемост или scalability

9.

10. Вертикално

11.

12. Хоризонтално скалиране

13. Хоризонтално скалиране

14. Разпределена система “Разпределена система е група от самостоятелни сървъри, които работят заедно и отвън изглеждат като една цялостна система”

15.

16. 120 сървъра = 1 сървър на месец

17. 1200 сървъра = 1 сървър на 15 дни

18. 12000 сървъра = 1 сървър на 7.5 часа

19.

20.

21. Няма stackoverflow

22. Децентрализирани алгоритми 1. Никоя машина няма информация за състоянитето на цялата система. 2. Всяка машина решава спряло локалната си информация. 3. Повреда е една машина не разваля целия алгоритъм. 4. Не се предполага че съществъва глобален часовник.

23. Gossip based membership

24.

25.

26.

27. 1. Няма централизирано знание 2. Всеки сам има списък 3. Ако една машина се повреди, алгоритъма си работи 4. Няма глобален часовник

28. Консистентност Consistency

29. Достъпност Availability

30. Репликация

31. Репликация

32. Разделяне на мрежата Partition tolerance

34. 100 лв. 100 лв.

35. CAP Теорема

36. Доказателство Seth Gilbert and Nancy Lynch. 2002. Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services.

37.

38.

39. Консистентност Или Достъпност

40. Кворум PH PDC TSTS

41. “A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable.”

42. Ресурси https://en.wikipedia.org/wiki/Fallacies_of_distributed_computing https://www.goodreads.com/book/show/405614.Distributed_Systems https://www.coursera.org/specializations/cloudcomputing http://the-paper-trail.org/blog/consensus-protocols-paxos/ http://dl.acm.org/citation.cfm?id=564601 https://www.cs.cornell.edu/projects/ladis2009/papers/lakshman- ladis2009.pdf http://static.googleusercontent.com/media/research.google.com/en//archi ve/gfs-sosp2003.pdf

43. Въпроси?

Notas do Editor

Здравейте на всички. Аз съм Ники и съм тук днес да ви разкажа за това как да пишем код за хиляди сървъри…
Първо няколко думи за мен. Аз съм софтуерен инженер в Убер, която е една голяма платформа, където се занимавам с разпределени системи извършващи плащания и различни други аспекти при работата с пари. Вчера по време на първата лекция Aлекс Тодоров повдигна въпроса дали правим mutation testing. Отговора е да, правим за Java и за JavaScript понеже голяма част от най-важните ни системи са на javascript върху node.js и искаме те да са изключително добре изстествани. За Java имаме библиотека, която може да намерите в github организацията ни.
И аз също съм част от ФМИ мафията, уча там и от няколко години водя упражнения
Та очевидно днес ще си говорим за бекенд. Да пятам - коло хора се определят като фронтенд девелъпъри, колко от вас са писали някакви backend неща , колко от вас са рънвали бекенд с повече от 100 сървъра, колко са ставали в 3 през нощта понеже бекенда им се е счупил Моята презентация е доста по-различна от тези които видяхте до сега на тази конференция. Тя е доста технически, с термини и теореми. Но нещо което всички ви казаха е че няма значение какъв език за програмиране или какъв framework ще си изберете, стига да разбирате нещата в осново. Днес точно това ще направим. Ще видим някакви принципи на backend нещата.
Та когато си избираме технология имаме доста голям избор и повечето пъти го правим спрямо езика за програмиране, който знаем. Тук виждаме няколко примера. Да направим едно малко състезание. Колко хора пишат на рейлс? Някой пише ли на django? Някой zend, php? А spring?
Без значение кой модерен фреймуърк ще си изберем, те горе долу имат едни и същи фийчъри...
Първото нещо което искаме да постигнем е мащабируемост - scalability Мащабируемост е възможноста на една система да увеличава количеството работа което извършва или да увеличи размера или капацитета си за да се справи с това увеличаване.
Ако имаме приложение, което работи на един сървър и искаме да го скалираме, може да го направим по 2 начина.
Първия е вертикално. При него си копуваме по-добър хардуер и започваме да използваме него. Това е по-евтино в началото и е много по-лесно
Но има 3 проблема с вертикалното скалиране… първия е закона на муур(транзисторите в един процесор се удвояват всеки 18 месеца)… втория е че данните в мрежата се удвояват всеки 12 месеца… третия е че човешкото знание също се удвоява всеки 12 месеца
Затова ще разгледаме втория начин, хоризонтално - при него нашата система увеличава капацитета си като добавя допълнителни сървъри
Този начин за скалиране е добър може може да се прави доста динамично - просто добавяш още машини, може да става през някакъв уеб портал. Другата е че прави системата по-надеждна, понеже дори и един от сървърите да се счупи другите може да продължат работа.
Така получаваме разпределена система...
Разпределената система е много много по-трудна за управление поради няколко причини. Първата е че нещата се чупят доста. Да предположим че един компютър се счупва веднъж на 10 години… тоест шанса е 1 на 120
Също така съм и мрежата се чупи the network will go down for annoying reasons: power failures, broken hardware, someone tripping a cord, vortex to other dimensions engulfing mission-critical components, headcrabs infestation, copper theft, etc И скороста на самата мрежа е непредвидима Study of Univeristy of Toronto and Microsoft - average failure rate of 5.2 devices per day and 40.8 links per day, with a median time to repair of approximately five minutes (and a maximum of one week), packet loss of 59,000 packets per failure. First year for a new Google cluster involves: • Five racks going wonky (40-80 machines seeing 50 percent packet loss). • Eight network maintenance events (four of which might cause ~30-minute random connectivity losses). • Three router failures (resulting in the need to pull traffic immediately for an hour).
Дори и цял дейтацентър може да се счупи. Виждал съм го да се случва не веднъж. Например, един инженер прави неправилна настройка на мрежата и цялата мрежа спира да работи. Може да намерите доста примери от най-различни компании като amazon и google. Когато проектираме нашаща разпределена система е добре да помислим как да се справяме с такива ситуации. Имаме реален случай, един рак се запалва, става пожар, изгасва централно тока…(един рак изгаря суича отгоре, започва да дими, включва пожарната аларма и тока спира) В друг дайта център в индия, хората решили да отваорят прозорците
Проблем е че може да си пейстнеш active record заявката в stackoverflow и да питаш защо не работи… няма как да питаш защо системата е толкова бавна в определена операция… може да питаш някакви малко архитектурни неща… няма кой да седи и да ни проектира системата в stackoverflow така че да реши точно нашия проблем с нашите инструменти
Един много просто пример на този протокол е ако има трима асистенти в един университет и те трябва да проверят едни домашни.
Ако имаме хиляди сървъри и искаме те да работят заедно, можем да имплементираме алгоритъм при които всеки знае, с колко и кои други сървъри работи. Проблема е как да разберем когато някой хост е добавен или кога е развален и повече не може да го достъпваме.
Освен тези принципи има и някои основни концепции Всички знаят едно и също In the previous example, consistency would be having the ability to have the system, whether there are 2 or 1000 nodes that can answer queries, to see exactly the same amount of money in the account at a given time.
Тhe key solution to high availability is redundancy.
По-евтино е да не я записваме навсякъде
The whole point of partition tolerance is that the system can work with messages possibly being lost between components. Примера
In the real world, we can have various things like quorum systems where we turn this 'yes/no' question into a dial we can turn to choose how much consistency we want. By changing making the M value of required nodes up to N (the total number of nodes), you can have a fully consistent system. By giving M the value 1, you have a fully AP system, with no consistency guarantees.
Лесли Лампорт, носител за наградата Тюринг за 2013