SlideShare uma empresa Scribd logo
1 de 48
Building a Fast, Reliable SQL Server
© kCura LLC. All rights reserved.
Brent
Ozar
© kCura LLC. All rights reserved.
• You have multiple concurrent reviewers that cost real money.
• You have dozens (or perhaps hundreds) of active cases,
with more being added all the time, without you being involved.
• Your users want the database to go as fast as possible,
but you don’t have a million-dollar budget for SQL Server.
• You’re not building a disaster recovery (DR) solution just yet,
but you want a roadmap of what it would take to do it.
About you
© kCura LLC. All rights reserved.
Tens of
thousands of
workspaces, or
tens of terabytes
in one database
Shops with
only a handful of
really small (1GB)
workspaces
About your environment
Normal distribution graph source, licensed with Creative Commons CC BY 2.5:
https://en.wikipedia.org/wiki/Standard_deviation#/media/File:Standard_deviation_diagram.svg
© kCura LLC. All rights reserved.
Agenda
• How SQL Server responds to Relativity workloads
• What this means for SQL Server hardware
• How your users respond to SQL Server problems
• What this means for SQL Server hardware
• How you respond to outages
• What this means for SQL Server capacity management
• The simple cheat sheet that ties it all together
© kCura LLC. All rights reserved.
How your users query
and how SQL Server responds
© kCura LLC. All rights reserved.
SQL Server stores data in 8KB pages.
© kCura LLC. All rights reserved.
SELECT *
FROM dbo.Users
WHERE LastAccessDate >= ‘8/25/09’
Users send in queries.
© kCura LLC. All rights reserved.
In normal databases, we can build indexes to improve speed.
© kCura LLC. All rights reserved.
SELECT TOP 1000 *
FROM dbo.Users
WHERE Location LIKE ‘%chicago%’
AND DisplayName LIKE ‘Bob’
AND LastAccessDate BETWEEN ‘2014/01/01’ AND
‘2014/05/01’
ORDER BY LastAccessDate
But Relativity users run very unpredictable queries.
© kCura LLC. All rights reserved.
You can’t predict
what users will filter for,
or the order they want it in.
© kCura LLC. All rights reserved.
Even worse, users can add fields.
That means each page
can hold less and less rows.
And their search patterns change.
© kCura LLC. All rights reserved.
The result:
we’re constantly scanning
the whole Document table.
© kCura LLC. All rights reserved.
SQL Server Enterprise Edition helps with this.
Source: Microsoft Books Online
https://technet.microsoft.com/en-us/library/ms191475(v=sql.105).aspx
© kCura LLC. All rights reserved.
• Enterprise Edition: $7,000 per processor core.
• Thankfully, Relativity queries aren’t CPU-bottlenecked at all:
we don’t need many CPU cores. (Other apps do.)
• Knowing that we have to have Enterprise Edition to scale Relativity with multiple simultaneous
reviewers, hitting tables we can’t index, makes the hardware decision simple.
The problem: Enterprise Edition is expensive. Really expensive.
© kCura LLC. All rights reserved.
What this means
For the “Compute” part of SQL Server hardware
© kCura LLC. All rights reserved.
• CPUs: as few processor cores as we can get.
2 sockets x 4 cores each = 8 cores = $56,000 of SQL licensing.
• Memory: as much as we can get, so we can cache these tables that users are constantly
scanning.
• Storage: the more memory we get, the less storage speed matters, because Enterprise Edition
can read ahead of what we’re scanning.
How to buy SQL Server compute power for Relativity
© kCura LLC. All rights reserved.
• Dell PowerEdge R730XD or HP ProLiant DL380
– 2 quad-core CPUs
– 768GB RAM
– 2 local SSDs for TempDB
• Hardware: $20k
• SQL Server licensing: $56k
• Total: $80k
What that looks like
© kCura LLC. All rights reserved.
So now we have the compute. What about storage?
© kCura LLC. All rights reserved.
How your users react
when SQL Server stops responding
© kCura LLC. All rights reserved.
Reviewers are constantly active, 24/7.
When the database goes down, firms lose money and time.
Attorney work product is valuable.
The database can go down, but it had better not lose data.
Relativity is mission-critical
Your Firm
Bossy McManager
“We can’t lose data,
and we need automatic failover in a minute, max.”
© kCura LLC. All rights reserved.
© kCura LLC. All rights reserved.
© kCura LLC. All rights reserved.
© kCura LLC. All rights reserved.
What this means
For the “Storage” part of SQL Server hardware
© kCura LLC. All rights reserved.
© kCura LLC. All rights reserved.
• All user data is written to shared storage,
accessible by two or more compute nodes.
• Compute nodes watch each other: if one gets into trouble and hangs, another
node can take over automatically with no human intervention.
• Ideally, this happens in under fifteen seconds.*
(If things were really ideal, we wouldn’t be failing over. Figure a minute or two.
And we still do have one single point of failure: our shared storage.)
Failover Clustered Instance (FCI)
© kCura LLC. All rights reserved.
As you grow, add more building blocks, spreading cases around.
© kCura LLC. All rights reserved.
• Lets you cycle in hardware gradually over time.
• Enables manual load balancing across compute nodes.
• Does require SQL Server Enterprise Edition
(but we needed that anyway.)
• Has lots of options for disaster recovery down the road.
Multi-instance Failover Clusters
© kCura LLC. All rights reserved.
How you (not users) react
when SQL Server goes bump in the night
© kCura LLC. All rights reserved.
Recovery Time Objective (RTO):
the length of time the business
is willing to be down for
an unplanned outage
© kCura LLC. All rights reserved.
© kCura LLC. All rights reserved.
Your Firm
Bossy McManager
“We can be down for an hour, but no more.”
© kCura LLC. All rights reserved.
• Mess up a patch for Windows, SQL Server, or Relativity
• Accidentally drops a table or an entire database
• Write a script that does something it’s not supposed to
• Discover database corruption
• Discover widespread database corruption affecting multiple databases
Things I’ve seen Relativity admins do
© kCura LLC. All rights reserved.
• Saturday midnight: you run a full backup.
• Sunday morning: you run a weekly CHECKDB, and it comes back OK.
• Sunday midnight: you run a full backup.
• Monday midnight: you run a full backup.
• Tuesday midnight: you run a full backup.
• Wednesday 11:00 AM: a user reports that their queries are failing.
You check the event log, and SQL Server is reporting corruption.
What do you do?
Take database corruption
© kCura LLC. All rights reserved.
The really, really, really sad truth about corruption
© kCura LLC. All rights reserved.
• You’d do a full backup daily, and transaction logs every minute.
• You’d check for corruption every day with DBCC CHECKDB.
• You would know with confidence that every full backup was clean,
so when corruption strikes, you just have to use the most recent full
backup and the transaction logs since.
• You’d have a corner office, beautiful hair, and a 50% raise.
In a perfect world
© kCura LLC. All rights reserved.
Recovery Time Objective (RTO):
the length of time the business is willing to be down.
In that time window,
you have to be able to restore all of your databases in that time,
as well as do troubleshooting and corruption repair.
Back here in reality, capacity is dictated by RTO.
Your Firm
Bossy McManager
“We can be down for an hour, but no more.”
Your Firm, said after testing how fast your restores can go
You
“That means I can only fit 1TB of cases in each of
our SQL Server building blocks.”
Your Firm
Bossy McManager
“That’s crazy. Fit in more.”
Your Firm, said like a boss
You
“Sure, but now your acceptable downtime is ____,
agreed?”
© kCura LLC. All rights reserved.
• Buy one building block
• Restore your databases to it, timing how long they take
• Do performance tuning on the:
• Backup target so it can send data to SQL Server faster
• Network so it can transmit this data faster
• SQL Server so it can write the restored data/log files faster
• Monitoring systems so they give you more time to react
• Measure your restores again
• Communicate your current RTO in writing to management
The keys to making this work
© kCura LLC. All rights reserved.
Recap
If you only see one slide this hour, it should be this one
(Well not this one, the next one)
© kCura LLC. All rights reserved.
• Architecture: multi-instance failover cluster with shared storage.
Start with just 2 nodes, 1 of which is licensed with Enterprise Edition.
• Hardware: 2-socket, 4-core servers with 512-768GB RAM and a pair of local solid
state drives for TempDB.
• These building blocks will make it easy for
your business to scale and adapt –
without blowing big bucks on the best
shared storage. Just use what you have.
SQL Server building blocks for Relativity
Let Us Know What You Think
You’ll receive a short survey via email for
each breakout session and the overall event.
Please take a minute to tell us what you think.

Mais conteúdo relacionado

Mais procurados

Building Scalable .NET Web Applications
Building Scalable .NET Web ApplicationsBuilding Scalable .NET Web Applications
Building Scalable .NET Web Applications
Buu Nguyen
 
10 performance and scalability secrets of ASP.NET websites
10 performance and scalability secrets of ASP.NET websites10 performance and scalability secrets of ASP.NET websites
10 performance and scalability secrets of ASP.NET websites
oazabir
 

Mais procurados (20)

Building Scalable .NET Web Applications
Building Scalable .NET Web ApplicationsBuilding Scalable .NET Web Applications
Building Scalable .NET Web Applications
 
Web api scalability and performance
Web api scalability and performanceWeb api scalability and performance
Web api scalability and performance
 
Scaling asp.net websites to millions of users
Scaling asp.net websites to millions of usersScaling asp.net websites to millions of users
Scaling asp.net websites to millions of users
 
Four Ways to Improve ASP .NET Performance and Scalability
 Four Ways to Improve ASP .NET Performance and Scalability Four Ways to Improve ASP .NET Performance and Scalability
Four Ways to Improve ASP .NET Performance and Scalability
 
Reduce latency and boost sql server io performance
Reduce latency and boost sql server io performanceReduce latency and boost sql server io performance
Reduce latency and boost sql server io performance
 
Microsoft Azure Web Sites Performance Analysis Lessons Learned
Microsoft Azure Web Sites Performance Analysis Lessons LearnedMicrosoft Azure Web Sites Performance Analysis Lessons Learned
Microsoft Azure Web Sites Performance Analysis Lessons Learned
 
10 performance and scalability secrets of ASP.NET websites
10 performance and scalability secrets of ASP.NET websites10 performance and scalability secrets of ASP.NET websites
10 performance and scalability secrets of ASP.NET websites
 
Tips and Tricks For Faster Asp.NET and MVC Applications
Tips and Tricks For Faster Asp.NET and MVC ApplicationsTips and Tricks For Faster Asp.NET and MVC Applications
Tips and Tricks For Faster Asp.NET and MVC Applications
 
Orlando DNN Usergroup Pres 12/06/11
Orlando DNN Usergroup Pres 12/06/11Orlando DNN Usergroup Pres 12/06/11
Orlando DNN Usergroup Pres 12/06/11
 
KoprowskiT_SQLSat219_Kiev_2AM-aDisasterJustbegan
KoprowskiT_SQLSat219_Kiev_2AM-aDisasterJustbeganKoprowskiT_SQLSat219_Kiev_2AM-aDisasterJustbegan
KoprowskiT_SQLSat219_Kiev_2AM-aDisasterJustbegan
 
Unity Connect - Getting SQL Spinning with SharePoint - Best Practices for the...
Unity Connect - Getting SQL Spinning with SharePoint - Best Practices for the...Unity Connect - Getting SQL Spinning with SharePoint - Best Practices for the...
Unity Connect - Getting SQL Spinning with SharePoint - Best Practices for the...
 
Web Speed And Scalability
Web Speed And ScalabilityWeb Speed And Scalability
Web Speed And Scalability
 
Drupal commerce performance profiling and tunning using loadstorm experiments...
Drupal commerce performance profiling and tunning using loadstorm experiments...Drupal commerce performance profiling and tunning using loadstorm experiments...
Drupal commerce performance profiling and tunning using loadstorm experiments...
 
Cassandra Community Webinar | Data Model on Fire
Cassandra Community Webinar | Data Model on FireCassandra Community Webinar | Data Model on Fire
Cassandra Community Webinar | Data Model on Fire
 
Webcenter application performance tuning guide
Webcenter application performance tuning guideWebcenter application performance tuning guide
Webcenter application performance tuning guide
 
Speed up sql
Speed up sqlSpeed up sql
Speed up sql
 
NoSQL, no SQL injections?
NoSQL, no SQL injections?NoSQL, no SQL injections?
NoSQL, no SQL injections?
 
Sherlock Homepage - A detective story about running large web services - WebN...
Sherlock Homepage - A detective story about running large web services - WebN...Sherlock Homepage - A detective story about running large web services - WebN...
Sherlock Homepage - A detective story about running large web services - WebN...
 
Taming the Cloud Database with Apache jclouds, ApacheCon Europe 2014
Taming the Cloud Database with Apache jclouds, ApacheCon Europe 2014Taming the Cloud Database with Apache jclouds, ApacheCon Europe 2014
Taming the Cloud Database with Apache jclouds, ApacheCon Europe 2014
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 

Semelhante a Building a Fast, Reliable SQL Server for kCura Relativity

Business_Continuity_Planning_with_SQL_Server_HADR_options_TechEd_Bangalore_20...
Business_Continuity_Planning_with_SQL_Server_HADR_options_TechEd_Bangalore_20...Business_Continuity_Planning_with_SQL_Server_HADR_options_TechEd_Bangalore_20...
Business_Continuity_Planning_with_SQL_Server_HADR_options_TechEd_Bangalore_20...
LarryZaman
 
Varrow Q4 Lunch & Learn Presentation - Virtualizing Business Critical Applica...
Varrow Q4 Lunch & Learn Presentation - Virtualizing Business Critical Applica...Varrow Q4 Lunch & Learn Presentation - Virtualizing Business Critical Applica...
Varrow Q4 Lunch & Learn Presentation - Virtualizing Business Critical Applica...
Andrew Miller
 

Semelhante a Building a Fast, Reliable SQL Server for kCura Relativity (20)

Denver SQL Saturday The Next Frontier
Denver SQL Saturday The Next FrontierDenver SQL Saturday The Next Frontier
Denver SQL Saturday The Next Frontier
 
Geek Sync | Planning a SQL Server to Azure Migration in 2021 - Brent Ozar
Geek Sync | Planning a SQL Server to Azure Migration in 2021 - Brent OzarGeek Sync | Planning a SQL Server to Azure Migration in 2021 - Brent Ozar
Geek Sync | Planning a SQL Server to Azure Migration in 2021 - Brent Ozar
 
Copy Data Management for the DBA
Copy Data Management for the DBACopy Data Management for the DBA
Copy Data Management for the DBA
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive


 
Business_Continuity_Planning_with_SQL_Server_HADR_options_TechEd_Bangalore_20...
Business_Continuity_Planning_with_SQL_Server_HADR_options_TechEd_Bangalore_20...Business_Continuity_Planning_with_SQL_Server_HADR_options_TechEd_Bangalore_20...
Business_Continuity_Planning_with_SQL_Server_HADR_options_TechEd_Bangalore_20...
 
Azure SQL Database for the SQL Server DBA - Azure Bootcamp Athens 2018
Azure SQL Database for the SQL Server DBA - Azure Bootcamp Athens 2018 Azure SQL Database for the SQL Server DBA - Azure Bootcamp Athens 2018
Azure SQL Database for the SQL Server DBA - Azure Bootcamp Athens 2018
 
Sql Start! 2020 - SQL Server Lift & Shift su Azure
Sql Start! 2020 - SQL Server Lift & Shift su AzureSql Start! 2020 - SQL Server Lift & Shift su Azure
Sql Start! 2020 - SQL Server Lift & Shift su Azure
 
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
 
Oracle RAC - Customer Proven Scalability
Oracle RAC - Customer Proven ScalabilityOracle RAC - Customer Proven Scalability
Oracle RAC - Customer Proven Scalability
 
Kudu austin oct 2015.pptx
Kudu austin oct 2015.pptxKudu austin oct 2015.pptx
Kudu austin oct 2015.pptx
 
Varrow Q4 Lunch & Learn Presentation - Virtualizing Business Critical Applica...
Varrow Q4 Lunch & Learn Presentation - Virtualizing Business Critical Applica...Varrow Q4 Lunch & Learn Presentation - Virtualizing Business Critical Applica...
Varrow Q4 Lunch & Learn Presentation - Virtualizing Business Critical Applica...
 
Apouc 2014-enterprise-manager-12c
Apouc 2014-enterprise-manager-12cApouc 2014-enterprise-manager-12c
Apouc 2014-enterprise-manager-12c
 
VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld 2013: Virtualizing Databases: Doing IT Right VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld 2013: Virtualizing Databases: Doing IT Right
 
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...
More Nines for Your Dimes: Improving Availability and Lowering Costs using Au...
 
PASS Summit 2020
PASS Summit 2020PASS Summit 2020
PASS Summit 2020
 
What's New in Apache Hive
What's New in Apache HiveWhat's New in Apache Hive
What's New in Apache Hive
 
Sql disaster recovery
Sql disaster recoverySql disaster recovery
Sql disaster recovery
 
On The Building Of A PostgreSQL Cluster
On The Building Of A PostgreSQL ClusterOn The Building Of A PostgreSQL Cluster
On The Building Of A PostgreSQL Cluster
 
Microsoft Azure Cost Optimization and improve efficiency
Microsoft Azure Cost Optimization and improve efficiencyMicrosoft Azure Cost Optimization and improve efficiency
Microsoft Azure Cost Optimization and improve efficiency
 
mParticle's Journey to Scylla from Cassandra
mParticle's Journey to Scylla from CassandramParticle's Journey to Scylla from Cassandra
mParticle's Journey to Scylla from Cassandra
 

Mais de Brent Ozar

Mais de Brent Ozar (12)

Fundamentals of TempDB
Fundamentals of TempDBFundamentals of TempDB
Fundamentals of TempDB
 
Fundamentals of Columnstore - Introductions
Fundamentals of Columnstore - IntroductionsFundamentals of Columnstore - Introductions
Fundamentals of Columnstore - Introductions
 
Top 10 Developer Mistakes That Won't Scale with SQL Server
Top 10 Developer Mistakes That Won't Scale with SQL ServerTop 10 Developer Mistakes That Won't Scale with SQL Server
Top 10 Developer Mistakes That Won't Scale with SQL Server
 
Deadlocks: Lets Do One, Understand It, and Fix It
Deadlocks: Lets Do One, Understand It, and Fix ItDeadlocks: Lets Do One, Understand It, and Fix It
Deadlocks: Lets Do One, Understand It, and Fix It
 
Help! SQL Server 2008 is Still Here!
Help! SQL Server 2008 is Still Here!Help! SQL Server 2008 is Still Here!
Help! SQL Server 2008 is Still Here!
 
An Introduction to GitHub for DBAs - Brent Ozar
An Introduction to GitHub for DBAs - Brent OzarAn Introduction to GitHub for DBAs - Brent Ozar
An Introduction to GitHub for DBAs - Brent Ozar
 
SQL Query Optimization: Why Is It So Hard to Get Right?
SQL Query Optimization: Why Is It So Hard to Get Right?SQL Query Optimization: Why Is It So Hard to Get Right?
SQL Query Optimization: Why Is It So Hard to Get Right?
 
Headaches of Blocking, Locking, and Deadlocking
Headaches of Blocking, Locking, and DeadlockingHeadaches of Blocking, Locking, and Deadlocking
Headaches of Blocking, Locking, and Deadlocking
 
"But It Worked In Development!" - 3 Hard SQL Server Problems
"But It Worked In Development!" - 3 Hard SQL Server Problems"But It Worked In Development!" - 3 Hard SQL Server Problems
"But It Worked In Development!" - 3 Hard SQL Server Problems
 
Columnstore Customer Stories 2016 by Sunil Agarwal
Columnstore Customer Stories 2016 by Sunil AgarwalColumnstore Customer Stories 2016 by Sunil Agarwal
Columnstore Customer Stories 2016 by Sunil Agarwal
 
500-Level Guide to Career Internals
500-Level Guide to Career Internals500-Level Guide to Career Internals
500-Level Guide to Career Internals
 
500-Level Guide to Career Internals
500-Level Guide to Career Internals500-Level Guide to Career Internals
500-Level Guide to Career Internals
 

Último

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Último (20)

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 

Building a Fast, Reliable SQL Server for kCura Relativity

  • 1. Building a Fast, Reliable SQL Server
  • 2. © kCura LLC. All rights reserved. Brent Ozar
  • 3. © kCura LLC. All rights reserved. • You have multiple concurrent reviewers that cost real money. • You have dozens (or perhaps hundreds) of active cases, with more being added all the time, without you being involved. • Your users want the database to go as fast as possible, but you don’t have a million-dollar budget for SQL Server. • You’re not building a disaster recovery (DR) solution just yet, but you want a roadmap of what it would take to do it. About you
  • 4. © kCura LLC. All rights reserved. Tens of thousands of workspaces, or tens of terabytes in one database Shops with only a handful of really small (1GB) workspaces About your environment Normal distribution graph source, licensed with Creative Commons CC BY 2.5: https://en.wikipedia.org/wiki/Standard_deviation#/media/File:Standard_deviation_diagram.svg
  • 5. © kCura LLC. All rights reserved. Agenda • How SQL Server responds to Relativity workloads • What this means for SQL Server hardware • How your users respond to SQL Server problems • What this means for SQL Server hardware • How you respond to outages • What this means for SQL Server capacity management • The simple cheat sheet that ties it all together
  • 6. © kCura LLC. All rights reserved. How your users query and how SQL Server responds
  • 7. © kCura LLC. All rights reserved. SQL Server stores data in 8KB pages.
  • 8. © kCura LLC. All rights reserved. SELECT * FROM dbo.Users WHERE LastAccessDate >= ‘8/25/09’ Users send in queries.
  • 9. © kCura LLC. All rights reserved. In normal databases, we can build indexes to improve speed.
  • 10. © kCura LLC. All rights reserved. SELECT TOP 1000 * FROM dbo.Users WHERE Location LIKE ‘%chicago%’ AND DisplayName LIKE ‘Bob’ AND LastAccessDate BETWEEN ‘2014/01/01’ AND ‘2014/05/01’ ORDER BY LastAccessDate But Relativity users run very unpredictable queries.
  • 11. © kCura LLC. All rights reserved. You can’t predict what users will filter for, or the order they want it in.
  • 12. © kCura LLC. All rights reserved. Even worse, users can add fields. That means each page can hold less and less rows. And their search patterns change.
  • 13. © kCura LLC. All rights reserved. The result: we’re constantly scanning the whole Document table.
  • 14. © kCura LLC. All rights reserved. SQL Server Enterprise Edition helps with this. Source: Microsoft Books Online https://technet.microsoft.com/en-us/library/ms191475(v=sql.105).aspx
  • 15. © kCura LLC. All rights reserved. • Enterprise Edition: $7,000 per processor core. • Thankfully, Relativity queries aren’t CPU-bottlenecked at all: we don’t need many CPU cores. (Other apps do.) • Knowing that we have to have Enterprise Edition to scale Relativity with multiple simultaneous reviewers, hitting tables we can’t index, makes the hardware decision simple. The problem: Enterprise Edition is expensive. Really expensive.
  • 16. © kCura LLC. All rights reserved. What this means For the “Compute” part of SQL Server hardware
  • 17. © kCura LLC. All rights reserved. • CPUs: as few processor cores as we can get. 2 sockets x 4 cores each = 8 cores = $56,000 of SQL licensing. • Memory: as much as we can get, so we can cache these tables that users are constantly scanning. • Storage: the more memory we get, the less storage speed matters, because Enterprise Edition can read ahead of what we’re scanning. How to buy SQL Server compute power for Relativity
  • 18. © kCura LLC. All rights reserved. • Dell PowerEdge R730XD or HP ProLiant DL380 – 2 quad-core CPUs – 768GB RAM – 2 local SSDs for TempDB • Hardware: $20k • SQL Server licensing: $56k • Total: $80k What that looks like
  • 19. © kCura LLC. All rights reserved. So now we have the compute. What about storage?
  • 20. © kCura LLC. All rights reserved. How your users react when SQL Server stops responding
  • 21. © kCura LLC. All rights reserved. Reviewers are constantly active, 24/7. When the database goes down, firms lose money and time. Attorney work product is valuable. The database can go down, but it had better not lose data. Relativity is mission-critical
  • 22. Your Firm Bossy McManager “We can’t lose data, and we need automatic failover in a minute, max.”
  • 23. © kCura LLC. All rights reserved.
  • 24. © kCura LLC. All rights reserved.
  • 25. © kCura LLC. All rights reserved.
  • 26. © kCura LLC. All rights reserved. What this means For the “Storage” part of SQL Server hardware
  • 27. © kCura LLC. All rights reserved.
  • 28. © kCura LLC. All rights reserved. • All user data is written to shared storage, accessible by two or more compute nodes. • Compute nodes watch each other: if one gets into trouble and hangs, another node can take over automatically with no human intervention. • Ideally, this happens in under fifteen seconds.* (If things were really ideal, we wouldn’t be failing over. Figure a minute or two. And we still do have one single point of failure: our shared storage.) Failover Clustered Instance (FCI)
  • 29. © kCura LLC. All rights reserved. As you grow, add more building blocks, spreading cases around.
  • 30. © kCura LLC. All rights reserved. • Lets you cycle in hardware gradually over time. • Enables manual load balancing across compute nodes. • Does require SQL Server Enterprise Edition (but we needed that anyway.) • Has lots of options for disaster recovery down the road. Multi-instance Failover Clusters
  • 31. © kCura LLC. All rights reserved. How you (not users) react when SQL Server goes bump in the night
  • 32. © kCura LLC. All rights reserved. Recovery Time Objective (RTO): the length of time the business is willing to be down for an unplanned outage
  • 33. © kCura LLC. All rights reserved.
  • 34. © kCura LLC. All rights reserved.
  • 35. Your Firm Bossy McManager “We can be down for an hour, but no more.”
  • 36. © kCura LLC. All rights reserved. • Mess up a patch for Windows, SQL Server, or Relativity • Accidentally drops a table or an entire database • Write a script that does something it’s not supposed to • Discover database corruption • Discover widespread database corruption affecting multiple databases Things I’ve seen Relativity admins do
  • 37. © kCura LLC. All rights reserved. • Saturday midnight: you run a full backup. • Sunday morning: you run a weekly CHECKDB, and it comes back OK. • Sunday midnight: you run a full backup. • Monday midnight: you run a full backup. • Tuesday midnight: you run a full backup. • Wednesday 11:00 AM: a user reports that their queries are failing. You check the event log, and SQL Server is reporting corruption. What do you do? Take database corruption
  • 38. © kCura LLC. All rights reserved. The really, really, really sad truth about corruption
  • 39. © kCura LLC. All rights reserved. • You’d do a full backup daily, and transaction logs every minute. • You’d check for corruption every day with DBCC CHECKDB. • You would know with confidence that every full backup was clean, so when corruption strikes, you just have to use the most recent full backup and the transaction logs since. • You’d have a corner office, beautiful hair, and a 50% raise. In a perfect world
  • 40. © kCura LLC. All rights reserved. Recovery Time Objective (RTO): the length of time the business is willing to be down. In that time window, you have to be able to restore all of your databases in that time, as well as do troubleshooting and corruption repair. Back here in reality, capacity is dictated by RTO.
  • 41. Your Firm Bossy McManager “We can be down for an hour, but no more.”
  • 42. Your Firm, said after testing how fast your restores can go You “That means I can only fit 1TB of cases in each of our SQL Server building blocks.”
  • 43. Your Firm Bossy McManager “That’s crazy. Fit in more.”
  • 44. Your Firm, said like a boss You “Sure, but now your acceptable downtime is ____, agreed?”
  • 45. © kCura LLC. All rights reserved. • Buy one building block • Restore your databases to it, timing how long they take • Do performance tuning on the: • Backup target so it can send data to SQL Server faster • Network so it can transmit this data faster • SQL Server so it can write the restored data/log files faster • Monitoring systems so they give you more time to react • Measure your restores again • Communicate your current RTO in writing to management The keys to making this work
  • 46. © kCura LLC. All rights reserved. Recap If you only see one slide this hour, it should be this one (Well not this one, the next one)
  • 47. © kCura LLC. All rights reserved. • Architecture: multi-instance failover cluster with shared storage. Start with just 2 nodes, 1 of which is licensed with Enterprise Edition. • Hardware: 2-socket, 4-core servers with 512-768GB RAM and a pair of local solid state drives for TempDB. • These building blocks will make it easy for your business to scale and adapt – without blowing big bucks on the best shared storage. Just use what you have. SQL Server building blocks for Relativity
  • 48. Let Us Know What You Think You’ll receive a short survey via email for each breakout session and the overall event. Please take a minute to tell us what you think.

Notas do Editor

  1. You want Relativity to run as fast as possible, but you don't have an unlimited budget. Microsoft Certified Master Brent Ozar will give you real-life stories from shops with high-performance environments. You’ll learn how to choose between virtual and physical servers, understand different high-availability methods, and discover why clustering is such a no-brainer.