Mais conteúdo relacionado Semelhante a Building a Fast, Reliable SQL Server for kCura Relativity (20) Building a Fast, Reliable SQL Server for kCura Relativity3. © kCura LLC. All rights reserved.
• You have multiple concurrent reviewers that cost real money.
• You have dozens (or perhaps hundreds) of active cases,
with more being added all the time, without you being involved.
• Your users want the database to go as fast as possible,
but you don’t have a million-dollar budget for SQL Server.
• You’re not building a disaster recovery (DR) solution just yet,
but you want a roadmap of what it would take to do it.
About you
4. © kCura LLC. All rights reserved.
Tens of
thousands of
workspaces, or
tens of terabytes
in one database
Shops with
only a handful of
really small (1GB)
workspaces
About your environment
Normal distribution graph source, licensed with Creative Commons CC BY 2.5:
https://en.wikipedia.org/wiki/Standard_deviation#/media/File:Standard_deviation_diagram.svg
5. © kCura LLC. All rights reserved.
Agenda
• How SQL Server responds to Relativity workloads
• What this means for SQL Server hardware
• How your users respond to SQL Server problems
• What this means for SQL Server hardware
• How you respond to outages
• What this means for SQL Server capacity management
• The simple cheat sheet that ties it all together
6. © kCura LLC. All rights reserved.
How your users query
and how SQL Server responds
7. © kCura LLC. All rights reserved.
SQL Server stores data in 8KB pages.
8. © kCura LLC. All rights reserved.
SELECT *
FROM dbo.Users
WHERE LastAccessDate >= ‘8/25/09’
Users send in queries.
9. © kCura LLC. All rights reserved.
In normal databases, we can build indexes to improve speed.
10. © kCura LLC. All rights reserved.
SELECT TOP 1000 *
FROM dbo.Users
WHERE Location LIKE ‘%chicago%’
AND DisplayName LIKE ‘Bob’
AND LastAccessDate BETWEEN ‘2014/01/01’ AND
‘2014/05/01’
ORDER BY LastAccessDate
But Relativity users run very unpredictable queries.
11. © kCura LLC. All rights reserved.
You can’t predict
what users will filter for,
or the order they want it in.
12. © kCura LLC. All rights reserved.
Even worse, users can add fields.
That means each page
can hold less and less rows.
And their search patterns change.
13. © kCura LLC. All rights reserved.
The result:
we’re constantly scanning
the whole Document table.
14. © kCura LLC. All rights reserved.
SQL Server Enterprise Edition helps with this.
Source: Microsoft Books Online
https://technet.microsoft.com/en-us/library/ms191475(v=sql.105).aspx
15. © kCura LLC. All rights reserved.
• Enterprise Edition: $7,000 per processor core.
• Thankfully, Relativity queries aren’t CPU-bottlenecked at all:
we don’t need many CPU cores. (Other apps do.)
• Knowing that we have to have Enterprise Edition to scale Relativity with multiple simultaneous
reviewers, hitting tables we can’t index, makes the hardware decision simple.
The problem: Enterprise Edition is expensive. Really expensive.
16. © kCura LLC. All rights reserved.
What this means
For the “Compute” part of SQL Server hardware
17. © kCura LLC. All rights reserved.
• CPUs: as few processor cores as we can get.
2 sockets x 4 cores each = 8 cores = $56,000 of SQL licensing.
• Memory: as much as we can get, so we can cache these tables that users are constantly
scanning.
• Storage: the more memory we get, the less storage speed matters, because Enterprise Edition
can read ahead of what we’re scanning.
How to buy SQL Server compute power for Relativity
18. © kCura LLC. All rights reserved.
• Dell PowerEdge R730XD or HP ProLiant DL380
– 2 quad-core CPUs
– 768GB RAM
– 2 local SSDs for TempDB
• Hardware: $20k
• SQL Server licensing: $56k
• Total: $80k
What that looks like
19. © kCura LLC. All rights reserved.
So now we have the compute. What about storage?
20. © kCura LLC. All rights reserved.
How your users react
when SQL Server stops responding
21. © kCura LLC. All rights reserved.
Reviewers are constantly active, 24/7.
When the database goes down, firms lose money and time.
Attorney work product is valuable.
The database can go down, but it had better not lose data.
Relativity is mission-critical
26. © kCura LLC. All rights reserved.
What this means
For the “Storage” part of SQL Server hardware
28. © kCura LLC. All rights reserved.
• All user data is written to shared storage,
accessible by two or more compute nodes.
• Compute nodes watch each other: if one gets into trouble and hangs, another
node can take over automatically with no human intervention.
• Ideally, this happens in under fifteen seconds.*
(If things were really ideal, we wouldn’t be failing over. Figure a minute or two.
And we still do have one single point of failure: our shared storage.)
Failover Clustered Instance (FCI)
29. © kCura LLC. All rights reserved.
As you grow, add more building blocks, spreading cases around.
30. © kCura LLC. All rights reserved.
• Lets you cycle in hardware gradually over time.
• Enables manual load balancing across compute nodes.
• Does require SQL Server Enterprise Edition
(but we needed that anyway.)
• Has lots of options for disaster recovery down the road.
Multi-instance Failover Clusters
31. © kCura LLC. All rights reserved.
How you (not users) react
when SQL Server goes bump in the night
32. © kCura LLC. All rights reserved.
Recovery Time Objective (RTO):
the length of time the business
is willing to be down for
an unplanned outage
36. © kCura LLC. All rights reserved.
• Mess up a patch for Windows, SQL Server, or Relativity
• Accidentally drops a table or an entire database
• Write a script that does something it’s not supposed to
• Discover database corruption
• Discover widespread database corruption affecting multiple databases
Things I’ve seen Relativity admins do
37. © kCura LLC. All rights reserved.
• Saturday midnight: you run a full backup.
• Sunday morning: you run a weekly CHECKDB, and it comes back OK.
• Sunday midnight: you run a full backup.
• Monday midnight: you run a full backup.
• Tuesday midnight: you run a full backup.
• Wednesday 11:00 AM: a user reports that their queries are failing.
You check the event log, and SQL Server is reporting corruption.
What do you do?
Take database corruption
38. © kCura LLC. All rights reserved.
The really, really, really sad truth about corruption
39. © kCura LLC. All rights reserved.
• You’d do a full backup daily, and transaction logs every minute.
• You’d check for corruption every day with DBCC CHECKDB.
• You would know with confidence that every full backup was clean,
so when corruption strikes, you just have to use the most recent full
backup and the transaction logs since.
• You’d have a corner office, beautiful hair, and a 50% raise.
In a perfect world
40. © kCura LLC. All rights reserved.
Recovery Time Objective (RTO):
the length of time the business is willing to be down.
In that time window,
you have to be able to restore all of your databases in that time,
as well as do troubleshooting and corruption repair.
Back here in reality, capacity is dictated by RTO.
42. Your Firm, said after testing how fast your restores can go
You
“That means I can only fit 1TB of cases in each of
our SQL Server building blocks.”
44. Your Firm, said like a boss
You
“Sure, but now your acceptable downtime is ____,
agreed?”
45. © kCura LLC. All rights reserved.
• Buy one building block
• Restore your databases to it, timing how long they take
• Do performance tuning on the:
• Backup target so it can send data to SQL Server faster
• Network so it can transmit this data faster
• SQL Server so it can write the restored data/log files faster
• Monitoring systems so they give you more time to react
• Measure your restores again
• Communicate your current RTO in writing to management
The keys to making this work
46. © kCura LLC. All rights reserved.
Recap
If you only see one slide this hour, it should be this one
(Well not this one, the next one)
47. © kCura LLC. All rights reserved.
• Architecture: multi-instance failover cluster with shared storage.
Start with just 2 nodes, 1 of which is licensed with Enterprise Edition.
• Hardware: 2-socket, 4-core servers with 512-768GB RAM and a pair of local solid
state drives for TempDB.
• These building blocks will make it easy for
your business to scale and adapt –
without blowing big bucks on the best
shared storage. Just use what you have.
SQL Server building blocks for Relativity
48. Let Us Know What You Think
You’ll receive a short survey via email for
each breakout session and the overall event.
Please take a minute to tell us what you think.
Notas do Editor You want Relativity to run as fast as possible, but you don't have an unlimited budget. Microsoft Certified Master Brent Ozar will give you real-life stories from shops with high-performance environments. You’ll learn how to choose between virtual and physical servers, understand different high-availability methods, and discover why clustering is such a no-brainer.