![]() If it does become a performance problem, then it can be dealt with then, no big deal. net, java or SSIS packages that can be supported over time are not ubiquitous, are expensive, and are in an international shortage.įor me, turning most of the ETLs that use to run in a maze of SSIS flow hell, into something where most of the data manipulation being done by TSQL through linked servers that 3-4 other staff members could work on without creating a mess, allowed me to actually have the time to work on performance tuning other things where the poor performance was not tolerable, and also deal with far less spaghettification from other staff not knowing how to put together something else better, so that it can later be appended to and modified without becoming a mess. Resources who can write at least fair quality SQL are ubiquitous. They are ok whenever you can tolerate the poor performance. I don’t agree that use of linked servers are only good for one off uses. The ‘learn a new tool’ thing works great until you end up having to support all the people who learned the new tools poorly and the problems they create. ![]() But if maybe you asked me directly, the answer might be different. If you need fast results, that’s the most surefire game in town.Īnd for the record, about that note your friend passed to me, the answer is no, I do not want to be your valentine. Otherwise, if you need data from a different SQL Server, I’ve got some tough advice: connect to the server that has the data you want. That way, you don’t risk accidentally restoring the backup over the entire production database, and you can take your time picking and choosing the specific rows you want. Pull or push just the rows you need (or all of them) across a linked server connection.Restore the database backup onto a different server. ![]() Since SQL Server still can’t restore a freakin’ table, a common approach is to: For example, say someone really screws up a table, and you need to restore that table’s contents. They’re fine for one-off utility queries, things you only have to do a couple of times. ![]() There are many more reasons linked server queries perform poorly – but those two alone are showstoppers. It’s hard on the local server, and it’s hard on the remote server that holds the single source of truth for the table. This is another great example of a development pattern that works fine in development, especially with small database sizes – but then falls down dramatically at scale, with larger databases and more concurrent queries.Įven worse, it penalizes both servers involved with the linked server query. If you run the same linked server query a dozen times – even if the rows aren’t changing, even if the database is read-only, even if all twelve queries run at the same time, SQL Server makes a dozen different connections over to the linked server and fetches the data from scratch, every single freakin’ time. Problem #2: linked servers don’t cache data. The linked server query copied the entire Posts table across the network wire, one row at a time. Eventually, the plan finishes, and here’s the actual plan: So the local SQL Server keeps on dragging rows across the network from the remote server, checking its owner one at a time, and eventually exhausts the entire content of the Posts table. Unfortunately, that assumption is not correct. (That’s really a good chunk of query tuning right there, and as I talk about in my Fundamentals of Query Tuning class, large variances between estimated and actual rows are usually the place to focus on when you’re tuning.) In some cases, when SQL Server’s assumptions match, queries perform just fine. This plan would work great – if SQL Server’s assumptions about a large number of matching Users rows were correct. The remote SQL Server sends rows, and for each row, the local SQL Server checks to see if the matching user has a low number of votes. The first thing SQL Server decided to do was a Remote Query – scanning the Posts table across the network, starting from the most recent posts first. Read the query plan from right to left, top to bottom – the thing at the top right was the first thing SQL Server chose to do: In this query, SQL Server believes that LOTS of users are going to match, so it decides to just go start fetching rows from the Posts table across the network. What two problems are causing this? Problem #1: linked server queries can copy entire tables across the network. Performance is catastrophic, taking minutes of query runtime – and no rows are even returned. I’m looking for the most recent posts written by the users with a low number of votes.
0 Comments
Leave a Reply. |