Wednesday, September 21, 2016

Filtered Indexes on SQL Server


Filtered Indexing can be a good strategy to lower the amount of data you are looking at.  Lets say you have a table, like we do, called Members which has 50 million records in it.  Lets say you know that there is an active flag in it, and that for 16 million of those records the flag is set to one and the rest are zero.  If you have a query that is grabbing the Last and first name, AND it wants Active = 1, when then this type of index might be for you:

CREATE INDEX IX_Members_byLastNameFirstName_filtered on Members
(
    LastName,
    FirstName,
    Active
)
Where (Active = 1)

Now, there are some caveats, and that's what I want to point out in this blog post.  Looking at the Index I have included a column that might not really need to be in there, but the optimizer might want it there... Or not... That comes down to testing.. And sometimes it can be really frustrating when SQL Server just doesn't use your filtered indexes for whatever reason.  There are plenty a posts out there in Blog land about these caveats so feel free to Internet Out -- 

What I want to focus on here are some important settings that need to be in place before running down this road towards the filtered index party:

SET ANSI_NULLS ON
SET ANSI_PADDING ON
SET ANSI_WARNINGS ON
SET ARITHABORT ON
SET CONCAT_NULL_YIELDS_NULL ON
SET QUOTED_IDENTIFIER ON
SET NUMERIC_ROUNDABORT OFF

Now it's possible that your sprocs may have some settings that don't fall inside these restrictions. I found several that had ANSI_NULLS off at the time, and here is what I cobbled together to get that list:



SELECT 
    a.ROUTINE_NAME, 
    a.ROUTINE_TYPE, 
    OBJECT_DEFINITION(object_id(a.ROUTINE_NAME)) as SprocText, 
    b.uses_ansi_nulls 
FROM 
    INFORMATION_SCHEMA.ROUTINES as a 
JOIN sys.sql_modules as b 
ON.routine_name = object_name(b.[object_id]) 
WHERE 
    OBJECT_DEFINITION(object_id(ROUTINE_NAME)) LIKE '%Table_With_Filtered_Index%' and b.uses_ansi_nulls = 0 


Thursday, September 15, 2016

Entity Framework Woes


I CAN HAS DYNAMIC SQLZ PLZ??

Dynamic SQL is a serious issue for Entity Framework.  Don't believe me?

Look herehere, and here for some non-answers and more frustration.  I did dig further and we tried some other things, but nothing was 'easy' and just fell into place.

I even tried the SET FMTONLY OFF, to no avail.

Whats the real issue here?  Entity framework needs to be able to scan the stored proc to get the values it returns to create a complex object.  That's all we are trying to do.  If you put the stuff in a stored procedure inside of a dynamic sql, Entity Framework can't see it, and then the Software Engineers (some of them anyway) cry.  A real solution is to build out the complex type by hand.  We all know that many computer programmers are pretty freaking lazy (myself included) so this 'by hand' term is not in our dictionary. (Another solution would be to program a thing that builds out the complex type, but i digress.)  Sometimes people let their laziness get the better of them and we get these abominations:


SELECT TOP 100
 m.Id,
 c.CompanyName AS ClientName,
 m.FirstName,
 m.LastName,
 m.MiddleName,
 m.Address1,
 m.Address2,
 m.City,
 m.Suffix,
 m.StateProvince,
 m.PostalCode,
 m.HomePhone,
 m.Email,
 m.Active,
 m.MemberShipNo
FROM Members m
WHERE (@MembershipNumber IS NULL OR (m.MemberShipNo LIKE @MembershipNumber + '%'))
 AND (@FirstName IS NULL OR (FirstName LIKE @FirstName + '%'))
 AND (@LastName IS NULL OR (LastName LIKE @LastName + '%'))
 AND (@ZipCode IS NULL OR (m.PostalCode LIKE @ZipCode + '%'))
 AND (@PhoneNumber IS NULL OR (m.HomePhone LIKE @PhoneNumber + '%'))
 AND (@EmailAddress IS NULL OR (m.Email LIKE @EmailAddress + '%'))

Don't do this because of this.  Every time you do, you murder a kitten and risk being chased by a DBA with a hatchet.

So the following is what we did -- we broke one of the statements out of the dynamic SQL and stuck it at the top of the stored procedure like so:

 SELECT DISTINCT TOP 2500  
 m.[Id],
 m.[ClientId],
 m.[MemberShipNo],
 m.[Active],
 m.[SignedAuth],
 m.[EnrollmentDate],
 m.[SSN_NIN],
 m.[DOB],
 m.[FirstName],
 m.[MiddleName],
 m.[LastName],
 m.[Address1],
 m.[City],
 m.[StateProvince],
 m.[PostalCode],                                                                                     
 m.[Country],
 m.[Void],
 m.[PackageId],
 p.[ProductLabel] AS PackageLabel,
 m.[VendorSelection],
 m.[OnlineCredit],
 m.[MinorProductID],
 r.[Id] AS CaseId
 FROM [dbo].[Members] AS m
 LEFT OUTER JOIN [dbo].[Packages] AS p ON m.[PackageId] = p.[Id]
 OUTER APPLY (SELECT TOP 1 r.[Id] 
     FROM [dbo].[Rq_FraudCase] AS r
     WHERE r.[MembersId] = m.[Id] 
     ORDER BY r.[Id] DESC) AS r
 WHERE m.[Void] = 0 AND 1=2
  END
-- Above sproc is what the Dynamic SQL After this returns

This almost got it, but now I am returning two sets of records to Entity Framework and it poops its pants. I thought why not put the 1=2 at the beginning?

IF 1=2
  BEGIN
 SELECT DISTINCT TOP 2500  
 m.[Id],
 m.[ClientId],
 m.[MemberShipNo],
 m.[Active],
 m.[SignedAuth],
 m.[EnrollmentDate],
 m.[SSN_NIN],
 m.[DOB],
 m.[FirstName],
 m.[MiddleName],
 m.[LastName],
 m.[Address1],
 m.[City],
 m.[StateProvince],
 m.[PostalCode],                                                                                     
 m.[Country],
 m.[Void],
 m.[PackageId],
 p.[ProductLabel] AS PackageLabel,
 m.[VendorSelection],
 m.[OnlineCredit],
 m.[MinorProductID],
 r.[Id] AS CaseId
 FROM [dbo].[Members] AS m
 LEFT OUTER JOIN [dbo].[Packages] AS p ON m.[PackageId] = p.[Id]
 OUTER APPLY (SELECT TOP 1 r.[Id] 
     FROM [dbo].[Rq_FraudCase] AS r
     WHERE r.[MembersId] = m.[Id] 
     ORDER BY r.[Id] DESC) AS r
 WHERE m.[Void] = 0
  END
-- Dynamic SQL below this sproc

And Entity Framework is now happy, and the Dynamic SQL runs without issue.  If someone comes along and tells me this borks the explain plan or something else horrible happens, I will investigate and change my tune.  For now, this is a good way to get a tool to allow us to run a stored proc with Dynamic SQL without needing to do tedious amounts of work to get there.

Tuesday, September 13, 2016

Interviewing Candidates


*Scowling face*
So, why do you think we should hire you?

We have recently gone through a string of interviews trying to find the perfect candidates for our company's vacant positions.  When I go through this exercise, I like to pay as much attention as possible to the what's going on and absorb as much as possible.  Interviewing is not a magic ability only endowed to a select few, it is a skill which takes practice to cultivate.

When I interview -- if there is any inkling of them doing programming I typically start off with something like the FizzBuzz problem.  What I typically see is either someone nails the FizzBuzz, or almost gets it, or they can't tie their shoes.  If they nail it or almost get it I move on to some stuff associated with their field -- whether its QA or SE etc.  I don't tend to spend too much time in their technical area as a lot of that can be branded into their brains the night before.  After that I have a short sheet with  SQL queries on it and ask the candidate to tell me if they are good or bad and why.  None of the queries are good, and they all send DBA's screaming into the forest at night.  Finally I enter into questions like the following:

  • Explain a mistake you recently made and what you did to correct it.  Also what will you do to keep from making that mistake in the future?
  • What makes a good (x)?
  • Tell me about a time when you realized something was critically wrong and what you did after that?
  • What new technologies are you excited about with regard to (their profession)?
  • What is the most innovative thing you have implemented?
  • What would you attribute your success to?
I could go on.. What I am generally trying to do here is get a gist of whether or not they can demonstrate a capacity to learn and adapt.  Past that they need to demonstrate the ability to work with others and be part of a team.  Ego's are really a rampant part of IT for some reason.  Somebody thinks they can code an app and suddenly they are a god to be worshiped.  It's wonderful that you can write code, but my five year old nephew can do that.  Working with others can be one of the more difficult aspects of working in IT. and It can eclipse even the most difficult of technical problems.  I attribute my success to that of the team of individuals I work with -- without them I cannot achieve my goals and be successful.

And since I wouldn't ask someone to do something I wouldn't do myself:

create procedure FizzBuzz 
as
begin
   declare @i int = 1, @string varchar(20)
   while (@i <= 100)
      begin
          select @string = ''
           if (@i % 3 = 0)
           begin
               select @string = @string + 'Fizz'
           end
           if (@i % 5 = 0)
           begin
             select @string = @string + 'Buzz'
           end
           if (@string = '')
           begin
             select @string = @i
           end

           select @string
           select @i = @i + 1
       end
end

The tools I use


Snorkie could have drowned

As a DBA who is currently involved in some HA/DR aspects of the job, a lot of the core DBA work (backups, monitoring, users), and is also a performance tuner/developer -- and is the lone DBA -- I need access to tools to help me get this job done.  Below is what I am currently sporting in no particular order.

Core Work(Monitoring, Backups, Troubleshooting):
We currently use Idera SQL Diagnostic Manager for alerts on blocking and long running queries.  We also have Nagios and Zabbix running which tell us things from and OS/File perspective.  If I am on a server I am using sp_whoistactive to get whats going on in the instance.  If I need to drill into an actual spid to do a trace, I use the SQL Server Profiler.  Backups are handled by Ola Hallengren's lovely scripts, as are Index and Statistics maintenance.

For Development:
I am currently using ERStudio for design and are pushing everything up to a team project server for version control.  That is, unless I just decide it's less frustrating to use my hands.

For Tuning:
I personally use the sp_Blitz scripts quite a bit.  These provide a lot of information, and save a lot of running around putting together queries on the DMV's.  To get those table statistics, or whats going on in the cache, this is a good go to.  Past that there are some other useful one off queries I have accrued in my travels.  As for explain plans, SQL Sentry Plan Explorer Pro just went free so that's something I have added to my repertoire recently.  And of course -- set statistics io, time on...


Saturday, September 10, 2016

Bad to Good Queries


I got to search for this stuff but its not running good!


A developer was tasked with finding some data and he came to me with the following query.  I am very glad he did so:


1
2
3
4
5
6
7
select m.MemberShipNo, m.FirstName, m.LastName, m.Address1, m.City, m.StateProvince, m.PostalCode, m.PackageId, p.ProductLabel from Members m
join Packages p on p.Id = m.PackageId
where 
(FirstName='something' and LastName='something' and Address1='somewhere') or
(FirstName='something' and LastName='something' and Address1='somewhere') or
(FirstName='something' and LastName='something' and Address1='somewhere') or
....

He recognized that this was going to be at the worst a full table scan,  at best an index scan, and this does not need to run in prod.

As first I said, 'Generally speaking, you will want to do something like this':

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
select m.MemberShipNo, m.FirstName, m.LastName, m.Address1, m.City, m.StateProvince, m.PostalCode, m.PackageId, p.ProductLabel from Members m
join Packages p on p.Id = m.PackageId
where 
(FirstName='something' and LastName='something' and Address1='somewhere')
UNION
select m.MemberShipNo, m.FirstName, m.LastName, m.Address1, m.City, m.StateProvince, m.PostalCode, m.PackageId, p.ProductLabel from Members m
join Packages p on p.Id = m.PackageId
(FirstName='something' and LastName='something' and Address1='somewhere') 
UNION
select m.MemberShipNo, m.FirstName, m.LastName, m.Address1, m.City, m.StateProvince, m.PostalCode, m.PackageId, p.ProductLabel from Members m
join Packages p on p.Id = m.PackageId
(FirstName='something' and LastName='something' and Address1='somewhere') 
UNION
...

This will do an index seek (there is an index on LastName.. I put that there after examining queries and the query cache and decided an index with LastName had the closest selectivity to 1 given the columns I wanted to include).  Its important to put your thinking in sets hat on when accessing a database.  IF this then this, If this equals this or this equals that kinds of logic works great in a programming language.  You try to wrap or's in your select and things get nasty pretty quickly.

But it gets worse -- He is doing this for 1048 members.  Just thinking about building that explain plan is enough to give a DBA a nose bleed.

The query also runs for 1 minute and 14 seconds (I ran on reporting while getting the below solution done).  In Oracle Land, where selects don't lock up the database because it's playing around with rollback segments, we don't worry about this as much.  In SQL Server non Read-Committed Snapshot Isolation, a query running for a minute and 14 seconds is fairly egregious.  People are on the phone trying to help customers, trying to get information -- So the phone call goes from 'Alright let me pull that up for you.. Wow this is taking awhile.. How are things?'

I took the 1048 records and put them into a table, and then I did something like this:



1
2
3
4
5
6
select m.MemberShipNo, m.FirstName, m.LastName, m.Address1, m.City, m.StateProvince, m.PostalCode, m.PackageId, p.ProductLabel 
from Members m 
JOIN #MemberInfo as mi
on m.LastName = mi.LastName and m.Firstname = mi.FirstName and m.Address1 = mi.Address1
join Packages p 
on p.Id = m.PackageId

This runs in 88ms.

Friday, September 9, 2016

SPAM Marketing Emails


If you could change someone's life, would you?

That was the cute subject line of a sales e-mail sent to my account.. Oooo!! Click-Baity!!

It has taken me awhile to get a solution to this problem.  Getting inundated with worthless crap e-mails from essentially car salesmen is now a part of corporate life.  There is one really nice thing about switching jobs -- You get a new e-mail address!! After awhile the sale zombies catch up and soon your e-mail box starts filling up with junk again.

Help! Please! Someone make it stop!

I realize that for some people, it's their job to sell stuff.  Those people I reach out to are the ones I want to deal with, however.  The rest of you can find a quiet corner of the room and...

It took me awhile to find an approach to this that worked.  You see, originally I actually would try to unsubscribe -- and for some odd reason five more unwanted emails got sent in my face.  Gee, I wonder how that happened? Could it be the morally reprehensible sales company agreed to stop sending me e-mails in thirty days (What are you people writing my e-mail down? Send a mule to go tell bob to erase this guy from our email list.. Oh here's an eraser.), then sent my e-mail to five other trolls to spew garbage in my inbox?

Needless to say that exercise is futile, and irritating.

So here's how you duct-tape their mouths shut in Outlook.  The idea is the same for any e-mail client.

You get a silly e-mail:



OMG!! The event of the year! Hope I can get tickets!

Right-Click, Junk,  and no.. not block sender, click on Junk Email Options:


Go to the Blocked Senders tab -- Click add.  Now, refer to the email that the sender sent and put in the @ symbol and everything after it.

example @stopbotheringme.com

If you do block sender the sender only.. Well, some companies are especially evil and you might block something like this:

info+472347610-36525223638470@evilcompany.com

And they can just randomize that stuff over and over again and keep getting through.

After completion get a cup of coffee, pat yourself on the back, and get back to work.


Wednesday, September 7, 2016

Moving a primary key


Moving a primary key

I recently had some fun un-partitioning some tables.  Deleting the schemes, functions, and the files/filegroups means getting the data out of there.

Now if you have a clustered primary key, moving the data can be a real pain.  You typically will have to drop all the constraints using that primary key, drop the primary constraint, and then add it back on the new filegroup.

While playing with the syntax I stumbled upon this, and it actually worked:

CREATE UNIQUE CLUSTERED INDEX PK_NAME ON TableName(PK_Column)
WITH (DROP_EXISTING = ON) ON [YourFileGroupHere]

example:

create unique clustered index PK_Members_ID on Members(ID)
WITH (DROP_EXISTING = ON, FILLFACTOR = 100)
ON [PRIMARY]

The wonderful thing about this is you don't have to drop the foreign keys, drop the primary key, recreate everything.  One command, in and out.

Make sure you have the database to yourself! It can be a timely operation to complete.

What happens when you let developers run wild


A long time ago, in a galaxy exactly like this one....

My company decided to write an app for a third party interface which was to communicate with my trusty dusty SQL Server database.  It's an ancient hand me down database with warts and bees in it, but its the hacked together duct taped thing I keep running.

At any rate, after several revisions of sprocs without me involved, and deploying said sprocs at the last minute we wound up with this.  This wonderful 530,000 logical read mess.


SELECT TOP 100
m.Id,
c.CompanyName AS ClientName,
m.FirstName,
m.LastName,
m.MiddleName,
m.Address1,
m.Address2,
m.City,
m.Suffix,
m.StateProvince,
m.PostalCode,
m.HomePhone,
m.Email,
m.Active,
m.MemberShipNo
FROM Members m
JOIN Client c
ON m.ClientId = c.Id
AND m.FirstName LIKE '%' + @FirstName + '%'
AND m.LastName LIKE '%' + @LastName + '%'
AND m.PostalCode LIKE '%' + @ZipCode + '%'
AND (m.Void = 0 OR m.Void IS NULL)
JOIN Client_Vendor_Xref x
ON x.ClientId = c.ID
JOIN Vendor v
ON v.Id = x.VendorID
AND (v.Void = 0 OR v.Void IS NULL)
WHERE x.VendorId = @VendorId
END

How do you get the logical reads? Throw this in front of the thing you want to examine:

SET STATISTICS IO ON

After the above query I would get something like this returned:

Table 'Client_Vendor_Xref'. Scan count 9, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Client'. Scan count 9, logical reads 28, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Members'. Scan count 26, logical reads 532616, physical reads 148, read-ahead reads 2088, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Vendor'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

I have gone through and corrected all sprocs to do something like below.  This falls in line with that made up word SARGability;  the Search ARGuments supplied do not allow us to take advantage of index seeks.  Instead we get scans and scans inside every column desperately looking for those matches.  So, we got rid of preceding wildcard, and after assuring folks there aren't any nulls in these tables for the Void column (anymore), and there won't be(anymore), we got rid of the VOID quantum logic search:

SELECT TOP 100
m.Id,
c.CompanyName AS ClientName,
m.FirstName,
m.LastName,
m.MiddleName,
m.Address1,
m.Address2,
m.City,
m.Suffix,
m.StateProvince,
m.PostalCode,
m.HomePhone,
m.Email,
m.Active,
m.MemberShipNo
FROM Members m
JOIN Client c
ON m.ClientId = c.Id
AND m.FirstName LIKE @FirstName + '%'
AND m.LastName LIKE '@LastName + '%'
AND m.PostalCode LIKE '@ZipCode + '%'
AND m.Void = 0
JOIN Client_Vendor_Xref x
ON x.ClientId = c.ID
JOIN Vendor v
ON v.Id = x.VendorID
AND v.Void =  0
WHERE x.VendorId = @VendorId
END

The results?  It now uses the covering index and performs seeks.

Table 'Client'. Scan count 0, logical reads 116, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Client_Vendor_Xref'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Members'. Scan count 1, logical reads 850, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Vendor'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Pretty freakin spiffy.

As to the NULLS... For some reason the table design was done in such a way where Void was a bit which could be NULL.  Not sure what the point of Yes/No/Maybe was, but it seems to be something we are stuck on.  Regardless, if I can just check for Void = 0 and not Void IS NULL or Void = 0, I am getting a little more towards that 'Simplicate and Add Lightness' model of thinking.  Another item to point out,  you do not want to search for Void <> 0 -- that's inequality.  Indexing lets us seek to Void = 0, otherwise you have to compare all the rows for "Does this not equal zero?"

TLDR ; This is going to make my warty bees nest purr for a little while.

RedBox issues


My Redbox Problem:

So I decided to rent a game from Redbox for someone else.  My girlfriends son has fallen ill, and I went out and tried to pick up a title for his PS4 to take his mind off of it.  I picked out a title, swiped my card, and I got a empty RedBox cartridge with a sticker in the middle.  Well, isn't that cute.

I am sure I am late to the party on how people are scamming RedBox out of 60$ games and circumventing their nifty method of renting.  This however, was the first time I managed to come across it.  It's not fun to be the guy on the receiving end of fraud.  It's also not fun to be the guy on hold waiting to tell the company about said fraud.  After waiting fifteen minutes for someone to get on the phone, they granted me a free rental coupon.  Whoopee.

So now I have to rent another game from another kiosk and hope this time the game is actually in the sleeve.  For the trouble a person was going through I might give a free coupon, but also reimburse the person's credit card.

The problem with today's 'I think everyone cares what I think' generation:

As soon as corporations jumped on twitter, it took a serious downturn.  Mobile app games 'Unleashing Stupid Promo Codes', Wendy's want's to ask everyone how the new Bacon Cheese Grease Heart Attack Burger tastes, and what do you think will happen in the next episode of who gives a crap? On top of that, companies seem to have followed suit by demanding you fill out a twenty page dissertation on why you think you should be allowed to cancel membership.

It's nonsense that really needs to stop.  Some kind of common sense needs to be infused back into these corporations.  We have technology, and smart people.  Fix the kiosks RedBox.  Make your fraud problem either disappear, or make it so its not a pain in the rear to get your money back if you can't do it right.  Don't make your customers angry? Isn't that business 101?

I fully realize that me writing in this BLOG puts me in the 'I think I everyone cares what I think' generation.  What can I say?