
Goodmorning everyone! This month Brent Ozar (b | t) asked us to blog about our least favorite datatype.
Datatypes
Now, when you look in SQL Server, you’ll find that it will hold 34 types.
select name, max_length, precision, scale from sys.types order by name
And personally, I think every datatype will have it’s use at some point.
But the datatype I despise is the one that’s not used correctly.
A few examples.
First up. A large amount of my work has to do with ETL processes. There are a lot of things that can go wrong there, but one of the main issues is wrong estimations on size. When we read data from a source system to transfer it to the datawarehouse environment we have to match the datatypes. A varchar(10) in the source will have to be a varchar(10) in the target. Easy enough. But now the source gets an update and with that update the source datatype goes from varchar(10) to varchar(12). When the supplier informs us, we change the datatype accordingly and everything is fine. When for some reason the update is missed, issues will arise. Because off course it’s the primary key that got enlarged and duplicates will start to form.
The other way around happens as well. Some tools check out the source, see a varchar column and, when no-one notices, will create a nvarchar(2000) column. Joy will arise when this column contains one or two characters when the optimizer expects at least a thousand characters.
Second. I’ve seen some horrible columns where a primary key was an integer. Maybe an int, maybe a bigint. But a number nonetheless. Why not save this number in a varchar column. In a table with millions of rows and when performance starts to suck, complain about SQL Server. No, it’s not the database. It’s a design flaw.
Last. Working with dates is hard, I get that. The datediff() and dateadd() functions are extremely difficult. No, they’re not. Unless for some reason you decide to record dates as integers (20210309 for instance). When you need to subtract a number of days from that number, you’re up the creek. because you have to convert from integer to date, subtract the days and reconvert to integer again. The date datatype is there for a reason.
This is why my least favorite datatype isn’t just a datatype, but the datatype that’s being abused.
Thanks for reading!
2 thoughts on “TSQL2sday 136: my least favorite datatype”