New Session: Profiling Your Data

June 20, 2018July 18, 2018 ~ sqlswimmer ~ 2 Comments

Data Quality is a huge issue, especially now with more and more data being created daily. According to Micro Focus back in 2016, we were creating 44 ExaBytes of data per day and are expected to produce 463 ExaBytes of data per day by 2025. That’s a LOT of data.

With all this data, it’s important now more than ever to profile your data BEFORE your warehouse or ETL projects start. No one wants to be several months into a project only to realize the design has to be reworked due to data quality issues. So I’ve created a session called Profiling Your Data that talks about Data Profiling, what it is, why you should do it and how you can do it with the tools already included in the SQL Server BI stack.

If you’re interested in having me present this for your group or event, just let me know, I’d be happy to.

Using the Analysis Services Execute DDL Task

March 2, 2018March 2, 2018 ~ sqlswimmer ~ 3 Comments

My grandfather used to say, “There’s more than one way to skin a cat.” I always thought it was a disgusting saying, I mean who would ever want to “skin a cat”?! Gross! But the underlying meaning was clear, there’s always more than one way to do something. And no my grandfather never actually skinned any cats, he was just a country boy from a small town in Iowa, where you quite often had to come up with creative ways to get things done.

There are often times when we get so focused on what we are doing, we forget to look around. This happened to me a few weeks ago. We were in the middle of a big upgrade at work, not only where we upgrading our ERP system, we were changing platforms as well (read this as: no one got more than 3-4 hours of sleep per night for about two weeks, including myself). We also have a BI solution that uses SQL Server Integration Services (SSIS) to pull data from our ERP system, stores it in a data warehouse on SQL Server then has a multi-dimensional SQL Server Analysis Services (SSAS) cube that sits on top of the warehouse. Pretty common scenario.

Last part of our ETL process is to process the Dimensions and Facts for the newly loaded data. I have always used the Analysis Services Process Task in SSIS to accomplish this. Works great, it’s stable and easy to use. Just click a few items and you are off and running.

My ETL code had been stable for so long, I hadn’t needed to make any changes since I upgraded from Visual Studio 2015 to Visual Studio 2017. Little did I know that there was a bug in Visual Studio 2017/SSDT 15.5.2 that will not allow you to use the Analysis Services Process Task if your target version of SSIS is 2016. You will get a couple of different errors depending on what you are doing. If you are trying to modify an existing package that uses an Analysis Services Process Task and your target version of SSIS is 2016 you will get the following error:

Post71-EditExistingError

If you try to create a new package using an Analysis Services Process Task and your target version of SSIS is 2016 you will get the following error after it allows you to configure your processing tasks and you click OK:

Post71-NewPackageError

In both of these scenarios you will not be able to save the package. So what the heck are you supposed to do?! Here’s where my tunnel vision (and panic) sets in. How was I supposed to get my SSAS objects processed?

I could always script out my processing tasks using SSMS and drop them in a SQL Agent job step. But I have multiple environments and multiple cubes so each one would have to be hard coded. Not a great idea, so scratch that.

Post71-ScriptProcesinSSMS

I could just create a new SSIS project, targeting SSIS 2014, just for the processing tasks. (Thanks to Teo Lachev for the idea. And yes, you can deploy SSIS 2014 targeted packages/projects to SSIS 2016.) It’s better than my first idea. I could parameterize all the connection managers so nothing would be hard coded, but now my code lives in two different projects. Not ideal, but it would definitely be better.

Then Gerhard Brueckl pointed me to the Analysis Services Execute DDL Task.

Post71-ASDDLTask

I did some quick research on it here and turns out, this was exactly what I needed to “skin this cat”. It would allow me to keep my SSAS processing packages in the same project as the rest of the ETL and I could parameterize it to my hearts content (queue beams of light from above and singing angels). Why hadn’t I thought of this before?! To be honest I had never paid attention to this task. It had the term DDL in it, and I wasn’t writing DDL, so why would I ever need it, especially since I was dealing with SSAS. Yes, I know, need I remind you about my lack of sleep and my tunnel vision.

Post71-ASDDLTaskOptions

The cool thing about the Analysis Services Execute DDL Task is that you have a choice when it comes to your source for the DDL. I opted for the variable because I needed to parameterize it with the SSAS database name.

Post71-SourceTypes

I was able to script my processing tasks, dump that code into a variable that I built with expressions so that it could be parameterized. Now one thing to keep in mind is that the XMLA that is produced for SSAS processing tasks has double quotes in it. You will have to either a) escape the double quotes or b) create a variable for the double quote to use in the XMLA in your expression to build your variable. I opted for b, create a variable to hold the double quote, it seemed cleaner to me. Here’s a good post on StackOverFlow on how to do both.

As you can see below, it looks very intimidating with all the XMLA & expressions, but it creates a very pretty XMLA processing task in the end.

Post71-VariableExpression

I created one task for processing dimensions and one task for processing facts. Mostly because I want to be able to track performance respectively, but you could create one task to do it all.

I was able to add these new packages to my SSIS project and deploy it without issue. They’ve been running ever since with no problems.

I hope this post helps someone else break out of their tunnel vision.

PSA: No cats were harmed in the writing of this blog post.

Speaking at SQL Saturday Denver (#673)

July 28, 2017 ~ sqlswimmer ~ Leave a comment

Wow, I can hardly contain my excitement. I was selected to speak at SQL Saturday Denver on September 16, 2017. I’ll be presenting my Intro to Integration Services session. I’m excited for so many reasons, but firstly, this will be my first trip back “home” in seven years. We planned this trip back in February so that we could see Muse at Red Rocks. I lived in Denver for 15 years and never once saw a show at Red Rocks, so I was super excited when I found out they were playing there. Secondly, I get to talk about Integration Services, which is my favorite piece of SQL Server. It’s an introductory session aimed at folks who want to learn more about that it is and how it can be used.

So, if you’re in the Denver area on Saturday, September 16, 2017, stop by and say “Hi”, I’d love to see you and chat for a little while.

#SQLSatDenver here we come!

denver-colorado-downtown

Import Export Wizard Mapping Files

June 7, 2017June 7, 2017 ~ sqlswimmer ~ 2 Comments

Recently I had to copy data from our AS400 to a SQL Server database. Normally I would whip up my fancy SQL Server Integration Services (SSIS) project and get it hammered out pretty quickly. Unfortunately there were over 4,000 tables that I needed to pull data from, no way in HELL was I going to manually create 4,000+ packages. Now most of my BIML friends would say, I could BIML that for you in 2 hours and if my BIML weren’t so rusty, I probably could have too. I didn’t have to do any fancy transformations on the data, I just had to copy it. So I decided to take the “easy way out” and use the Import Export Wizard in SQL Server Management Studio. Shoot, all I would have to do is a few clicks and be done with it, right? Boy was I wrong.

This post talks about the issue I ran into with SSIS Mapping Files.

We currently run DB2 on an IBM iSeries AS400 for our ERP system. I was tasked with copying data from the AS400 to a SQL Server database for some consultants to use. The C-Suite didn’t want to give the consultants access to our AS400, so this was the work around that was put forth and accepted (and no, no one asked me before I was “voluntold” for the task). Since this would essentially be a “one-time” thing, I chose to use the Import Export Wizard, but I would save the package just in case they wanted this process repeated.

I fired up the Import Export Wizard and selected my source, IBM DB2 for i IBMDA400 OLE DB Provider. Now before you can select this data source you must install the IBM DB2 drivers. You can find out more about them here, unfortunately you have to have a maintenance contract and an account with IBM before you can download them <sigh>. It’s a straight forward install once you have the installation package.

EZ peazy, lemon squeezy.

I selected my destination, SQL Server Native Client 11.0, of course!

On a roll now, should only be another 30 seconds and I can put this project to bed. Well, we all know that’s not what happened, otherwise you wouldn’t be reading this.

When I clicked on the Edit mappings button in the wizard to make sure all the datatypes had been mapped successfully, I got “<unknown type>” for every single column. WTH?! This doesn’t happen when I’m working in Visual Studio with my SSIS projects. After some frantic googling, I found a couple of very good articles on the Mapping Files for the Import Export Wizard.

Data Type Mapping

Import Export Wizard can show numbers..

I took the advice of the articles and made copies of my Mapping Files before I modified them. I made my modifications to include the column types and their respective mappings courtesy of the Data Type Mapping article and saved my changes. I made sure the Import Export Wizard was closed then started it again. This isn’t so hard, no big deal, they’ll all populate correctly now..WHAT?! Still <unknown type> for all columns! Now this has become a matter of solving it using this method, I will NOT resort to brushing up on my BIML.

After many attempts I finally figured out what the issue was. There were two. First, the order in which the Import Export Wizard searches through the Mapping Files. Second, the Source Type within the Mapping File.

According to the Import Export Wizard, my source provider is IBMDA400 and it can’t find any mapping file. But if you notice the Mapping file for my destination…

For the first issue, a little background on how the Import Export Wizard works. When you select a source and destination the wizard has to know how to map the data types from source to destination so you don’t end up with gobbledygook in your destination. So it searches through all the files in the following directories, depending on your architecture (I’m using SQL Server 2016 hence the 130 in the path):

C:\Program Files\Microsoft SQL Server\130\DTS\MappingFiles

C:\Program Files (x86)\Microsoft SQL Server\130\DTS\MappingFiles

The key word there is ALL the files in the directory, so if you just copy your original files to the same folder (with the famous “ – copy” so courteously appended by Windows Explorer), it will search through both your original AND the modified copy of the respective Mapping File. In my case it was finding the source/destination Mapping File in the original Mapping File and completely ignoring my modified copy <sigh>. Easy enough to fix, I moved my original “copies” to a completely different folder.

For the second issue, the source type within the Mapping File. Now I will admit that I had been working on this for about 5 hours straight and had become so obsessed with making this work with the Import Export Wizard, I started to not pay attention to detail. I want to see how long it takes you to find the issue, this is my file that I think should work.

This is the file that actually works

Did you find it? How long did it take you? Took me about an hour to figure this out

In case you still haven’t found the issue, the answer is: The SourceType in the first file is using IBMDADB2* and the second file is using IBMDA*. Since our source provider is IBMDA400 and we are using the first file (IBMDADB2*), there is will be no match on the source. As soon as we change the SourceType (IBMDA*) we get a match (* is used as a wild card) it works. Three little letters, that’s all it took for me to waste half a day.

Now what I ended up doing instead of modifying the original mapping file is creating a copy of it, renaming it to something meaningful to me, but still following the naming convention of the Mapping Files, changing the SourceType value to IBMDA* and adding all the data types that were missing. This way there will be no conflict if I ever need to use the mapping file with the IBMDADB2 SourceType.

I hope this helps someone else. There’re tons of posts out there about data mapping, but none of them tell you to pay special attention to the issues I had. Granted my issues were self created, but they were issues nonetheless.

Achievement Unlocked: 24 Hours of Pass Session Submitted

March 19, 2015 ~ sqlswimmer ~ 1 Comment

I have finally submitted a session for a speaking engagement. Those who know me know that this is a very BIG deal. I have a horrible fear of public speaking, but I am slowly starting to embrace that fear. Last year I presented at the Triad SQL BI User Group and I didn’t hyperventilate, no one was more surprised than me.

I submitted the same session that I did at the Triad SQL BI User Group, Introduction to Integration Services (SSIS), to the upcoming 24 Hours of PASS: Growing Our Community.

I wanted to be a teacher when I was in college until I discovered my fear of public speaking. Now, more than 20 years later, maybe I can finally reach that goal.

Fingers crossed that my session is accepted.

I Didn’t Hyperventilate!

August 28, 2014June 9, 2026 ~ sqlswimmer ~ 3 Comments

I gave what is officially my second presentation this week. I presented at the Triad SQL BI (Twitter | PASS) user group meeting and I didn’t hyperventilate! That’s a huge deal for someone like me, who is petrified of public speaking.

It started out a little rough though.

Timeline
Friday, 8/22 (just 4 days before presentation date) – I met Katherine Fraser (SQLSassy) for lunch and she mentioned that their scheduled speaker had just cancelled on them the day before. I asked her what she was going to do and she said unless I wanted to present for her, she had no idea. I jokingly said, “Yeah, sure, I’ll present”. Do not EVER, tell a chapter lead you will present, even if joking around because they will pounce on you! Lesson learned there. I agreed to present a very introductory session on SSIS. I then went home and started to panic.

Saturday, 8/23 – I woke up with a horrible sinus headache and thought I was in the beginning of nasty sinus infection. Now I really started to panic. I sent Martin to the drugstore to buy every sinus medication they had on the shelf. There was no way I could be sick, I could not cancel on Katherine after I had just agreed to present the day before. I proceeded to pound down some Emergen-C and drink about a gallon of water an hour for the rest of the day.

Sunday, 8/24 – I woke up at 4:30 a.m. to take part in the upgrade of major system at work. I felt about the same as Saturday. I pounded some more Emergen-C and worked until 11:30am. After we got the green light from the testers at 3:30 p.m., I went to bed and collapsed.

Monday, 8/25 – Woke up feeling much better, but not great. Pounded more Emergen-C. Started to work on my presentation. Did I mention that I didn’t have anything prepped for a presentation? I’m not a speaker, why on earth would I have a presentation ready to go? Got a call from my boss that the system upgrade wasn’t going so smoothly and had to start firefighting in production.

Tuesday, 8/26 – Presentation day. Got the word from my boss that the system upgrade was still up in the air, but none of the pieces that were broken were anything I could help with or fix. I started to work on my presentation. Just before lunch time I was told I had two conference calls I needed to participate in. Great, another two hours I don’t get to work on my presentation! Finally done with conference calls, when I got a call from my boss, we are rolling back the upgrade and I need to bring the old server back online. Luckily I had been able to create the content of the presentation and test it. I just didn’t have any time to do a practice run through. That was going to have to be enough, it was time to go to Winston-Salem.

I arrived in plenty of time, but I forgot: the power supply for my laptop, my external mouse, speaker evaluation forms and my list of things I needed to take with me to the meeting. Luckily my laptop was fully charged and didn’t die during the presentation (in fact I could have gone on for another 2 ½ hours, thankfully no one wanted to stay that long!). A mouse was provided by our wonderful host, Matt Clepper of Inmar, but not before I had a slight mishap using that @!#$%^& embedded mouse on my laptop. Katherine was well prepared and brought speaker evaluation forms. As for my list of things I needed to bring with me, well, I just had to “adapt and overcome”.

The presentation went pretty well, I didn’t hyperventilate. Sometimes you have to have a very simple goal, just survive without a trip to the ER.

Wrap up
Overall it was a good experience. I think I did a good job of presenting and the feedback I got reinforced that. There were some great ideas on what I could have done better and some great comments on what I did well.

Will I speak again? Probably. I’m not sure I’m ready for a SQL Saturday quite yet, but maybe another couple of times at UG meetings and I’ll think about it. A huge “Thank you” goes out to Katherine for taking a chance and believing in me.

Of course I didn’t sleep at all Tuesday night. I kept thinking, “I forgot to tell them…”

Custom Reports for SSMS 2012

March 29, 2013April 12, 2013 ~ sqlswimmer ~ Leave a comment

I presented on this topic at the TriadSQL User’s Group meeting on 3/26/13. I promised I would post my code/solutions, so here it is. For those that were not able to attend the meeting, here is a brief description of what I am posting.

I am lucky enough to be using SQL Server 2012 in a production environment. Not only do I get to use the database engine, but I also get to use Integration Services (SSIS), Analysis Services (SSAS) and Reporting Services (SSRS).

This post is mostly about SSIS 2012 and the aches and pains it relieves compared to previous versions.

One of the great things about SSIS 2012, is the new project deployment model. Gone is the day when you had to search file servers high and low to figure out where all your SSIS packages lived (if you used file deployment, which I did). Now all your packages are located in one place.

Another great thing about SSIS 2012 is that it has logging built in and the entries all go to one location, so gone is the day when a rogue developer would send his logging to some obscure table in some obscure database on some obscure server never to be seen by anybody ever again.

Where is this magical place you ask? The Integration Services Catalog (ISC), that’s where. The ISC stores your packages along with the metrics for running those packages and sits atop a nice little database called SSISDB.

The ICS also comes with some canned reports for looking at those metrics. The stepping off point is a report called All Executions. It can be viewed by right clicking on your project then selecting Reports | Standard Reports | All Executions. This report is very cool, it provides basic execution information, e.g., how many times it has succeeded, failed, etc. It also provides start times, end times and duration for each package in the project. As I mentioned, it is the stepping off point, it has a section with links to three other reports, Overview, All Messages & Execution Performance.

The Overview report is nice; it shows all the parameter values that were used when the package was executed and some detailed execution information. Definitely comes in handy when troubleshooting.

The All Messages report, is just as it sounds, it shows all the messages that were logged during the execution of the package. Again, good information for troubleshooting.

My favorite is the Execution Performance report. Prior to my first glimpse of this report, I had visions of graphs and KPIs that would rival dancing sugar-plums. Needless to say, I was a little disappointed. Don’t get me wrong, it has a graph on it, but only the last 10 executions are included. There is also a listing of the last 10 execution durations. The thing that gets me is that right above that listing is a three month average and standard deviation. Really, they couldn’t have included those months of data in the graph? But I digress. So I said to myself, “I need more data.” That’s when the light bulb went off and I decided to get more data all on my own.

The SSISDB is easily accessible and easy enough of follow, especially when using the catalog schema and its views. I created some SSRS reports in SQL Server Data Tools and was happy as a clam, until I discovered that the SSMS custom reports don’t support images, drill-down or basically any other kind of interactive-ness. That’s okay, I published the cool reports for the managers to see via the SSRS site and I created a stripped down copy of the reports that I can use within SSMS.

I end up maintaining two code bases, which is kind of a headache, but I work in a small shop where I am the only SQL database related employee, so it’s not that big of a headache. If you work in a larger shop, you may want to figure out some cool PowerShell script that will sync them up.

I hope you find these reports useful and possibly a jumping off point for you to start your own collection of custom reports. I am only providing some of the very basic reports that I created, I wanted to leave some of the creativity to you 😉