I would like to invite you to my presentations in the SQL Saturday #176 event in Pordenone, Italy, November 17th, 2012. I am having two presentations: Market Basket Analysis and High Performance Statistical Queries. I just want to give you little bit more background about the later.
SQL Server suite has nearly everything you need for a good BI project. Nearly. Comparing to some competitive products, SQL Server suite lacks statistical procedures and functions. Statistics is very useful for understanding your data. You can use it as a final result of a report, or, like mainly I do, in the first stage of a data mining project, for data overview. I started to write my own statistical queries back in SQL Server 2000 time. With version 2012, because of important new support for analytical queries in Transact-SQL, I decided to rewrite most of my queries. The most important gain is much better performance of the queries. However, when talking about performance for these queries, I mean the performance by the algorithm. My main goal was to calculate everything I need in a single pass through the data. Of course, performance can be further improved by indexes. However, index tuning is a fairly broadly spread knowledge, while understanding the mathematics, knowing the language, and ability to find an effective algorithm is not that simple. In the "High Performance Statistical Queries" presentation I am explaining the statistics, the algorithms, and show those efficient queries. Beside queries, there is also another important part of this presentation. Many people say that statistics lies. However, this is not true; during the presentation, I explain the meaning of each statistics, how it is calculated, and how to correctly interpret the results. Therefore, attendees get from this presentation:
- explanation how to efficiently use new T-SQL Window functions and other T-SQL elements;
- a correct understanding of quite a few statistics;
- ideas for their own BI projects;
- working queries that they can use immediately in their reports or for data overview.
Of course, the level of the presentation is very high, and good knowledge of Transact-SQL is a prerequisite.