Thursday, June 24, 2021
SQL GROUPING allows you to segregate data into groups so that you can work on it separately from the rest of the table records. Let's say you want to get the number of products in a category in the Northwind database Products table.
You would write the following query:
The query above gives you the following results:

The query gives you the number of products in each category, however it's not very useful. You don't really know what category the count is for in each record. You might want to try to change the query into something like this:

The above query is more useful the preceding one, however it only gives you the CategoryID number not the CategoryName in the Categories table. Being the perfectionist that you are you say to yourself, I can do better. "Yes, I can". I think that was a campaign slogan. So you try a join like this:
The above query joins the Categories table with the Products table to be able to select from both tables therefore giving the ability to select the CategoryName field.

You probably think to yourself, hey if I can get the CategoryName with the RIGHT JOIN I can just add the GROUP BY filter and be done with it. Yah, my Shaolin Master from the SQL Wing will be proud. So you write the following query:
But behold you get an error! OMG! !@@##$%##$% Luckily your Shaolin Master walked into your room and saw that you were distraught. Your master ask you why you so distraught? You tell the Master, I don't understand, the join was working by itself but when I add the GROUP BY it throws me this stupid SQL error:
Column 'Categories.CategoryName' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Your master says, if the expression is used in the SELECT statement then the same expression must be used in the GROUP BY part of the statement. You think ah ha! I can just add the columns in the SELECT list of columns to the GROUP BY filtering in the query. So you write:
And like magic the result came to your screen. Tears of joy were streaming down your face because you've gotten the results that you were looking for.

Option 2: You can select what you want from the Products table and then use the GROUP BY in another select in a RIGHT JOIN to the get the COUNT() for each category. You will get the correct results with this query as well.
As you have noticed the GROUP BY queries are missing the WHERE clause. What happened to the powerful WHERE clause? The reason we don't have a where clause is because WHERE does not work with GROUP BY results. That's where HAVING comes in. HAVING takes care of the filtering of the GROUP BY, while the WHERE clause takes care of the non GROUP BY results. For example let's say you want to get the categories with the most expensive products you can write a query that filters the product price with the WHERE clause and then filter the group data with the HAVING clause. You can write the following query to get the expensive products and their categories:
The query above gets products that are more than $10, and gets the categories that has more than 9 products. The two filters are separate but they work together nicely with the WHERE and HAVING
clause.
SELECT COUNT(*) NumberOfProductsByCategory FROM Products GROUP BY CategoryID
The query above gives you the following results:

The query gives you the number of products in each category, however it's not very useful. You don't really know what category the count is for in each record. You might want to try to change the query into something like this:
SELECT CategoryID,COUNT(*) NumberOfProductsByCategory FROM Products GROUP BY CategoryID

The above query is more useful the preceding one, however it only gives you the CategoryID number not the CategoryName in the Categories table. Being the perfectionist that you are you say to yourself, I can do better. "Yes, I can". I think that was a campaign slogan. So you try a join like this:
SELECT c.CategoryName,c.CategoryID RIGHT JOIN Categories AS c ON c.CategoryID = p.CategoryID
The above query joins the Categories table with the Products table to be able to select from both tables therefore giving the ability to select the CategoryName field.

You probably think to yourself, hey if I can get the CategoryName with the RIGHT JOIN I can just add the GROUP BY filter and be done with it. Yah, my Shaolin Master from the SQL Wing will be proud. So you write the following query:
SELECT c.CategoryName,p.CategoryID,COUNT(*) AS NumberOfProductsByCategory FROM Products p RIGHT JOIN Categories AS c ON p.CategoryID = c.CategoryID GROUP BY c.CategoryID
But behold you get an error! OMG! !@@##$%##$% Luckily your Shaolin Master walked into your room and saw that you were distraught. Your master ask you why you so distraught? You tell the Master, I don't understand, the join was working by itself but when I add the GROUP BY it throws me this stupid SQL error:
Column 'Categories.CategoryName' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Your master says, if the expression is used in the SELECT statement then the same expression must be used in the GROUP BY part of the statement. You think ah ha! I can just add the columns in the SELECT list of columns to the GROUP BY filtering in the query. So you write:
SELECT c.CategoryID,c.CategoryName,COUNT(*) AS NumberOfProductsByCategory FROM Products p RIGHT JOIN Categories AS c ON c.CategoryID = p.CategoryID GROUP BY c.CategoryID,c.CategoryName
And like magic the result came to your screen. Tears of joy were streaming down your face because you've gotten the results that you were looking for.

Option 2: You can select what you want from the Products table and then use the GROUP BY in another select in a RIGHT JOIN to the get the COUNT() for each category. You will get the correct results with this query as well.
SELECT c.CategoryID,c.CategoryName,NumberOfProductsByCategory FROM Categories AS c RIGHT JOIN (SELECT CategoryID, COUNT(*) AS NumberOfProductsByCategory FROM Products GROUP BY CategoryID) AS p ON p.CategoryID = c.CategoryID
As you have noticed the GROUP BY queries are missing the WHERE clause. What happened to the powerful WHERE clause? The reason we don't have a where clause is because WHERE does not work with GROUP BY results. That's where HAVING comes in. HAVING takes care of the filtering of the GROUP BY, while the WHERE clause takes care of the non GROUP BY results. For example let's say you want to get the categories with the most expensive products you can write a query that filters the product price with the WHERE clause and then filter the group data with the HAVING clause. You can write the following query to get the expensive products and their categories:
SELECT c.CategoryID,c.CategoryName,COUNT(*) AS NumberOfProductsByCategory FROM Products p RIGHT JOIN Categories AS c ON c.CategoryID = p.CategoryID WHERE p.UnitPrice > 10 GROUP BY c.CategoryID,c.CategoryName HAVING COUNT(*) >=5 ORDER BY NumberOfProductsByCategory DESC
The query above gets products that are more than $10, and gets the categories that has more than 9 products. The two filters are separate but they work together nicely with the WHERE and HAVING

Subscribe to:
Post Comments (Atom)
Search This Blog
Tags
Web Development
Linux
Javascript
DATA
CentOS
ASPNET
SQL Server
Cloud Computing
ASP.NET Core
ASP.NET MVC
SQL
Virtualization
AWS
Database
ADO.NET
AngularJS
C#
CSS
EC2
Iaas
System Administrator
Azure
Computer Programming
JQuery
Coding
ASP.NET MVC 5
Entity Framework Core
Web Design
Infrastructure
Networking
Visual Studio
Errors
T-SQL
Ubuntu
Stored Procedures
ACME Bank
Bootstrap
Computer Networking
Entity Framework
Load Balancer
MongoDB
NoSQL
Node.js
Oracle
VirtualBox
Container
Docker
Fedora
Java
Source Control
git
ExpressJS
MySQL
NuGet
Blogger
Blogging
Bower.js
Data Science
JSON
JavaEE
Web Api
DBMS
DevOps
HTML5
MVC
SPA
Storage
github
AJAX
Big Data
Design Pattern
Eclipse IDE
Elastic IP
GIMP
Graphics Design
Heroku
Linux Mint
Postman
R
SSL
Security
Visual Studio Code
ASP.NET MVC 4
CLI
Linux Commands
Powershell
Python
Server
Software Development
Subnets
Telerik
VPC
Windows Server 2016
angular-seed
font-awesome
log4net
servlets
tomcat
AWS CloudWatch
Active Directory
Angular
Blockchain
Collections
Compatibility
Cryptocurrency
DIgital Life
DNS
Downloads
Google Blogger
Google Chrome
Google Fonts
Hadoop
IAM
KnockoutJS
LINQ
Linux Performance
Logging
Mobile-First
Open Source
Prototype
R Programming
Responsive
Route 53
S3
SELinux
Software
Unix
View
Web Forms
WildFly
XML
cshtml
githu
This comment has been removed by the author.
ReplyDeleteGood teaching skills you have I must say. I was so dumb at SQL. Now I think I learned so many things from here. It was a good day for me. But i am hopping you will publish composable storage related article soon.
ReplyDeleteNow you won't have to stress about working Top Audit Firms In Dubai with a mass of inexperienced and incompetent financial accountants.
ReplyDelete