Visit my Youtube Channel
Use SQL Group By in Excel VBA
Home
Note: This webpage is designed to accompany the Youtube video posted above. The video offers detailed explanation on all topics; while this webpage will serve as a repository for the code presented in the video.
This tutorial will cover how to use the SQL Group By clause in Excel VBA.
While this may seem like a random video about a specific topic, I assure you, its not. We will cover the broader topic of
- How to use Excel like a database
- How to use ADO or ActiveX Data Objects to connect to the Excel application
- How to use SQL in Excel
- How to aggregate multiple columns of data quickly and efficiently using Group By
- How to sum data after aggregating it
So, in addition to the Group By clause, hopefully this video will open more possibilities for you to add to your coding arsenal. For example, you could use the same technique to extract data out of Ms Access or SQL server by making a few modifications.
Basic VBA Query with Group By
In this section, we will look at how to create a basic procedure where we will group the data as per the requirements.
In order to be able to run SQL queries in VBA, we will need to reference the ActiveX Data Object (ADO) Library.
You can add the reference by going to VB Editor -> Tools -> Reference and tick mark against the latest ADO library.
The connection string used in this code can be found at https://www.connectionstrings.com/excel/.
We will follow the below steps.
- Add a reference to ADO library
- Create a new ADO connection
- Open the ADO connection
- Create a new ADO recordset
- Define SQL query
- Open the ADO recordset
- Extract data into worksheet
- Close the ADO recordset
- Close the ADO connection
- Release the objects
What is missing is some error handling, which we will look at in the next section.
Option Explicit
Sub Ado_Vanilla()
Dim conn As ADODB.Connection
Set conn = New ADODB.Connection
conn.Open "Provider=Microsoft.ACE.OLEDB.12.0;Data Source= " & ThisWorkbook.FullName & ";Extended Properties=""Excel 12.0;HDR=YES"";"
Dim rs As ADODB.Recordset
Set rs = New ADODB.Recordset
Dim qSelectInvDetails As String
qSelectInvDetails = "Select InvoiceNumber, InvoiceDate, Customer, sum(InvoiceAmount) as InvAmt, sum(Tax) as Tax, sum(GrossAmount) as GrossAmt" & _
" From [InvoiceDetail$]" & _
" Group By InvoiceNumber, InvoiceDate, Customer "
rs.Open qSelectInvDetails, conn
wsInvHeader.Cells.ClearContents
Dim i As Long
For i = 0 To rs.Fields.Count - 1
wsInvHeader.Cells(1, i + 1).Value = rs.Fields(i).Name
Next i
wsInvHeader.Range("A2").CopyFromRecordset rs
wsInvHeader.Columns("A:F").AutoFit
rs.Close
conn.Close
Set rs = Nothing
Set conn = Nothing
End Sub
De-construct the Group By Statement
Here, we will deconstruct the SQL query used in the previous section.
SQL Query 1: Select every Invoice Number field from the Invoice Detail worksheet.
qSelectInvDetails = Select InvoiceNumber" & _
" From [InvoiceDetail$]"
SQL Query 2: Select unique Invoice Numbers from the Invoice Detail worksheet. Note: The more appropriate way to find unique values would be by using the Select Distinct statement. However, for the purpose of this video, we will stick to Group By, as this will build up to our full SQL statement as seen in the first section.
qSelectInvDetails = Select InvoiceNumber" & _
" From [InvoiceDetail$]" & _
" Group By InvoiceNumber"
SQL Query 3: Select unique Invoice Numbers and sum the Gross Amounts for that invoice.
qSelectInvDetails = Select InvoiceNumber, sum(GrossAmount) as GrossAmt " & _
" From [InvoiceDetail$]" & _
" Group By InvoiceNumber"
SQL Query 4: Select unique combination of Invoice Numbers, Invoice Dates and Customers and sum the Gross Amounts for that combination.
qSelectInvDetails = Select InvoiceNumber, InvoiceDate, Customer, sum(GrossAmount) as GrossAmt " & _
" From [InvoiceDetail$]" & _
" Group By InvoiceNumber, InvoiceDate, Customer"
SQL Query 5: Select unique combination of Invoice Numbers, Invoice Dates and Customers and sum the Gross Amounts, Tax amounts and Invoice Amounts for that combination.
qSelectInvDetails = Select InvoiceNumber, InvoiceDate, Customer, sum(InvoiceAmount) as InvAmt, sum(Tax) as Tax, sum(GrossAmount) as GrossAmt " & _
" From [InvoiceDetail$]" & _
" Group By InvoiceNumber, InvoiceDate, Customer"
SQL Query 6: Select unique combination of Invoice Numbers, Invoice Dates and Customers and sum the Gross Amounts, Tax amounts and Invoice Amounts for that combination. Do this only for customers starting with the name Mighty.
qSelectInvDetails = Select InvoiceNumber, InvoiceDate, Customer, sum(InvoiceAmount) as InvAmt, sum(Tax) as Tax, sum(GrossAmount) as GrossAmt " & _
" From [InvoiceDetail$]" & _
" Where Customer like ‘Mighty%’" & _
" Group By InvoiceNumber, InvoiceDate, Customer"
SQL Query 7: Select unique combination of Invoice Numbers, Invoice Dates and Customers and sum the Gross Amounts, Tax amounts and Invoice Amounts for that combination. Sort these results by Customer (ascending) and Invoice Date (descending).
qSelectInvDetails = Select InvoiceNumber, InvoiceDate, Customer, sum(InvoiceAmount) as InvAmt, sum(Tax) as Tax, sum(GrossAmount) as GrossAmt " & _
" From [InvoiceDetail$]" & _
" Group By InvoiceNumber, InvoiceDate, Customer" & _
" Order By Customer, InvoiceDate Desc"
SQL Query 8: Select unique combination of Invoice Numbers, Invoice Dates and Customers and sum the Gross Amounts for that combination and count the number of lines within that combination.
qSelectInvDetails = "Select InvoiceNumber, Customer, InvoiceDate, sum(GrossAmount) as GrossAmt, count(InvoiceNumber) as Count_of_Lines" & _
" From [InvoiceDetail$]" & _
" Group By InvoiceNumber, Customer, InvoiceDate"
Error Handling
Now, we will cover some error handling scenarios. This will ensure that we gracefully exit the sub by closing any open connections or recordsets. We will place the error handler right at the beginning of the sub. Any error will take us directly to the ErrorHandler line, where we will print the error description. Next, the code will direct us to the CleanExit line, where we first check if an instance of the recordset has been created. If so, we will discard it. Next we check, whether the recordset has been opened. If so, we will close it. And we will repeat the same steps for the connection.
Option Explicit
Sub Ado_w_ErrorHandling()
On Error GoTo ErrorHandler
Dim conn As ADODB.Connection
Set conn = New ADODB.Connection
conn.Open "Provider=Microsoft.ACE.OLEDB.12.0;Data Source= " & ThisWorkbook.FullName & ";Extended Properties=""Excel 12.0;HDR=YES"";"
Dim rs As ADODB.Recordset
Set rs = New ADODB.Recordset
Dim qSelectInvDetails As String
qSelectInvDetails = "Select InvoiceNumber, InvoiceDate, Customer, sum(InvoiceAmount) as InvAmt, sum(Tax) as Tax, sum(GrossAmount) as GrossAmt" & _
"From [InvoiceDetail$]" & _
" Group By InvoiceNumber, InvoiceDate, Customer "
rs.Open qSelectInvDetails, conn
wsInvHeader.Cells.ClearContents
Dim i As Long
For i = 0 To rs.Fields.Count - 1
wsInvHeader.Cells(1, i + 1).Value = rs.Fields(i).Name
Next i
wsInvHeader.Range("A2").CopyFromRecordset rs
wsInvHeader.Columns("A:F").AutoFit
CleanExit:
If Not rs Is Nothing Then
If rs.State = 1 Then
rs.Close
End If
Set rs = Nothing
End If
If Not conn Is Nothing Then
If conn.State = 1 Then
conn.Close
End If
Set conn = Nothing
End If
Exit Sub
ErrorHandler:
Debug.Print "Error#: " & Err.Number & vbCrLf & vbCrLf & Err.Description
Resume CleanExit
End Sub
Dumping Results into an Array
In this section, we will look at how to dump the results of a SQL query directly into an array. Up until now, the results from the recordset were been dumped directly into the worksheet using the CopyFromRecordset() method of the Range object. But, now we will use the GetRows() of the recordset, to offload the results of the query directly into an array. These results are in transposed form. We will need to transpose these back into a second array using the Application.Transpose function.
Option Explicit
Sub Ado_w_Array()
On Error GoTo ErrorHandler
Dim conn As ADODB.Connection
Set conn = New ADODB.Connection
conn.Open "Provider=Microsoft.ACE.OLEDB.12.0;Data Source= " & ThisWorkbook.FullName & ";Extended Properties=""Excel 12.0;HDR=YES"";"
Dim rs As ADODB.Recordset
Set rs = New ADODB.Recordset
Dim qSelectInvDetails As String
qSelectInvDetails = "Select InvoiceNumber, InvoiceDate, Customer, sum(InvoiceAmount) as InvAmt, sum(Tax) as Tax, sum(GrossAmount) as GrossAmt" & _
" From [InvoiceDetail$]" & _
" Group By InvoiceNumber, InvoiceDate, Customer "
rs.Open qSelectInvDetails, conn
wsInvHeader.Cells.ClearContents
Dim i As Long
For i = 0 To rs.Fields.Count - 1
wsInvHeader.Cells(1, i + 1).Value = rs.Fields(i).Name
Next i
Dim arrRs1() As Variant, arrRs2() As Variant
arrRs1 = rs.GetRows
ReDim arrRs2(1 To UBound(arrRs1, 2) + 1, 1 To UBound(arrRs1, 1) + 1)
arrRs2 = Application.Transpose(arrRs1)
wsInvHeader.Range(wsInvHeader.Cells(2, 1), wsInvHeader.Cells(UBound(arrRs2, 1) + 1, UBound(arrRs2, 2))).Value = arrRs2
'wsInvHeader.Range("A2").CopyFromRecordset rs
wsInvHeader.Columns("A:F").AutoFit
CleanExit:
If Not rs Is Nothing Then
If rs.State = 1 Then
rs.Close
End If
Set rs = Nothing
End If
If Not conn Is Nothing Then
If conn.State = 1 Then
conn.Close
End If
Set conn = Nothing
End If
Exit Sub
ErrorHandler:
Debug.Print "Error#: " & Err.Number & vbCrLf & vbCrLf & Err.Description
Resume CleanExit
End Sub