Use SQL Group By in Excel VBA

Note: This webpage is designed to accompany the Youtube video posted above. The video offers detailed explanation on all topics; while this webpage will serve as a repository for the code presented in the video.

This tutorial will cover how to use the SQL Group By clause in Excel VBA.

While this may seem like a random video about a specific topic, I assure you, its not. We will cover the broader topic of

  1. How to use Excel like a database
  2. How to use ADO or ActiveX Data Objects to connect to the Excel application
  3. How to use SQL in Excel
  4. How to aggregate multiple columns of data quickly and efficiently using Group By
  5. How to sum data after aggregating it

So, in addition to the Group By clause, hopefully this video will open more possibilities for you to add to your coding arsenal. For example, you could use the same technique to extract data out of Ms Access or SQL server by making a few modifications.

Basic VBA Query with Group By

In this section, we will look at how to create a basic procedure where we will group the data as per the requirements.

In order to be able to run SQL queries in VBA, we will need to reference the ActiveX Data Object (ADO) Library.

You can add the reference by going to VB Editor -> Tools -> Reference and tick mark against the latest ADO library.

The connection string used in this code can be found at https://www.connectionstrings.com/excel/.

We will follow the below steps.

  1. Add a reference to ADO library
  2. Create a new ADO connection
  3. Open the ADO connection
  4. Create a new ADO recordset
  5. Define SQL query
  6. Open the ADO recordset
  7. Extract data into worksheet
  8. Close the ADO recordset
  9. Close the ADO connection
  10. Release the objects

What is missing is some error handling, which we will look at in the next section.

Option Explicit

Sub Ado_Vanilla()

Dim conn As ADODB.Connection
Set conn = New ADODB.Connection

conn.Open "Provider=Microsoft.ACE.OLEDB.12.0;Data Source= " & ThisWorkbook.FullName & ";Extended Properties=""Excel 12.0;HDR=YES"";"

Dim rs As ADODB.Recordset
Set rs = New ADODB.Recordset

Dim qSelectInvDetails As String
qSelectInvDetails = "Select InvoiceNumber, InvoiceDate, Customer, sum(InvoiceAmount) as InvAmt, sum(Tax) as Tax, sum(GrossAmount) as GrossAmt" & _
                    " From [InvoiceDetail$]" & _
                    " Group By InvoiceNumber, InvoiceDate, Customer "

rs.Open qSelectInvDetails, conn

wsInvHeader.Cells.ClearContents

Dim i As Long
For i = 0 To rs.Fields.Count - 1
    wsInvHeader.Cells(1, i + 1).Value = rs.Fields(i).Name
Next i

wsInvHeader.Range("A2").CopyFromRecordset rs
wsInvHeader.Columns("A:F").AutoFit

rs.Close
conn.Close

Set rs = Nothing
Set conn = Nothing

End Sub

De-construct the Group By Statement

Here, we will deconstruct the SQL query used in the previous section.

SQL Query 1: Select every Invoice Number field from the Invoice Detail worksheet.

qSelectInvDetails = Select InvoiceNumber" & _
" From [InvoiceDetail$]"

SQL Query 2: Select unique Invoice Numbers from the Invoice Detail worksheet. Note: The more appropriate way to find unique values would be by using the Select Distinct statement. However, for the purpose of this video, we will stick to Group By, as this will build up to our full SQL statement as seen in the first section.

qSelectInvDetails = Select InvoiceNumber" & _
" From [InvoiceDetail$]" & _
" Group By InvoiceNumber"

SQL Query 3: Select unique Invoice Numbers and sum the Gross Amounts for that invoice.

qSelectInvDetails = Select InvoiceNumber, sum(GrossAmount) as GrossAmt " & _
" From [InvoiceDetail$]" & _
" Group By InvoiceNumber"

SQL Query 4: Select unique combination of Invoice Numbers, Invoice Dates and Customers and sum the Gross Amounts for that combination.

qSelectInvDetails = Select InvoiceNumber, InvoiceDate, Customer, sum(GrossAmount) as GrossAmt " & _
" From [InvoiceDetail$]" & _
" Group By InvoiceNumber, InvoiceDate, Customer"

SQL Query 5: Select unique combination of Invoice Numbers, Invoice Dates and Customers and sum the Gross Amounts, Tax amounts and Invoice Amounts for that combination.

qSelectInvDetails = Select InvoiceNumber, InvoiceDate, Customer, sum(InvoiceAmount) as InvAmt, sum(Tax) as Tax, sum(GrossAmount) as GrossAmt " & _
" From [InvoiceDetail$]" & _
" Group By InvoiceNumber, InvoiceDate, Customer"

SQL Query 6: Select unique combination of Invoice Numbers, Invoice Dates and Customers and sum the Gross Amounts, Tax amounts and Invoice Amounts for that combination. Do this only for customers starting with the name Mighty.

qSelectInvDetails = Select InvoiceNumber, InvoiceDate, Customer, sum(InvoiceAmount) as InvAmt, sum(Tax) as Tax, sum(GrossAmount) as GrossAmt " & _
" From [InvoiceDetail$]" & _
" Where Customer like ‘Mighty%’" & _
" Group By InvoiceNumber, InvoiceDate, Customer"

SQL Query 7: Select unique combination of Invoice Numbers, Invoice Dates and Customers and sum the Gross Amounts, Tax amounts and Invoice Amounts for that combination. Sort these results by Customer (ascending) and Invoice Date (descending).

qSelectInvDetails = Select InvoiceNumber, InvoiceDate, Customer, sum(InvoiceAmount) as InvAmt, sum(Tax) as Tax, sum(GrossAmount) as GrossAmt " & _
" From [InvoiceDetail$]" & _
" Group By InvoiceNumber, InvoiceDate, Customer" & _
" Order By Customer, InvoiceDate Desc"

SQL Query 8: Select unique combination of Invoice Numbers, Invoice Dates and Customers and sum the Gross Amounts for that combination and count the number of lines within that combination.

qSelectInvDetails = "Select InvoiceNumber, Customer, InvoiceDate, sum(GrossAmount) as GrossAmt, count(InvoiceNumber) as Count_of_Lines" & _
" From [InvoiceDetail$]" & _
" Group By InvoiceNumber, Customer, InvoiceDate"

Error Handling

Now, we will cover some error handling scenarios. This will ensure that we gracefully exit the sub by closing any open connections or recordsets. We will place the error handler right at the beginning of the sub. Any error will take us directly to the ErrorHandler line, where we will print the error description. Next, the code will direct us to the CleanExit line, where we first check if an instance of the recordset has been created. If so, we will discard it. Next we check, whether the recordset has been opened. If so, we will close it. And we will repeat the same steps for the connection.

Option Explicit

Sub Ado_w_ErrorHandling()

On Error GoTo ErrorHandler

Dim conn As ADODB.Connection
Set conn = New ADODB.Connection

conn.Open "Provider=Microsoft.ACE.OLEDB.12.0;Data Source= " & ThisWorkbook.FullName & ";Extended Properties=""Excel 12.0;HDR=YES"";"

Dim rs As ADODB.Recordset
Set rs = New ADODB.Recordset

Dim qSelectInvDetails As String
qSelectInvDetails = "Select InvoiceNumber, InvoiceDate, Customer, sum(InvoiceAmount) as InvAmt, sum(Tax) as Tax, sum(GrossAmount) as GrossAmt" & _
                    "From [InvoiceDetail$]" & _
                    " Group By InvoiceNumber, InvoiceDate, Customer "

rs.Open qSelectInvDetails, conn

wsInvHeader.Cells.ClearContents

Dim i As Long
For i = 0 To rs.Fields.Count - 1
    wsInvHeader.Cells(1, i + 1).Value = rs.Fields(i).Name
Next i

wsInvHeader.Range("A2").CopyFromRecordset rs
wsInvHeader.Columns("A:F").AutoFit

CleanExit:

If Not rs Is Nothing Then
    If rs.State = 1 Then
        rs.Close
    End If
    Set rs = Nothing
End If

If Not conn Is Nothing Then
    If conn.State = 1 Then
        conn.Close
    End If
    Set conn = Nothing
End If

Exit Sub

ErrorHandler:
Debug.Print "Error#: " & Err.Number & vbCrLf & vbCrLf & Err.Description
Resume CleanExit

End Sub

Dumping Results into an Array

In this section, we will look at how to dump the results of a SQL query directly into an array. Up until now, the results from the recordset were been dumped directly into the worksheet using the CopyFromRecordset() method of the Range object. But, now we will use the GetRows() of the recordset, to offload the results of the query directly into an array. These results are in transposed form. We will need to transpose these back into a second array using the Application.Transpose function.

Option Explicit

Sub Ado_w_Array()

On Error GoTo ErrorHandler

Dim conn As ADODB.Connection
Set conn = New ADODB.Connection

conn.Open "Provider=Microsoft.ACE.OLEDB.12.0;Data Source= " & ThisWorkbook.FullName & ";Extended Properties=""Excel 12.0;HDR=YES"";"

Dim rs As ADODB.Recordset
Set rs = New ADODB.Recordset

Dim qSelectInvDetails As String
qSelectInvDetails = "Select InvoiceNumber, InvoiceDate, Customer, sum(InvoiceAmount) as InvAmt, sum(Tax) as Tax, sum(GrossAmount) as GrossAmt" & _
                    " From [InvoiceDetail$]" & _
                    " Group By InvoiceNumber, InvoiceDate, Customer "

rs.Open qSelectInvDetails, conn

wsInvHeader.Cells.ClearContents

Dim i As Long
For i = 0 To rs.Fields.Count - 1
    wsInvHeader.Cells(1, i + 1).Value = rs.Fields(i).Name
Next i

Dim arrRs1() As Variant, arrRs2() As Variant

arrRs1 = rs.GetRows
ReDim arrRs2(1 To UBound(arrRs1, 2) + 1, 1 To UBound(arrRs1, 1) + 1)
arrRs2 = Application.Transpose(arrRs1)
wsInvHeader.Range(wsInvHeader.Cells(2, 1), wsInvHeader.Cells(UBound(arrRs2, 1) + 1, UBound(arrRs2, 2))).Value = arrRs2

'wsInvHeader.Range("A2").CopyFromRecordset rs
wsInvHeader.Columns("A:F").AutoFit

CleanExit:

If Not rs Is Nothing Then
    If rs.State = 1 Then
        rs.Close
    End If
    Set rs = Nothing
End If

If Not conn Is Nothing Then
    If conn.State = 1 Then
        conn.Close
    End If
    Set conn = Nothing
End If

Exit Sub

ErrorHandler:
Debug.Print "Error#: " & Err.Number & vbCrLf & vbCrLf & Err.Description
Resume CleanExit

End Sub