Excel VBA: Compare Data [Before & After Versions]

Note: This webpage is designed to accompany the Youtube video posted above. The video offers detailed explanation on all topics; while this webpage will serve as a repository for the code presented in the video.

Macro file can be downloaded from Github here.

In this Project, we will cover a useful way to compare two versions of data from the same source i.e. a before and after version; using Excel VBA. This would be applicable for conducting audits of master data sets such as Supplier Master, Customer Master, Price Data, Inventory SOH etc. It could also, be incorporated into data governance models or error reporting.

In our project example, we have two data sets representing Item Master data which lists attributes for each product code. One is a Previous version of the data while the other is the Current version. Our task is to identify what’s changed -> new products added, products deleted and any changes to attributes in existing products. The results will be listed in the New, Deleted and Change worksheets.

Product codes highlighted in Green are New, since they don’t exist in the previous data. Product codes highlighted in Red are Deleted, since they don’t exist in the current data. The fields highlighted in Yellow have changed values. 

Current Data Set
Previous Data Set

Any lookup logic involves comparison on a certain column or group of columns that are unique. In our case, this column is the product code which we will call Item Ids in the macro code.

We will look at different techniques to compare data starting with long-form way of using Nested For Loops. Then, we will perform the same task using Range.Find(). Later, we will replace Find with dictionaries, which is my preferred way. Lastly, we will modify the code to make it work on any dataset. All you’ll need to do is specify which column to lookup against.

Common Update Sheets macro

This macro will be used or called in every main sub procedure that we create below. It clears contents in the Change New and Deleted worksheets and adds in the Column headers.

Sub UpdateSheets()

wsChng.Cells.ClearContents
wsNew.Cells.ClearContents
wsDel.Cells.ClearContents
wsChng.Range("A1").Value = "Item"
wsChng.Range("B1").Value = "Changed Field"
wsChng.Range("C1").Value = "Previous Value"
wsChng.Range("D1").Value = "Current Value"
wsNew.Range("A1").Value = "Item"
wsDel.Range("A1").Value = "Item"

End Sub

Section One: Nested For Loop

This is the brute force technique where we will preform an outer loop on the Current data set, grab the product code (item id) and for each product code, loop over the Previous data on the product codes to find a match, and for each match loop across the columns to check whether a value has changed. Changes are logged in the first available row in the Change worksheet. If no match is found, then its a new product code and will be logged in the New worksheet.

Then, we will perform another loop over Previous data and check for the product code in the Current worksheet. If no match is found, then that product code has been deleted and would need to be logged in the Deleted worksheet.

Sub CompareData01()

Call UpdateSheets

Dim lrowPrev As Long, lrowCur As Long
Dim lcolCur As Long
lrowPrev = wsPrev.Range("A1").CurrentRegion.Rows.Count
lrowCur = wsCur.Range("A1").CurrentRegion.Rows.Count
lcolCur = wsCur.Range("A1").CurrentRegion.Columns.Count

Dim i As Long, j As Long, x As Long
Dim itemIdCur As Long, itemIdPrev As Long
Dim lrowNew As Long, lrowChng As Long

For i = 2 To lrowCur
    itemIdCur = wsCur.Cells(i, 1).Value
    For j = 2 To lrowPrev
        itemIdPrev = wsPrev.Cells(j, 1).Value
        If itemIdCur = itemIdPrev Then
            For x = 2 To lcolCur
                If wsCur.Cells(i, x).Value <> wsPrev.Cells(j, x).Value Then
                    lrowChng = wsChng.Range("A1").CurrentRegion.Rows.Count + 1
                    wsChng.Cells(lrowChng, 1).Value = itemIdCur
                    wsChng.Cells(lrowChng, 2).Value = wsPrev.Cells(1, x).Value
                    wsChng.Cells(lrowChng, 3).Value = wsPrev.Cells(j, x).Value
                    wsChng.Cells(lrowChng, 4).Value = wsCur.Cells(i, x).Value
                End If
            Next x
            GoTo LineNext
        End If
    Next j
' No Match was found
    lrowNew = wsNew.Range("A1").CurrentRegion.Rows.Count + 1
    wsNew.Cells(lrowNew, 1).Value = itemIdCur
LineNext:
Next i

''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'Find Deleted Lines
Dim lrowDel As Long
For i = 2 To lrowPrev
    itemIdPrev = wsPrev.Cells(i, 1).Value
    For j = 2 To lrowCur
        itemIdCur = wsCur.Cells(j, 1).Value
        If itemIdCur = itemIdPrev Then
            GoTo LineNext2
        End If
    Next j
' No Match was found
    lrowDel = wsDel.Range("A1").CurrentRegion.Rows.Count + 1
    wsDel.Cells(lrowDel, 1).Value = itemIdPrev
LineNext2:
Next i
End Sub

Section Two: Range.Find

This is much faster than the previous code. Here, we will only loop over the Current Data, grab the product code and search (Range.Find) each product code within the Previous Data. If nothing is found, log it in the New worksheet. For matches found, loop across the columns to check for changes and log any change in the Change worksheet.

Then, do the opposite and search each product code from the Previous data within the Current data column of product codes. If nothing is found, log it in the Deleted worksheet.

Sub CompareData02()

Call UpdateSheets

Dim lrowPrev As Long, lrowCur As Long
Dim lcolCur As Long
lrowPrev = wsPrev.Range("A1").CurrentRegion.Rows.Count
lrowCur = wsCur.Range("A1").CurrentRegion.Rows.Count
lcolCur = wsCur.Range("A1").CurrentRegion.Columns.Count

Dim i As Long, j As Long, x As Long
Dim itemIdCur As Long, itemIdPrev As Long
Dim lrowNew As Long, lrowChng As Long
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
    'Declare Range for Find
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Dim rngPrev As Range
Set rngPrev = wsPrev.Range("A1").CurrentRegion.Columns(1)
Dim rngFound As Range
Dim itemRow As Long
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
For i = 2 To lrowCur
    itemIdCur = wsCur.Cells(i, 1).Value
'Find New Items
    Set rngFound = rngPrev.Find( _
                        what:=itemIdCur, _
                        lookat:=xlWhole)
    If rngFound Is Nothing Then
        lrowNew = wsNew.Range("A1").CurrentRegion.Rows.Count + 1
        wsNew.Cells(lrowNew, 1).Value = itemIdCur
        GoTo LineNext
    End If
'Assign Item Row
    itemRow = rngFound.Row
    Set rngFound = Nothing
    For x = 2 To lcolCur
        If wsCur.Cells(i, x).Value <> wsPrev.Cells(itemRow, x).Value Then
            lrowChng = wsChng.Range("A1").CurrentRegion.Rows.Count + 1
            wsChng.Cells(lrowChng, 1).Value = itemIdCur
            wsChng.Cells(lrowChng, 2).Value = wsPrev.Cells(1, x).Value
            wsChng.Cells(lrowChng, 3).Value = wsPrev.Cells(itemRow, x).Value
            wsChng.Cells(lrowChng, 4).Value = wsCur.Cells(i, x).Value
        End If
    Next x
LineNext:
Next i

''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'Find Deleted Lines
Dim lrowDel As Long
Dim rngCur As Range
Set rngCur = wsCur.Range("A1").CurrentRegion
For i = 2 To lrowPrev
    itemIdPrev = wsPrev.Cells(i, 1).Value
'Find New Items
    Set rngFound = rngCur.Find( _
                        what:=itemIdPrev, _
                        lookat:=xlWhole)
    If rngFound Is Nothing Then
        lrowDel = wsDel.Range("A1").CurrentRegion.Rows.Count + 1
        wsDel.Cells(lrowDel, 1).Value = itemIdPrev
    End If
    Set rngFound = Nothing
Next i

End Sub

Section Three: Dictionary

This would be my preferred way of conducting this operation. Dictionaries would be more efficient than Range.Find as the size of the data set increases.

Loop over Previous data and add product codes as key and row number as value into a dictionary.

Then, loop over the Current Data, grab the product code and check whether each product code exists within the dictionary of previous codes. If the current code doesn’t exists in the dictionary, log it in the New worksheet. if it does, grab the row number from the dictionary and loop across the columns to check for changes and log any change in the Change worksheet.

Then, do the opposite and search each product code from the Previous data in a dictionary of current product codes. If it doesn’t exists, log it in the Deleted worksheet.

Sub CompareData03()

Call UpdateSheets

Dim lrowPrev As Long, lrowCur As Long
Dim lcolCur As Long
lrowPrev = wsPrev.Range("A1").CurrentRegion.Rows.Count
lrowCur = wsCur.Range("A1").CurrentRegion.Rows.Count
lcolCur = wsCur.Range("A1").CurrentRegion.Columns.Count

Dim i As Long, j As Long, x As Long
Dim itemIdCur As Long, itemIdPrev As Long
Dim lrowNew As Long, lrowChng As Long
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
    'Put Prev Item Ids into a Dictionary
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Dim dicPrev As Object
Set dicPrev = CreateObject("Scripting.Dictionary")
Dim sKey As Long
Dim lValue As Long
For i = 2 To lrowPrev
    sKey = wsPrev.Cells(i, 1).Value
    lValue = i
    dicPrev(sKey) = lValue
Next i
Dim itemRow As Long
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
For i = 2 To lrowCur
    itemIdCur = wsCur.Cells(i, 1).Value
    If Not dicPrev.exists(itemIdCur) Then
        lrowNew = wsNew.Range("A1").CurrentRegion.Rows.Count + 1
        wsNew.Cells(lrowNew, 1).Value = itemIdCur
        GoTo LineNext
    End If
'Assign Item Row
    itemRow = dicPrev(itemIdCur)
    For x = 2 To lcolCur
        If wsCur.Cells(i, x).Value <> wsPrev.Cells(itemRow, x).Value Then
            lrowChng = wsChng.Range("A1").CurrentRegion.Rows.Count + 1
            wsChng.Cells(lrowChng, 1).Value = itemIdCur
            wsChng.Cells(lrowChng, 2).Value = wsPrev.Cells(1, x).Value
            wsChng.Cells(lrowChng, 3).Value = wsPrev.Cells(itemRow, x).Value
            wsChng.Cells(lrowChng, 4).Value = wsCur.Cells(i, x).Value
        End If
    Next x
LineNext:
Next i

''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'Find Deleted Lines

Dim dicCur As Object
Set dicCur = CreateObject("Scripting.Dictionary")
For i = 2 To lrowCur
    sKey = wsCur.Cells(i, 1).Value
    lValue = i
    dicCur(sKey) = lValue
Next i

Dim lrowDel As Long
For i = 2 To lrowPrev
    itemIdPrev = wsPrev.Cells(i, 1).Value
'Find New Items
    If Not dicCur.exists(itemIdPrev) Then
        lrowDel = wsDel.Range("A1").CurrentRegion.Rows.Count + 1
        wsDel.Cells(lrowDel, 1).Value = itemIdPrev
    End If
Next i

End Sub

Section Four: Dynamic Code

Here, we will make our code dynamic i.e. it will work on any dataset. Till now we have hard-coded the column 1 (with product codes) as the column number to perform the lookup logic on. Now, we will create a constant at the top of the macro, which represents the column number. The user can change this to a new lookup column number for a different dataset with no further changes to the code.

Public Const compareID As Long = 1

Sub CompareData04()

Call UpdateSheets

Dim lrowPrev As Long, lrowCur As Long
Dim lcolCur As Long
lrowPrev = wsPrev.Range("A1").CurrentRegion.Rows.Count
lrowCur = wsCur.Range("A1").CurrentRegion.Rows.Count
lcolCur = wsCur.Range("A1").CurrentRegion.Columns.Count


Dim i As Long, j As Long, x As Long
'Change to Variant
Dim itemIdCur As Variant, itemIdPrev As Variant
Dim lrowNew As Long, lrowChng As Long

For i = 2 To lrowCur
    itemIdCur = wsCur.Cells(i, compareID).Value
    For j = 2 To lrowPrev
        itemIdPrev = wsPrev.Cells(j, compareID).Value
        If itemIdCur = itemIdPrev Then
'Change to 1
            For x = 1 To lcolCur
                If wsCur.Cells(i, x).Value <> wsPrev.Cells(j, x).Value Then
                    lrowChng = wsChng.Range("A1").CurrentRegion.Rows.Count + 1
                    wsChng.Cells(lrowChng, 1).Value = itemIdCur
                    wsChng.Cells(lrowChng, 2).Value = wsPrev.Cells(1, x).Value
                    wsChng.Cells(lrowChng, 3).Value = wsPrev.Cells(j, x).Value
                    wsChng.Cells(lrowChng, 4).Value = wsCur.Cells(i, x).Value
                End If
            Next x
            GoTo LineNext
        End If
    Next j
' No Match was found
    lrowNew = wsNew.Range("A1").CurrentRegion.Rows.Count + 1
    wsNew.Cells(lrowNew, 1).Value = itemIdCur
LineNext:
Next i

''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'Find Deleted Lines
Dim lrowDel As Long
For i = 2 To lrowPrev
    itemIdPrev = wsPrev.Cells(i, compareID).Value
    For j = 2 To lrowCur
        itemIdCur = wsCur.Cells(j, compareID).Value
        If itemIdCur = itemIdPrev Then
            GoTo LineNext2
        End If
    Next j
' No Match was found
    lrowDel = wsDel.Range("A1").CurrentRegion.Rows.Count + 1
    wsDel.Cells(lrowDel, 1).Value = itemIdPrev
LineNext2:
Next i
End Sub