For many professionals, Microsoft Excel is an indispensable tool, yet a deep understanding of its advanced functionalities, particularly VBA, remains elusive for some. This guide addresses a common need: efficiently merging data from multiple Excel files and subsequently cleaning it by removing duplicate entries. We’ll provide VBA macro solutions designed for ease of use, allowing even novice Excel users to perform these complex operations routinely.
Consolidating Data from Multiple Excel Files
The first challenge is to create a macro that can seamlessly combine data from several Excel workbooks into a single, new worksheet. The provided VBA code, CombineData, is designed to achieve this. It allows users to select multiple files, opens them, copies the data from the first sheet of each selected file, and appends it into a new sheet named “Combined Data” within a new workbook.
How the CombineData Macro Works:
- Initialization: The macro declares necessary variables and sets up a
FileDialogobject to enable multi-file selection. - User Selection: A dialog box prompts the user to select the Excel files they wish to combine.
- New Sheet Creation: If files are selected, a new worksheet titled “Combined Data” is created in the active workbook.
- Data Copying:
- For the first selected file, the entire used range, including headers, is copied to the “Combined Data” sheet.
- For subsequent files, only the data rows (excluding headers) are copied and appended below the existing data in “Combined Data.”
- File Closure: Each source workbook is closed without saving changes to preserve the original files.
- Confirmation: A message box confirms the successful combination of data. If no files are selected, a cancellation message is displayed.
Important Note: The original CombineData macro was intended to be saved within the workbook where data consolidation occurs. However, for routine use across different workbooks, it needs to be adapted to function correctly when stored in personal.xlsb. This ensures that the macro operates on a new workbook initiated by the user, rather than altering the personal.xlsb file itself. The corrected approach ensures that when a user runs the macro, they are prompted to select files, and the combined data appears in a new workbook, preserving their original files and the personal.xlsb environment.
Efficiently Removing Duplicate Rows
Once the data is combined, the next critical step is to eliminate redundant entries. The initial RemoveDuplicates macro recorded by the user had a limitation: it was hardcoded to a specific range ($A$1:$V$5907) and a fixed set of columns. This makes it ineffective for datasets of varying sizes.
Enhancing the RemoveDuplicates Macro:
To address this, the RemoveDuplicates macro needs to be dynamic. A corrected version would use Cells.Select or ActiveSheet.UsedRange to automatically detect the entire range of data present on the active sheet. It should then apply the RemoveDuplicates method to this detected range.
For instance, to make the macro work with any number of rows and columns, you can modify it as follows:

