When merging two datasets with a many-to-many relationship on the join key, what is the most critical preprocessing step?