score:1

The concat function in pandas allows one to accomplish something similar to a merge.

I encountered a similar issue and was able to develop a function that "merges" two dataframes assuming a shared index. In my case, I turned a field "PrimaryKey" into the index for each dataframe and then merged the two dataframes using this field.

import pandas as pd

    def merge_dataframe_diff(target_df, source_df, merge_on_fields: [list] = ["PrimaryKey"], drop_index: [bool] = True)):
        """Merges two dataframes based on shared columns
        Assumptions:
            1) The dataframes share all the same columns
            2) There is a "PrimaryKey" column to merge on; this could be modified to include multiple columns
        Args:
            source_df [dataframe]: source dataframe
            target_df [dataframe]: target dataframe (to merge/upsert into)
            merge_on_fields [list]: field(s) to merge the two dataframes on
            drop_index [dataframe]: whether to drop the merge_on_fields from the merged dataframes and the returned dataframe

        Returns:
            full_df [dataframe]: merged dataframe
        """
        source_df.set_index(merge_on_fields, drop=drop_index, append=False, inplace=True, verify_integrity=True)  # Set Pk to index to enable merging
        target_df.set_index(merge_on_fields, drop=drop_index, append=False, inplace=True, verify_integrity=True)
        full_df = pd.concat([target_df[~target_df.index.isin(source_df.index)], source_df])  # SQL merge aka upsert the two dfs using index as "merged on" field          
        full_df = diff_df  # it is assumed to be the first run and the source dataframe is returned
        if drop_index:
            full_df.drop(columns=merge_on_fields, inplace=True)
        return full_df


More questions

More questions with similar tag