score:1
The concat function in pandas allows one to accomplish something similar to a merge.
I encountered a similar issue and was able to develop a function that "merges" two dataframes assuming a shared index. In my case, I turned a field "PrimaryKey" into the index for each dataframe and then merged the two dataframes using this field.
import pandas as pd
def merge_dataframe_diff(target_df, source_df, merge_on_fields: [list] = ["PrimaryKey"], drop_index: [bool] = True)):
"""Merges two dataframes based on shared columns
Assumptions:
1) The dataframes share all the same columns
2) There is a "PrimaryKey" column to merge on; this could be modified to include multiple columns
Args:
source_df [dataframe]: source dataframe
target_df [dataframe]: target dataframe (to merge/upsert into)
merge_on_fields [list]: field(s) to merge the two dataframes on
drop_index [dataframe]: whether to drop the merge_on_fields from the merged dataframes and the returned dataframe
Returns:
full_df [dataframe]: merged dataframe
"""
source_df.set_index(merge_on_fields, drop=drop_index, append=False, inplace=True, verify_integrity=True) # Set Pk to index to enable merging
target_df.set_index(merge_on_fields, drop=drop_index, append=False, inplace=True, verify_integrity=True)
full_df = pd.concat([target_df[~target_df.index.isin(source_df.index)], source_df]) # SQL merge aka upsert the two dfs using index as "merged on" field
full_df = diff_df # it is assumed to be the first run and the source dataframe is returned
if drop_index:
full_df.drop(columns=merge_on_fields, inplace=True)
return full_df
More questions
- Efficient upsert of pandas dataframe to MS SQL Server using pyodbc
- Python sqlalchemy trying to write pandas dataframe to SQL Server using .to_sql
- SQL Server Merge using Pandas Dataframe?
- Python - writing to SQL server database using sqlalchemy from a pandas dataframe
- Import data from SQL Server into pandas dataframe using a list to filter desired values
- What is USING in SQL Server 2008 MERGE syntax?
- SQL Server - merge two XML using only .modify()
- Slow loading SQL Server table into pandas DataFrame
- Insert pandas dataframe created within Python into SQL Server
- Using Dask's NEW to_sql for improved efficiency (memory/speed) or alternative to get data from dask dataframe into SQL Server Table
- How can I transform data in xlsx file removing merge in cells and transposing some columns to ingest data in SQL Server using SSIS?
- SQL Server MERGE statement with "DUAL" in using clause
- Using SQL Server MERGE command with same source & target table
- Pandas error when creating DataFrame from MS SQL Server database: 'ODBC SQL type -151 is not yet supported
- Error when trying to create new database table in SQL Server 2016 from csv file while using python 3.5 with pandas and sqlalchemy
- SQL Server : using MERGE statement to update two tables
- How to merge two tables data in one using SQL Server 2008
- How to save output of SQL Server stored procedure with multiple result sets into each dataframe in pandas
- Using MERGE with same source/target table in Sql Server
- Getting individual count of insert and update using merge statement - C# and SQL Server
- Read data from SQL Server to pandas using pyodbc
- Fetching Million records from SQL server and saving to pandas dataframe
- Writing to a SQL Server database from Pandas using PYODBC
- Using TABLOCK with merge SQL Server
- Using pandas to_sql to append data frame to an existing table in sql server gives IntegrityError
- When using SQL Server UNION ALL, can I merge columns?
- Using Merge with Trigger in SQL Server 2012
- How to upsert pandas DataFrame to Microsoft SQL Server table?
- Pandas Dataframe to SQL Server
- Microsoft Sql Server statement using MERGE is executed for table but never for another table
More questions with similar tag
- SQL Server Agent Job History - Delete for only one day
- Database import from text/csv file and many to many relationship
- Select a row X times
- How to use the IN keyword?
- SQL Server Management Studio Express 2005 has no Configuration Manager
- Split a full name into first and last name in SQL
- SQL Server - Missing Indexes - What would use the index?
- Primary Key violation on datetime with differing values
- Can I count {(Sum non-NULL value in the past time)/ # of non-NULL }?
- islands and gaps tsql
- How to populate gridview using parameter from stored procedure pivot
- I need to use a comma delimited string for SQL AND condition on each word
- Normalize data with index from table
- SMO.Restore.SqlRestore sometimes throws a timeout exception on deployed computers
- Pentaho Kettle: cannot connect to MS SQL Server Express
- how to write a query to get records with n consecutive dates
- Get data from two tables using by data from another table in SQL Database
- Update table based on join multiple rows SQL Server
- Join tables showing all records for each week
- Is it possible to create an SQL Server Azure VM but use your own license?
- Can this Sql statement be refactored to NOT use RANK/PARTITION?
- Is it possible to use database migrations with multiple databases in ASP.NET Core?
- What sort of Index for 'AND' columns?
- SQL Server Order first by ParentID, then Child
- SQL ROUND - What is the actual reason for arithmetic overflow?
- PDO installed but no dblib
- Which .so files can be used for linux system corresponding to .dll files in windows to connect php to ms sql server
- How to get the last inserted row of a group of rows matching some criteria?
- The data types varchar and date are incompatible in the add operator
- Set the default value of a column based on another column of a different data type
Source:
stackoverflow.com