score:1

You seem to have the basic idea right. Hitting the database once for every line in the CSV is going to be way too slow. You can create a "where in" statement via LINQ like so:

var addresses = GetEmailAddresses();
var entries = ctx.Entries.Where(e => addresses.Contains(e.EmailAddress));

However, if you have too many addresses in your list, it'll take a long, long time to generate and evaluate your query. I'd recommend splitting your input list up into batches of a reasonable size (200 entries?), and then using the trick above to handle each batch with a single database check.

Once you've got that working, you can try a few other things to see if they make a measurable difference performance-wise:

  1. Tweak the batch size.
  2. Run the batches independently with varying degrees of parallelism.
  3. Play with indexes on the database tables, particularly on the email address field.
  4. Order the email addresses before breaking them into batches. It's possible that the db queries will take better advantage of hard disk caching strategies.

score:0

You could put the contents of your csv list objects into a table value parameter. Then call a stored procedure, passing in that TVP. The stored procedure could then run a cursor through the 300 databases and joins to your table value parameter (using ad-hoc sql). It will basically be a loop that iterates 300 times which isn't too bad. Something like this:

CREATE PROCEDURE yourNewProcedure
(
    @TableValueParameter dbo.udtTVP READONLY
)
AS

DECLARE @dbName varchar(255)
DECLARE @SQL nvarchar(3000)

DECLARE DB_Cursor CURSOR LOCAL FOR
    SELECT DISTINCT name
    FROM sys.databases
    WHERE Name like '%yourdbs%'
OPEN DB_Cursor
FETCH NEXT FROM DB_Cursor INTO @dbName
WHILE @@FETCH_STATUS  = 0
BEGIN
    SET @SQL = 'UPDATE t
                SET t2.Field = t.Field              
                FROM @TableValueParameter t
                JOIN [' + @dbName + ']..TableYouCareAbout t2 ON t.Field = t2.Field '

    EXEC sp_executesql @SQL, N'@TableValueParameter dbo.udtTVP', @TableValueParamete

    FETCH NEXT FROM DB_Cursor INTO @dbName
END
CLOSE DB_Cursor
DEALLOCATE DB_Cursor

Related Articles