score:0

I hope this could help you. After call this function, you could use the result to include every single field in another table from your database or something similar.

Public Function divideThis(CompleteString As String)

    Dim n As Integer = InStr(CompleteString, " by ")
    Dim Field1 As String = CompleteString.Substring(0, n - 1)

    Dim m As Integer = InStr(CompleteString, " on ")
    Dim Field2 As String = CompleteString.Substring(n + 3, m - n - 4)

    Dim o As Integer = InStrRev(CompleteString, "/")
    Dim Field3 As String = CompleteString.Substring(m + 3, o - m - 3)

    Dim Field4 As String = CompleteString.Substring(o, CompleteString.Length - o)

    Dim fragmented As String() = {Field1, Field2, Field3, Field4}

    Return (fragmented)

End Function

score:0

OK try this...It will import the CSV file, splitting the required field and merging it as it goes.

I have modified the code to include the RegEx in the LINQ statement to split that field, it will work if there is NO file name BUT there HAS to be a forward slash at the end AND this MUST be the ONLY field (in you CSV file) that contains forward slashes... the reason being is that you are going to lose a field if the forward slash is missing and NO file name OR you will get too many fields if other fields in your csv file contain a forward slash.....

Sub Main()  
    Dim regex As Text.RegularExpressions.Regex = New Text.RegularExpressions.Regex("\d+")  
    Dim readText() As String = File.ReadAllLines("<Path to your CSV file>")  

    Using connection As SqlConnection = New SqlConnection("<DB connection string>")  

        Dim command As SqlCommand = connection.CreateCommand()
        connection.Open()  

        command.CommandType = System.Data.CommandType.StoredProcedure  
        command.CommandText = "SP_InsertCSVData"  

        For i As Integer = 0 To readText.Count
            Dim readRecord() As String = readText.ToArray

            Dim CSVData As XElement = New XElement("Root",
                            From str In readText
                            Let fields = Text.RegularExpressions.Regex.Replace(str, "(by|on|\/(?=[^\/]*$))", ",").Split(",")
                            Select New XElement("CSVDataRecord",
                                New XAttribute("Field_1", fields(0)),
                                New XAttribute("Field_2", fields(1)),
                                New XAttribute("Field_3", fields(2)),
                                New XAttribute("Field_4", fields(3)),
                                New XAttribute("Field_5", fields(4)),
                                New XAttribute("Field_6", fields(5)),
                                New XAttribute("Field_7", fields(6)),
                                New XAttribute("Field_8", fields(7))
                            )
                        )

            command.Parameters.Clear()
            command.Parameters.Add(New SqlParameter With
            {
                .ParameterName = "@InputXML",
                .DbType = DbType.Xml,
                .Value = CSVData.CreateReader
            })

            command.ExecuteNonQuery()

        Next
    End Using

End Sub

Here is the new table...

/****** Object:  Table [dbo].[CSV_Import_Table]    Script Date: 06/02/2016 01:55:32 ******/   
SET ANSI_NULLS ON  
GO  

SET QUOTED_IDENTIFIER ON  
GO  

CREATE TABLE [dbo].[CSV_Import_Table](  
[Field_1] [nvarchar](max) NULL,  
[Field_2] [nvarchar](max) NULL,  
[Field_3] [nvarchar](max) NULL,  
[Field_4] [nvarchar](max) NULL,  
[Field_5] [nvarchar](max) NULL,  
[Field_6] [nvarchar](max) NULL,  
[Field_7] [nvarchar](max) NULL,  
[Field_8] [nvarchar](max) NULL  
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]  

GO  

And here is the SP...

/****** Object:  StoredProcedure [dbo].[SP_InsertCSVData]    Script Date: 06/02/2016 01:56:20 ******/  
SET ANSI_NULLS ON  
GO  
SET QUOTED_IDENTIFIER ON  
GO  
CREATE PROCEDURE [dbo].[SP_InsertCSVData] (@InputXML xml)  
as  
begin  
MERGE  CSV_Import_Table AS main  
USING (select Row.id.value('@Field_1','[nvarchar](MAX)') as Field_1, Row.id.value('@Field_2','[nvarchar](MAX)') as Field_2, Row.id.value('@Field_3','[nvarchar](MAX)') as Field_3, Row.id.value('@Field_4','[nvarchar](MAX)') as Field_4,  Row.id.value('@Field_5','[nvarchar](MAX)') as Field_5, Row.id.value('@Field_6','[nvarchar](MAX)') as Field_6, Row.id.value('@Field_7','[nvarchar](MAX)') as Field_7, Row.id.value('@Field_8','[nvarchar](MAX)') as Field_8  
            from @InputXML.nodes('/Root/CSVDataRecord') as Row(id)) as stage  
ON main.Field_1=stage.Field_1  
WHEN MATCHED THEN  
    UPDATE SET main.Field_1=stage.Field_1, main.Field_2=stage.Field_2, main.Field_3=stage.Field_3, main.Field_4=stage.Field_4 ,   main.Field_5=stage.Field_5 , main.Field_6=stage.Field_6, main.Field_7=stage.Field_7, main.Field_8=stage.Field_8     
WHEN NOT MATCHED THEN   
    INSERT  (Field_1, Field_2, Field_3, Field_4, Field_5, Field_6, Field_7, Field_8) VALUES (stage.Field_1, stage.Field_2, stage.Field_3, stage.Field_4, stage.Field_5, stage.Field_6, stage.Field_7, stage.Field_8);  
end  

And i'm guessing that you CSV file looks something like this....

Record 1 Field 1, Successful Write by DOMAIN\USERLOGIN on /SOMENETWORKSHARE/FOLDER/MyFolder/USERLOGIN/DesktopBackUp/Log.txt, Field 6, Field 7, Field 8

score:0

Here's a select that you should be able to run over your table to test.

Replace the FROM ( .... ) as tbl bit with your real table

You can see that the expressions in the select are pretty ugly and unwieldy so have a think about whether you are happy maintaining this

There is a WHERE at the end to stop invalid data getting in. If data with missing delimiters gets in, then indexes will end up being negative and you'll get all kinds of errors.

With these kinds of things data quality is always an issue. So we don't want to compound that by assuming the domain will have the same name forever, or assuming that you'll only ever import data from one domain.

So you need to run this against your table, see if it works. If so the next step is:

  • take the file name out of the path
  • change into an update

That should be straightforward

SELECT 
SRC,
LEFT(SRC,CHARINDEX(' by ',SRC)) AS Operation,
SUBSTRING(
    SRC,
    CHARINDEX(' by ',SRC)+4,
    CHARINDEX(' on ',SRC) - CHARINDEX(' by ',SRC)-4
    ) AS LoginName,

RIGHT(
    SRC,  
    LEN(SRC) - CHARINDEX(' on ',SRC)-4
) AS FullPath,

REVERSE(LEFT(
    REVERSE(SRC),
    CHARINDEX('/',REVERSE(SRC))-1)
    ) AS PathAndFileName
FROM 
(
SELECT 'Successful Write by DOMAIN\USERLOGIN on /SOMENETWORKSHARE/FOLDER/MyFolder/USERLOGIN/DesktopBackUp/log.txt' AS SRC
UNION ALL
SELECT 'some invalid stuff' AS SRC
) as tbl
WHERE SRC LIKE '% by %\% on %/%'

An update statement that applies this would be:

UPDATE [dbo].[SACL]
SET 
[Event]=
LEFT(Headline,CHARINDEX(' by ',Headline)),
[Username] = 
SUBSTRING(
    Headline,
    CHARINDEX(' by ',Headline)+4,
    CHARINDEX(' on ',Headline) - CHARINDEX(' by ',Headline)-4
    ),
[Path] = 
SUBSTRING(
    Headline,
    CHARINDEX(' by ',Headline)+4,
    CHARINDEX(' on ',Headline) - CHARINDEX(' by ',Headline)-4
    ),

[FileName]=
REVERSE(LEFT(
    REVERSE(Headline),
    CHARINDEX('/',REVERSE(Headline))-1)
    )
WHERE 
-- Only update rows that fit the pattern
Headline LIKE '% by %\% on %/%'
-- Only update rows that haven't been processed
AND ([Event] IS NULL  OR [Event]  = '')

Please note:

  • I don't know which field contains the full string. I assumed Headline
  • I suggest you standardise on the contents of the Event field. NULL = Not processed and blank means something else (processed but couldn't work it out)
  • I also suggest you add an ImportedDate and ProcessedDate field which is of type datetime. ImportedDate has a default of GETDATE() which means this is the datetime the record was imported. ProcessedDate should be updated by the UPDATE statement to be GETDATE() and this tells you when the record was updated.

More questions

More questions with similar tag