I was wondering if it was possible to add an identity column to a flat file data source as it is being processed in a data flow. I need to know the record number of each row in the file. Can this be done with the derived column task or is it possible to return the value of row count on each row of the data?
Any help on this is greatly recieved.
Cheers,
Grant
Generating Surrogate Keys
(http://www.sqlis.com/default.aspx?37)
-Jamie
|||
You may have more than one option here; I would try with a script task in control flow that uses a variable to generate and adds it as a column to the pipeline.
Create a SSIS variable Int32. eg MyIdentityColSeed that is set to zero.
Then, in your data flow, after the flat file source create a script task with 1 output column (named for example MyIdentityCol) and use and script like this:
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
Row.MyIdentityCol = Variables.MyIdentityColSeed + Counter
Counter += 1
End Sub
Even when the code is very simple; I did not tested it, so take the time to debug it.
|||Rafael, Jamie,Thanks for both your posts on this matter. I'm looking into Rafael's suggestion at present and see that the only way to access variables in the data flow section is within the PostExecute phase. That problem with this is that it doesn't seem to update the variable as i move through the rows in the dataflow. Is this because of how SSIS has been developped. Can you not set a variable anywhere else within the script. I'll look at Jamie's suggestion once this one has been exhausted.
Thanks,
Grant|||
The Pre and Post execute limitation is just that. It also makes more sense as you only need to read and write the variables pre/post, which is faster than accessing them for every row. Use local variables in the actual main execution code.
You may want to take a quick look at the Row Number Transformation, it may do exactly what you want with seed and increment settings already- http://www.sqlis.com/default.aspx?93
|||Got Rafaels suggestion to work. Didn't need to use a variable in the end just a counter that was initialized on the preexecute event which was incremented for each row. Many thanks for everyone's help.Yet more new things learned today :) .
Thank you all.
Grant|||
Grant Swan wrote:
Rafael, Jamie, Thanks for both your posts on this matter. I'm looking into Rafael's suggestion at present and see that the only way to access variables in the data flow section is within the PostExecute phase. That problem with this is that it doesn't seem to update the variable as i move through the rows in the dataflow. Is this because of how SSIS has been developped. Can you not set a variable anywhere else within the script. I'll look at Jamie's suggestion once this one has been exhausted.
Thanks,
Grant
Errr...both Rafael's and my suggestions were exactly the same
Glad you got it working anyhow!
-Jamie
|||
Grant,
I think Jamie is right; the approaches are identical. I do not keep records of web resources I used; so I was unable, like Jamie, to post a link to the original sorce. I guess a owe some credits to author of the script
Rafael Salas
|||Jamie,Sorry, I'd had a previous link open from the same website and was looking at this which had some resemblence to what i wanted to do:
http://www.sqlis.com/default.aspx?93
As apposed to the link i'd clicked from your post. Apologies and i'll mark your post as an answer also.
Its been a bad day and i've got so many SSIS things in my head :)
Cheers,
Grant|||
Grant Swan wrote:
Jamie, Sorry, I'd had a previous link open from the same website and was looking at this which had some resemblence to what i wanted to do:
http://www.sqlis.com/default.aspx?93
As apposed to the link i'd clicked from your post. Apologies and i'll mark your post as an answer also.
Its been a bad day and i've got so many SSIS things in my head :)
Cheers,
Grant
Ha ha. Cool. I'm not bothered about getting my "answer count" up, as long as the thread has at least one reply marked as the answer. thanks anyway tho!
-Jamie
No comments:
Post a Comment