Wednesday, 15 August 2012

Automate Import of a 10 GB XML file to SQL Server -



Automate Import of a 10 GB XML file to SQL Server -

i 10 gb xml file vendor every day , want import file sql server tables. right approach?

that's big old xml file. how it, , on recent project receiveing big files import.

firstly i'd create sure receiving file zip file or gzip file, i'd in java, done in python or c#. i'd uncompress in stream (not whole file @ once, reading compressed stream).

then i'd parse file in streaming parser. in java i'd utilize stax, in other languages other choices available. i'd reading xml i'd gathering info , writing downwards csv (tab separated) files can passed bcp.exe.

i'm not sure of construction of data, maybe can set in 1 csv file, or maybe need multiple types of csv file. either way i'd seek not create csvs bigger 50mb. 1 time csv file has gona past size threshold, i'd close , pass thread , go on xml parsing.

in sec thread i'd them fork out bcp.exe load data.

if need load multiple tables, can still via 1 csv file, bcp view , have 'instead of insert trigger' on view. trigger can normalise info , lookup primary keys , insert kid tables etc.

if doing in c#, maybe don't need utilize bcp.exe natice mass loading improve java api.

this overall approach of convert chunked csvs, parallel upload, using trigger lookups worked us.

i had version take folder of 6gb of xml spread on hundreds of files, , load db in few minutes. , 4 tables, using 1 csv file union of columns.

sql-server

No comments:

Post a Comment