Automate Import of a 10 GB XML file to SQL Server -
i 10 gb xml file vendor every day , want import file sql server tables. right approach?
that's big old xml file. how it, , on recent project receiveing big files import.
firstly i'd create sure receiving file zip file or gzip file, i'd in java, done in python or c#. i'd uncompress in stream (not whole file @ once, reading compressed stream).
then i'd parse file in streaming parser. in java i'd utilize stax, in other languages other choices available. i'd reading xml i'd gathering info , writing downwards csv (tab separated) files can passed bcp.exe.
i'm not sure of construction of data, maybe can set in 1 csv file, or maybe need multiple types of csv file. either way i'd seek not create csvs bigger 50mb. 1 time csv file has gona past size threshold, i'd close , pass thread , go on xml parsing.
in sec thread i'd them fork out bcp.exe load data.
if need load multiple tables, can still via 1 csv file, bcp view , have 'instead of insert trigger' on view. trigger can normalise info , lookup primary keys , insert kid tables etc.
if doing in c#, maybe don't need utilize bcp.exe natice mass loading improve java api.
this overall approach of convert chunked csvs, parallel upload, using trigger lookups worked us.
i had version take folder of 6gb of xml spread on hundreds of files, , load db in few minutes. , 4 tables, using 1 csv file union of columns.
sql-server
No comments:
Post a Comment