sayanosauras's Avatar
sayanosauras 3
7 Asked
1 Answered
0 Best
0
No one has voted on this question yet :(
3 years, 3 months ago

I want to use SAX parser in SQL Server to load a table from a complex xml file. Inputs anyone?

Normally DOM is used to parse XML trees in SQL Server, but since SAX is faster so would like to try that out.
I want to get some references about how to accomplish:

To load a SQL Server table by parsing a complex XML file(the source) in SSIS.

Thanks,
Sayan
Tip for best answer: M$0.00
Separate topics with commas, or by pressing return. Use the delete or backspace key to edit or remove existing topics.

You can leave an optional "tip" with Mahalo's virtual currency, Mahalo Dollars. If you are asking a difficult question that might require some research, or if you'd like a wide variety of feedback, a higher tip often leads to more answers to your question.

M$

What is Your Answer?

0
0
0

2 Answers

0
shakespearegeek's Avatar
shakespearegeek | 3 years, 3 months ago
4
I have to admit that I'm more familiar with XML than SQL Server, but your question left me wondering if you're familiar with the differences between SAX and DOM? It's more than just one being faster. Specifically, DOM is used to load the entire tree into memory (or at least, accessible space) at once, so that you can jump around as needed. SAX is more of a flat, event-driven parse that goes through the source just once. So there's a possibility that what you want to do (load a table by parsing an XML file) cannot actually be done in SAX. For instance, if the parse needs to know all the potential Table columns up front, it couldn't do it with SAX, but could with DOM.

Sorry if that's not more helpful, but seemed like it had to be said. Good luck!
source(s):
Tried researching "sax parsers in sql server" but while there appear to be many articles, they always seem to go to premium magazine subscription sources and thus don't make good links.

You can leave an optional "tip" with Mahalo's virtual currency, Mahalo Dollars. If you are asking a difficult question that might require some research, or if you'd like a wide variety of feedback, a higher tip often leads to more answers to your question.

M$
sayanosauras's Avatar
sayanosauras | 3 years, 2 months ago Report

I am a newbie in XML, so know almost nothing, but hey, I get your point....yes now I reckon the question was shaky right from the start. Thanks for showing the way!

Report Abuse

Post Reply Cancel
0
bucabay's Avatar
bucabay | 3 years, 3 months ago
4
" For instance, if the parse needs to know all the potential Table columns up front, it couldn't do it with SAX, but could with DOM. "

It would really depend on the structure of the XML. If the columns were added to a parent node, which is common, parsing code would not need to cache much information. But, since you treat the XML as a stream when using a SAX parser, you would need to cache each stream until you have enough data for an insert. This would mean multiple small inserts into SQL server which would not be efficient as a bulk insert especially if you have a seperate machine or cluster for the DB and if the inserts are transaction based.

If you have large files, there would be a point where using SAX would outweigh the overhead of small SQL inserts, compared to the memory usage of DOM.

Here is some information on both DOM and SAX in SQL server. 
http://www.informit.com/articles/article.aspx?p=102307&seqNum=1

Apparently there isn't that much information to find, and since I am not familiar with SQL server, take my advice as generic based on Database and XML experience.

You can leave an optional "tip" with Mahalo's virtual currency, Mahalo Dollars. If you are asking a difficult question that might require some research, or if you'd like a wide variety of feedback, a higher tip often leads to more answers to your question.

M$
sayanosauras's Avatar
sayanosauras | 3 years, 2 months ago Report

Thanks, yes I guess in my case DOM would be a better option because the XML is really complex.

shakespearegeek's Avatar
shakespearegeek | 3 years, 3 months ago Report

I don't know, B -- in order to determine the number of columns, you pretty much have to stream through the entire file to find the one at the end (by definition). And if your table creation logic is such that you can't really do anything until you have that number (which is where my lack of SQL Server knowledge shows), you would essentially have to cache everything. And if you do that, you're basically doing the job the DOM would have done in the first place.

I look at SAX for cases where I can reasonably say, "I will know what to do with this information when I get it." For each time I have to say, "I'm not sure what to do with it, I'll have to save it over here until I get more information..." I lean more towards DOM.

Report Abuse

Post Reply Cancel

Learn something new with our FREE educational apps!

Private lessons in the comfort of your own home. Get back in shape or finally pick up a guitar with our great experts guiding you the whole way!
Learn Guitar
Learn Hip Hop
Learn Pilates