The last few days I’ve been working on some crazy SQL. The project will be storing an xml representation of data (let’s just say it’s a resume), in one of it’s columns. The data spans a bunch of tables with lookups and such. Version 1 of the script ended up being 2251 lines long and just a wee bit of a pain in the arse to manage.
We paired it down into smaller stored proc chunks, which not only made managing the beast easier but made it way more flexible.
We decided to put the heavy lifting on SQL server 2k5. THe data is already there so why bring it down to CF just to put it back up into the DB?
Using SQL servers XML functions is somewhere between easy and hard. I suppose we started with the hardest approach, "For Explicit".
The jist is this. For Explicit puts all the control on the developer. Auto and Raw will generate XML based on how the data comes out of the query. For Explicit on the other hand, puts it all squarely in the hands of the developer. Writing an XML file is easy. Conceptualizing how it should look in SQL script in the form of a query, not so much.
Basically you use your first query to outline the layout of the xml file. all subsequent queries populate different nodes. Each query is unioned to the first. This is why it gets hairy. YOu have a first query with 30 or more fields, each following query has to have the same number of columns, some NULL, some not depending on where in the XML you are. In my example that meant the same basic query over 20 times, with about 25 or so columns. Here’s the first query.
set @xmldata = (
select *
from (
SELECT
1 as Tag,
NULL as Parent,
cr.ID as [ResumeCore!1!ID],
cr.GUID as [ResumeCore!1!GUID],
cr.CandidateResumeTypeID as [ResumeCore!1!typeID],
cr.ResumeName as [ResumeCore!1!Name],
NULL as [resume_content!2!!ELEMENT],
NULL as [resume_contact_info!3!ID],
NULL as [resume_contact_info!3!GUID],
NULL as [resume_contact_info!3!resFirstName!cdata],
NULL as [resume_contact_info!3!resMiddleName!cdata],
NULL as [resume_contact_info!3!resLastName!cdata],
NULL as [resume_contact_info!3!resProfessionalTitle!cdata],
NULL as [resAddress!4!ID],
NULL as [resAddress!4!GUID],
NULL as [resAddress!4!res_address_addressLine1!cdata],
NULL as [resAddress!4!res_address_addressLine2!cdata],
NULL as [resAddress!4!res_address_addressCity!cdata],
NULL as [resAddress!4!res_address_addressState!cdata],
NULL as [resAddress!4!res_address_addressCountry!cdata],
NULL as [resAddress!4!res_address_addressPostalCode!cdata],
NULL as [resContacts!5!!ELEMENT],
NULL as [resContact!6!ID],
NULL as [resContact!6!GUID],
NULL as [resContact!6!TypeID],
NULL as [resContact!6!!cdata]
from candidateResume cr
inner join candidateResumeType rt on rt.id = cr.candidateresumeTypeID
where cr.id = @resumeID AND cr.active = @active and cr.deleted = 0
union all
As you can see the column name is weird. THat’s what controls the output. Imagine this over and over. Each column having to line up.
One thing I learned later. SQL ignores column names on all succeeding queries. Only the first one is used for laying out the XML output.
After about three straight days of nothing but SQL I consider myself a little bit more "senior" in my SQL skillset.