in

ArtOfTest, Inc. Community Forums

Discuss and ask questions about ArtOfTest's products.

IE DOM structure doesn't match View Source

Last post 03-19-2008 7:53 PM by admin. 5 replies.
Page 1 of 1 (6 items)
Sort Posts: Previous Next
  • 02-21-2008 6:42 PM

    IE DOM structure doesn't match View Source

    It appears that the DOM presented by the AOT Browser.DomTree.Root does not match the View->Source output for IE 7.

     

    I've used the Web Developer BHO plugin (supplied by Microsoft) for IE to verify that the internal DOM in IE matches the IE View->Source.

    Here's a snippet of the DOM where I'm seeing the problem. The first part is the View->Source output. The second part is the output of a DOM walker that I wrote that just displays the tag, with its index, the tag attributes and values, and recursively displays the ChildNodes (properly indented). The ":-1:" are just #text# nodes. For some reason they always show up as TagName Index of -1.

    The part that is showing the problem is  the node '<p:0>' (the ':0' is the TagNameIndex of the node).

    In the IE View->Source, this <P> node is a parent node to a whole set of nested nodes. In AOT, it is a childless node.

    I believe this is a bug.

    ------------------------------- attachment ------------------------------------------

    [[[[[[[[[[[ --- IE VIEW SOURCE Output --- ]]]]]]]]]]]]]]

    <TD class=search_BODY>
        <SPAN class=search_TITLE>
            <!-- Start DISPLAY_ORGANIZATION tag -->
            <SPAN></SPAN>
            <!-- End DISPLAY_ORGANIZATION tag -->
        </SPAN>
        <BR>
        <!-- Start BODY tag -->
        <SPAN>
            Progressive...
            <A href="mailto:npri.employment@gmail.com">
                ...
            </A>
            .
        </SPAN>
        <!-- End BODY tag -->
        <P>
            <STRONG>
                <FONT color=#246485>Contact Information:</FONT>
            </STRONG>
            <BR>
            <!-- Start DISPLAY_CONTACT_INFO tag -->
            <SPAN>
                <P>
                    <A href="mailto:...">...</A>
                    &nbsp;
                </P>
            </SPAN>
            <!-- End DISPLAY_CONTACT_INFO tag -->
        </P>
        <P>
            <STRONG>
                <FONT color=#246485>Closing Date:</FONT>
            </STRONG>
            <SPAN>
                Open until filled
            </SPAN>
            <!-- Start EXPIRE_DATE tag -->
            <!-- End EXPIRE_DATE tag -->
            <BR>
            <STRONG>
                <FONT color=#246485>Posted:</FONT>
            </STRONG>
            <SPAN>
                February&nbsp;21,&nbsp;2008
            </SPAN>
            <!-- Start DATEPOSTED tag -->
            <!-- End DATEPOSTED tag -->
        </P>
    </TD>

    [[[[[[[[[[ -- displaying the DOM structure using AoT -- ]]]]]]]]]]]]]]
    2008-02-21:20.09.24.415>
    <td:37 class=search_BODY>
    |   <span:6 class=search_TITLE>
    |   |   <!--:-1: <!-- Start DISPLAY_ORGANIZATION tag -->-->
    |   |   <span:7 />
    |   |   <!--:-1: <!-- End DISPLAY_ORGANIZATION tag -->-->
    |   </span>
    |   <br:9 />
    |   <!--:-1: <!-- Start BODY tag -->-->
    |   <span:8>
    |   |   :-1: Progressive...
    |   |   <a:19 href="...">
    |   |   |   :-1: ...
    |   |   </a>
    |   |   :-1: .
    |   </span>
    |   <!--:-1: <!-- End BODY tag -->-->
    |   <p:0 />
    |   <strong:0>
    |   |   <font:0 color=#246485>
    |   |   |   :-1: Contact Information:
    |   |   </font>
    |   </strong>
    |   <br:10 />
    |   <!--:-1: <!-- Start DISPLAY_CONTACT_INFO tag -->-->
    |   <span:9>
    |   |   <p:1>
    |   |   |   <a:20 href="...">
    |   |   |   |   :-1: ...
    |   |   |   </a>
    |   |   |   :-1: &nbsp;
    |   |   </p>
    |   </span>
    |   <!--:-1: <!-- End DISPLAY_CONTACT_INFO tag -->-->
    |   <p:2 />
    |   <p:3>
    |   |   <strong:1>
    |   |   |   <font:1 color=#246485>
    |   |   |   |   :-1: Closing Date:
    |   |   |   </font>
    |   |   </strong>
    |   |   <span:10>
    |   |   |   :-1: Open until filled
    |   |   </span>
    |   |   <!--:-1: <!-- Start EXPIRE_DATE tag -->-->
    |   |   <!--:-1: <!-- End EXPIRE_DATE tag -->-->
    |   |   <br:11 />
    |   |   <strong:2>
    |   |   |   <font:2 color=#246485>
    |   |   |   |   :-1: Posted:
    |   |   |   </font>
    |   |   </strong>
    |   |   <span:11>
    |   |   |   :-1: February&nbsp;21,&nbsp;2008
    |   |   </span>
    |   |   <!--:-1: <!-- Start DATEPOSTED tag -->-->
    |   |   <!--:-1: <!-- End DATEPOSTED tag -->-->
    |   </p>
    </td>


     

    Filed under:
  • 02-21-2008 7:39 PM In reply to

    Re: IE DOM structure doesn't match View Source

     Here's a smaller case with a similar problem. AOT DOM shows a <TD> node to be a direct child of a <TBODY> node when in fact it should have been a child node of a <TR> node as in the View->Source for IE7. Note that <TD:2> is a sibling of <TR:1> when it is really a child of that node.

    Here are the snippets.

     [[[[[[[[[[[[[[ -- IE VIEW->SOURCE snippet -- ]]]]]]]]]]]]]]]]

    <table width=600>
        <tr>
            <td align="right" bgcolor="#C0C0C0">
                <b>
                    <i>Position Title:
                </b>
            </td>
            <td width=450>
                ...
            </td>
        </tr>
        <tr>
            <td align="right" bgcolor="#C0C0C0">
                <b>
                    <i>Salary:
                </b>
            </td>
            <td>
                ...
            </td>
        <tr>


    [[[[[[[[[[[[[[[[ --- AOT DOM tree snippet -- ]]]]]]]]]]]]]]]]]]]

    <table:0 width=600>
    |   <tbody:0>
    |   |   <tr:0>
    |   |   |   <td:0 align=right bgColor=#c0c0c0>
    |   |   |   |   <b:0>
    |   |   |   |   |   <i:0>
    |   |   |   |   |   |   :-1: Position Title:
    |   |   |   |   |   </i>
    |   |   |   |   </b>
    |   |   |   </td>
    |   |   |   <td:1 width=450>
    |   |   |   |   :-1: ...
    |   |   |   </td>
    |   |   </tr>
    |   |   <tr:1 />
    |   |   <td:2 align=right bgColor=#c0c0c0>
    |   |   |   <b:1>
    |   |   |   |   <i:1>
    |   |   |   |   |   :-1: Salary:
    |   |   |   |   </i>
    |   |   |   </b>
    |   |   </td>
    |   |   <td:3>
    |   |   |   :-1: ...
    |   |   </td>
    |   |   <tr:2 />

    I have a screen capture from Visual Studio showing the absurd children problem but I don't know how to attach it (Chris, I'll send it via a separate email).

    -David 

     

    Filed under:
  • 02-21-2008 9:39 PM In reply to

    Re: IE DOM structure doesn't match View Source

    Actually looking at the view source looks like the last <TR> tag is mal-formatted:

    <table width=600>
        <tr>
            <td align="right" bgcolor="#C0C0C0">
                <b>
                    <i>Position Title:
                </b>
            </td>
            <td width=450>
                ...
            </td>
        </tr>
        <tr> 
            <td align="right" bgcolor="#C0C0C0">
                <b>
                    <i>Salary:
                </b>
            </td>
            <td>
                ...
            </td>
        <tr>   <--- This should be a </TR>

    WebAii natively parses the markup from IE's DOM and does not rely on IE's parser. We are a bit stricter in terms of malformatted mark-up since we think part of that is helping customers identify where the markup might be malformatted instead of masking these errors.

     So we WebAii in your case above does not find a closing tag for <TR> so it simply assumes the inner content are simply siblings to it. I haven't looked at your earlier sample but I'm guessing the the <BR> tag is causing the problem, it should be a <BR /> .

     

    Hope that helps.

    ArtOfTest, Inc.
  • 02-22-2008 7:09 AM In reply to

    Re: IE DOM structure doesn't match View Source

    Unfortunately, the HTML is not mine and I do not control it.

    I took a closer look at the View->Source and indeed it is malformed with respect to several of the tags. The <i> is not matched with a corresponding </i> and some of the <tr> nodes aren't properly closed by a corresponding </tr>.

    Nevertheless, I have several observations:

    1) The AOT DOM is correct when the browser is FireFox - the malformed <tr> is correctly set up as the child of the <tbody> even though the </tr> is physically missing from the original HTML. In other words, FireFox does the parsing properly even though the original HTML is malformed.

    2) The IE DOM, as displayed by  the Web Developer BHO plugin, is also correct - the <tr> was elevated to be a child of the <tbody>. In other words, IE also gets the DOM structure right even through the original HTML is malformed.

    3) If you look carefully at where the first malformation of the <tr> occurs, you see that it happens after the AOT DOM error of the previous <td:2>. I would expect the malformation of the <tr> to affect subsequent nodes, not prior nodes. It does not make sense that a later error in the <tr:2> should have caused a <td:2> which was properly nested (in the HTML) to a <tr:1> to become improperly nested and become a child of the <tbody:0>.

    4) The AOT DOM parser handled the missing </i> nodes properly. It recognized that the closing </b> node must be implicitly closing the <i> node. I do see that this is a different type of implicit closing of a tag. In the <i> case the presence of a '</' on the </b:0> is clear indication of implicit closing, but in the <tr> case, there is no closing </tr> to trigger a simple implicit closing. In this case it is triggered by 'global' knowledge that <tr> nodes need to be children of <tbody> nodes.

    Suggestion: Improve the AOT IE DOM parser to better recognize implicit tag closing. The parser should be smarter and recognize that <tr> nodes are only valid as children of the <tbody> hence when a <tr> shows up at the wrong nesting level, promote to be the direct child of the <tbody> and thus implicitly close all the open tags.

    Also, in IE's case, perhaps your native DOM parser can 'consult' the real IE DOM to discover when it parses differently from IE's parser.

    -David 

    Filed under:
  • 03-17-2008 8:36 AM In reply to

    • Jony.cs
    • Top 25 Contributor
    • Joined on 12-03-2007
    • Simferopol, Ukraine
    • Posts 18

    Re: IE DOM structure doesn't match View Source

    Actually I have a same issue. Let me try to explain what I faced on. I have a div element <div>...</div>. Into this div I'm appending a child - table, and into the table I'm appending a tbody with some structure. A final version of html looks like the next one:
    <div>
      <table>
        <tbody>
          <tr>
            ...
          </tr>
        </tbody>
      </table>
    </div>
    

    The IE and Mozilla parsers represents html correctly.
    But for some reasons WebAii parser thinks that the struructure of the html looks like the next one:
    <div>
      <table>
        <tbody>
        <tr>
            ...
        </tr>
      </table>
    </div>
    
    What am I doing wrong?
  • 03-19-2008 7:53 PM In reply to

    Re: IE DOM structure doesn't match View Source

    We are going to apply few fixes in the parser to help address issues reported in this post. Expect these fixes to be in the final WebAii 1.1. release.

     

    ArtOfTest, Inc.
Page 1 of 1 (6 items)
Copyrights © 2008 ArtOfTest, Inc. All rights reserved.