New project.
Main form size: 840, 440. windowstate: Maximized
Add a textbox on top. Size around 300, 20
Add a button on the right of the textbox. Text: Navigate
Add a webbrowser below the first textbox, size: 750, 160, anchor: Top, Bottom, Left, Right
Add another textbox below the first webbrowser. size around 750, 100, anchor: Bottom, Left, Right, scrollBars: Vertical
You can use the wikipedia homepage as the testing website.
Double click the navigate button:
WebBrowser1.Navigate(TextBox1.Text)
TextBox2.Clear()
Double click on the webbrowser, add this:
For Each i In WebBrowser1.Document.All
If i.getattribute("classname").Length > 0 Then
TextBox2.AppendText(i.getattribute("classname") & vbNewLine)
End If
Next
- this will give you all the values for the "class" attribute.
After using it, turn this code (webbrowser sub) into comments with the ' symbol.
Add another button on the right of the "Navigate" button, text: Active Element. Resize the button to fit the name inside. Double click the button, add this:
MsgBox(WebBrowser1.Document.ActiveElement.TagName)
- start program, navigate to a site, click on various parts of the site to move the focus around and press the button. You should get the name of the html element.
Add 2 more buttons to the right of the "Active Element" button, text of the first button: "OuterHtml", text of the second button: "InnerHtml"
Double click on button3, named "OuterHtml", insert into the sub:
MsgBox(WebBrowser1.Document.ActiveElement.OuterHtml)
Double click on the "InnerHtml" button:
MsgBox(WebBrowser1.Document.ActiveElement.InnerHtml)
- click on various parts of the navigated website and press those buttons. If the specific html section you have chosen is large, you might have to wait a while for the webbrowser to pickup the html code.
You should notice that the innerhtml contains the same code that the outerhtml has except for the first parent element.
For example:
<li various attributes> <a various attributes> Some text here </a> </li>
- <li> will be the "parent" html element of <a>, while <a>, and all the other elements in <a>, are regarded as the "children" of <li>. If you try to get the inner and outer html for the same focused location, then the "innerhtml" will be missing it's first parent, that's what the "inner" represents.
Try to press and hold the left mouse button on a link, move away from the link with the cursor without pressing it down and then release it. Press the "OuterHtml" and "Innerhtml" buttons.
Go to the "Innerhtml" button sub and add another msgbox alert below the first one:
MsgBox(WebBrowser1.Document.ActiveElement.InnerText)
- "innertext" doesn't take the html code, only the textual content present on the site. This option is very good for those who just want to retrieve the readable sections of a website and store them into a textbox or a file. A good idea would be downloading the latest news from a news site, or the latest site updates.
Let's retrieve all the focusable elements from a webpage.
Above the webbrowser sub, in the declaration area, insert this:
Dim g
Go to the webbrowser sub, below the previous comments add:
For Each i In WebBrowser1.Document.All
i.focus()
g = g + 1
TextBox2.Text = TextBox2.Text & "Name: " & WebBrowser1.Document.ActiveElement.TagName & vbNewLine
Next
WebBrowser1.Document.ActiveElement.RemoveFocus()
MsgBox(g)
- start the program and try the button. The problem with this "for each" loop is that it will get the name of all elements rather than the focusable elements only. There is much more elements present on the page than the ones you can focus on (switch to with the tab key). So, the first html element appears in the loop, we switch to it's focus location, the loop goes forward to the next element, we focus again but the focus goes to the same location as before, this can happen a couple of times until it finally moves to a new location.
Inside the webbrowser sub, at the start, declare this:
Dim kk As New Point
- New Point holds two integer values that represent the "x" and "y" coordinates on the screen, since we haven't put anything in the brackets during declaration, the default location is going to be (0,0). So, the "kk" variable carries the (0, 0) coordinates at the moment.
Take all the content of webbrowser sub and put it inside this "if" statement:
If g = 0 Then
End If
- if the navigated website is triggering the webbrowser sub more than once this will stop it. The webbrowser code will obviously only go through once when the program is started, after that, "g" will be higher than 0. You will have to add "g = 0" to the "navigate" button and press it again for the code to do this part again. Also you should clear the textbox there.
Inside the webbrowser select this 2 lines:
g = g + 1
TextBox2.Text = TextBox2.Text & "Name: " & WebBrowser1.Document.ActiveElement.TagName & vbNewLine
and and paste this over them:
If WebBrowser1.Document.ActiveElement.OffsetRectangle.Location <> kk Then
g = g + 1
TextBox2.Text = TextBox2.Text & "Name: " & WebBrowser1.Document.ActiveElement.TagName & vbNewLine
TextBox2.Text = TextBox2.Text & "Location: " & WebBrowser1.Document.ActiveElement.OffsetRectangle.Location.ToString & vbNewLine
TextBox2.Text = TextBox2.Text & "Element Size: " & WebBrowser1.Document.ActiveElement.OffsetRectangle.Size.ToString & vbNewLine
TextBox2.Text = TextBox2.Text & "Focused Element Number: " & g & vbNewLine & vbNewLine
End If
kk = WebBrowser1.Document.ActiveElement.OffsetRectangle.Location
- what the new "if" statement does: it compares the location of the current html element which the "for each" loop has taken with the location of the previously taken element (although the first "kk" location is 0,0) -> if they are different -> then write the attributes of the new element into the textbox.
Below the "if" statement we update the "kk" variable with the location of the new element.
WebBrowser1.Document.ActiveElement.OffsetRectangle - this method holds the location and the size of the currently active element.
- we retrieve the location and size here, but we could have specified and taken x, y, height or width instead.
Start the program and navigate to a website. The "g" value should now be lower than before.
Technically, there might repetition of the same focusable elements because that's just how some web pages tend to be structured, if you keep pressing the "tab" key on your keyboard you might eventually come back to an already focused location.
Let's get the attribute of an element we hover over with the mouse pointer.
For the sake of time, go to the webbrowser sub and turn the event it has:
Handles WebBrowser1.DocumentCompleted
into a comment:
'Handles WebBrowser1.DocumentCompleted
- this will disable the sub, you should re-enable it at the end of this post.
Below the button4 sub, declare this:
Dim g1
Add a button below the "Navigate" button, text: Start hover. Double click on this button, add this:
If g1 = 1 Then
RemoveHandler WebBrowser1.Document.MouseOver, AddressOf aa1
Button5.Text = "Start Hover"
g1 = 0
Exit Sub
End If
AddHandler WebBrowser1.Document.MouseOver, AddressOf aa1
Button5.Text = "End Hover"
g1 = 1
- we add the mouseover event to our webbrowser document. The sub that will be handling the event will be "aa1". The first click on the button adds the event, the second click stops the process.
Create the "aa1" sub below the previous sub:
Private Sub aa1(ByVal sender As Object, ByVal e As System.Windows.Forms.HtmlElementEventArgs)
TextBox2.Clear()
TextBox2.Text = WebBrowser1.Document.GetElementFromPoint(e.ClientMousePosition).TagName & vbNewLine
TextBox2.Text = TextBox2.Text & WebBrowser1.Document.GetElementFromPoint(e.ClientMousePosition).GetAttribute("href")
End Sub
- this sub will only do it's job once we add the mouseover event by pressing the "Start Hover" button.
We can hover over a html element, get it's name and also get the link it holds (if it has one).
WebBrowser1.Document.GetElementFromPoint(e.ClientMousePosition) - this allows us to retrieve data from a html element by using the mouse position. If you wanted to get the name of an element resting on some specific coordinates you knew, like: 50, 70, you would have written:
MsgBox(WebBrowser1.Document.GetElementFromPoint(New Point(50, 70)).TagName)
Below the aa1 sub declare this:
Dim g2
Add another button on the right of the "Start Hover" button, text: Start Click, double click on it:
If g2 = 0 Then
TextBox2.Clear()
AddHandler WebBrowser1.Document.Click, AddressOf aa2
WebBrowser1.AllowNavigation = False
Button6.Text = "End Click"
g2 = 1
Exit Sub
End If
RemoveHandler WebBrowser1.Document.Click, AddressOf aa2
WebBrowser1.AllowNavigation = True
Button6.Text = "Start Click"
g2 = 0
- this button will add the click event to the "aa2" sub and stop all navigation. This allows us to click on links without moving away from the main page.
Create the "aa2" sub:
Private Sub aa2(ByVal sender As Object, ByVal e As System.Windows.Forms.HtmlElementEventArgs)
If WebBrowser1.Document.ActiveElement.GetAttribute("href").Length > 0 Then
TextBox2.Text = TextBox2.Text & WebBrowser1.Document.ActiveElement.GetAttribute("href") & vbNewLine
End If
End Sub
- we collect the link of every element that has one into the textbox.
Code:
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
WebBrowser1.Navigate(TextBox1.Text)
End Sub
Dim g
Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted
If g = 0 Then
Dim kk As New Point
'For Each i In WebBrowser1.Document.All
'If i.getattribute("classname").Length > 0 Then
'TextBox2.AppendText(i.getattribute("classname") & vbNewLine)
'End If
'Next
For Each i In WebBrowser1.Document.All
i.focus()
If WebBrowser1.Document.ActiveElement.OffsetRectangle.Location <> kk Then
g = g + 1
TextBox2.Text = TextBox2.Text & "Name: " & WebBrowser1.Document.ActiveElement.TagName & vbNewLine
TextBox2.Text = TextBox2.Text & "Location: " & WebBrowser1.Document.ActiveElement.OffsetRectangle.Location.ToString & vbNewLine
TextBox2.Text = TextBox2.Text & "Element Size: " & WebBrowser1.Document.ActiveElement.OffsetRectangle.Size.ToString & vbNewLine
TextBox2.Text = TextBox2.Text & "Focused Element Number: " & g & vbNewLine & vbNewLine
End If
kk = WebBrowser1.Document.ActiveElement.OffsetRectangle.Location
Next
WebBrowser1.Document.ActiveElement.RemoveFocus()
MsgBox(g)
End If
End Sub
Private Sub Button2_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button2.Click
TextBox2.Text = WebBrowser1.Document.ActiveElement.TagName
End Sub
Private Sub Button3_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button3.Click
MsgBox(WebBrowser1.Document.ActiveElement.OuterHtml)
End Sub
Private Sub Button4_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button4.Click
MsgBox(WebBrowser1.Document.ActiveElement.InnerHtml)
MsgBox(WebBrowser1.Document.ActiveElement.InnerText)
End Sub
Dim g1
Private Sub Button5_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button5.Click
If g1 = 1 Then
RemoveHandler WebBrowser1.Document.MouseOver, AddressOf aa1
Button5.Text = "Start Hover"
g1 = 0
Exit Sub
End If
AddHandler WebBrowser1.Document.MouseOver, AddressOf aa1
Button5.Text = "End Hover"
g1 = 1
End Sub
Private Sub aa1(ByVal sender As Object, ByVal e As System.Windows.Forms.HtmlElementEventArgs)
TextBox2.Clear()
TextBox2.Text = WebBrowser1.Document.GetElementFromPoint(e.ClientMousePosition).TagName & vbNewLine
TextBox2.Text = TextBox2.Text & WebBrowser1.Document.GetElementFromPoint(e.ClientMousePosition).GetAttribute("href")
End Sub
Dim g2
Private Sub Button6_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button6.Click
If g2 = 0 Then
TextBox2.Clear()
AddHandler WebBrowser1.Document.Click, AddressOf aa2
WebBrowser1.AllowNavigation = False
Button6.Text = "End Click"
g2 = 1
Exit Sub
End If
RemoveHandler WebBrowser1.Document.Click, AddressOf aa2
WebBrowser1.AllowNavigation = True
Button6.Text = "Start Click"
g2 = 0
End Sub
Private Sub aa2(ByVal sender As Object, ByVal e As System.Windows.Forms.HtmlElementEventArgs)
If WebBrowser1.Document.ActiveElement.GetAttribute("href").Length > 0 Then
TextBox2.Text = TextBox2.Text & WebBrowser1.Document.ActiveElement.GetAttribute("href") & vbNewLine
End If
End Sub
End Class
Start the program, navigate to a site. See how the buttons work.