VB.net extract specific URL using RegEx
I have a working code but this extracts all the links from the site.
strReg = "<a\s+href\s*=\s*""?([^"" >]+)""?>(.+)</a>"
Dim reg As New Regex(strReg, RegexOptions.IgnoreCase)
I want to modify the code to search only specific URLs. For example I only
want to extract URL that contains /test/. My program should only display
links that has the word /test/ in it.
Like:
http://www.website.com/sample/test/
http://www.website.com/test/
What should I change with my RegEx code? Thanks in advance.
Here is my updated working code:
Dim links As New List(Of String)()
Dim htmlDoc As New HtmlAgilityPack.HtmlDocument()
htmlDoc.LoadHtml(WebSource)
For Each link As HtmlNode In htmlDoc.DocumentNode.SelectNodes("//a[@href]")
Dim att As HtmlAttribute = link.Attributes("href")
If att.Value.Contains("/test/") Then
ListBox1.Items.Add(att.Value)
End If
Next
it now displays all URLs with /test/ but I want to extract URLs from a
google search result. is it possible?
No comments:
Post a Comment