怎么提取网页代码中指定内容?
某数据库网页结构如下:
html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
2015年9月6日 17:18
农业药械:
瞬时流量:175.30 m3/h
累计流量:79438 m3
程序下载到html文件中,名称为1400.html,源代码如上,现在想提取
农业药械:
这一句,怎么编程?------解决思路----------------------
imports System.Text.Regularexpressions
imports System.Text
Public Class Form1
Private Sub Form1_Load(sender As Object, e As EventArgs) Handles Mybase.Load
Dim strWeb As String = IO.File.ReadAllText(Application.StartupPath & "1400.html", Encoding.Default)
MsgBox(strWeb)
Dim re As Regex
re = New Regex("
(.*?)
", RegexOptions.IgnoreCase)If re.IsMatch(strWeb) Then
MsgBox(re.Match(strWeb).Groups(1).Value)
End If
re = New Regex("
(.*?)
", RegexOptions.IgnoreCase)If re.IsMatch(strWeb) Then
For Each mat As Match In re.Matches(strWeb)
MsgBox(mat.Groups(1).Value)
Next
End If
End Sub
End Class