Playing “Fetch” with ASP.NET

August 17th, 2012 Leave a comment Go to comments

Using ASP.NET has many inherent advantages because of which it is preferred by many a .NET development company. With access to multiple .NET framework classes that are the same ones used by all .NET applications, ASP.NET eliminates the need of a COM component just leading to rapid custom .NET development. There is a host of stuff that you can accomplish using ASP.NET:

  • Screen scraping
  • Sending email messages
  • Regular expressions
  • Dynamic GIF and JPG image creation
  • And much more…

In this article we shall focus on the screen scraping with ASP.NET. What essentially happens in screen scraping is that data is extracted from the HTML output of a program by another computer program. Typical examples of the utility of such a function are the pulling of weather data from the Met department’s website to another portal, scores from a sports website onto your personal blog, etc.

The basic tasks involved here can be broken down into:

  • WebResponse object creation and feeding of the ResponseStream into a StreamReader instance
  • Removal of null lines and assigning of the result to a StringBuilder using StringBuilder.Append method
  • Conversion of the StringBuilder to a string and getting the entire scrapped HTML content

Once these things are done, we should have is the required content or data from the desired URL in a string variable, which we can manipulate or display in whichever way we desire. What would we use this output for? Let’s take a practical example. Suppose you want to display the score of a cricket match from a website http://cricketultra/latestscore.aspx (fictitious).

ASPX Code:

<table>
<tr>
<tdalign=”left”>Live Score:
</td>
<td align=”right”><asp:Label ID=”LivScr” Runat=”server” />
</td>
</tr>
</table>

HTML Conversion Function
private string ScrapHTML()
{
WebRequestWebRequest1;
StringBuilderStringBuilder1;
StreamReaderStreamReader1;
stringstrLine = string.Empty;
stringstrHTML = string.Empty;
stringstrURL = “http://cricketultra/latestscore.aspx”;

// Open the requested URL
WebRequest1 = WebRequest.Create(strURL);
// Get the stream from the returned web response
StreamReader1 = new StreamReader(WebRequest1.GetResponse().GetResponseStream());
// Get the stream from the returned web response
StringBuilder1 = new StringBuilder();
try
{
// Read the stream a line at a time and place each one into the stringbuilder
while ((strLine = StreamReader1.ReadLine()) != null)
{
// Ignore blank lines
if (strLine.Length> 0)
StringBuilder1.Append(strLine);
}
// Cache the streamed site now so it can be used without reconnecting later
strHTML = StringBuilder1.ToString();
}
catch
{
}
finally
{
// Finished with the stream so close it now
StreamReader1.Close();
}
returnstrHTML;
}
On the Page
protected void Page_Load(object sender, EventArgs e)
{
int intPos1, intPos2, intPos3;
stringstrHTML = string.Empty;

strHTML = ScrapHTML();
if (strHTML != string.Empty)
{
intPos1 = strHTML.IndexOf(“Live Score:”, 0);
intPos2 = strHTML.IndexOf(“<b>”, intPos1);
intPos3 = strHTML.IndexOf(“</b>”, intPos2);
LivScr.Text = strHTML.Substring(intPos2 + 3, intPos3 – intPos2 + 3);
//LivScr.Text = strHTML;
}
}

GoodCore is a leading offshore software development company. We, at GoodCore are committed to leveraging the wonderful benefits of ASP.NET. For this, we have nurtured a team of best-of-the-breed ASP.NET experts. You can hire .net developers of high caliber from us who are committed to delivering cost-effective solutions within defined timeframes.

  1. No comments yet.

Leave a reply

 
 
 


three × 7 =