Last time we left off with a GUI for our keyword checker program. What would be the next logical building block to add now that we have our checker with a nice preview pane?

Well the preview pane is not too handy if it does not show the part that we are looking for, so let us improve that today. We will use regular expressions to achieve this. In short, a regular expression acts as a text search, but instead of using a specific keyword to search for, it is generally more effective with a pattern that finds a set of text results that match this pattern. So our first piece of code is to add a reference to System.Text.RegularExpressions at the top of our form code.

But first, let’s diverge a bit to describe how we are going to scroll the preview pane to our desired location. Since the web page is in HTML format, we can find an HTML element on the page that we can just scroll to. We just need the id of that element. Thankfully, with regular expressions we can get a list of all the ids inside the page, and then scroll to the one that is closest to our content. Simple.

The regex (a shorter term for regular expression) that we are going to use to find all matches of element ids is this: @"id=""\s*?\S*?"""

And we will use it as follows:

var pageIds = Regex.Matches(pageContent, @"id=""\s*?\S*?""");

This will give us the page ids we wanted and conveniently store them as a list of matches in our pageids variable. Now we also need a private function that will give us the closest element to our content. A function is a piece of code that does a specific task, and we usually create a function for each simple task we need, so that we can use it in different parts of our program without having to rewrite the same code over and over. It could also be used by other programs, if it weren’t for that private adjective I’ve used (in technical terms called access modifier). The private access modifier limits the way that the function can be used only within the same class, in our case the program’s form. We are happy with that, so let’s move on.

Here’s our function:

private string closestId(int keywordLocation,
MatchCollection matchingIds)
{
int? closestId = null;
string closestIdName = null;
foreach (Match id in matchingIds)
{
if (closestId != null)
{
int idDistance = Math.Abs(id.Index - keywordLocation);
if (idDistance < closestId.Value)
{
closestId = idDistance;
closestIdName = id.Value;
}
}
else
{
closestId = Math.Abs(id.Index - keywordLocation);
closestIdName = id.Value;
}
}
return closestIdName;
}

The function, which I named closestId, will take two parameters. The first one is the index of our original keyword search (which is described in the first part of the tutorial), and the second parameter is the list of regex matches. What is important is that this list of matches contains the id and index of each match. What this function does is to iterate through the list of matches in order to find the closest one to our keywordLocation. The distance between each match and the keyword is calculated with the absolute distance function called Math.Abs (now that is a handy public function!). Every time that a new minimum distance is found, we store the value of this distance until we find a better one, whereby it will replace the current minimum. Initially the value of the closest distance is null, so the first match in the list will always be set as the closest in the first iteration. Once the loop ends, we just return the name of the closest id that we found. The function would then be called from the main function like this:

string matchedId = closestId(keywordLocation, pageIds);

Actually, we just need the id of the element without the id= part, so let’s go ahead and strip it off:

string idTag = matchedId.Substring(4, matchedId.Length - 5);

This last piece of code can also go inside the closestId function, so feel free to put it there. The last piece of the puzzle is to navigate to the page as we did before, but by adding the id to the url (prefixed with a hash sign) we get the nice effect of scrolling to the element with this id into view.

brwPreview.Navigate(url + "#" + idTag);

This method is not guaranteed to work 100% of the time, as some website may not have any Id elements or the id of the closest element may not be so close to our content, but it’s a start. I also increased the size of the window from the previous tutorial so that we have more space for the preview pane. The full source code for this tutorial is available on GitHub. Here is a sample screenshot.

It is all good to have the basics of a program working on the console, but we can do better than that. Today we shall add a graphical user interface (GUI) to our keyword checker.

Let us start a new project like we did in part one, only this time choose Windows Forms Application as the project type. This will allow us to add an interface with clickable buttons. Once the project is ready, we will have a number of components that can be drag and dropped from our toolbox (shown below) to our form (the user window).

Add the following components to the form as follows:

Component typeNameText content
labellblUrlURL
labellblKeywordsKeywords
textboxtxtUrl
textboxtxtKeywords
buttonbtnCheckCheck
labellblResult
webBrowserbrwPreview

The resulting form should look as shown here below. We have two text controls so that our users can enter the Url and keywords, a button that will trigger the search process, a label to show a snippet of the website content with the selected keywords, and a web browser to show a short preview of the page that the user is searching. There is no doubt that once finished, this interface will look better than the console application that we had before.

Now we need to convert the code from our console application to make it work with our new form. The code should run when the user clicks the button. We create a handler for the button by double clicking on the button and the editor will wire the necessary handler for us. When running the application, handlers make sure that the correct code is executed when particular events are triggered, in the case the click event of the Check button. Here is the code that goes in the btnCheck_Click(...) method that was just created for us, followed by the explanation:

var client = new WebClient();
var url = txtUrl.Text;
url = !string.IsNullOrEmpty(url) && Uri.IsWellFormedUriString(url,
UriKind.Absolute) ? url : "http://www.gametrailers.com";
var keywords = txtKeywords.Text;
keywords = !string.IsNullOrEmpty(keywords) ? keywords : "final fantasy";
var pageContent = client.DownloadString(url);
var keywordLocation = pageContent.IndexOf(keywords,
StringComparison.OrdinalIgnoreCase);
StringBuilder sb = new StringBuilder();
if (keywordLocation >= 0)
{
sb.AppendFormat("{0} are talking about {1} today.", url, keywords);
sb.Append("\n\nSnippet:\n" + pageContent.Substring(keywordLocation,
100));
brwPreview.Navigate(url);
}
else
{
sb.Append("Keyword not found!");
}
lblResult.Text = sb.ToString();

As usual, first we create a web client instance that will be used to fetch the results. Then comes the new part. Instead of reading the user’s input from the console, we read the url from the form’s textbox. This is done by using the textbox’s name (txtUrl) and reading its Text value (which holds the user’s input), and assigning it to the url variable. We then check if the url is valid, and use the default one otherwise (to understand how this works, take a look at the previous tutorial). We do likewise with the keywords textbox. As we also did previously, then we read the results from the page, check if the required keywords exist, and display the results on the user’s screen.

One difference this time is that we use the StringBuilder class to prepare the output before displaying it (by copying it to lblResult.Text), as opposed to directly writing each part of the result to the console.

The other difference is that now that we have a graphical interface, we can embed a preview of the page inside our form. This can be achieved quickly by using the browser component from our toolbox and pointing it to the selected url (simply done by using the Navigate method). We will improve this in a future tutorial.

Running the project and entering some url and keywords will look as shown here:

Again, a full version of the code is also available on github here.

In the last tutorial, I showed you how to do a quick web client to find out if a particular site contains the keywords that you are interested in. Today, we’ll make a small addition. We are going to add command line parameters, so we can look for the U.S. job situation on a whim.

Command line parameters allow us to add settings to a program when it is being launched, so that our program’s users can choose where and what to look for without having to recompile the program.

If we look at the main method we did earlier, we can see that it accepts the following parameter:

string[] args

This means that any parameters from the command line are contained in this array of strings.

Let’s say that we are going to accept two parameters: the first one for the url to be checked, and the second one for the keywords to find. If only one or no parameters are passed, we will default to our settings that were used in the previous example. The following code is used to read the url parameter:

var url = args.Length > 0 && Uri.IsWellFormedUriString(args[0],
UriKind.Absolute) ? args[0] : "http://www.gametrailers.com";

Here we are creating a new variable called url and setting it to a value. Now comes the interesting part. We are using a ternary operator to set this value. A ternary operation has the following format.

(condition) ? (value when true) : (value when false)

So it is really a shorthand to check for a particular condition and set a value accordingly. It is mostly used when the condition has two possible outcomes. In the conditional statement we check if there are any command line parameters (args.Length > 0) and (&&) the 1st parameter (args[0]) is a well formed url. The ternary operator then takes care that if both of these conditions are true, we use the passed url, otherwise we use the default gametrailers url.

Next we do the same with the keywords parameter:

var keywords = args.Length > 1 ? args[1] : "final fantasy";

This time we only check if there is a second parameter (args.Length > 1) and use it (args[1]) if true or the default keywords. Now let’s end by trying some searches.

The full version of the code is also available on github here.

In this short introduction to C#, I tried to do something different by making a program that scrapes a website to report if today it is showing content that we want to see. The act of scraping for an automated program is to retrieve a website's content in order to obtain some useful information. The example that we will build will check if our favourite website (e.g. gametrailers.com) are posting some information about our favourite game (e.g. Final Fantasy) and any of its modern versions and updates. If so, we can then visit the Gametrailers website safe in the knowledge that we will view something about Final Fantasy.

First, open Visual Studio and create a new console application. I named mine keywordCheck, but you are free to chose your own name.

This will create a standard program class containing a Main method that will be executed every time that we run the program. It is currently empty, so let us fix that.

Since we will be using the system's web client library to connect to and fetch the required page, let us add a reference to that at the top of our class:

using System.Net;

Now let us try to fetch the page that we require, using the following code:

static void Main(string[] args)
{
var client = new WebClient();
var url = "http://www.gametrailers.com";
Console.Write(client.DownloadString(url));
Console.ReadKey();
}

Here, we are first initialising a new instance of a web client and setting it to the client variable. Then we are setting the url variable with the required url to fetch, and finally we instruct the client to fetch this url for us and output the page’s HTML to the console. When we run the program, we can confirm that we are indeed fetching the page:

That’s great, but we’re still not there yet. Let us add a new variable to hold the keywords in. Then we can make a check to see if these keywords are included in the downloaded web page. If the website includes the text that we are looking for, we display a confirmation message:

    var client = new WebClient();
var url = "http://www.gametrailers.com";
var keywords = "final fantasy";
var pageContent = client.DownloadString(url);
if (pageContent.IndexOf(keywords, StringComparison.OrdinalIgnoreCase)
>= 0)
{
Console.WriteLine(url + " are talking about " + keywords +
" today.");
}
Console.ReadKey();

The IndexOf method will return a positive number if the text is found. This number indicates the position in the page where the keywords were found. We also instruct this method to ignore the case when comparing strings so we make sure to still find the keywords even if they are in a different case. The if statement will display a message if the returned number from IndexOf is positive.

To finish off this tutorial, we will also display a snippet of the text where the keywords are included in the fetched website. Nothing big and fancy, but it will give us a general idea of what the page’s content is.

static void Main(string[] args)
{
var client = new WebClient();
var url = "http://www.gametrailers.com";
var keywords = "final fantasy";
var pageContent = client.DownloadString(url);
var keywordLocation = pageContent.IndexOf(keywords, StringComparison
.OrdinalIgnoreCase)>= 0)
if (keywordLocation >= 0)
{
Console.WriteLine(url + " are talking about " + keywords +
" today.");
Console.WriteLine("\nSnippet:\n" + pageContent.Substring(
keywordLocation, 100));
}
Console.ReadKey();
}

And here is the result:

Next time we will see how to improve upon this code, like adding command line parameters or a GUI, for example. A full version of the code is also available on github here, part 2 of the tutorial here.

After the recent release of Internet Explorer 11, you may have noticed that you cannot log-in to your ASP.Net website with this browser if you are using forms authentication with cookies.

You may also have noticed that the session id is being stored in the url (some additional characters are being added to the site’s url seemingly out of nowhere), instead of a cookie when browsing the site only with IE11.

This happens because the cookieless parameter is not specified explicitly (such as UseDeviceProfile or AutoDetect), so it is browser dependant. To solve this issue, this parameter has to be forced in order that all browsers will use cookies to store the session id. Here is an example of the required change:

<authentication mode="Forms">
<forms cookieless="UseCookies" loginUrl="Account/Login" timeout="2880" />
</authentication>