Getting All the Links on a Page

One thing I keep having to do again and again (why???) is extract links from a webpage. I recently created a tiny application that gives you the list of all the links on a given web page, and I thought I’d share it with everyone. Given the url of the page, the first thing to do is to get the source code of the page so that we can screen it for links. There are a couple of ways to do this, but the easiest way, as far as I’m aware of, is to use a WebClient object:

WebClient webClient = new WebClient();

//Get the HTML from the given page byte[] response_html = webClient.DownloadData(url);

UTF8Encoding utf8 = new UTF8Encoding();

string html = utf8.GetString(response_html);

Read more about the WebClient class at MSDN.

Now that we have the source code, the next thing to do is search for all the links. You can do this the hard way — i.e., use the String class’ IndexOf function and hack your way out of the predicaments that come your way (and trust me, there are quite a few of them).

The easy way is to use regular expressions.

The pattern for matching the href = "wherever.whatever" (i.e., the url) part of a link is: href\s=\s(?:"(?<1>[^"]*)"|(?<1>\S+)). Looks a little ugly, but the point is that it works in 90% of the cases. What that regular expression actually means is the subject of another article, but suffice to say, it works.

So, the code then:

private void GetLinks(string url) { //using System.Net WebClient webClient = new WebClient();

     //Get the HTML from the given page
     byte[] response_html = webClient.DownloadData(url);

     //using System.Net
     UTF8Encoding utf8 = new UTF8Encoding(); 

     string html = utf8.GetString(response_html);

//using System.Text.RegularExpressions Regex r = new Regex ("hrefs=s(?:"(?<1>[^"]*)"|(?<1>S+))", RegexOptions.IgnoreCase | RegexOptions.Compiled);

     //using System.Text.RegularExpressions
     //Get all the matches
     MatchCollection mcl = r.Matches(html);

     //using System.Collections
     ArrayList a = new ArrayList(); 

     foreach (Match m in mcl)
         a.Add(m.Value);

//A gridview object grdLinks.DataSource = a; grdLinks.DataBind(); }

Changing Windows Icons Without 3rd Party Applications

Today, we’re going to learn such vitally important things as changing the folder icons in Windows XP. And no, you don’t actually need to install 3rd party applications to tweak your icons. You just need to know where to go to change most of the icons; changing the default folder icon, however, is a bit more complicated, but not much.

Let’s start with the easy part:

Changing Desktop Icons

Right click on your My Computer icon and select Properties. Go to the Desktop tab and click the Customize Desktop button (below the list of wallpapers you can choose for your background). As might be self-evident from the screen, you basically click on each unhappy icon in the middle, click the Change Icon button and point the file browser to the proper location of your new happy icon.

Changing Other Icons

What if you want to change the Microsoft Word Document icon or your mp3 icon or maybe even your open folder icon? It’s very easy. Double click on your My Computer folder (or practically any other folder, for that matter). Go to Tools » Folder Options » File Types. Once again what you need to do is rather obvious. If you scroll down the list of the registered files, you find that there’s practically everything in there, including the open folder one and the normal folder one. To change the icon, you basically do the same thing you do above: click the icon that you want to change, click the Advanced button, click the Change Icon button in the dialog that pops up and point to your new icon.

Changing the Default Folder Icon

If you tried to follow the method mentioned just above this paragraph to change the default folder icon, you might have noticed that the method…err, failed. For some odd reason, Windows doesn’t give you an easy way to change the default folder icon, which kind of sucks, because I rather dislike the default folder icon. It looks particularly ugly when compared to all my new icons. So, after some researching and much head-banging, I found out that there was a pretty easy way to change the default folder icon.

Open up your registry editor(Start » Type in "regedit" in the textbox of the dialog that pops up). Navigate to HKEY_LOCAL_MACHINE / SOFTWARE / Microsoft / Windows / Current Version / Explorer / Shell Icons. Find the registry file called 3. Right-click on it and select Modify in the menu. Type in the path to the location of your new icon, click OK, and restart the computer. That’s it! You have a new default folder icon. :-) (Sometimes, though, you might need to refresh the icon cache for your new icon to turn up.)

Repairing Corrupted Registry File (If It Gets Corrupted, That Is)

As I was doing this, however, I encountered a strange problem — whenever I double-clicked on a folder, instead of opening the folder, Windows opened the search page instead?? Bizarre. I thought I had unrepairably screwed my operating system until I found out that sometimes the responsible registry gets corrupted. The site provides a fix [.vbs file] and how to go about using said file, but in case you don’t want to run the file, you can also repair your registry file manually:

Open up your registry editor (Start » Type in "regedit" in the textbox of the dialog that pops up). Navigate to HKEY_CLASSES_ROOT/Directory/shell. Now right-click on the Default registry file you can see in the right panel and select Modify from the menu. For the Value Data, set the value to be none. This worked for me.

Now, if this tutorial means that you’ve run out of excuses to not go icon-hunting… ;-)

Sql Transactions 101

Transactions allow you to batch a set of SQL so that all of them either succeed or fail together.

In .NET, it's especially easy to create transactions — using SqlTransaction.Suppose you have a monkey object and a fingers object. When you create a monkey, you would of course want to create the monkey’s fingers, as well. And so, you first create the monkey object, get the ID of that object, and then go create the fingers.

The code might go something like this:

public void CreateMonkey() { Monkey monkey = new Monkey();

monkey.Id = Guid.NewGuid();

//Set other monkey data here

bool result = Insert(monkey);

if (result) { Fingers fingers = new Fingers(10);

monkey.Hand.Fingers = fingers;

result = Insert(monkey.Hand.Fingers);

if (result) Console.WriteLine("Success! Good job!");

else Console.WriteLine("Failure! Dismal, just dismal!"); } }

You’ll run into problems when you use this kind of code, as might be obvious to you already. Suppose you successfully created the monkey, but for some odd reason, creating the fingers failed. What now? You have a monkey without fingers running around, which is sad for the monkey and creepy for us humans.

Yes, this is exactly where transactions come in. Transactions let you batch SQL statements together so that either all of them succeed or all of them fail. So, if you were using transactions, you’d either get “no monkey and no fingers” or “a monkey with fingers.”

.NET 2.0 makes it especially easy to use transactions. ;-) I recently had to work with them, and I was impressed with how intuitive they were.

When you want to use a transaction, you basically need to follow the following steps:

  1. Create a SqlTransaction object
  2. Create a SqlTransaction object
  3. Link the transaction to the SqlConnection before running the first SQL statement of the batch.
  4. Link the transaction to the SqlCommand object of each of the SQL statement of the batch.
  5. If everything succeeds, commit the transaction. If there’s an error, rollback the transaction.

So, the modified code:

public void CreateMonkey() { //Need to initialize it to null because otherwise .NET will // complain that we're using an uninitialized object. SqlTransaction tr = null;

SqlConnection cn = new SqlConnection ("ConnectionString");

try { cn.Open();

tr = cn.BeginTransaction();

string sqlInsertMonkey = "Insert Monkey into database";

SqlCommand cmd = new SqlCommand(sqlInsertMonkey, cn, tr);

if (cmd.ExecuteNonQuery() == 0) throw new Exception("Failed to insert monkey");

string sqlInsertFingers = "Insert Monkey's fingers into database";

cmd = new SqlCommand(sqlInsertFingers, cn, tr);

if (cmd.ExecuteNonQuery() == 0) throw new Exception("Failed to insert fingres");

cmd.ExecuteNonQuery();

//Everything executed successfully, so commit the //transaction tr.Commit();

//Close the SqlConnection cn.Close(); }

catch (Exception ex) { //We need to check this, because the exception might //not be SQL related at all; i.e., the transaction //could have completed successfully and we could //be here because of some other problem. if (tr != null) { tr.Rollback(); cn.Close(); }

//Show the appropriate error message here } }

It’s that easy! :-)

1 of 1 pages

On the Side