So, what have I been talking about when I say I want to look at blogs? or rather mine them? Well, there are a lot of things to tell, so I shall proceed in no particular order.
One of the most important aspect of blogs (also, a pretty weak aspect) is linking, and thats what we will be looking at in this post. So, what are links, how do they affect the blog-ecosphere and what kind of semantic information they provide? How reliable is the information provided by the links in blogsphere? Can links tell us about what poeple like to read? Can links tell us about what people are writing about? And how about ramifications of links? These are the types of questions I will be trying to answer in the following series.
So what are these links anyway?
For the HTML uninitaiated, links are specific types of tags that starts with an < a > and ends with an < /a > and refer to another HTML page. (for more specific defenition, check out w3c). But in the blogsphere, there is more than a single type of link. Without going into pedantic taxonomy, we can basically classify them as "hyperlinks in posts", and "hyperlinks to other blogs".
Hyperlinks in posts are those that are posted by people inside their posts to link to stuff that they like or refer to within the contents of the posts. Some weblogs, primarily consists of whole lots of interesting links (for an example of that type of weblog look here ). These are pretty important types of links, and often can help us guess better at the type of content in the post, and also if they link to other blogs, then they can be used to analyse the contents of the 'linked' posts.
Hyperlinks to blogs are those that are posted by poeple in their blog side bars etc. These types of hyperlinks provide two entirely different types of information. One, about the people themselves, who they are friends with, etc.. etc.. and secondly, About the content they like to read and most probably write about. But its hard to distinguish which type of hyperlinks provide information of which sort.
How do they affect the blog-ecosphere and what type of semantic information do they provide ?
Links, not only affect the blog-ecosphere, but define the blog-ecosphere. From links and to links tell us who links to who. (the why can be difficult to figure out.. and even if done, might not entirely be accurate.. but we have to work with a certain degree of uncertainity). The linked neighbourhood of a blog, in general can give information about the contents of the blogs. The semantic information is contained in the linked neighbourhood of the blogs, and among the shape of the relationships (shape of the graph) themselves.
But this brings about another question? Well, if immidiete links can give information on the nature of blogs, what about 2nd level or 3rd level links? What about links to links to links? things like that? Well, my contention is that they do, but the amount of relation they hold to the original blog under consideration drops down exponentially.
So, now what kinda of semantic information do they contain? They contain enough information to allow us to categorise blogs in general. Categorising blogs might be a bit more difficult because even a primarily technical blog can be a bit discursive and go into the finesse of germanic languages. The spatial and the topological nature of blogs can provide information about the related content of other blogs.
So what next?
Well, there is more to this than whats in this post, so I will follow up this post with a part 2. Also, there is the temporal aspects of blogs, and why mallika sherawat is a bad data point ;-),
So, till then