We hear a lot about illegal downloads, pirating, torrents in the media and online but how much do you really know about this subject and maybe more importantly, how big the problem really is? Well, here at Dotegy, we used some of our technology to put together a little case study, just to show you how wide and deep the rabbit hole goes.

We decided to just use the film industry in this example but illegal downloads (and sharing) spreads into software, games, books, music and television. FACT (The Federation Against Copyright Theft) estimates that

However, some of these statistics have been labelled as wildly inaccurate and scare mongering. It's easy to see why these numbers may be appealing but to use a concrete example, just because someone downloads a film, it does not mean that they would have purchased the film on DVD or have gone to the cinema to see it. That's not a defence of the illegal download and sharing, it's merely an important point to bear in mind. The honest truth is that the real number simply is not known but we can say it DOES hurt sales.

So back to our case study. Let's start by saying that it is incredibly easy to download and share files these days. It's difficult for me to qualify exactly how easy this is but in the days of dial-up, the amount of sharing was far less, the quality of what was being shared was far poorer and downloading might take hours or days due to the bandwidth of the time (dial-up, ADSL, etc). So, the time and effort involved in downloading and sharing wasn't necessarily worth the payoff, especially if at the end of the event, you were left with a fake (more on that later). These days, it's a quantum leap forward (never understood why quantum leap means huge when a quantum is very very small!). The speed of individuals internet connections has had a lot to do with this, so that now, you can download and share in minutes rather than hours or days. The number of versions or copies of what you're looking for are far greater, meaning even if the download is a fake, you simply try the next copy and probably most importantly but linked to all of this is that the number of people now doing it has risen and therein lies the largest question - why?

Whilst it's clear, the freedom of the internet is still hugely important to a lot of internet users and a lof of individuals see that the cost of copyright content is not fairly priced. This is a self-endorsement for the case to share or make available content for people to download - crusaders of the cause if you like. We're not just talking about these crusaders of the cause though, we're talking about the average Joe, who is looking for, downloading and then re-sharing that content OR uploading content they have themselves. So the why comes back to (a) it's easy (see above) but (b) they believe it's harmless and/or cannot get caught. This is the crucial part and is the difficult nut to crack. We will come back to this later as well. Let's crack on with the example.

So, let's type the 'The Brothers Grimsby' into Google and see what we get. Well the first 3 pages are all reviews, or related marketing material to the film. There are 2 DMCA delisted entries which might have been relating to illegal downloads but we don't know. So let's change the query somewhat and add the word download. That changes ALOT! The results are:

Our first few results are all NOT torrent downloading sites but specific websites intended to promote downloading and pretty much every one you visit will either include a link off to another site to actually do the work or people will have added comments providing their own links to other download locations. In the case of the Facebook page it's a massive advert for the way .xyz is one of the new gTLD extensions (the largest and not without controversy).

We're going to take a step back at this point. As interesting as this type of manual search is, what we're going to do is automate our life slightly and behave a bit more like the serial downloader (I wanted to say professional but that doesn't feel right!) If you type 'where to download torrents' into Google, the first result you find will be a link to this page:

..and that's pretty much all you need to get going. At this point I'm going to mention something very important to bear in mind. Torrents, torrent files, sharing, downloading...all of this is NOT illegal if it's not copyrighted content expressly forbidden to be distributed in this manner. Torrents and peer to peer file sharing networks were developed NOT for the express purpose of pirating and sharing copyrighted content...despite what tabloid hacks might tell you. Unfortunately like a lot of technology it's become synonymous with illegal activity.

From here, we can examine each of the 21 different versions to see which torrent sites are holding a copy of the 'torrent file'. This yields a rather eye watering 341 results. This means, there are 341 (at least) copies of the torrent file(s) available on the internet. To make a little more sense of this for you..the first result or type/version of this film has a torrent file located at 10 different torrent websites. This torrent file is IDENTICAL at each of the torrent websites and it is NOT a copy of the film. A torrent file can be thought of as something which will tell your download software where it CAN find the complete version or parts of the film for download. In the case of, this first result we're examining has been voted the most popular by users of the site and verified as either legitimate, best quality, etc. So we're pretty sure it's not a fake.

Using a bit of clever Dotegy technology we now go off and verify these 341 links to see if (a) they still exist or have been removed and (b) contain a version of the torrent file that is good and can be used to download the film (c) identify the number of seeds (computers with complete copies of the file) and peers (computers with partial copies but involved in the sharing of the parts they have).

Concentrating on this first highly rated version, we detect 111 seeds (that's complete copies of the file remember) and 79 peers (people with partial amounts of the file but involved in the upload/download).

Picking on 1 IP address from the seeds list, we can track that to Hereford in the UK and managed by BT as an Internet Service Provider. This part is where it gets geeky-interesting. This server/computer/individual appears to be a home user with a DSL connection to the internet. It infers they are not using a VPN or proxy to try and hide their identity and that they are content to leave a complete copy of this file on their machine for anyone to download. This assumes of course they are not unwitting in what is happening (hacked machine, someone else using their home network, etc). In any case it's one of the following: lazy, stupid, brazen...and I'm not sure which. The follow up question and part of the Big Data analysis is, 'what else do they have?' and remember I started off by saying this film was not a large success!

So, yes, there are processes in place to remove these files but after several business days, which means, these files have been shared a lot and the downloads of complete copies has already happened. Torrent files can then be re-published with different information and your fighting an on-going and difficult problem.

