What global torrent traffic contains?

  • Attempt evaluation. Just the facts.

    Princeton's student, Sauhard Sahi, did a little research to assess what kind of data the global torrent traffic consists of. To do this, he connected to the Mainline DHT network, the main DHT used by Bitorrent, uTorrent, Transmission, etc. (Azureus / Vuze uses a different DHT system by default, but there is a plug-in that allows it to use Mainline DHT) and received data and fragments of 1021 randomly selected torrents in distribution.

    At the same time, it should be noted, one can only say that the distribution of this file is among the active ones, but it cannot be said about the scale of its popularity, and the number of distributors or downloads. In addition, the full download was not carried out, but only a characteristic fragment was obtained that allows you to add an idea of ​​this file or the contents of the torrent if the torrent contained many files.
    It is also worth noting that connecting to DHT allowed us to conduct an analysis without being tied to the specifics of a particular tracker, however, it seems that it excluded some percentage of torrents and customers who do not use DHT from the study (are there any?).

    The analysis gave the following results:
    From the considered group by file types, the files were divided as follows:
    46% - movies and video shows (without porn)
    14% - games and software
    14% - Porn (video and photo)
    10% - music
    1% - books and manuals
    1% - pictures
    14% - failed to classify

    Movies and video shows
    Mainly represented by AVI files, and a number of other types, such as RMVB (RealVideo), MPEG, raw DVD (DVD rips), and various multi-volume RAR archives with such content. It is curious that in this segment, a preponderance towards recent films is clearly visible.
    Of these randomly selected films and videos, 60% were in English, 8% in Spanish, 7% in Russian, 5% in Polish, 5% in Japanese, 4% in Chinese, 4% could not be determined, 3% in French , 1% in Italian, other different languages ​​- 2%.

    Games and software
    No dominant file type was noted in this category. The main file types in this segment were ISO images, multi-volume RAR archives, and EXE files (Windows executables). The games were for various platforms such as XBOX360, Nintendo Wii, Windows PC. 74% of games and software were in English, 12% in Japanese, 5% in Spanish, 4% in Chinese, 2% in Polish, and 1% each in Russian and French.

    Porn
    The dominant format in this category is also AVI, which is similar to the “Movies” category indicators, but there are significantly more MPEG and WMV files. Also, most porn videos in torrents are presented as a full file, a sample of 1-5 minutes, and a poster in JPG.
    Pornovideo was difficult to date, so there was a suggestion that, unlike the tendency revealed in the “movies” group, where the bias towards new films is clearly pronounced, in the porn section they are more evenly distributed along the “time scale”.
    We found that 53% of porn movies were in English, 16% in Chinese, 15% in Japanese, 6% in Russian, 3% in German, 2% in French, 2% could not be classified, other languages ​​such as Italian, Hindi Spanish no more than 1% each.

     

    Music
    The main, dominant file type in this category is MP3, but some albums have met in WMA, as well as ISO images and in multi-volume RAR archives. There is also a steady bias towards new products, although not as pronounced as for movies, perhaps because the seeders will continue to distribute them, even when the music being distributed is not so new, therefore these files are stored in the DHT and torrent sites https://ext.to.
    By languages, this category is distributed as follows: 78% English, 6% Russian, 4% Spanish, 2% Japanese, 2% Chinese, the rest, more rare languages, no more than 1% each.

    Books and manuals
    Books and manuals occupy a distinct minority. It was possible to classify only 15 torrents of this kind. 13 in English, 1 in French, 1 in Russian. In addition, met the sets of posters of the national park, a collection of pictures with cars BMW (both in English) and the Japanese comic.

    Copyright Relations
    Our last classification makes an attempt to figure out what percentage of torrents is copyright infringing.
    We classified as non-infringing objects in the following three categories: those in the public domain, freely accessible from legitimate sources, or user-generated.
    Based on this classification, all of the 476 torrents in the “movies and video show” category were found to be infringing copyrights. We found that seven of the 148 torrents of the “games and software” category looked like non-infringing copyrights (including two Linux distributions, one add-on pack for the game, as well as free software and beta versions). In the porn category, one of the 145 films looked like an amateur video, and we assigned it to non-infringing copyrights. All 98 torrents with music were distribution of infringing copyright. Two of the 15 files handed out as “books and manuals” looked like non-infringing.

    As a result, the authors found that approximately 10 hands out of a total of 1021 torrents could be considered completely non-infringing copyrights, which is approximately 1%.
    This result should be evaluated with caution, since the authors could skip some files, and the samples available to the authors (according to the chosen methodology, we did not download the entire file) could add the wrong impression about the copyright material. However, from the data that emerged from the survey, it should be concluded that today the Bittorrent network, in the overwhelming majority of cases, is used almost exclusively to transmit illegally copied content that violates the copyrights of creators and owners.

No Stickers to Show

X