Dark Web And Tor Forensic

Over the last few years, cybercrimes have become more intense, sophisticated and potentially debilitating for individuals, organizations and nations. Law enforcement agencies are finding it difficult to check and prevent the crimes in the cyber space because the perpetrators of these crimes are faceless and incur very low cost to execute a cybercrime whereas the cost of prevention is extremely high. Targets have increased exponentially due to the increasing reliance of people on the internet. Cybercrimes which were restricted to computer hacking till some time ago, have diversified into data theft, ransomware, child pornography, attacks on Critical Information Infrastructure (CII) and so on, in this negative context the Dark Web has been in the focus of the media regularly in recent years.

The Dark web is turning out to be a familiar term nowadays. It invokes some common impressions like the place of shady dealings, anonymous communications, illicit materials etc. With the takedown of the ‘Silk Road’ website in October 2013 by the FBI, the Dark Web entered the awareness of large parts of the population. In February 2015, the FBI took the infamous Dark Web site ‘Playpen’ offline, which hosted more than 23,000 child pornographic images and videos and had more than 215,000 users. As part of the preparation for the terrorist attacks in Paris in November 2015, the communication was anonymized by using the software Tor; while the weapon used in the shooting rampage in Munich in July 2016 was also acquired over the Dark Web. Beside drugs, weapons, and child pornography, every kind of information is sold via marketplaces on the Dark Web, from credit cards to sensitive information captured during data leaks or hacking attacks. The latter can pose new challenges for the society.

Inherent anonymity and closed nature of the Dark web has turned it into a safe haven for cyber criminals and their wares. The dark web hosts a wide range of illegal online markets of cyber exploit kits, drugs, counterfeit documents, stolen credit cards, bank account credentials, human trafficking, illegal immigration, etc. The Dark web has thousands of forums which operate in a tightly controlled environment. Crypto currencies are used for transactions so that these transactions cannot be traced to individuals or organizations.

To understand the opportunities and weaknesses when using the Dark Web, some knowledge of how anonymization networks work is required. Therefore, terms with respect to the Dark Web are explained. These are often mixed, but must be clearly separated. This is followed by an investigation into the security levels of the Dark Web, since this is fundamental for an evaluation of the transactions to be expected there.

Terminology

Quite often, the terms Dark net, Deep web, Dark web and others terms are improperly mixed or used interchangeably. Due to insufficient separation and misuse of terms, data and evaluations can be incorrectly assigned and falsify the actual situation.

Deep Web

The Deep Web “refers to any Internet information or data that is inaccessible by a search engine and includes all websites, intranets, networks and online communities that are intentionally and/or unintentionally hidden, invisible or unreachable to search engine crawlers” (Janssen 2018). The term, Deep Web, “relates to deep sea/ocean environments that are virtually invisible and inaccessible” (Janssen 2018). Therefore, the Deep Web “contains data that is dynamically produced by an application, unlinked or standalone Web pages/websites, non-HTML content and data that is privately held and classified as confidential. Some estimate the size of the Deep Web as many times greater than the visible or Surface Web” (Janssen 2018).

Dark net

From a technical and historical point of view, the term ‘Dark net’ is used to describe the part of the IP address space which is routable, but not in use. This must be differentiated from addresses, which should not be routed by definition. In the still predominantly used internet addressing architecture, Internet Protocol version 4 (IPv4), specific addresses are defined as private. By using them, a router can provide connectivity to numerous attached devices by using its own public address, translating the traffic between the private network and the internet. The respective private addresses are not visible on the internet; therefore, they should not be routed, and only routable addresses can be seen. By monitoring these unused but routable addresses, a lot of observations with respect to security can be made normally, nobody should interact with them. So if some interaction can be seen, the underlying behaviour is typically malicious, e.g., an automated worm run looking for target addresses to infect. This security-relevant part of the address space is called the Dark net.

One of the early uses of the term with regard to digital content can be found in an article about content protection. It described Dark nets as a ‘collection of networks and technologies used to share digital content’ (Biddle 2002). Nowadays, the term is mainly used for overlay networks providing anonymous network connectivity and services. An overlay network is a layer of virtual network topology on top of the physical layer, which directly interfaces with users (Zhang 2003).

Tor is an example of an overlay network, and the biggest and most widely used anonymisation network; but there are numerous others, such as I2P, Freenet or ZeroNet. It is important to recognize that the term Dark net originally refers to the network itself, and therefore the technical base like the protocol and devices; but not the content which may be transported through the network, or can be found on its respective servers.

Dark Web

The Dark Web refers to the websites which are hosted within overlay networks, and are normally not accessible without special software like the Tor Browser. Nowadays, usage of the Tor network is easy and straightforward: the Tor Browser is a complete bundle ready to use without installation by providing a fully configured Firefox Browser. As in the case of the Deep Web, search engine crawlers are not able to index the websites of the Dark Web. But in contrast to it, its most important feature is that the users of a service stay anonymous – neither a provider of a website can identify the visitors, nor can a visitor identify the service provider. Given this, the respective services are also called ‘hidden services’; more recently, ‘onion services’.

Invisible Internet Project (I2P)

Invisible Internet Project (I2P) is an anonymous peer to peer network layer upon which any number of anonymous applications can work. The applications in I2P provide internet activities like anonymous web browsing, chat, file sharing, email, blogs and many more. The network has been in active development since 2003. All the work done on I2P is open source and freely available on website. The software participating in the I2P network is called a router. This software is separated from anonymous endpoints or destinations associated with different applications. So an end user will have several local destinations on their router. A tunnel created through an explicitly selected list of routers are used for sending messages. The message can only be sent in one direction. Another tunnel is required to send messages back. Inbound and outbound tunnels are used in I2P communication. Layered encryption is used to ensure the anonymity of the communications. Garlic routing, a variant of onion routing is used in I2P. It will encrypt multiple messages together. As a result traffic analysis become difficult and speed of data transfer also increases.

The Tor

Few Internet technologies have had more of an impact on anonymous Internet use than The Onion Router browser, commonly known as “Tor,” Tor is simply an Internet browser modified from the popular Firefox Internet browser. The browser modifications hide the user’s originating Internet Protocol (IP) address when surfing websites or sending e-mail. By hiding the true IP address of the user, attempts to trace or identify the user are nearly impossible without the use of extraordinary methods.

The Tor is a 2^ndgeneration implementation of the onion routing topology initially developed in work for the U.S. Navy Research Lab in the mid-1990s (Goldschlage, Reed, Syverson, 1996; Tor, 2012). This implementation of the onion routing topology is intended to be a low-latency overlay network for TCP flows over the public Internet that intends to provide privacy and anonymity to its users. Specifically, Tor provides the functionality that “prevents a user from being linked with their communication partners” (Loesing, Murdoch, Dingledine, 204, 2010).

While the original design goal of the Tor network was to provide significantly more privacy to a user than provided in default Internet communications, Tor has recently been used by evade state-sponsored censorship attempts (Loesing, Murdoch, Dingledine, 2010). Dingledine has noted that there is an “ongoing trend in law, policy and technology” that “threaten anonymity and undermine our ability to speak and read freely” on the public Internet (n.d.). For example, in early 2012, Iran disallowed access to any sites that utilized HTTPS (Kabir News, 2012). Also, in May, 2012, the Palestinian government shut down eight news websites for posting critical opinion pieces of the president (Hale; OONI, 2012). In mid-2012, the Ethiopian Telecommunications Corporation began performing deep packet inspection on all ingress and egress traffic coming in to the country. As Ethiopia’s only service provider, they had direct access to all such traffic (Runa, 2012). York has provided multiple cases where there have been calls for additional censorship, the creation of a censorship body, and even examples where citizens have been arrested for political or religious reasons (2012b). And so then, Tor is intended to provide not only privacy in these types of scenarios, but anonymity by providing protection against eavesdropping and man-in-the-middle attacks. Additionally, by using multiple iterations of encryption and intermediating nodes defining a radically different path from a client to its ultimate destination, deep packet inspection, traffic analysis, and timing analysis attempts are mitigated.

Previous studies regarding identifying dark web activity have centred around identifying users on the Tor network by utilizing techniques that de-anonymize these users at the entry or exit nodes of the network. While they are difficult to exploit, vulnerabilities do exist in the Tor browser that can result in the de-anonymization of users (Jacoby MenTor & Chow, 2016). Two extremely unlikely scenarios are collecting data by controlling all of the nodes on a circuit used or controlling only the entry and exit nodes used by a particular user. There have been instances of malicious scripts being used to infect computers of visitors of the dark web that cause identification of those users.

How Tor Works?

We already discussed that Tor uses onion routing system. Tor uses thousand of volunteer networks to direct traffic over internet so user identity can be kept hidden from network interceptor. Tor helps to reduce risk of traffic analysis by distributing transaction over several places so no single point can link to senders destination. For example user “A” wants to send a packet safely to user “B” using Tor network. Tor creates private network for this communication. First step is to identify available nodes. User A’s Tor client obtains list of Tor nodes from server. It picks random node for each time so pattern cannot be observed by interceptor. Now client generates an encrypted message and which is sent to first node. The client on this node decrypts the first layer of encryption and identifies the next node. This will continue until the final node receives the location of the actual recipient, where it transmits an unencrypted message to ensure complete anonymity. Now when the client computer want to send another packet Tor uses completely different path. (Source: https://www.torproject.org/about/overview.html.en)

Normally when a connection is made, the host computer makes a direct connection to the destination, so both know each IP address. Tor uses a series of connections, known as nodes, to connect the host to the destination. The first step in connecting to the Tor network is the client obtaining a list of Tor nodes from a directory server.

During the second step, the client selects a random path to the destination server. For each node the data will pass through, a layer of encryption is added. Each node has a decryption key for only their layer and no other.

As each node is reached, it decrypts a layer then sends the data to the next node. Only when reaching the final destination is the data in an unencrypted form. If the user visits another site, another random path is taken to the new destination.

This layering preserves the privacy of the data preventing eavesdropping. Additionally, each node only knows the node it received the data from and the node it is sending data to, they do not know the full path from the host to the destination which prevents traffic analysis and creates anonymity (Syverson, 2005).

TOR offer full anonymity

Yes, as long as we make every effort not to reveal our identity and what device we are using. Any disclosure of TOR users results from their inattention or ignorance. All you have to do is open the file downloaded from the Dark Web through a different browser than TOR and our real location is no longer anonymous. Safety settings The TOR browser offers three levels of user protection: standard, safer and safest.

1. Standard – all functions of the browser and websites are enabled, this guarantees full functionality of the websites. Use the standard settings only when browsing known, trusted sites.

2. Safer – disables potentially dangerous functions of websites. Some fonts or symbols may not work properly and multimedia such as audio and video clips only start with a click. Furthermore, java scripts are disabled on all non HTPS pages.

3. The safest – allows websites to use only their basic functions. Scripts on all pages, even those with HTTPS protocol do not start. Many pages will be non-functional in this mode.

TAILS – THE OPERATING SYSTEM CREATED FOR PRIVACY

Tails is a free, real-time open source operating system focused on protecting the privacy of its users.

What is a live system? It is an operating system that works directly from a flash drive or CD/DVD and is loaded into the computer’s RAM. Such a system does not use a hard drive, and therefore has no connection to data stored on our computer. When you turn off such a system, all temporary files, saved passwords, cookies and anything else that could leave a trace on us are deleted. In the case of Tails, the remaining empty space on the pen-drive is used as disk space. This allows us to save files, for example presentations or images created in the included software. Tails allows us to choose which files or program settings are important to us and saves them in an encrypted space, so we don’t have to worry about losing them. All other data is deleted when the system is switched off, so that no traces are left behind.

What exactly is Tails? Tails is a very simple and fully functional, secure operating system. It has a built-in TOR browser, Thunderbird email that encrypts the messages we send, LibreOffice, graphics programs such as GIMP and InkScape, and the OnionShare application with which we can share files and folders on the TOR network. In addition, you will find here all the tools such as a calendar, calculator, notepad and many others. The built-in browser offers us several plugins that can be turned on at any time and freely configured.

No Script is an extension that, when activated, blocks any scripts that are launched when the page is opened. These can be animations, sounds, advertisements that direct to external sites, links infected with malware or tracking files. However, many sites will not work properly with the scripts disabled, so if we trust a site, we can gradually increase its privileges until we can use it easily.

uBlock is a free open code extension that blocks all ads (including infected ones) and gives us the ability to filter and block content of our choice.

HTTPS Everywhere is a plug-in that forces web pages to operate in a more secure, encrypted HTTPS connection. If a website does not support such a protocol, it may be blocked.

HOW TO DOWNLOAD AND USE TAILS?

1. Prepare a flash drive with a capacity of at least 8 GB.

2. Go to the official Tails website and download the USB image.

3. Download one of the free programs for creating bootable flash drives such as Rufus.

4. Connect the flash drive to your computer.

5. Run the program to create bootable media, enter the path to the USB image and then select the media on which you want to install the system. Confirm the settings and wait for the program to finish. (Note – this action will delete all data from the flash drive).

6. When the program is finished, restart your computer and wait for the manufacturer’s computer or motherboard screen. When this appears, click the button responsible for launching the boot menu several times. It will probably be one of the keys from F1 fo F12, this information can be found at the bottom of the previously mentioned loading screen.

7. From the boot menu, select the USB storage medium and wait for the system to load.

8. After some time you will see a window where you can change the language, time zone and keyboard type. Personally, I always leave the default settings – the less personalization, the less information about you. In the “more options” tab we can find such settings as MAC address spoofing and connecting to a proxy server (Enabling the proxy server is useful when you come from a country where the internet is filtered and censored.)

What is a MAC address spoofing? Apart from the logical address, i. e. IP, network cards also have their unchangeable physical address – MAC. The above mentioned option will change the MAC address of our network card to a fake one. This will give you even more anonymity.

You will find all the details about Tails on their official website.

https://tails.boum.org

In present scenario the published research available on TOR bundle browser memory dump forensic analysis is currently very limited. Existing research is only available for:

1. General memory dump forensic analysis and recovery.

2. Monitoring or detection of TOR browser in a network.

There is a huge gap in this area and a deep research is needed to address this gap using the forensic process.

Computer Forensics and the Procedure Used

In resent past the events in forensic science using tools for court cases. Digital forensics (DF) is defined as using scientifically derived and actually proven methods to identify, transport, collect, analyze, store, present, distribute, revert back, destroy and/or interpret digital evidence from digital sources, in addition to preservation of evidence. DF is generally divided into several classes. These include network forensics, computer forensics, and mobile forensics. Each of the above digital forensic classes helps figure out the authors of cyber-attacks, phishers, and fraudsters. Details of the three classes of the forensic science are given below:

1- Computer forensics: It is the evaluation of the digital media via a scientific process for reconstructing real information for judicial review. That is collecting and analyzing data from different computer resources including computer networks, lines of communication, computer systems and suitable test storage media.

2- Network forensics: A network of telecommunication enables computer data sharing. Most digital devices such as PCs, notepads, and terminators are connected through wired or wireless contact in the network. It aims to catch with evidence cybercriminals for their illegal actions, thus limiting online crime.

3- Mobile forensics: Mobile Forensics (MF) is a type of digital forensics related to the restoration from a mobile device of evidence. It is also called mobile device testing involving interacting components like authority, people, resources, investigators team, procedures, and policy.

Investigators in computer forensics must follow proper procedure to obtain legal evidence. Accuracy is the main priority in computer forensics. Forensic practitioners must strictly follow policies and procedures and maintain high working ethics rules to ensure accuracy. Investigations in computer forensics follow rigid set of methods to make sure that computer evidence is obtained correctly.

Darknet Forensic

The forensic methods recommended for Dark net forensics are divided into two categories: Bitcoin forensics and TOR forensics, as anyone can use darknet using TOR browser and most of the dark net sites do transaction using Bitcon – digital currency. The techniques for dark net forensics are described in blow.

EXISTING DETECTION METHODS AVAILABLE

Artefacts from Operating System

In reference to the research paper by Runa A. Sandvik(2013),the author has illustrated the forensic analysis and documented traceable evidence that TOR bundle browser can leave in a windows machine. The author has also analysed and recorded the directory path of TOR bundle browser artefacts acquired from windows OS. From the research paper, the path or directories with TOR artefacts identified were thefollowing:

1.Prefetchfolder C:\Windows\Prefetch\.

2.Thumbnail cache memory

3.Windows Paging File 4.WindowsRegistry{Sandvik,2013#49}.

A similar analysis was performed by Andrew Case on “De-Anonymizing Live CDs through Physical Memory Analysis”, where the author has discussed different forensic techniques for recovery of using The Amnesic Incognito Live System (TAILS)(Case, n.d.). The TAILS is a LIVE operating system (Linux) bootable from DVD or any portable device. All internet connection is established and traffic is forced to pass through TOR network. Here ,the author has covered small sections on initial memory dump analysis for forensically retrieving artefacts. The author illustrates that Python scripts can be used to analyse specific TOR data structures where information regarding the artefacts are stored. However, the author hasn’t proved the script’s practical workability for artefacts recovery {Case, n.d. #50}(Dodge, Mullins, Peterson, & Okolica, 2010)(Sutherland, Evans, Tryfonas, & Blyth, 2008).

Artefacts from Network Traffic

As the TOR bundle browser encrypts all possible traffic sent through TOR entry node, middleman, and exit nodes by adding high level multilayer encryption each time it passes through a middleman node. Detecting the TOR traffic and any kind of proxy usage is highly essential on network forensic perspective.(Fachkhaetal.,2012)(Berthier&Cukier,2008).

Author John Brozycki, in the research paper “Detecting and Preventing Anonymous Proxy usage” has penned down the techniques that could be used for detecting any anonymous proxies by using SNORT rules. For this purpose, the author used Vidalia package–a TOR package for establishing TOR connections. The author claims that using the below SNORT rule as given in figure, can detect TOR bundle browser traffic {Brozycki, 2008#36}(Mizoguchi, Fukushima, Kasahara, Hori, &Sakurai, 2010).

SNORT rule developed by John Brozycki

A similar method was also supported by David’s SNORT rule, founder of Seclits blog. The author stated that it could also detect the TOR browser usage. However ,there were not any proven results documented in the paper.

SNORT rule developed by David

Malicious browser plugins

In August 2013, it was confirmed that attackers had exploited the vulnerability in TOR bundle browser Firefox plugin which can disclose the source IP address of the TOR users. This malicious code was injected from a darknet host called“ Freedom Hosting”, the users who visited this hidden service web site were compromised as it exploits the memory management vulnerability in Firefox browser (Goodin,2013). This malicious Java Script would make Firefox send a unique identifier to a public server by which the source IP address can be traced back. I addition to this, a reverse engineering security specialist claimed that this malicious code reveals some of the IP address in Reston, Virgina (POULSEN,2013).

MEMORY DUMP ANALYSIS

Why memory dump analysis

Memory dump analysis has always been a critical and interesting area for forensic investigations. The information stored in the memory of the computer has significant importance. For example, when a cyber criminal uses bootable LIVE CD or USB such as TAILS with a windows or Linux operating system, then no significant information is stored in the physical host computer. This is because the machine boots from the CD or portable USB with self contained hard drive and even if the physical computer is captured and forensically analysed not much evidence can be retrieved. Consider, a suspect uses TAILS for connecting to some illegal darknet website. In this scenario, all the internet connections established from that machine are forced to go through the TOR network with multi level encryption and thus the identity of the user is hidden to some extent–leaving behind no potential evidence. Thus, retrieving the memory dump from the suspect machine and analysing it forensically could provide more forensic evidence of TOR bundle browser usage (Aljaedi,Lindskog, Zavarsky,Ruhl,&Almari,2011).

What information are stored in memory dump?

The glimpse of the information stored in the memory dump are given below:

1. All the details about the image including date, time, and CPU usage are recorded.

2. Processes, process ID–all running process in the operating system.

3. Network connections – what network connections were available at the time of memory dump captured.

4. DLLs, memory maps, objects, encryption keys.

5. Programs, hidden programs, root kits, promiscuous codes.

6. Registry information of the operating system.

7. API functions, system call tables.

8. Graphic contents.

As the memory dump contains significant information, analysing it forensically will help to detect, recover and analyse artefacts of TOR bundle browser usage.

TOR forensics in the application of Darknet techniques in brief

Techniques	kits	Goal
RAM forensic evidence	1. Belkasoft RAM capturer is used to capture the RAM dump. 2. Hex dump is used to view hexadecimal view of RAM dump.	The purpose of RAM forensics is to obtain the description of the types of documents, visited websites as well as other downloaded content.
Registry forensics	Registry changes	Would be performed by Regshot to obtain evidence of TOR installation and last access date information
Network forensics	Wireshark and miner network	Would be performed by Wireshark and Miner network to gather evidence of web traffic information
Database forensics	The TOR browser database is housed in TorBrowser\Bro wser\TorBrowser\ Data\Browser\Pro file.default	Can be used to access the database content.

Bitcoin forensics in the application of Darknet techniques

Techniques	kits	Goal
Bitcoin wallet	Internet Evidence Finder (IEF)	Would be performed by extracting forensic artifacts from the Bitcoin wallet application downloaded on the client device. Internet Evidence Finder allows Bitcoin artifacts to be recovered.