April 16, 2008
The Johns Hopkins University Applied Physics Laboratory
Need: The biggest headlines in network security seem to feature the dark world of outside hackers, worms, and viruses. In reality, however, a company's computer network is more likely to be compromised by people inside the organization, as a result of either malicious acts or simple non-compliance to established security protocols. Sometimes it is hard to identify who the users are when a company uses Network Address Translation (NAT) to enable multiple hosts on a private network to access the internet using a single public internet protocol address.
A traditional approach to determine the identity of the non-compliant user would be to remove drives from suspicious computers after hours, image them, and then replace them in order to analyze deleted files, fragments, inodes, logs, and histories to find evidence of malicious or non-compliant use. The problem with this approach is that it is very labor intensive, and there is a real chance that an individual who is intentionally compromising the network will discover that he or she is under investigation and destroy evidence.
Technical Description: Researchers at The Johns Hopkins University Applied Physics Laboratory (JHU/APL) have developed a way to identify specific machines through Remote Network Fingerprinting.
The advantages of this system are that it is passive, networked, stealth, and exploits the physical uniqueness of the machine. The system can be used to identify the endpoints in a communication, show that an endpoint participated in a transaction, or show that an endpoint was not involved in a transaction.
The University of California, San Diego (UCSD) developed an approach that exploits small, microscopic deviations in device hardware, clock skews, to create a "fingerprint" of a machine. The UCSD methodology collects time-stamp values from an observed machine (during a collection phase) and plots these values against a measurer system time in a scatter plot. After this step is completed, a convex hull method of fit is plotted, and the slope of this line is the clock skew of the observed machine. The investigator would then group similar drifts to sort out individual machines. This method acknowledged but did not resolve the required sampling size and the effect of differing topologies. It also ignored statistical techniques. Using a convex hull technique instead of a linear regression technique throws out the whole body of error analysis theory.
The researchers at JHU/APL expanded the UCSD research by estimating skew via linear regression and used error analysis theory to determine the required sample size. They simulated wide area network (WAN) delay and measured peripheral component interconnect (PCI) bus to link clock skew to the physical world and found that PCI bus clock speed is directly related to clock skew. Linear regression uniquely identifies machines to within a couple of parts per million. Also, the number of samples required is directly proportional to the observed time-stamp error and confidence interval, and it is inversely proportional to collection interval and allowed parts-per-million tolerance.
The JHU/APL approach to passive forensic identification of networked transmission control protocol/internet protocol (TCP/IP) communication endpoints allows for a repeatable way to fingerprint a specific machine by using the simple technique of linear regression, and statistical error analysis guides the investigator on how much information needs to be collected.
Stage of Development: Prototype
Figure 1. Results of a forensic investigation matching a machine with non-compliant activity.
The Applied Physics Laboratory, a division of The Johns Hopkins University, meets critical national challenges through the innovative application of science and technology. For information, visit www.jhuapl.edu.