Security watch

Where is my PII? Frank Simorjay

formation, such as accidentally allow- We all talk about PII (Personally ing a spammer to pick up your e-mail address, can be annoying, but alone it Identifiable Information) being the most would not lead to identity theft. Sensitive information consists of important information to protect. But private data that provides a link to your identity. Data that you before you can protect PII, you must wouldn’t want disclosed publicly in- cludes your Social Security number (or thoroughly understand what PII you have other similar unique identifier provid- ed by your government), bank account collected on your PC. It’s easy to say that numbers, credit card numbers (partic- ularly when accompanied by the expi- everything on your computer is sensitive, ration date and card member ID), your driver’s licence number, and your fin- but what do you really mean by everything? gerprint (or other biometric-related in- formation). When in the wrong hands, To shed light on this, I started to look that used to require diligent search ef- these items can be used in very damag- the problem in a bit more detail, forts to uncover. ing ways. It is important that you con- breaking down data types that may be Knowing what information you trol where and how this information sensitive and figuring out where the need to protect is more of a science is recorded and stored, on the Internet data may end up on your computer. than it used to be. To , I thought and on your PC. To this end, I will now First, just how sensitive information it would be interesting to see if your discuss a couple of simple methods to may be is often a personal judgment. computer has potentially private in- any PII that may be stored on your For instance, some people feel threat- formation that you may not be aware system’s hard drive. ened if their name shows up in a search of and that you may want to protect. result. Of course, unless you’ve been While you might say that all personal Finding PII data on your computer living under a rock, there is a good information that can be used to steal PII information is scattered every- chance that someone has posted your your identity is sensitive, information where. In fact, if you were to go name on the Internet in some form by can really be separated into two levels through your rubbish, you would prob- now. To investigate, use your favourite of detail. There is information that is ably find some PII quite easily. Protect- search engine to search for your name readily available and information that ing this information requires diligence online. Keep in mind that the more is more private and generally consid- and a bit of care. I recommend that ev- common your name, the harder it will ered critical to your personal identity. eryone invest in a good paper shredder be to find instances that refer specifi- Information that is readily available and shred anything that has personal cally to you. And you might consider is not typically considered as PII. This information on it. this a good thing. includes your name and may also con- But what about the PII lurking If you are looking for yourself on sist of your phone number, street ad- about on your PC? Finding this data the Internet, you might also want to dress, e-mail address, gender, and in can be as challenging as storing it se- check out some of the popular social many cases your place of employment curely. Windows Vista, and several networking sites, such as LinkedIn, and some educational information. other desktop search tools, can help Facebook, and YouTube. It is quite re- These items are readily available on the you find information on your system. markable to see the Internet’s ability to Internet and in public directories such But you need to know what informa- store and disperse private information as phone books. Disclosure of this in- tion to look for.

TechNet Magazine September 2008 77

77_79_SecurityWatch_des5ar.indd 77 13/8/08 16:16:30 Security watch

To illustrate the problem, I’m using look interesting (due to the random- number structured as XXX XX XXXX a couple of simple tools that will allow ness of data strings in binary files) but or XXX-XX-XXXX. Using Windows me to provide quick hands-on exam- are in fact of no interest here. In oth- PowerShell, you can simply enter the ples of what’s at stake. I’m using scripts er words, non-text files can be ignored following lines:

with Windows PowerShell. Among for this exercise. Get-ChildItem -rec -exclude *.exe,*.dll | I have selected two good PII data select-string “ [0-9]{3}[-| ][0-9]{2}[-| ] [0-9]{4}” types: Social Security numbers and If you are inclined credit card information. This data Or you can use findstr.exe to ensure should be easy to find if it is actually that binary files are not read for the to store PII stored on your hard drive in clear text. search using this: The structure and pattern of both data Get-ChildItem -rec | ?{ findstr.exe types are unique enough to allow for /mprc:. $_.FullName } | select-string information, “ [0-9]{3}[-| ][0-9]{2}[-| ][0-9]{4}” a simple script to find the informa- encrypt it tion. However, this data is also sensi- In this sample, Get-ChildItem –rec tive enough for me to ask why it needs conducts a recursive directory search to be stored on your PC. If you are in- of files that starts from the directory the many things Windows PowerShell clined to store this information, you in which the command was executed. does, you’ll find that it also provides should ensure that it is protected. I’ll Findstr.exe searches for strings in files excellent string-matching capabilities. cover ways to protect your PII in a mo- and Select-string is the Windows Pow- For our purposes, I will be focusing on ment. My discussion here is admittedly erShell string search function. (Find- its ability to match regular expressions. limited – there are other important PII str.exe provides similar functionality Windows PowerShell (available at data types that I haven’t included here, that I am not discussing here.) In addi- www.microsoft.com/powershell) is a such as user names and passwords. tion, note that the leading space in the powerful tool that has quickly become is deliberate. This a standard for administrative tasks. Searching for a social security helps to reduce false positives by elimi- Additionally, I will use findstr.exe number nating unnecessary information, such to provide a means to manage false Here is a simple string that will look as registry strings like HKLM\SOFT- positives, meaning the ability to ig- for any information in files that con- WARE\tool\XXX-XX-XXXX. nore files that may contain strings that sists of a standard U.S. Social Security In my sample run, the search pattern

Figure 1 Results when searching for a number pattern

Figure 2 Searching for a specific number

78 To get your FREE of TechNet Magazine subscribe at: www.microsoft.com/uk/technetmagazine

77_79_SecurityWatch_desFIN2.indd78 78 14/8/08 16:59:40 Figure 3 Using exclude to eliminate noise from the results

returned a test sample file I put in a dows PowerShell, my search string made me rethink what I should and subdirectory, and it also found samples looks like this: should not write down! located in an XML file that outline file If you find that you do want to store Get-CchildItem -rec | ?{ findstr.exe patterns for credit card and Social Se- /mprc:. $_.FullName } | select-string this information but only in a safe “[456][0-9]{15}”,”[456][0-9]{3}[-| ][0-9]{4} curity numbers (see Figure 1). [-| ][0-9]{4}[-| ][0-9]{4}” way, try using a tool such as Password I use the exclude capability in the Safe (available at passwordsafe.source- first example to drop all .exe and .dll In the sample shown in Figure 3, I forge.). Or encrypt your hard drive files since they can generate unneces- used the exclude function to eliminate with a tool such as BitLocker™ Drive sary noise. You may discover other file noise from .rtf, .rbl, and .h file types. Encryption. Finally, the Data Encryp- types that also cause false positives. If Additionally, the sample code looks for tion Toolkit for Mobile PCs provides you do, you can use exclude to fine- credit card strings that have no spac- tested guidance on protecting data on tune the search process. es or dashes. This, unfortunately, may a mobile PC. These solutions will at If you are searching only for a specif- overload your display. So the follow- least make it a bit more difficult for ic Social Security number, you can do ing is an alternative command for the someone who happens to be trolling the following (replacing “123 45 6789” same function, but this one will not your PC for personal information. with your Social Security number): catch non-spaced or non-dashed card

Get-ChildItem -rec | ?{ findstr.exe numbers: Wrapping up /mprc:. $_.FullName } | select-string Finding PII information is fairly sim- “123 45 6789”,”123-45-6789” Get-ChildItem -rec | ?{ findstr.exe /mprc:. $_.FullName } | select-string ple. Being aware of the information is The results of this search effort are “[456][0-9]{3}[-| ][0-9]{4}[-| ][0-9]{4} [-| ][0-9]{4}” the tricky part. But keep in mind that shown in Figure 2. a piece of malware or a malicious user Since American Express cards are who has gained access to your system Searching for credit card considerably different, I have created can use similar discovery techniques to information a modified search string to locate that find information on your system just Credit card information is a bit trick- card’s pattern. In Windows Power- as easily. Be careful about when and ier since the formats vary. And I want , the search string looks like this: where you enter PII information, and to limit false positives (meaning results if you are inclined to store the infor- Get-ChildItem -rec | ?{ findstr.exe that just happen to look like a credit /mprc:. $_.FullName } | select-string mation, be sure that you encrypt it. card number). Nonetheless, the search “3[47][0-9]{13}”,”3[47][0-9]{2}[-| ][0-9]{6} [-| ][0-9]{5}” I’d like to thank Matt Hainje for help- will probably turn up some random ing to troubleshoot my Windows Power- sequences that are merely similar to Overload of data may affect this result Shell scripts. ■ credit card numbers. also. This alternative command is the I am using information that is pro- same function but will not catch non- vided in the essay “Anatomy of Credit spaced or non-dashed card numbers: Frank Simorjay is a Technical

Card Numbers” by Michael Gilleland Get-childitem -rec | ?{ findstr.exe Program Manager for the Microsoft as a reference when building these /mprc:. $_.FullName } | select-string Solution Accelerator–Security and “3[47][0-9]{2}[-| ][0-9]{6}[-| ][0-9]{5}” strings (see www.merriampark.com/ Compliance group. He designs security anat­mycc.htm). For instance, my When writing this column, I ran solutions for Microsoft customers, speaks search string specifies that the first these searches on my own system and at events such as the Secure World number must be a 4, 5, or 6 since this I was quite surprised to find several Exposition (of which he is a founder), is defined as the major industry identi- instances of my Social Security num- provides security education and train- fier of the credit card. ber saved in places where it should not ing, and has contributed to a variety of Here I have constructed simple have been stored. It turns out the infor- papers and books on security. His most strings that will search for Discover, mation was located in a note I wrote a recent work is the “Malware Removal MasterCard, and Visa cards. In Win- while ago and then forgot about. This Starter Kit.”

TechNet Magazine September 2008 79

77_79_SecurityWatch_des5ar.indd 79 13/8/08 16:17:03