I've searched, but I can't find a clear answer on which files get hashed and submitted to the cloud. Imagine a scenario where you receive a sensitive Word document or PDF--let's say it's the Pentagon Papers--as an email attachment. Does this file ever get hashed and submitted? If so, can Webroot be subpoenaed to reveal which accounts submitted that hash for inspection?
Best answer by KitView original
This is an extremely complicated question, but let's break it down. As for how the files are hashed, we use the industry standard md5. Only files that are executable will be scanned and hashed. Archives are extracted, and their contents are hashed as well. The MD5 hashes are submitted to the cloud database and returned as "Good", "Bad", or "Unknown." These hashes do not contain data and would not be able to reassembled into a working exe. They are encrypted hashes of data...they can be decrypted, but we don't have the entire file. As far as being subpoenaed, yes we can track which users or accounts submitted file information.
The question is not whether anyone can reconstruct the data. The question is whether someone in possesion of the file and who therefore knows it's hash value, can through some legal means identifiy all of Webroot's customers that also possess the file.
Also, acquirable non-modified information on a given file hash just includes the NUMBER of computers it was seen on, the geo-located country it was first seen in, the OS version, default browser, and a few internal things like the version of the file that was hashed, version of the WSA agent it was seen by, etc. Other information is anonymized instantly, for example, if the file was seen as C:UsersKitDesktopFile.exe, that is stripped to be %desktop%file.exe.
So basically, No. Webroot cannot see or provide information on every computer that scanned a specific file hash, especially not if it's not a PE. The most we could say is how many computers scanned it.
And also, No. Webroot does not maintain a history of all files seen by a given system indefinitely or even for a minor amount of time. As a good example, when I look up my home computer's keycode on the system, I see files that were included in the most recent non-trivial (Deep) scan, but not, for example, an executable on my desktop that I deleted two days ago.
Honestly, given the number of cache files, temp files, etc, keeping a cross-linked record of every single file out of thousands of transient files per day per computer across every one of millions of computers would be prohibitive, data-wise, and would not help protect computers against threats.
Edit: And no, document hashes, even with macros or scripts, do not get submitted.
Absolutely. Sorry it took so long, the weekend is my days off and my wife would rather I play SW:TOR with her than do work things..