One my favorite talks at last week’s BlackHat conference in Las Vegas was former FBI CISO Patrick Reidy's  Combating the Insider Threat at the FBI: Real World Lessons Learned.    I liked it because it was an informative, real-world discussion on the use of analytics to help detect (and prevent) insider attacks. But, my favorite part was the honesty.  Patrick told us his phase 1 was < 50% accurate at detecting the types of behaviors that indicate an inside attacker.   He emphasized this point by saying Puxatony Phil, the winter predicting groundhog, has a 50% chance of being right.   So phase 1 of the FBI’s insider detection analytics program was worse than a rodent.  Pretty funny.    He calculated his success rate by a running a regression of the model against historical data and seeing if it identified previously known insiders. (Later phases were better.)   It was pure scientific method.  Nice.   

Patrick then proceed told us what he did learn from his team’s more accurate later phases.    His lessons included:

  • Goal is to deter, not detect.   The end goal is to prevent people from becoming insiders.  Deterrence can sometimes be as simple as giving overt, gentle nudges if users stray too close to some boundary (like clicking on a directory to which they don't have access.)    Assume it was an honest mistake.  Keep the warnings friendly.  Encourage caution and awareness. Discourage the bad behavior.
  • Indicators, not predictions.   The analytics can help the human, but can’t spit out the final prediction.
  • Focus on just two activity feeds:   In the end, he told us his model was most accurate when based on two data feeds:  HR + data egress.   For HR, Insiders tend to have had some run-in/trouble where HR had to be involved.  Things like a recent reprimand or even security violation – poking into data they shouldn’t.  This is a leading indicator that the employee may be disgruntled.  (where they were previously just gruntled.)  For egress, look for atypically large amounts of data being transferred (either through USB of direct transfer off prem)
  • Baseline activity on per user basis:  In order to measure ‘atypical’ behavior you need a baseline.   Some users move a lot of data as part of their normal operation and will skew a global average.  Instead look at individual deltas.

The takeaways for me were:  start with basic models, science counts: hypothesis-measure-adjust, don’t lose site of the end goal (and not just your awesome analytics), and maybe have a sense of humor as you make progress.