This is a continuation of our previous discussion on getting started with Social Network Analysis (SNA). So now that we can do some SNA with NodeXL, how do you now go out and catch the bad guys? Well, remember, SNA is a network of entities. So let’s take auto insurance fraud for example and show how just you and NodeXL are not going to uncover the Mafia network up the street. In a typical auto insurance claim you are going to have a claimant or a person making the claim, you might have another driver or passenger that could also be making a claim. You could have a tow truck operator, body shop, healthcare provider, lawyer, claims adjuster, witnesses etc. that could all be part of a claim. The general auto industry rule of thumb is that around 1 in 10 claims have some sort of fraud or abuse or stated differently 90% of the claims are legit while 10% are suspicious. Now we know that SNA is a network of entities, we can network all of the parties involved in a claim. Below is an actual example from NodeXL from a client’s claims. The red lines is a very prolific crime ring that was detected using SPSS Modeler that I highlighted manually within NodeXL. The remaining networks are a combination of bad guys and good guys.
SNA does not know which is which other than there is some sort of a connection going on. Only through data mining and predictive modeling can you determine which networks are just part of normal business and which are possibly fraudulent. In another post I will go through how this model worked but a high level it was tasked with finding crime rings not networks. A ring can be thought of as an independent group of 2 or more entities like a terror cell for example. A network is a connection of multiple entities possibly rings or other connected entities usually controlled centrally. A mafia or organized crime network would be an example. What this graph shows is why traditional SNA using just graphs was not the silver bullet that we had all hoped for and that we still needed predictive modeling.
In the next post I will start going over getting into more detail with SNA and SPSS. If you are looking for some data to start playing with, here a couple of sources at Stanford and Arizona State that you might find useful.