The Extent of Identity Theft in America

A.   Introduction Identity theft is a concern because, as extracts from the National Crime Victimization Survey (NCVS, done on a nationally representative sample of 40,000 household residents) for January to December 2005 show, millions of families have been victimized.  The survey estimates that about 5.5% of all households across the United States (or 6.4 million families) fell victim at least once to identity theft and that “average” loss per household was $1,620.

Highlights include the following:

·         About 1.6 million households experienced theft of existing accounts other than a credit card (such as a banking account), and 1.1 million households discovered misuse of personal information (such as social security number).

·         Ten percent of the households with incomes of $75,000 or higher experienced identity theft; about twice the incidence of households earning less than $50,000.

B.   Coding method and Implementation In general, data coding involved pre-coded classifications of closed-ended items that fall under the four traditional types.  As well, we also see the application of what Kerlin (2002) calls “Selective Coding” (reflected in the structural relationship between categories - the relationship between a core category and related categories - which are integrated to form the theoretical structure of Identity Theft analysis) or “Factual/Descriptive Coding” (ideas that lean more toward the concrete; - such as Actions, Definitions, Events, Properties, Settings, Conditions, Processes, etc.).

The coding of identity theft types, for instance, is an example of nominal scaling because the types of identity theft vulnerabilities or exploitable loopholes must fall into categories that are mutually exclusive and collectively exhaustive.  At the same time, it is a selective coding because the separate instances define the structure of the problem:

Identity theft

Existing credit card

Other existing accounts

Personal information

Multiple types during same episode

No identity theft


Distinct from the other variables, “ways victims became aware of identity theft” is the classic case of open-ended responses coded in factual or descriptive fashion because responses are about events or processes such as noticing “missing money/unfamiliar charges on account,” “contacted about late/unpaid bills,” or “banking problems”.

C.   Measurement of Variables Used in Data Analysis There were three variable types in this study.  The first, or the dependent variable, is identity theft.  The Bureau of Justice Statistics (BJS) defined this as credit card thefts, thefts from existing accounts, misuse of personal information, and multiple types at the same time.  As one may discern from the above list, this is an example of a nominal scale: the categories are set apart solely by a classificatory hypothesis about the circumstances when rogues may acquire personal or financial information that were misused for financial gain.

The explanatory or independent variables were age, race, and ethnicity of the household head; household income and composition; and location of the household.  Race, ethnicity, income position, and address are also nominal variables that, while mutually exclusive, cannot be ordered or mathematically manipulated.  On the other hand, age is an interval scale and income effectively a ratio scale since the income classes are unequal in size.

Yet a third class of variable was descriptive in nature.  “Characteristics of the theft presented include economic loss, how the theft was discovered, whether misuse is ongoing, and problems experienced as a result of the identity theft” (Baum, op. cit.).

As reported in the article, analysis was quite cursory and confined, for the most part, to crosshatching the dependent variable with classificatory ones.

The breakdown of the ordinal scale, “time it took to resolve an identity theft once discovered”, is a vivid example of an unplanned measurement outcome.  It is such an intuitive way to build a timescale: “in a day or less, within the week, more than a week and up to 2 weeks, etc.”  And yet the distribution is skewed to the left, suggesting that the lion’s share of identity thefts are resolved within hours and losses minimized.


ONGOING.............................................. 18.0%

Problems resolved.................................... 71%

1 day or less ............................................ 28.6%

2-7 days .................................................. 16.6 %

8-14 days .................................................. 6.5 %

15-28 days ................................................ 3.1 |%

1-2 months .............................................. 12.2 %

3 or more months....................................... 4.8 %

D.   Choice of, and Rationale for, Measurement Scales In reality, only the incidence and descriptive loss variables were specific to an identity theft investigation.  All the other classificatory variables fit the general mold of an “omnibus” crime incidence survey.  On the other hand, Justice Statistics did file a “Final Report of Cognitive Research on the New Identity Theft Questions for the National Crime Victimization Survey’, presumably to address more meaningful analysis and has undertaken to unify field research with that done by the FTC.

Amount of loss Total Existing credit card Other existing accounts Personal information Multiple types during same episode $0 18.30% 13.30% 16.80% 37.30% 14.80% $1-99 16.7 21.2 18.8 5.6 10.6 $100-249 12 12.4 14.8 7.8 10.1 $250-499 10.8 11.8 12.2 4.5 12.2 $500-999 10.8 11.3 11.7 6.2 13.2 $1,000-2,499 10.2 10 10.8 7.6 13.1 $2,500-4,999 3.6 4.2 2.3 2.5 5.5 $5,000+ 4.7 3.8 3.2 6.6 8.7 Don't know 12.9 11.9 9.4 21.8 11.8 Mean* $1,620 $980 $1,220 $4,850 $2,460 Median* 300 N.B. Mean and median calculations based on losses of $1 or more. Table 1 Amount of financial loss due to identity theft,

In this instance, losses reported were broken up into class intervals of unequal size because the distribution is seriously skewed to the left: the mode fell below $100, the median was at $300 and the arithmetic mean raised to $1,650 by outliers reporting losses of $5,000 or greater.

E.   Types of Graphic Presentations Used Histogram (Bar chart) – To illustrate for, example, a finding that “almost half of households experienced unauthorized use of an existing credit card, 2005”.  While it would not have been wrong to choose a pie chart for this, Baum chose a bar chart to more easily visualize the finding.

F.    Possible Improvements in Measurement and Coding Scales One might take issue with limiting the nominal scaling of race into just four: “white, black, Other and More than one race” since this ignores the sheer diversity of the population now.  Breakdowns for East and South Asians, for example, are commonplace in Census Bureau data-gathering and reports.

Given the explosion in e-commerce and consequent online use of credit cards since about 2001, one hazards that purchases over the Internet could account for a substantial proportion of identity theft cases.  Classifying such instances under the general heading “Existing credit card” does not help very much in warning potential victims about the risks of entrusting personal financial information to a faceless stranger who, more often than not, discloses no verifiable physical location.  Breaking out the data in future would also motivate the Justice Department to propose measures to close the gaps and to prosecute felons more vigorously.

Called descriptive analysis, cross-tabulation of identity theft incidence against age and residence may focus the investigative efforts of local police or the FBI but, by failing to explain how it happens, such reporting does little to forestall these crimes in future.

Similarly, basic analysis of crime incidence with income – “one in 10 households with incomes of $75,000 or higher experienced identity theft, making this income group more vulnerable than households with lower annual incomes” – leaves the reader wondering whether there may not be a third, intervening variable at work.  For instance, the comparatively well-off may account for a greater share of card usage and are therefore more likely to run afoul of the probabilities for inadvertent loss.

Even in the absence of relevant variables, the analysis team could have made the entire matter of measurement more meaningful by showing how the socio-demographic profile of identity theft victims differs from that of the population at large, as well as that of credit card holders, for example.


Baum, K. (2006) Identity theft: 2005. Bureau of Justice Statistics. Nov. 23, 2007, from < >.

Kerlin, B. A. Ph.D. (2002) Chapter 6: Coding strategies. Nud.Ist4 Classic Guide. Nov. 23, 2007, from <>.