Identity management 101: How digital identity works in 2020

The integrity of the global distributed computing network depends extensively on how well users’ digital identities can be protected. As difficult a problem as it seems, it’s actually much harder.

In a person-to-person business transaction, when the other person doesn't know you or know who you are, she may accept credentials that vouch for your identity. Identity management in computer networks is about the following:

  • Whether a set of digital credentials under examination by a service or application can vouch for your identity, and can attest to your authorization to conduct the transaction you request;
  • Whether an access management system can trust the results of that examination enough to grant you access to documents, protected services, or information;
  • Whether you personally can trust credentials presented by a website or web app, to represent the business, institution, or agency with which you intend to conduct a transaction.
thomas-edison-logs-in.jpg

Thomas Edison punching in for work at his Menlo Park laboratory, 1921.

From the US Library of Congress, in the public domain.

Data is being collected about you -- that much is undeniable. (Just today, you've probably thwarted a phishing attempt.) The common misconception is that your personally identifiable data (PII) resides natively in some single, centralized database. Clearly, Facebook is one of the most aggressive collectors of behavioral data and aggregated inferences about the personal information people post -- although it accuses others of "harvesting" this data. Indeed, social media's ability to influence public perceptions and opinion may have already changed the course of world history.

But not even the autonomously collected metadata about everyone's online behavior -- despite efforts to protect everyone's privacy -- constitutes a collective database of digital identities. Many people who claim to have had their digital identities stolen are actually among thousands of victims of the theft of a database which includes some element of personal data, such as a credit card number. What makes the collection of this data most dangerous to you as a person is the possibility that a system with access to data that authenticates you (credentials) can pair it with data that describes you (metadata). This way, conceivably, a malicious actor can gather this data together from multiple sources to impersonate you, and conduct financial and business transactions in your name.

So when we talk about "digital identity," what are we really saying?

  • Digital identity consists of the credentials necessary to gain access to resources in a network or online, in your name.
    • In the weakest possible secured system, just a password may suffice to assume some form of digital identity.
    • In an unsecured system -- for instance, utilizing a web browser so you can read some technology news site somewhere -- public servers may only require anonymous credentials, but even then there's a kind of temporary digital identity representing the browser, which is assumed to represent you because you gained access to the client where the browser was installed.
    • An enterprise network typically requires some form of authentication for you to attain access -- perhaps two-factor authentication, or even more. Those credentials are your digital identity in the context of that network, but not outside that network.
    • On the internet, which is a patchwork of networks linked together through gateways, the sharing of services requires digital identity to be exchanged in some form. This is the trickiest and most volatile part of the entire digital identity scheme.
  • Personal identity is the amalgam of information necessary for you, or anyone seeking to impersonate you, to be recognized as valid and to be authenticated. Someone stealing a password to a DMV database might be able to attain information about your driver's license or the make and model of car you drive. If that's enough information for someone else, in some other transaction, to pass this person off as you, then that malicious actor may have effectively "stolen your identity."

Identity management, therefore, consists of the practices and principles to which everyone in the transaction process adheres, including yourself, to protect those elements of digital identity that may be combined by a malicious actor to utilize your personal identity. Identity and access management (IAM) is the class of software and services on the computer networks' side of the transaction, dedicated to fulfilling their responsibilities to you in that regard. 

Digital identity and personal identity

You know who you are. However, digital identity in a computing system, as you've just seen, is a fuzzy topic.

Digital identity is asserted through credentials

The weakest form of access management in a computer system involves a single username paired with a single password. Password management is not identity management.

At any one time, your digital identity is comprised of credentials, which are essentially tokens of data and metadata that represent you. They're not your personal papers, or anything such papers would contain. Whenever you enter a secured building, even if you work there, you present your credentials -- probably just to operate the elevator. That card is a token testifying to someone already having cleared you for entry. In most cases, digital identity is a token comprised of data.

But it's more complicated than that:

  • Services and software -- which are clearly not people -- may have some form of digital identity, because they too must access databases and resources as though they were "users."
  • Browsing the web does not require you, or any other user, to have enough identity for a person or system to identify you personally. You can read this ZDNet article without signing into some central internet service provider first. A browser may be assigned a kind of temporary "visitor's credentials," if you will, to establish a session with servers, but which do not exchange your credentials -- that may happen as part of a separate transaction.
  • There is no universally recognized, converged, single set of credentials that identify "you" exclusively for the Internet. In many regards, that's not good news. It means, if a malicious actor (a person) truly wants to impersonate you -- maybe to post text in your name on social media, or to transfer your cash into this person's account -- there is a slim possibility that this person could gather enough personally identifiable information (PII) about you to attain the data necessary to pass as you online.

Comparing identity management to data loss prevention

Businesses, especially financial institutions, do store PII for their customers in databases. The class of security service dedicated to preventing data breaches is called data breach prevention or data loss prevention (DLP). Such services focus on the integrity and non-impregnability of these databases themselves.

Data breaches appear to be more common events in recent years. Statistically speaking, however, they may be declining in number at the same time they're increasing with respect to damage caused.  Last February's customer data breach reported by facial recognition analytics firm Clearview AI is an example of malicious PII acquisition on a relatively small scale, but with a potentially major impact, especially on the integrity of active law enforcement investigations.

It takes real, typically human, effort to coordinate PII from multiple sources -- for example, to acquire access to your lines of credit and make purchases using your name and numbers. Data breaches do happen, indeed all too frequently, but it is not a push-button process.

Identity management can and does thwart such efforts. But its focus is securing the data with which you identify yourself online, as well as by electronic means in-person (for example, using an ID card). A person who has successfully breached DLP, pilfering a firm's customer data, may use that data to impersonate customers of that firm. But your own digital identity links to all the credentials and permissions you may have to access your personal records, wherever they may be stored. Your digital identity would be invaluable to a malicious actor capable of gathering the databases acquired through multiple breaches, and joining them together. That's a bad actor who would open loan accounts in your name.

How digital identities have typically been protected

By now, in the margins of web pages and in cable TV news ads, you've seen the hype about "hackers" stealing your identity, or trying to, and selling it on the digital black market of the "dark web." These depictions represent the commonly marketed concept of digital identity, much of it based on 1980s pulp sci-fi. Its premise is that the internet is a massive, collective database of people's names, addresses, voting preferences, salary histories, credit ratings, and Social Security numbers, and all anyone needs to do to steal these records to send you phishing email is to swipe some eight-character password you or a relative might have used once to access CompuServe in the 1980s.

200310-the-old-fortress.jpg

The first enterprise networks were portrayed as "fortresses," and the objective of security was to ensure that malicious actors could not breach them. This was the "endpoint security" or "perimeter security" model, and to a surprisingly large extent, it still exists today. It's the modern version of the "castle defense" model, that dates back to the mainframe computing model of the 1970s.

A modern enterprise security system is not focused on securing the perimeter to disallow something from coming in. The new security model is completely inverted, focused instead on securing the entity, sometimes called in this context the "identity." This is the credentialed user, who by default is not permitted to see or do anything at all -- a condition called zero trust (ZT). The purpose of the system, therefore, is to enable that user to see or do something, in accordance with policy and permissions. This way, in the absence of any operative security (for example, if the network is hacked and the system defeated), absolutely nothing is exposed.

Where trust comes from

In the everyday world, you conduct business transactions with other people relatively easily, for the most part, when they have confidence that you are who you say you are. You sign documents asserting your willingness to make these transactions. Even when you affix your signature to a document outside of any witnesses, people can generally trust your signature as an assertion that you are fully aware of what you're doing. There is an implied trust in a personal transaction. It can be violated, but it is not by nature volatile.

In the realm of digital transactions, this implied trust is effectively erased. Digits are symbols, and as such have little to distinguish themselves from one another. Any sequence of digits is logically as simple to forge as a single digit. Trust has to be constructed for each session of digital transactions. The simplest way to explain how this is done is to say that each person in a trusted transaction is given a puzzle box, and is trusted with the puzzle's solution. The solution opens the box without revealing its own details, or any part of the box's mechanism. If the box is opened and the contents accurately revealed, one can draw a conclusion with 99+% confidence that the person who opened the box is the person she claims to be.

With the earliest networked systems, you logged onto a computer or a server through a terminal (a command line where you type instructions). Whether you then gained access to resources located on that server depended on how much protection was assigned to your account. In the absence of a pro-active IT security team, there may have been no protection at all. Some who tried to paint the best picture of this scenario that they could, called this "open architecture," defining the principle of implicit trust -- access without being challenged.

Modern systems that utilize identity management operate on zero trust, which sounds like the opposite because it is. Unless an access control list (ACL), an enforced policy, or some other mechanism explicitly grants access to a resource, your request for it will be denied. In a properly administered zero-trust system, no one has unrestricted access to any resource or domain. Nevertheless, you'll often find some IAM systems governed by a restriction-free administrator account.

Identity and Access Management in practice

An identity and access management (IAM) system builds permissions and accessibility for users, within a network where those users are otherwise untrusted. The mission of IAM is protecting access to information assets, and ensuring that only authorized people have a view to protected documents and services inside an enterprise. IAM protects and encapsulates one network domain, using a single directory of users, and a single directory of protected resources.

Where policy comes into play

In February, the US National Institute of Science and Technology (NIST) published the second draft of its proposed official description of zero trust architecture. 

nist-zero-trust-architecture-diagram.jpg

This particular diagram from NIST publication 800-207 depicts the "journey" that a requesting user takes from placing an untrusted request for a resource, to being granted access to that resource. When a user becomes the subject of an authentication inquiry, a digital identity is built around that subject. At that point, it becomes what security engineers call a "security principal" or just principal.

In the diagram, think of the big, rounded rectangle as corralling all the relevant components of the network. And think of the pictogram that looks like a budget desktop PC from the 1990s as something a user might actually operate today. Like most software-defined networks (SDN), the network diagrammed here is divided into two planes of traffic. The control plane is separated from any portion of the network to which the user may be given visibility, when attributes of the network become associated with the principal. From the principal's perspective, only the data plane of this network exists. Determinations about whether to grant access to a principal are made with each request, and no principal at any time is given unchallenged or permanent access to a resource, even if that principal, or another security principal attributed to the same user, has accessed this resource before.

It's in this control plane where the policy deployment point (PDP) is located. In security architecture, a policy is a rule that sets the conditions for which a principal, or a group representing multiple principals, may be granted or denied access to a resource. Think of it like a computer program, but encapsulated into a single instruction line, with multiple lines running simultaneously.

The PDP is subdivided into two components, one of which interprets policy rules (policy engine, PE). The second one (policy administrator, PA) evaluates whether a policy finding grants the principal access to the requested resource, and until it sees a finding it likes, the answer will be no. Because of the plane separation, the principal never actually "sees" the PDP, so a malicious actor cannot take down the PDP from within the data plane.

In the data plane, the policy enforcement point (PEP) contains the low-level agent that executes the policy directive from the PA. Rather than acting as a turnstile directing the principal towards permanent access to the resource, it serves as a go-between or proxy, facilitating a connection to the resource, but only through it, and only for the duration of time set by the PA.

All the other boxes to the left and right in the NIST diagram represent security components which may present information that informs every active agent in the network in the PDP and PEP. But the dependency is indirect and event-driven. For instance, if "threat intelligence" were to provide some insight, the PE could make its own request for that data through an API call. But there are no dependencies added here -- which is important, because such dependencies could be exploitable by a malicious actor seeking to defeat the policy engine.

Attributes and assertions

Policy creation cannot happen without a way to identify the subjects of its rules. This happens explicitly and exclusively by way of identity management. In the context of a local enterprise network, NIST calls this identity governance. It's here where NIST's definition of identity (what we've been calling "digital identity") makes some modicum of sense. Its most succinct version of this definition appears in a report on identity federation, where it calls identity, "authentication attributes and subscriber attributes" in a networked system.

We can define these terms as follows:

  • Authentication attributes are elements of data attributable to a user which may be cross-referenced and verified to ensure that the user has authorization to assert itself as a certain entity or person.
  • Subscriber attributes consist of data that relates an authenticated user to the system hosting the directory to which that user, for lack of a better word, belongs -- in effect describing the user's relationship to the enterprise.

The ultimate objective of identity governance in the enterprise, in practice, is to become capable of restricting each user's view of the network exclusively to those resources to which the user is explicitly entitled, or to which an interpretation of policy may determine the user has legitimate access. Nothing else exists, from the user's perspective, but resources to which the user is entitled. It's security experts' view that, if the access management function of IAM is reliable enough, security professionals can re-focus their attentions on the integrity of digital identity.

One metaphor for this strategy involves defending entrance to a castle against agents from the outside using hundreds of guards stationed along its perimeter, compared to four or five guards encircling each foreign agent.

Making identity make sense across network boundaries

In any network of networks such as the internet, single sign-on (SSO) implies the ability for a user to pass his credentials through once and once only, generally in logging on to his local (client-side) operating system. Every other service or application requiring the user's authentication receives it from the service in which he signed on. 

Identity federation is an active effort for multiple networks to agree upon one protocol for allowing assertions of digital identity to traverse network boundaries, so that SSO becomes, at the very least, feasible. In any cross-network or cloud-oriented transaction where a process on one network requires a process or resource on another network, and both processes need to appear seamlessly integrated to an authenticated user, federation is the system that both networks rely upon to establish some level of trust with one another.

When federation enters into the picture, the trick becomes to define identity using something more permanent than just subscriber attributes. A subscription to a resource in one network probably should not entitle the subscriber to access to resources in other networks. However, for SSO to be feasible, other networks connected to one another through the internet should be able to vouch for the validity of each other's authentications.

nist-federation-800-px.jpg

At right is another diagram, this time from NIST Special Publication 800-63C, that represents the simplest form of identity federation, as an exchange between three parties: a principal who claims access to a resource (the "subscriber"), a relying party (RP, or "resource provider" depending upon whom you ask) and an identity provider (IdP). To access the RP, first the subscriber authenticates to the IdP.  (If she's logged onto her employer's network using SSO, this part may have already happened.)  The IdP then acts as a proxy for the subscriber, asserting credentials on her behalf. The RP then provides access to the subscriber, but only in an encrypted session that only the subscriber can decrypt and make sense of, if and only if the IdP was correct about who the subscriber is.

Federation happens, in this instance, when multiple networks that may act as RPs trust either a single IdP, or a network of IdPs that agree to use the same protocol.

ZDNet asked a panel of executives in the security field, meeting at the last RSA Conference in San Francisco, how they perceived the problem of identity management on a relative scale of priorities.

"It comes down to a way to federate identity across lots of different organizations," responded Hank Thomas, CEO of security VC Strategic Cyber Ventures LLC.  "That comes back to people having to work together, and trust each other that, once one person has proven that someone is someone, that other organization is going to have the same level of trust in the same thing. There's ways to do that; it's just that that trust isn't necessarily there yet. Maybe it's for compliance reasons, and for other reasons."

If that didn't sound like the clearest response you've ever heard, it's because the direction of identity federation at present is just about that clear. Federation is a vital necessity only because digital identity is an ephemeral thing -- it's a digital voucher that expires. It has to be re-created -- which wouldn't be a bother if the world were just one single enterprise network. If human users all carried with them a digital authentication device such as, or similar to, a YubiKey U2F device -- something physical which an identity provider (IdP) could assume was on the user's person during work hours -- the type of cryptographic ping-pong that takes place today could perhaps be radically simplified.

Until human beings worldwide become more comfortable with the idea of letting something electronic stand for them, identity management will be a work in progress. Ironically, the state of affairs that many people fear -- that their personal identities will be stolen by way of their digital identities -- is more likely to happen, as long as there is no single source of identity, with a single root of trust, that can be the focus of security engineers' best work.

Learn more — From the CBS Interactive Network

Elsewhere