Product Requirements Documents (PRDs) are a common tool for getting product teams on the same page at the beginning of product development cycles. Typically written by marketing, PRDs seek to lay out what functions a product must have in order to be competitive at launch in 2-3 years. As with any future prediction, it involves many assumptions about technologies, markets, and use cases.
In a written document, words matter, and team members should have a similar understanding of what the words mean, and what the assumptions behind them are. Too often though, that common understanding is lacking. And when you're two years into a three project, investing tens of millions of dollars, those miscommunications can be very costly.
For example, I wrote a PRD that specified how user interface should be structured. I assumed that the engineers would know that for every action there should be a confirming signal to the user once the requested function completed.
The engineers, fresh from grad school, didn't know that. The prototype UI had no confirming signals. The software schedule took a three month hit.
Enter natural language processing
In a recent paper, researchers at Delhi Technological University proposed an AI-assisted way to overcome some these cross-domain ambiguities. As they note, eliciting product requirements has been termed the ". . . the most difficult, critical, and communication-intensive aspect of software development."
It's difficult because the team members - and eventual customers - have different backgrounds. Marketing doesn't know what coming technology may make possible. Engineers rarely understand customer environments, or competitive pressures. Everyone uses words with domain-specific meanings whose connotations may be lost on other team members.
In their paper, the researchers propose:
". . . a natural language processing (NLP) approach based on linear transformation of word embedding spaces. . . . [to produce] a ranked list of potentially ambiguous terms for a given set of domains. . . . For each word in a set of dominant shared terms, . . . [a]n ambiguity score is then assigned to the word. . . ."
Word embedding is a language modeling technique based on the idea that words appearing in similar linguistic contexts share similar meanings. When a single word is used in dissimilar contexts - engineering and marketing, for instance - found by crawling the Wikipedia corpus, the frequency of its use in each context determines the dominant meaning.
The differences in those meanings can be measured to determine the word's cross-domain ambiguity. Ambiguous terms can be noted for extra discussion in the product requirements process, resolving differing understandings among team members, and reducing surprises later on.
The researchers tested their approach on words for specific projects in five different domains: computer science (CS), electronic engineering (EE), mechanical engineering (ME), medicine (MED), and sports (SPO). They used the Wikipedia API for Python to build domain corpora, with 20,000 maximum articles for each domain.
This yielded lists of words within each project with different ambiguity ratings. Across all the domains, for instance, the word induction was the most ambiguous, while the word government was the least. Assembly was another highly ambiguous term.
The Storage Bits take
Words are one of man's oldest forms of storage. Our need to communicate, coupled with network effects, have lifted English to the closest thing we have to a global language.
When I joined Sun, one of my first team meetings surprised me: of the dozen or so people present, I was the only native speaker of English. So the ambiguity issue is very real to me.
Correcting problems early in product is quick and cheap. Anything that makes product development faster and cheaper is a win for a civilization relying on technical progress to solve our most threatening problems.
This research also points out just how far-reaching AI techniques will be in reshaping virtually every aspect of our daily lives. I can hardly wait!
Comments welcome. What is your best ambiguous word fail?