Universities around the world have been using plagiarism software for a good few years now, to crack down on the amount of copied work. This, after all, is defrauding a university for the purpose of gaining a qualification, which can and has been classed as a criminal offence at least once before.
Some bright sparks at the University of Arizona has used essentially the intelligent properties of plagiarism software, to work out who writes terrorist propaganda on extremist websites, for them to then be called to justice. This revolutionary software has been coined, quite appropriately, "Dark Web".
Plagiarism software is "intelligent" in the way it can see things that the average human cannot, by using algorithms and logarithms, protocols and matching words and synonyms. Give someone two essays and ask them to spot the differences, and it's certainly not easy. However, a computer program can use sources from millions of pre-inputted books, academic work, the Internet and other submitted content, and analyse which parts are different, which are similar, and which has been blatantly copied.
Every university in the United Kingdom have access to the Turnitin software, which is displayed through a web interface, allowing the
submitter to submitter to electronically hand in their work, and have it analysed for plagiarism before it is finally submitted. Many universities around the world have a similar scheme in place, often customised for their own purposes. This saves the lecturer and the marker of the work lots of time by having this process essentially automated for them, but it also acts as a deterrent for students who decide to try and copy off someone. You really can't fool this software, but it can give false negatives.
Some also have the capability to identify the style of writing - where nouns and verbs meet, added hyperbole and the overall discourse of the text, it can also analyse the style of writing, to distinguish who actually wrote the piece being submitted, based on the text in past submissions. From this, the UoA have:
"...developed various multilingual data mining, text mining, and web mining techniques to perform link analysis, content analysis, web metrics (technical sophistication) analysis, sentiment analysis, authorship analysis, and video analysis in [their] research."
The Associated Press covered a story relating to Dark Web and how it works; however due to the recent state-media running of the corporation, I can't quote anything they've written without imposing a fine. Here's the article, should you wish to read it, although it only seems to work properly in Windows Internet Explorer.
One of the key features of Dark Web, is the near-ability to learn as it goes. One of the sub-projects involved searching, reading, and studying how terrorists go about learning how to create certain improvised explosive devices, but by learning how the data spreads and which demographics read the articles, enables the software to improve the intelligence behind how they can be caught.
Hsinchun Chen, the director of the artificial intelligence lab at UoA spoke to The National Student about Dark Web, described how "analysts cannot effectively analyse writing styles in cyberspace, especially multilingual writings. But using our tool, we can get about 95% accuracy, because [the project is] utilising a lot of things your naked eye cannot see.”
Sounds like something James Bond could be interested in, when he's not blowing something up of course.
Edit: some formatting was out of place - just tweaked it slightly.