Tuesday, April 8, 2014

Machine Learning and the Future of Authentication

Look what machines are teaching themselves to do:

All courtesy of what is called "machine learning": the ability of artificial intelligence programs to learn from their own performance working with large data sets, enabling them to do things they are not explicitly programmed to do. Basically, machine learning gives computers the ability to generalize from experience. 

In TrendsWatch 2014,  I noted that, with regard to the implications of machine learning for society, "the biggest challenge facing doctors, investment analysts, engineers, policy makers and managers is learning to trust analytic algorithms rather than their own judgement." Or to bring it down to a personal level--would you trust a self-driving car to "take the wheel" because it is a safer driver than a human pilot?

We have good reasons to learn such trust, because it turns out that recognizing patterns in data is something that computers can be really, really good at. For example, when IBM Watson looks at a patient's symptoms, and then combs medical databases for matching cases, it isn't hobbled by preconceptions about what this patient "should" be suffering from. This frees Dr. Watson from the classic "hoofbeats=horses, not zebras" paradigm. (Which is a fine rule of thumb, unless you happen to be the rare patient actually suffering from a zebra infestation.) It also helps avoid the all too common situation in which a cardiologist, an oncologist, and a neurologist each attribute the same problem to the heart, cancer and nerves respectively. Computers don't have preconceptions, or, so far, egos. 

So, I'm wondering what happens when Watson, working its computational way up Maslow's hierarchy of needs, at last turns its attention to art. Art authentication is an increasingly fraught field, with artist-specific foundations, collectors and experts tangled over who has the final word over attribution. Some foundations, like the Pollock-Krasner and the Warhol, have ceased doing authentication whether from fear of lawsuits or other concerns. 

Often authentication rests on induction from objective evidence: the age and chemical nature of paint or canvas, the presence of accidental inclusions such as pollen, or hair. Even those clues, however, may only help to detect conscious, asynchronous forgeries. What about the endless reclassification of works into "by [insert great artist here]" versus "from the workshop of.." with attendant vast swings in monetary value and prestige?

When it comes down to recognizing what we have previously described as "style"--the ineffable quality that can only be recognized by instinct and training--I wonder if museums are going to need to learn to trust analytic algorithms rather than their own judgement. While art historians aren't driven by the need to process huge amounts of material (which has fueled the application of machine learning to text classification) they certainly could use an unbiased arbiter with no skin in the game. (Heck, in Watson's case, with no skin at all...) 


Rich Cherry said...

I think this is a really important subject for museums to explore.

At our recent Museums and the Web Deep Dive on email archiving in art museums we had Neal Fishman who is the Program Director for Data Based Pathology within IBM's Software Group talking about how we could use systems like Watson to help us mine email now and in the future to enhance scholarly research while protecting the privacy of email senders.

The museum field has a burgeoning problem in loss of archival material that is being created digitally in the form of email. The scale of the problem is staggering when you realize that in the Clinton presidential library there are only a small quantity of emails and that the Obama Library will likely have more than a billion emails. So if we solve the archiving problem, retrieval using AI might be the only way to handle the pure volume of emails.

The Alliance's Center for the Future of Museums said...

Thanks for extending the conversation, Rich. Digital archives are a great example of data that is too big for museums to handle with traditional management methods. Can you recommend any links to resources/articles mentioned in the Deep Dive that readers can follow to explore this topic further?