February 2017 |
[an error occurred while processing this directive] |
The Label Maker...is You Appreciating How Metadata Makes AI Possible |
Therese Sullivan, Principal, |
Articles |
Interviews |
Releases |
New Products |
Reviews |
[an error occurred while processing this directive] |
Editorial |
Events |
Sponsors |
Site Search |
Newsletters |
[an error occurred while processing this directive] |
Archives |
Past Issues |
Home |
Editors |
eDucation |
[an error occurred while processing this directive] |
Training |
Links |
Software |
Subscribe |
[an error occurred while processing this directive] |
One
of the most re-watched episodes of the comedy series Seinfeld is ‘The Label Maker’ when Elaine’s
gift to a friend at Christmas was re-gifted to Jerry before the
Superbowl. True, a thing for tagging other things was not a fun or
romantic gift in 1995. And, perhaps Bryan Cranston’s fictional dentist
didn’t get it. But, Julia Louis-Dreyfus’s Elaine was way out ahead in
her thinking. Label making is important! If you are a data wrangler
today, you should appreciate any gift that helps you tag things with
metadata labels.
Frank Chen of Silicon Valley venture capital firm Andreessen Horowitz
(a16z) presents a timeline of how labeling became cool in his AI and deep learning mini-course. Released to all
interested students in June 2016, it is a fantastic history lesson and
a primer on what is happening in artificial intelligence (AI) today. He
writes:
“One person, in a literal garage, building a self-driving car.” That happened in 2015. Now to put that fact in context, compare this to 2004, when DARPA sponsored the very first driverless car Grand Challenge. Of the 20 entries they received then, the winning entry went 7.2 miles; in 2007, in the Urban Challenge, the winning entries went 60 miles under city-like constraints. Things are clearly progressing rapidly when it comes to machine intelligence. But how did we get here, after not one but multiple “A.I. winters”? What’s the breakthrough? And why is Silicon Valley buzzing about artificial intelligence again?
The same AI entering cars is impacting buildings too. Listen to Ken Sinclair discuss the surprising rate of
innovation in his latest ControlTrends interview. Chen answers his
own question this way: more compute power, more data, better algorithms
and more investment. His research colleague at a16z, Ben Evans explores
the topic of labeling in more depth in the blog post AI, Apple, and Google:
Did you catch that? The speech and image recognition
technology may be superficial eye-candy compared to the feat of putting
together the underlying knowledge graph. In other words, how you
classify and label objects is at the core of how well your AI works.
Knowledge graphs for the World Wide Web are the domain of semantic web
researchers. Three leading professors in the field from the University
of Zurich, Rensselaer Polytechnic Institute, and Stanford University
collaborated on the September 2016 article, A New Look at the Semantic Web. They say:
[an error occurred while processing this directive]Labeling, edge computing, artificial intelligence—these
are three pieces of the same puzzle—a puzzle that seems to be coming
together very fast right now. (Don’t miss the recent slideshow of
another a16z thinker, Peter Levine, on how edge computing will soon
eclipse the cloud.) The concepts and timing that Silicon Valley’s
a16z
thought leaders describe are as applicable to buildings as they are to
cars, dogs, and beaches. And, the academics leading the semantic web
conversation are saying that the mark-up languages and metadata schema
are coming from all corners of the web, not just ivory towers. Frank
Chen points out that the latest Google image recognition algorithms can
chow down on the entire collection of videos on Youtube. But, when they
do that, they get a graph that skews in favor of cats doing funny
things. That doesn’t reflect the real world. The best knowledge graphs,
metadata schema, neural nets—whatever you want to call this
undergirding ML labeling technology that does the classifying—the
versions that work best reflect the collective wisdom and first-hand
evidence of those with physical-world experience.
This brings us to Project Haystack, the open-source organization
launched in 2011, devoting to developing a standard mark-up language
and a tagging schema for devices in commercial buildings. Given the
core importance to AI of getting standardized labeling right the first
time, it is no surprise that Academia and big-IT picked up on the Haystack schema
when they launched Brick schema. One way to look at it is that
there is more industry, academic and government energy, focus and money
being invested in label-making than ever before—what a gift! Seinfeld’s
Elaine Benes would be such a supporter if she were here today. And even
the dentist that became Walter White of Breaking Bad would not
under-appreciate it. I hope more of those that hold the evidence and
wisdom to contribute get involved. (HaystackConnect 2017 in Tampa in May is a good opportunity to do that.) Silo-ing data was the way business
was conducted in the last innovation cycle, but, it won’t work going
forward in the age of AI and machine learning (ML).
Another reason to do your part: tomorrow, there may not be chief
marketing officers and chief technology officers, but rather chief
labelers of marketing things and chief labelers of technology things,
etc. The labeling of training data for machine-learning algorithms is
about to consume us all—at least everyone that works with computers,
mobile phones, and Internet-of-Things devices. So, best to get ahead of
the game.
[an error occurred while processing this directive]
[Click Banner To Learn More]
[Home Page] [The Automator] [About] [Subscribe ] [Contact Us]