back to index

Gavin Miller: Adobe Research | Lex Fridman Podcast #23


Whisper Transcript | Transcript Only Page

00:00:00.000 | The following is a conversation with Gavin Miller.
00:00:02.400 | He's the head of Adobe Research.
00:00:04.840 | Adobe has empowered artists, designers, and creative minds
00:00:07.720 | from all professions working in the digital medium
00:00:10.560 | for over 30 years with software such as Photoshop,
00:00:13.680 | Illustrator, Premiere, After Effects, InDesign, Audition,
00:00:17.600 | software that work with images, video, and audio.
00:00:21.300 | Adobe Research is working to define the future evolution
00:00:24.520 | of these products in a way that makes the life
00:00:26.800 | of creatives easier, automates the tedious tasks,
00:00:29.780 | and gives more and more time to operate in the idea space
00:00:33.440 | instead of pixel space.
00:00:35.120 | This is where the cutting edge deep learning methods
00:00:37.620 | of the past decade can really shine
00:00:39.580 | more than perhaps any other application.
00:00:42.140 | Gavin is the embodiment of combining tech and creativity.
00:00:46.620 | Outside of Adobe Research, he writes poetry
00:00:48.940 | and builds robots, both things that are near and dear
00:00:52.540 | to my heart as well.
00:00:54.300 | This conversation is part
00:00:55.860 | of the Artificial Intelligence Podcast.
00:00:57.860 | If you enjoy it, subscribe on YouTube, iTunes,
00:01:00.960 | or simply connect with me on Twitter
00:01:02.920 | at Lex Friedman, spelled F-R-I-D.
00:01:06.060 | And now, here's my conversation with Gavin Miller.
00:01:09.860 | You're head of Adobe Research,
00:01:12.940 | leading a lot of innovative efforts in applications of AI,
00:01:15.980 | creating images, video, audio, language,
00:01:20.200 | but you're also yourself an artist, a poet, a writer,
00:01:24.400 | and even a roboticist.
00:01:25.600 | So, while I promise to everyone listening
00:01:28.760 | that I will not spend the entire time we have together
00:01:31.400 | reading your poetry, which I love,
00:01:33.560 | I have to sprinkle it in at least a little bit.
00:01:35.880 | So, some of them are pretty deep and profound,
00:01:39.320 | and some are light and silly.
00:01:40.520 | Let's start with a few lines from the silly variety.
00:01:43.820 | You write in Je Ne Regrette Rien,
00:01:49.720 | a poem that beautifully parodies both
00:01:52.600 | Edith Piaf's Je Ne Regrette Rien
00:01:55.360 | and my way by Frank Sinatra.
00:01:56.840 | So, it opens with, "And now, dessert is near.
00:02:01.840 | "It's time to pay the final total.
00:02:05.020 | "I've tried to slim all year,
00:02:07.660 | "but my diets have been anecdotal."
00:02:10.820 | So, where does that love for poetry come from for you?
00:02:15.260 | And if we dissect your mind,
00:02:17.740 | how does it all fit together in the bigger puzzle
00:02:19.760 | of Dr. Gavin Miller?
00:02:22.360 | - Well, interesting you chose that one.
00:02:25.520 | That was a poem I wrote when I'd been to my doctor
00:02:28.120 | and he said, "You really need to lose some weight
00:02:29.640 | "and go on a diet."
00:02:30.880 | And whilst the rational part of my brain wanted to do that,
00:02:35.160 | the irrational part of my brain was protesting
00:02:37.160 | and sort of embraced the opposite idea.
00:02:39.360 | - I regret nothing, hence.
00:02:40.680 | - Yes, exactly.
00:02:41.520 | Taken to an extreme, I thought it would be funny.
00:02:43.580 | Obviously, it's a serious topic for some people.
00:02:46.320 | But I think for me, I've always been interested in writing
00:02:52.020 | since I was in high school,
00:02:53.160 | as well as doing technology and invention.
00:02:55.920 | And sometimes there are parallel strands in your life
00:02:58.320 | that carry on and one is more about your private life
00:03:01.440 | and one's more about your technological career.
00:03:05.800 | And then at sort of happy moments along the way,
00:03:08.400 | sometimes the two things touch.
00:03:09.840 | One idea informs the other.
00:03:11.640 | And we can talk about that as we go.
00:03:15.000 | - Do you think your writing, the art, the poetry
00:03:17.120 | contribute indirectly or directly to your research,
00:03:20.440 | to your work in Adobe?
00:03:22.100 | - Well, sometimes it does if I say,
00:03:24.520 | imagine a future in a science fiction kind of way.
00:03:28.180 | And then once it exists on paper,
00:03:30.140 | I think, well, why shouldn't I just build that?
00:03:32.500 | There was an example where,
00:03:35.820 | when realistic voice synthesis first started in the 90s
00:03:39.620 | at Apple, where I worked in research,
00:03:42.020 | it was done by a friend of mine.
00:03:44.140 | I sort of sat down and started writing a poem
00:03:46.340 | which each line I would enter into the voice synthesizer
00:03:49.380 | and see how it sounded and sort of wrote it for that voice.
00:03:53.240 | And at the time the agents weren't very sophisticated.
00:03:57.280 | So they'd sort of add random intonation
00:03:59.140 | and I kind of made up the poem
00:04:01.640 | to sort of match the tone of the voice.
00:04:03.520 | And it sounded slightly sad and depressed.
00:04:05.840 | So I pretended it was a poem written
00:04:08.080 | by an intelligent agent,
00:04:09.660 | sort of telling the user to go home and leave them alone.
00:04:13.520 | But at the same time, they were lonely
00:04:14.960 | and wanted to have company and learn
00:04:16.420 | from what the user was saying.
00:04:18.240 | And at the time it was way beyond anything
00:04:20.420 | that AI could possibly do.
00:04:21.860 | But since then,
00:04:23.880 | it's becoming more within the bounds of possibility.
00:04:28.040 | And then at the same time,
00:04:31.140 | I had a project at home where I did sort of a smart home.
00:04:34.320 | This was probably '93, '94.
00:04:36.900 | And I had the talking voice.
00:04:39.060 | It'd remind me when I walked in the door
00:04:40.660 | of what things I had to do.
00:04:42.020 | I had buttons on my washing machine
00:04:44.580 | 'cause I was a bachelor
00:04:45.420 | and I'd leave the clothes in there for three days
00:04:46.900 | and they'd go moldy.
00:04:47.740 | So as I got up in the morning,
00:04:49.740 | it would say, "Don't forget the washing," and so on.
00:04:52.580 | I made photo albums that used light sensors
00:04:57.100 | to know which page you were looking at,
00:04:58.580 | would send that over wireless radio to the agent
00:05:01.140 | who would then play sounds that matched the image
00:05:03.460 | you were looking at in the book.
00:05:05.180 | So I was kind of in love with this idea of magical realism
00:05:07.960 | and whether it was possible to do that with technology.
00:05:11.120 | So that was a case where the agent
00:05:14.820 | sort of intrigued me from a literary point of view
00:05:16.980 | and became a personality.
00:05:18.860 | I think more recently, I've also written plays.
00:05:23.100 | And when in plays you write dialogue,
00:05:24.660 | and obviously you write a fixed set of dialogue
00:05:27.140 | that follows a linear narrative.
00:05:29.180 | But with modern agents, as you design a personality
00:05:32.300 | or a capability for conversation,
00:05:33.980 | you're sort of thinking of,
00:05:35.460 | I kind of have imaginary dialogue in my head.
00:05:37.580 | And then I think, what would it take
00:05:39.440 | not only to have that be real,
00:05:41.580 | but for it to really know what it's talking about?
00:05:44.300 | So it's easy to fall into the uncanny valley
00:05:46.860 | with AI where it says something
00:05:48.900 | it doesn't really understand,
00:05:50.220 | but it sounds good to the person.
00:05:51.660 | But you rapidly realize that it's kind of just
00:05:55.620 | stimulus response, it doesn't really have
00:05:57.940 | real world knowledge about the thing it's describing.
00:06:00.740 | And so when you get to that point,
00:06:04.140 | it really needs to have multiple ways
00:06:05.820 | of talking about the same concept.
00:06:07.260 | So it sounds as though it really understands it.
00:06:09.340 | Now, what really understanding means
00:06:10.980 | is in the eye of the beholder, right?
00:06:13.140 | But if it only has one way of referring to something,
00:06:16.260 | it feels like it's a canned response.
00:06:17.900 | But if it can reason about it,
00:06:19.900 | or you can go at it from multiple angles
00:06:22.220 | and give a similar kind of response that people would,
00:06:24.500 | then it starts to seem more like
00:06:27.460 | there's something there that's sentient.
00:06:31.060 | - You can say the same thing,
00:06:32.380 | multiple things from different perspectives.
00:06:34.780 | I mean, with the automatic image captioning
00:06:37.180 | that I've seen the work that you're doing,
00:06:38.940 | there's elements of that, right?
00:06:40.320 | Being able to generate different kinds of statements
00:06:43.060 | about the same feature. - Right, so in my team,
00:06:44.700 | there's a lot of work on turning a medium
00:06:47.620 | from one form to another,
00:06:48.740 | whether it's auto-tagging imagery
00:06:50.460 | or making up full sentences about what's in the image,
00:06:54.260 | then changing the sentence,
00:06:56.300 | finding another image that matches the new sentence
00:06:58.820 | or vice versa.
00:06:59.700 | And in the modern world of GANs,
00:07:03.180 | you sort of give it a description
00:07:04.820 | and it synthesizes an asset that matches the description.
00:07:08.380 | So I've sort of gone on a journey.
00:07:11.440 | My early days in my career were about 3D computer graphics,
00:07:14.660 | the sort of pioneering work,
00:07:15.900 | sort of before movies had special effects
00:07:18.220 | done with 3D graphics,
00:07:19.660 | and sort of rode that revolution.
00:07:21.820 | And that was very much like the Renaissance
00:07:24.460 | where people would model light and color and shape
00:07:26.820 | and everything.
00:07:27.900 | And now we're kind of in another wave
00:07:29.780 | where it's more impressionistic
00:07:31.280 | and it's sort of the idea of something
00:07:33.660 | can be used to generate an image directly,
00:07:35.560 | which is sort of the new frontier
00:07:39.100 | in computer image generation using AI algorithms.
00:07:43.780 | - So the creative process is more in the space of ideas
00:07:46.620 | or becoming more in the space of ideas
00:07:48.180 | versus in the raw pixels.
00:07:50.820 | - Well, it's interesting.
00:07:51.940 | It depends.
00:07:52.780 | I think at Adobe,
00:07:53.600 | we really want to span the entire range
00:07:55.420 | from really, really good,
00:07:57.580 | what you might call low-level tools,
00:07:59.060 | by low level, as close to say,
00:08:00.700 | analog workflows as possible.
00:08:02.340 | So what we do there is we make up systems
00:08:05.860 | that do really realistic oil paint
00:08:07.540 | and watercolor simulation.
00:08:08.780 | So if you want every bristle to behave
00:08:11.020 | as it would in the real world
00:08:12.140 | and leave a beautiful analog trail of water
00:08:15.540 | and then flow after you've made the brushstroke,
00:08:17.940 | you can do that.
00:08:18.900 | And that's really important for people
00:08:20.500 | who want to create something really expressive
00:08:23.700 | or really novel,
00:08:24.660 | 'cause they have complete control.
00:08:26.860 | And then as certain other tasks become automated,
00:08:30.740 | it frees the artists up to focus on the inspiration
00:08:34.140 | and less of the perspiration.
00:08:35.660 | So I'm thinking about different ideas, obviously.
00:08:40.980 | Once you finish the design,
00:08:42.700 | there's a lot of work to say,
00:08:45.060 | do it for all the different aspect ratio
00:08:46.580 | of phones or websites and so on.
00:08:49.080 | And that used to take up an awful lot of time for artists.
00:08:52.060 | It still does for many,
00:08:53.380 | what we call content velocity.
00:08:55.260 | And one of the targets of AI is actually to reason about,
00:08:59.500 | from the first example of what are the lightning intent
00:09:02.340 | for these other formats?
00:09:03.740 | Maybe if you change the language to German
00:09:06.340 | and the words are longer,
00:09:08.160 | how do you reflow everything
00:09:09.460 | so that it looks nicely artistic in that way?
00:09:12.340 | And so the person can focus on
00:09:14.620 | the really creative bit in the middle,
00:09:15.940 | which is what is the look and style and feel
00:09:18.460 | and what's the message
00:09:19.300 | and what's the story and the human element?
00:09:21.580 | So I think creativity is changing.
00:09:25.440 | So that's one way in which we're trying
00:09:26.940 | to just make it easier and faster and cheaper to do
00:09:29.380 | so that there can be more of it,
00:09:31.620 | more demand 'cause it's less expensive.
00:09:33.580 | So everyone wants beautiful artwork for everything
00:09:36.100 | from a school website to Hollywood movie.
00:09:39.540 | On the other side,
00:09:41.820 | as some of these things have automatic versions of them,
00:09:45.500 | people will possibly change role
00:09:48.060 | from being the hands-on artisan
00:09:50.860 | to being either the art director or the conceptual artist.
00:09:53.700 | And then the computer will be a partner
00:09:55.940 | to help create polished examples
00:09:57.960 | of the idea that they're exploring.
00:10:00.060 | - Let's talk about Adobe products first,
00:10:01.740 | like AI and Adobe products.
00:10:03.460 | Just so you know where I'm coming from,
00:10:06.300 | I'm a huge fan of Photoshop for images,
00:10:09.820 | Premiere for video, Audition for audio.
00:10:12.860 | I'll probably use Photoshop
00:10:15.020 | to create the thumbnail for this video,
00:10:16.940 | Premiere to edit the video,
00:10:18.060 | Audition to do the audio.
00:10:20.160 | That said, everything I do is really manually.
00:10:24.700 | And I set up, I use this old school Kinesis keyboard
00:10:27.540 | and I have AutoHotKey that just,
00:10:29.620 | it's really about optimizing the flow
00:10:33.340 | of just making sure there's as few clicks as possible
00:10:36.260 | to just being extremely efficient.
00:10:37.820 | It's something you started to speak to.
00:10:40.420 | So before we get into the fun,
00:10:42.380 | sort of awesome deep learning things,
00:10:45.580 | where does AI, if you could speak a little more to it,
00:10:48.020 | AI or just automation in general,
00:10:51.100 | do you see in the coming months and years
00:10:54.540 | or in general prior in 2018,
00:10:58.580 | fitting into making the life,
00:11:00.580 | the low level pixel work flow easier?
00:11:03.940 | - Yeah, that's a great question.
00:11:04.980 | So we have a very rich array of algorithms already
00:11:09.060 | in Photoshop, just classical procedural algorithms
00:11:12.540 | as well as ones based on data.
00:11:14.620 | In some cases, they end up with a large number of sliders
00:11:19.020 | and degrees of freedom.
00:11:20.100 | So one way in which AI can help is just an auto button,
00:11:23.380 | which comes up with default settings
00:11:25.260 | based on the content itself,
00:11:26.660 | rather than default values for the tool.
00:11:29.740 | At that point, you then start tweaking.
00:11:31.900 | So that's a very kind of make life easier for people
00:11:36.140 | whilst making use of common sense from other example images.
00:11:39.700 | - So like smart defaults.
00:11:41.020 | - Smart defaults, absolutely.
00:11:43.420 | Another one is something we've spent a lot of work
00:11:46.980 | over the last 20 years I've been at Adobe,
00:11:49.580 | well 19, thinking about selection, for instance,
00:11:53.060 | where, you know, with a quick select,
00:11:56.460 | you would look at color boundaries
00:11:58.820 | and figure out how to sort of flood fill into regions
00:12:01.420 | that you thought were physically connected
00:12:03.140 | in the real world.
00:12:04.780 | But that algorithm had no visual common sense
00:12:06.980 | about what a cat looks like or a dog.
00:12:08.700 | It would just do it based on rules of thumb,
00:12:10.620 | which were applied to graph theory.
00:12:12.940 | And it was a big improvement over the previous work
00:12:16.180 | where you had sort of almost click everything by hand,
00:12:19.140 | or if it just did similar colors,
00:12:21.100 | it would do little tiny regions
00:12:22.500 | that wouldn't be connected.
00:12:24.540 | But in the future, using neural nets
00:12:26.940 | to actually do a great job with say a single click,
00:12:30.220 | or even in the case of well-known categories
00:12:32.860 | like people or animals, no click,
00:12:34.700 | where you just say, select the object
00:12:36.740 | and it just knows the dominant object
00:12:38.420 | as a person in the middle of the photograph.
00:12:41.060 | Those kinds of things are really valuable
00:12:45.300 | if they can be robust enough
00:12:46.780 | to give you good quality results.
00:12:48.420 | Or they can be a great start for like tweaking it.
00:12:51.980 | - So for example, background removal.
00:12:54.300 | - Correct.
00:12:55.140 | - Like one thing I'll, in a thumbnail,
00:12:57.420 | I'll take a picture of you right now
00:12:59.300 | and essentially remove the background behind you.
00:13:01.580 | And I wanna make that as easy as possible.
00:13:04.020 | Now you don't have flowing hair,
00:13:06.820 | like rich-- - Sadly, at the moment, yes.
00:13:08.540 | - Rich sort of--
00:13:09.660 | - I had it in the past,
00:13:10.580 | it may come again in the future, but for now.
00:13:12.580 | (laughing)
00:13:13.620 | - So that sometimes makes it a little more challenging
00:13:16.060 | to remove the background.
00:13:17.420 | How difficult do you think is that problem for AI
00:13:20.020 | for basically making the quick selection tool
00:13:23.740 | smarter and smarter and smarter?
00:13:25.140 | - Well, we have a lot of research on that already.
00:13:28.420 | If you want a sort of quick, cheap and cheerful,
00:13:32.580 | look, I'm pretending I'm in Hawaii,
00:13:34.220 | but it's sort of a joke,
00:13:35.780 | then you don't need perfect boundaries.
00:13:37.540 | And you can do that today with a single click
00:13:39.300 | with the algorithms we have.
00:13:40.740 | We have other algorithms where,
00:13:44.260 | with a little bit more guidance on the boundaries,
00:13:46.580 | like you might need to touch it up a little bit.
00:13:49.980 | We have other algorithms that can pull a nice mat
00:13:52.580 | from a crude selection.
00:13:54.420 | So we have combinations of tools that can do all of that.
00:13:58.660 | And at our recent Max conference at AbMax,
00:14:02.860 | we demonstrated how very quickly,
00:14:05.620 | just by drawing a simple polygon
00:14:07.620 | around the object of interest,
00:14:09.300 | we could not only do it for a single still,
00:14:11.820 | but we could pull a mat,
00:14:14.420 | well, pull at least a selection mask from a moving target,
00:14:17.980 | like a person dancing in front of a brick wall or something.
00:14:21.060 | And so it's going from hours to a few seconds
00:14:24.420 | for workflows that are really nice.
00:14:27.820 | And then you might go in and touch up a little.
00:14:30.300 | - So that's a really interesting question.
00:14:31.860 | You mentioned the word robust.
00:14:33.460 | You know, there's like a journey for an idea, right?
00:14:37.660 | And what you presented probably at Max
00:14:40.260 | has elements of just sort of, it inspires the concept.
00:14:44.540 | It can work pretty well in majority of cases.
00:14:47.340 | But how do you make something that works,
00:14:49.380 | well, in majority of cases,
00:14:50.940 | how do you make something that works maybe in all cases
00:14:54.580 | or it becomes a robust tool that can-
00:14:56.660 | - There are a couple of things.
00:14:57.620 | So that really touches on the difference
00:15:00.500 | between academic research and industrial research.
00:15:02.980 | So in academic research, it's really about
00:15:05.580 | who's the person to have the great new idea
00:15:07.620 | that shows promise.
00:15:09.420 | And we certainly love to be those people too,
00:15:12.380 | but we have sort of two forms of publishing.
00:15:15.100 | One is academic peer review,
00:15:16.900 | which we do a lot of,
00:15:17.940 | and we have great success there
00:15:19.340 | as much as some universities.
00:15:21.700 | But then we also have shipping,
00:15:24.820 | which is a different type of peer.
00:15:26.780 | And then we get customer review
00:15:29.180 | as well as product critics.
00:15:30.820 | And that might be a case where it's not about
00:15:35.820 | being perfect every single time,
00:15:37.420 | but perfect enough of the time,
00:15:39.580 | plus a mechanism to intervene and recover
00:15:41.820 | where you do have mistakes.
00:15:43.380 | So we have the luxury of very talented customers.
00:15:46.140 | We don't want them to be overly taxed
00:15:49.660 | doing it every time.
00:15:50.700 | But if they can go in and just take it from 99 to 100
00:15:55.700 | with the touch of a mouse or something,
00:15:59.060 | then for the professional end,
00:16:00.900 | that's something that we definitely
00:16:02.300 | want to support as well.
00:16:03.940 | And for them, it went from having to do that
00:16:06.420 | tedious task all the time to much less often.
00:16:09.860 | So I think that gives us an out.
00:16:12.740 | If it had to be 100% automatic all the time,
00:16:15.940 | then that would delay the time
00:16:17.540 | at which we could get to market.
00:16:19.780 | - So on that thread, maybe you can untangle something.
00:16:23.820 | Again, I'm sort of just speaking to my own experience.
00:16:27.380 | - Yeah, no, that's fine.
00:16:29.140 | - Maybe that is the most useful I can get at.
00:16:32.020 | So I think Photoshop, as an example, or Premiere,
00:16:36.380 | has a lot of amazing features that I haven't touched.
00:16:42.060 | And so what's the, in terms of AI,
00:16:44.300 | helping make my life or the life of creatives easier,
00:16:49.300 | how, this collaboration between human and machine,
00:16:54.660 | how do you learn to collaborate better?
00:16:57.380 | How do you learn the new algorithms?
00:17:00.140 | Is it something where you have to watch tutorials
00:17:02.860 | and you have to watch videos and so on?
00:17:04.980 | Or do you ever think, do you think about
00:17:08.140 | the experience itself through exploration,
00:17:10.260 | being the teacher?
00:17:11.140 | - We absolutely do.
00:17:12.180 | So we, I'm glad that you brought this up.
00:17:16.980 | We sort of think about two things.
00:17:18.900 | One is helping the person in the moment
00:17:20.740 | to do the task that they need to do,
00:17:22.060 | but the other is thinking more holistically
00:17:23.900 | about their journey learning a tool.
00:17:25.940 | And we're just like, think of it as Adobe University,
00:17:28.500 | where you use the tool long enough, you become an expert.
00:17:31.220 | And not necessarily an expert in everything.
00:17:33.180 | It's like living in a city.
00:17:34.300 | You don't necessarily know every street,
00:17:35.860 | but you know the important ones you need to get to.
00:17:38.420 | So we have projects in research,
00:17:40.860 | which actually look at the thousands of hours
00:17:42.940 | of tutorials online and try to understand
00:17:45.500 | what's being taught in them.
00:17:47.220 | And then we had one publication at CHI
00:17:49.820 | where it was looking at,
00:17:51.500 | given the last three or four actions you did,
00:17:55.140 | what did other people in tutorials do next?
00:17:57.540 | So if you want some inspiration for what you might do next,
00:18:00.820 | or you just want to watch the tutorial and see,
00:18:03.300 | learn from people who are doing similar workflows to you,
00:18:05.620 | you can without having to go and search
00:18:07.740 | on keywords and everything.
00:18:09.340 | So really trying to use the context of your use of the app
00:18:14.020 | to make intelligent suggestions,
00:18:16.620 | either about choices that you might make,
00:18:19.140 | or in a more assistive way where it could say,
00:18:24.060 | if you did this next, we could show you.
00:18:25.780 | And that's basically the frontier
00:18:27.700 | that we're exploring now,
00:18:28.780 | which is if we really deeply understand the domain
00:18:31.940 | in which designers and creative people work,
00:18:35.380 | can we combine that with AI
00:18:37.740 | and pattern matching of behavior
00:18:39.540 | to make intelligent suggestions,
00:18:42.020 | either through verbal possibilities,
00:18:47.020 | or just showing the results of if you try this.
00:18:49.580 | And that's really the sort of,
00:18:51.820 | I was in a meeting today thinking about these things.
00:18:55.460 | - Well, I-- - So it's still
00:18:56.940 | a grand challenge.
00:18:58.100 | We'd all love an artist over one shoulder
00:19:00.300 | and a teacher over the other, right?
00:19:02.260 | And we hope to get there.
00:19:05.340 | And the right thing to do is to give enough at each stage
00:19:09.100 | that it's useful in itself,
00:19:10.300 | but it builds a foundation
00:19:11.420 | for the next level of expectation.
00:19:14.540 | - Are you aware of this gigantic medium of YouTube
00:19:19.260 | that's creating just a bunch of creative people,
00:19:22.860 | both artists and teachers of different kinds?
00:19:26.340 | - Absolutely.
00:19:27.180 | And the more we can understand those media types,
00:19:29.900 | both visually and in terms of transcripts and words,
00:19:33.220 | the more we can bring the wisdom that they embody
00:19:35.780 | into the guidance that's embedded in the tool.
00:19:38.300 | - That'll be brilliant.
00:19:39.740 | To remove the barrier from having to yourself
00:19:42.740 | type in the keyword, searching, so on.
00:19:45.340 | - Absolutely.
00:19:46.180 | And then in the longer term,
00:19:48.060 | an interesting discussion is,
00:19:50.580 | does it ultimately not just assist
00:19:52.780 | with learning the interface we have,
00:19:54.100 | but does it modify the interface to be simpler?
00:19:57.020 | Or do you fragment into a variety of tools,
00:19:59.340 | each of which has a different level
00:20:01.220 | of visibility of the functionality?
00:20:04.460 | I like to say that if you add a feature to a GUI,
00:20:07.900 | you have to have yet more visual complexity
00:20:11.500 | confronting the new user.
00:20:12.740 | Whereas if you have an assistant with a new skill,
00:20:15.780 | if you know they have it, so you know to ask for it,
00:20:18.260 | then it's sort of additive without being more intimidating.
00:20:21.820 | So we definitely think about new users
00:20:24.300 | and how to onboard them.
00:20:26.540 | Many actually value the idea
00:20:28.260 | of being able to master that complex interface
00:20:31.180 | and keyboard shortcuts like you were talking about earlier,
00:20:34.420 | because with great familiarity,
00:20:36.820 | it becomes a musical instrument
00:20:38.260 | for expressing your visual ideas.
00:20:40.540 | And other people just want to get something done quickly
00:20:44.620 | in the simplest way possible.
00:20:45.940 | And that's where a more assistive version
00:20:48.340 | of the same technology might be useful,
00:20:50.460 | maybe on a different class of device,
00:20:52.540 | which is more in context for capture, say.
00:20:54.840 | Whereas somebody who's in a deep post-production workflow
00:20:58.780 | maybe want to be on a laptop or a big screen desktop
00:21:03.300 | and have more knobs and dials
00:21:07.260 | to really express the subtlety of what they want to do.
00:21:10.820 | - So there's so many exciting applications
00:21:14.820 | of computer vision and machine learning
00:21:16.420 | that Adobe is working on.
00:21:18.540 | Like scene stitching, sky replacement,
00:21:21.260 | foreground, background removal,
00:21:23.300 | spatial object-based image search,
00:21:25.780 | automatic image captioning, like we mentioned,
00:21:27.860 | Project Cloak, Project Deep Fill,
00:21:30.060 | filling in parts of the images,
00:21:32.020 | Project Scribbler, Style Transfer Video,
00:21:34.940 | Style Transfer Faces and Video,
00:21:37.140 | with Project Puppetron, best name ever.
00:21:40.060 | Can you talk through a favorite or some of them
00:21:45.060 | or examples that popped in mind?
00:21:49.260 | I'm sure I'll be able to provide links
00:21:50.980 | to other ones we don't talk about,
00:21:53.660 | 'cause there's visual elements to all of them
00:21:56.860 | that are exciting.
00:21:59.100 | - Why they're interesting for different reasons
00:22:00.620 | might be a good way to go.
00:22:01.820 | So I think sky replace is interesting
00:22:04.380 | because we talked about selection
00:22:06.780 | being sort of an atomic operation.
00:22:08.420 | It's almost like a, if you think of an assembly language,
00:22:11.540 | it's like a single instruction.
00:22:13.140 | Whereas sky replace is a compound action
00:22:16.780 | where you automatically select the sky,
00:22:18.820 | you look for stock content
00:22:20.700 | that matches the geometry of the scene.
00:22:23.140 | You try to have variety in your choices
00:22:25.980 | so that you do coverage of different moods.
00:22:28.300 | It then mattes in the sky behind the foreground,
00:22:32.380 | but then importantly, it uses the foreground
00:22:35.740 | of the other image that you just searched on
00:22:38.220 | to recolor the foreground of the image that you're editing.
00:22:41.380 | So if you say go from a midday sky to an evening sky,
00:22:45.980 | it will actually add sort of an orange glow
00:22:48.940 | to the foreground objects as well.
00:22:50.740 | I was a big fan in college of Magritte
00:22:53.820 | and he has a number of paintings where it's surrealism
00:22:57.500 | because he'll like do a composite,
00:22:59.700 | but the foreground building will be at night
00:23:01.900 | and the sky will be during the day.
00:23:03.220 | There's one called the empire of light,
00:23:04.620 | which was on my wall in college.
00:23:06.580 | And we're trying not to do surrealism.
00:23:09.180 | It can be a choice,
00:23:10.700 | but we'd rather have it be natural by default
00:23:14.140 | rather than it looking fake
00:23:15.780 | and then you have to do a whole bunch of post-production
00:23:17.660 | to fix it.
00:23:18.740 | So that's a case where we're kind of capturing
00:23:21.460 | an entire workflow into a single action
00:23:23.620 | and doing it in about a second rather than a minute or two.
00:23:27.900 | And when you do that, you can not just do it once,
00:23:30.300 | but you can do it for say like 10 different backgrounds
00:23:33.060 | and then you're almost back to this inspiration idea of,
00:23:37.100 | I don't know quite what I want,
00:23:38.380 | but I'll know it when I see it.
00:23:39.900 | And you can just explore the design space
00:23:43.180 | as close to final production value as possible.
00:23:46.100 | And then when you really pick one,
00:23:47.340 | you might go back and slightly tweak the selection mask
00:23:49.620 | just to make it perfect and do that kind of polish
00:23:52.540 | that professionals like to bring to their work.
00:23:55.300 | - So then there's this idea of,
00:23:57.700 | you mentioned the sky,
00:23:58.660 | replacing it to different stock images of the sky.
00:24:01.940 | In general, you have this idea-
00:24:03.420 | - Or it could be on your disc or whatever.
00:24:04.980 | - Disc, right.
00:24:05.860 | But making even more intelligent choices
00:24:08.500 | about ways to search stock images,
00:24:11.100 | which is really interesting.
00:24:12.380 | It's kind of spatial.
00:24:13.860 | - Absolutely.
00:24:14.700 | - Being able to specify.
00:24:15.540 | - Right, so that was something we called concept canvas.
00:24:18.660 | So normally when you do a, say an image search,
00:24:21.860 | you would, assuming it's just based on text,
00:24:24.740 | you would give the keywords
00:24:26.420 | of the things you want to be in the image,
00:24:27.820 | and it would find the nearest one that had those tags.
00:24:30.580 | For many tasks, you really want,
00:24:35.020 | to be able to say, I want a big person in the middle
00:24:36.780 | or in a dog to the right and umbrella above the left,
00:24:39.420 | 'cause you want to leave space for the text or whatever.
00:24:42.940 | And so concept canvas lets you assign spatial regions
00:24:46.980 | to the keywords.
00:24:48.220 | And then we've already pre-indexed the images
00:24:50.900 | to know where the important concepts are in the picture.
00:24:54.140 | So we then go through that index matching to assets.
00:24:58.140 | And even though it's just another form of search,
00:25:01.260 | because you're doing spatial design or layout,
00:25:03.860 | it starts to feel like design.
00:25:05.980 | You sort of feel oddly responsible for the image
00:25:08.260 | that comes back as if you invented it.
00:25:10.100 | - Yeah.
00:25:10.940 | - So it's a good example where giving enough control
00:25:15.940 | starts to make people have a sense of ownership
00:25:18.820 | over the outcome of the event.
00:25:20.620 | And then we also have technologies in Photoshop
00:25:22.460 | where you physically can move the dog in post as well.
00:25:25.740 | But for concept canvas, it was just a very fast way
00:25:29.100 | to sort of loop through and be able to lay things out.
00:25:31.940 | - In terms of being able to remove objects from a scene
00:25:38.700 | and fill in the background automatically.
00:25:41.680 | So that's extremely exciting.
00:25:45.660 | And that's so neural networks are stepping in there.
00:25:48.300 | I just talked this week with Ian Goodfellow.
00:25:51.300 | - Yes, the GANs for doing that is definitely one approach.
00:25:55.420 | - So is that a really difficult problem?
00:25:57.740 | Is it as difficult as it looks, again,
00:25:59.940 | to take it to a robust product level?
00:26:03.820 | - Well, there are certain classes of image
00:26:06.060 | for which the traditional algorithms
00:26:07.620 | like Content-Aware Fill work really well.
00:26:10.140 | Like if you have a naturalistic texture,
00:26:12.020 | like a gravel path or something,
00:26:13.940 | because it's patch-based, it will make up
00:26:15.700 | a very plausible looking intermediate thing.
00:26:18.260 | And fill in the hole.
00:26:19.100 | And then we use some algorithms
00:26:21.620 | to sort of smooth out the lighting.
00:26:23.020 | So you don't see any brightness contrasts in that region.
00:26:25.820 | Or you've gradually ramped from one from dark to light
00:26:28.620 | if it straddles a boundary.
00:26:30.020 | Where it gets complicated is if you have to infer
00:26:34.020 | invisible structure behind the person in front.
00:26:37.540 | And that really requires a common sense knowledge
00:26:40.840 | of the world to know what, you know,
00:26:42.580 | if I see three quarters of a house,
00:26:44.720 | do I have a rough sense of what the rest
00:26:46.380 | of the house looks like?
00:26:47.820 | If you just fill it in with patches,
00:26:49.260 | it can end up sort of doing things
00:26:51.340 | that make sense locally.
00:26:52.260 | But you look at the global structure
00:26:53.540 | and it looks like it's just sort of crumpled or messed up.
00:26:57.260 | And so what GANs and neural nets bring to the table
00:27:00.700 | is this common sense learned from the training set.
00:27:03.740 | And the challenge right now is that the generative methods
00:27:08.740 | that can make up missing holes using that kind of technology
00:27:13.020 | are still only stable at low resolutions.
00:27:15.640 | And so you either need to then go from a low resolution
00:27:18.180 | to a high resolution using some other algorithm,
00:27:20.580 | or we need to push the state of the art
00:27:22.180 | and it's still in research to get to that point.
00:27:25.060 | Of course, if you show it something,
00:27:27.600 | say it's trained on houses,
00:27:29.980 | and then you show it an octopus,
00:27:31.580 | it's not gonna do a very good job
00:27:34.120 | of showing common sense about octopuses.
00:27:37.180 | So again, you're asking about how you know
00:27:42.180 | that it's ready for prime time.
00:27:44.560 | You really need a very diverse training set of images.
00:27:47.660 | And ultimately, that may be a case
00:27:51.540 | where you put it out there with some guardrails
00:27:55.260 | where you might do a detector,
00:27:59.340 | which looks at the image
00:28:00.420 | and sort of estimates its own competence
00:28:03.060 | of how well a job could this algorithm do.
00:28:06.060 | So eventually, there may be this idea
00:28:08.580 | of what we call an ensemble of experts
00:28:10.220 | where any particular expert is specialized
00:28:13.340 | in certain things.
00:28:14.200 | And then there's sort of a,
00:28:15.560 | either they vote to say how confident they are
00:28:17.480 | about what to do.
00:28:18.320 | This is sort of more future looking,
00:28:20.200 | or there's some dispatcher which says,
00:28:22.540 | "You're good at houses, you're good at trees."
00:28:24.840 | (laughing)
00:28:25.680 | And so, I mean, it's all this adds up to a lot of work
00:28:30.080 | 'cause each of those models will be a whole bunch of work.
00:28:32.320 | But I think over time, you'd gradually fill out the set
00:28:36.520 | and initially focus on certain workflows
00:28:39.420 | and then sort of branch out as you get more capable.
00:28:42.320 | - You mentioned workflows,
00:28:44.080 | and have you considered maybe looking far into the future,
00:28:48.920 | first of all, using the fact that there is
00:28:54.520 | a huge amount of people that use Photoshop, for example,
00:28:58.760 | and have certain workflows,
00:29:00.400 | being able to collect the information by which they,
00:29:05.400 | basically get information about their workflows,
00:29:08.320 | about what they need, the ways to help them,
00:29:11.840 | whether it is houses or octopus that people work on more.
00:29:15.640 | - Right.
00:29:16.480 | - Like basically getting a beat on what kind of data
00:29:20.400 | is needed to be annotated and collected for people
00:29:23.520 | to build tools that actually work well for people.
00:29:26.360 | - Right, absolutely, and this is a big topic
00:29:28.280 | and the whole world of AI is what data can you gather
00:29:31.360 | and why.
00:29:32.200 | - Right.
00:29:33.320 | - At one level, a way to think about it is,
00:29:35.740 | we not only want to train our customers
00:29:38.660 | in how to use our products,
00:29:39.800 | but we want them to teach us what's important
00:29:42.040 | and what's useful.
00:29:43.580 | At the same time, we want to respect their privacy.
00:29:46.400 | And obviously we wouldn't do things
00:29:49.720 | without their explicit permission.
00:29:51.480 | And I think the modern spirit of the age around this
00:29:55.840 | is you have to demonstrate to somebody
00:29:57.560 | how they're benefiting from sharing their data
00:29:59.480 | with the tool.
00:30:01.560 | Either it's helping in the short term
00:30:03.120 | to understand their intent
00:30:04.240 | so you can make better recommendations,
00:30:06.440 | or if they're friendly to your cause or your tool
00:30:09.840 | or they want to help you evolve quickly
00:30:11.800 | 'cause they depend on you for their livelihood,
00:30:14.360 | they may be willing to share some of their workflows
00:30:18.040 | or choices with the dataset to be then trained.
00:30:23.040 | There are technologies for looking at learning
00:30:27.560 | without necessarily storing all the information permanently
00:30:31.340 | so that you can sort of learn on the fly
00:30:33.560 | but not keep a record of what somebody did.
00:30:36.680 | So we're definitely exploring all of those possibilities.
00:30:39.240 | - And I think Adobe exists in a space where Photoshop,
00:30:43.200 | like if I look at the data I've created and own,
00:30:46.640 | you know, I'm less comfortable sharing data
00:30:48.640 | with social networks than I am with Adobe
00:30:51.720 | because there's just exactly as you said,
00:30:55.080 | there's an obvious benefit for sharing the data
00:31:00.080 | that I use to create in Photoshop
00:31:02.720 | because it's helping improve the workflow in the future.
00:31:06.000 | - Right. - As opposed to,
00:31:06.840 | it's not clear what the benefit is in social networks.
00:31:10.240 | - It's nice of you to say that.
00:31:11.280 | I mean, I think there are some professional workflows
00:31:13.960 | where people might be very protective
00:31:15.360 | of what they're doing,
00:31:16.200 | such as if I was preparing evidence for a legal case,
00:31:19.560 | I wouldn't want any of that, you know,
00:31:22.720 | phoning home to help train the algorithm or anything.
00:31:25.740 | There may be other cases where people say
00:31:28.840 | having a trial version or they're doing some,
00:31:30.880 | I'm not saying we're doing this today,
00:31:31.920 | but there's a future scenario
00:31:33.240 | where somebody has a more permissive relationship
00:31:36.400 | with Adobe where they explicitly say, I'm fine.
00:31:39.440 | I'm only doing hobby projects
00:31:41.080 | or things which are non-confidential
00:31:44.480 | and in exchange for some benefit, tangible or otherwise,
00:31:49.060 | I'm willing to share very fine-grained data.
00:31:51.880 | So another possible scenario
00:31:54.160 | is to capture relatively crude high-level things
00:31:57.600 | from more people and then more detailed knowledge
00:32:00.240 | from people who are willingly participate.
00:32:02.280 | We do that today with explicit customer studies
00:32:04.720 | where, you know, we go and visit somebody
00:32:07.240 | and ask them to try the tool
00:32:08.400 | and we human observe what they're doing.
00:32:10.960 | In the future, to be able to do that enough,
00:32:14.480 | to be able to train an algorithm,
00:32:16.420 | we'd need a more systematic process,
00:32:18.520 | but we'd have to do it very consciously
00:32:20.000 | because one of the things people treasure about Adobe
00:32:23.200 | is a sense of trust.
00:32:24.600 | And we don't want to endanger that
00:32:26.820 | through overly aggressive data collection.
00:32:28.960 | So we have a chief privacy officer
00:32:31.640 | and it's definitely front and center
00:32:34.560 | of thinking about AI rather than an afterthought.
00:32:37.520 | - Well, when you start that program, sign me up.
00:32:40.000 | - Okay, happy to.
00:32:41.280 | (laughing)
00:32:42.480 | - Is there other projects that you wanted to mention
00:32:45.120 | that I didn't perhaps that pop into mind?
00:32:48.800 | - Well, you covered a number.
00:32:49.800 | I think you mentioned Project Puppetron.
00:32:51.960 | I think that one is interesting
00:32:53.520 | because you might think of Adobe as only thinking in 2D,
00:32:59.360 | and that's a good example
00:33:01.920 | where we're actually thinking more three-dimensionally
00:33:04.280 | about how to assign features to faces so that we can,
00:33:07.440 | you know, if you take, so what Puppetron does,
00:33:09.580 | it takes either a still or a video of a person talking,
00:33:13.680 | and then it can take a painting of somebody else
00:33:16.840 | and then apply the style of the painting
00:33:18.640 | to the person who's talking in the video.
00:33:20.840 | And it's unlike a sort of screen door post filter effect
00:33:28.280 | that you sometimes see online.
00:33:30.480 | It really looks as though it's sort of somehow attached
00:33:34.440 | or reflecting the motion of the face.
00:33:36.240 | And so that's the case where even to do a 2D workflow
00:33:39.440 | like stylization, you really need to infer more
00:33:42.480 | about the 3D structure of the world.
00:33:44.280 | And I think as 3D computer vision algorithms get better,
00:33:48.680 | initially they'll focus on particular domains like faces
00:33:52.000 | where you have a lot of prior knowledge about structure
00:33:54.400 | and you can maybe have a parameterized template
00:33:57.000 | that you fit to the image.
00:33:58.800 | But over time, this should be possible
00:34:00.480 | for more general content.
00:34:01.960 | And it might even be invisible to the user
00:34:05.000 | that you're doing 3D reconstruction, but under the hood,
00:34:08.840 | but it might then let you do edits much more reliably
00:34:13.160 | or correctly than you would otherwise.
00:34:15.920 | - And, you know, the face is a very important application,
00:34:20.520 | right? - Absolutely.
00:34:21.360 | - So making things work.
00:34:22.680 | - And a very sensitive one.
00:34:23.800 | If you do something uncanny, it's very disturbing.
00:34:26.760 | - That's right.
00:34:27.600 | - Yeah. - You have to get it right.
00:34:30.080 | So in the space of augmented reality and virtual reality,
00:34:35.080 | what do you think is the role of AR and VR
00:34:39.600 | in the content we consume as people, as consumers
00:34:43.400 | and the content we create as creators?
00:34:45.520 | - Now that's a great question.
00:34:46.560 | We think about this a lot too.
00:34:48.880 | So I think VR and AR serve slightly different purposes.
00:34:52.880 | So VR can really transport you to an entire immersive world,
00:34:57.520 | no matter what your personal situation is.
00:35:01.000 | To that extent, it's a bit like a really, really
00:35:03.240 | widescreen television where it sort of snaps you out
00:35:05.400 | of your context and puts you in a new one.
00:35:08.280 | And I think it's still evolving in terms of the hardware.
00:35:12.680 | I actually worked on VR in the '90s,
00:35:14.560 | trying to solve the latency and sort of nausea problem,
00:35:17.200 | which we did, but it was very expensive and a bit early.
00:35:20.840 | There's a new wave of that now.
00:35:22.720 | I think, and increasingly those devices are becoming
00:35:25.560 | all in one rather than something that's tethered to a box.
00:35:28.920 | I think the market seems to be bifurcating into things
00:35:32.800 | for consumers and things for professional use cases,
00:35:35.800 | like for architects and people designing
00:35:38.440 | where your product is a building,
00:35:39.800 | and you really want to experience it better
00:35:42.240 | than looking at a scale model or a drawing,
00:35:44.960 | I think, or even than a video.
00:35:47.480 | So I think for that, where you need a sense of scale
00:35:49.880 | and spatial relationships, it's great.
00:35:52.400 | I think AR holds the promise of sort of taking
00:35:57.200 | digital assets off the screen and putting them
00:36:00.640 | in the context in the real world,
00:36:02.040 | on the table in front of you, on the wall behind you.
00:36:04.880 | And that has the corresponding need that the assets
00:36:09.240 | need to adapt to the physical context
00:36:11.120 | in which they're being placed.
00:36:12.400 | I mean, it's a bit like having a live theater troupe
00:36:15.560 | come to your house and put on "Hamlet."
00:36:18.360 | My mother had a friend who used to do this
00:36:20.040 | at stately homes in England for the National Trust,
00:36:23.000 | and they would adapt the scenes,
00:36:24.800 | and even they'd walk the audience through the rooms
00:36:27.680 | to see the action based on the country house
00:36:31.720 | they found themselves in for two days.
00:36:33.240 | And I think AR will have the same issue that,
00:36:36.920 | if you have a tiny table in a big living room or something,
00:36:39.240 | it'll try to figure out what can you change
00:36:41.960 | and what's fixed.
00:36:43.360 | And there's a little bit of a tension between fidelity,
00:36:47.600 | where if you captured Sayin Uriah doing a fantastic ballet,
00:36:52.600 | you'd want it to be sort of exactly reproduced,
00:36:54.880 | and maybe all you could do is scale it down.
00:36:57.360 | Whereas somebody telling you a story
00:37:00.560 | might be walking around the room doing some gestures,
00:37:03.680 | and that could adapt to the room
00:37:05.840 | in which they were telling the story.
00:37:07.880 | - And do you think fidelity is that important in that space,
00:37:10.680 | or is it more about the storytelling?
00:37:12.840 | - I think it may depend on the characteristic of the media,
00:37:16.640 | if it's a famous celebrity,
00:37:17.960 | then it may be that you want to catch every nuance,
00:37:20.120 | and they don't want to be reanimated by some algorithm.
00:37:23.560 | It could be that if it's really, you know,
00:37:27.480 | a lovable frog telling you a story,
00:37:29.840 | and it's about a princess and a frog,
00:37:32.080 | then it doesn't matter if the frog moves in a different way.
00:37:35.640 | I think a lot of the ideas that have sort of grown up
00:37:37.800 | in the game world will now come
00:37:40.040 | into the broader commercial sphere
00:37:42.080 | once they're needing adaptive characters in AR.
00:37:46.120 | - Are you thinking of engineering tools
00:37:47.880 | that allow creators to create in the augmented world,
00:37:52.560 | basically making a Photoshop for the augmented world?
00:37:56.400 | - Well, we have shown a few demos
00:37:59.120 | of sort of taking a Photoshop layer stack
00:38:01.440 | and then expanding it into 3D.
00:38:03.000 | That's actually been shown publicly as one example in AR.
00:38:06.280 | Where we're particularly excited at the moment is in 3D.
00:38:10.920 | 3D design is still a very challenging space,
00:38:14.800 | and we believe that it's a worthwhile experiment
00:38:18.280 | to try to figure out if AR or immersive
00:38:20.760 | makes 3D design more spontaneous.
00:38:23.440 | - Can you give me an example of 3D design,
00:38:25.840 | just like applications that we're talking about?
00:38:27.080 | - Well, literally, a simple one
00:38:28.560 | would be laying out objects, right?
00:38:30.080 | So on a conventional screen,
00:38:32.240 | you'd sort of have a plan view and a side view
00:38:34.120 | and a perspective view,
00:38:34.960 | and you'd sort of be dragging it around with a mouse,
00:38:36.800 | and if you're not careful,
00:38:37.880 | it would go through the wall and all that.
00:38:39.560 | Whereas if you were really laying out objects,
00:38:43.200 | say in a VR headset,
00:38:44.760 | you could literally move your head
00:38:46.840 | to see a different viewpoint.
00:38:48.040 | They'd be in stereo, so you'd have a sense of depth,
00:38:50.920 | 'cause you're already wearing the depth glasses, right?
00:38:53.880 | So it would be those sort of big, gross motor,
00:38:57.560 | move things around kind of skills
00:38:59.160 | seem much more spontaneous,
00:39:00.480 | just like they are in the real world.
00:39:02.320 | The frontier for us, I think,
00:39:05.400 | is whether that same medium can be used
00:39:07.960 | to do fine-grained design tasks,
00:39:09.680 | like very accurate constraints on, say, a CAD model
00:39:13.640 | or something that may be better done on a desktop,
00:39:16.960 | but it may just be a matter of inventing the right UI.
00:39:20.240 | So we're hopeful that,
00:39:22.440 | because there will be this potential explosion of demand
00:39:26.680 | for 3D assets driven by AR
00:39:29.560 | and more real-time animation on conventional screens,
00:39:33.240 | that those tools will also help with,
00:39:37.880 | or those devices will help with
00:39:39.440 | designing the content as well.
00:39:40.920 | - You've mentioned quite a few interesting new ideas.
00:39:43.680 | And at the same time, there's old timers like me
00:39:47.840 | that are stuck in their old ways and are-
00:39:50.320 | - I think I'm the old timer.
00:39:51.440 | - Okay, all right, all right.
00:39:52.600 | But the opposed all change at all costs kind of.
00:39:56.480 | Is there, when you're thinking about creating new interfaces,
00:40:00.760 | do you feel the burden of just this giant user base
00:40:04.480 | that loves the current product?
00:40:06.920 | So anything new you do that,
00:40:09.800 | any new idea comes at a cost that you'll be resisted?
00:40:14.080 | - Well, I think if you have to trade off control
00:40:18.080 | for convenience, then our existing user base
00:40:21.320 | would definitely be offended by that.
00:40:23.840 | I think if there are some things where
00:40:26.280 | you have more convenience and just as much control,
00:40:29.240 | that may be more welcome.
00:40:31.760 | We do think about not breaking
00:40:33.720 | well-known metaphors for things.
00:40:36.200 | So things should sort of make sense.
00:40:39.000 | Photoshop has never been a static target.
00:40:41.120 | It's always been evolving and growing.
00:40:43.040 | And to some extent, there's been a lot of brilliant thought
00:40:47.160 | along the way of how it works today.
00:40:48.960 | So we don't want to just throw all that out.
00:40:52.000 | If there's a fundamental breakthrough,
00:40:53.360 | like a single click is good enough to select an object
00:40:55.680 | rather than having to do lots of strokes,
00:40:58.560 | that actually fits in quite nicely to the existing tool set,
00:41:02.600 | either as an optional mode or as a starting point.
00:41:05.840 | I think where we're looking at radical simplicity,
00:41:09.120 | where you could encapsulate an entire workflow
00:41:12.240 | with a much simpler UI,
00:41:14.160 | then sometimes that's easier to do in the context
00:41:16.640 | of either a different device, like a mobile device,
00:41:19.600 | where the affordances are naturally different,
00:41:22.080 | or in a tool that's targeted at a different workflow,
00:41:26.040 | where it's about spontaneity and velocity
00:41:28.840 | rather than precision.
00:41:30.640 | And we have projects like Rush,
00:41:32.200 | which can let you do professional quality
00:41:35.080 | video editing for a certain class of media output
00:41:39.720 | that is targeted very differently
00:41:43.400 | in terms of users and the experience.
00:41:46.400 | And ideally, people would go,
00:41:48.720 | if I'm feeling like doing Premiere, big project,
00:41:51.720 | I'm doing a four-part television series,
00:41:55.640 | that's definitely a Premiere thing.
00:41:56.800 | But if I want to do something to show my recent vacation,
00:42:00.240 | maybe I'll just use Rush because I can do it
00:42:03.040 | in the half an hour I have free at home,
00:42:05.600 | rather than the four hours I'd need to do it at work.
00:42:08.240 | And for the use cases, which we can do well,
00:42:12.600 | it really is much faster to get the same output,
00:42:15.320 | but the more professional tools obviously
00:42:16.920 | have a much richer toolkit and more flexibility
00:42:20.440 | in what they can do.
00:42:21.720 | - And then at the same time, with the flexibility control,
00:42:24.040 | I like this idea of smart defaults,
00:42:27.200 | of using AI to coach you to like what Google has,
00:42:32.280 | some feeling lucky button.
00:42:34.120 | - Right.
00:42:34.960 | - Or one button kind of gives you
00:42:36.800 | a pretty good set of settings.
00:42:39.240 | And then you almost, that's almost an educational tool.
00:42:42.120 | - Absolutely, yeah.
00:42:43.560 | - To show, because sometimes when you have all this control,
00:42:48.280 | you're not sure about the correlation
00:42:51.840 | between the different bars that control
00:42:53.800 | different elements of the image and so on.
00:42:55.760 | And sometimes there's a degree of,
00:42:59.120 | you don't know what the optimal is.
00:43:02.040 | And then some things are sort of on demand,
00:43:03.920 | like help, right?
00:43:05.480 | Where I'm stuck, I need to know what to look for,
00:43:08.440 | I'm not quite sure what it's called.
00:43:10.480 | And something that was proactively
00:43:12.360 | making helpful suggestions or,
00:43:14.360 | you could imagine a make a suggestion button
00:43:19.320 | where you'd use all of that knowledge
00:43:20.800 | of workflows and everything to maybe suggest something
00:43:23.080 | to go and learn about or just to try or show the answer.
00:43:26.360 | And maybe it's not one intelligent default,
00:43:29.720 | but it's like a variety of defaults.
00:43:32.240 | And then you go, "Oh, I like that one."
00:43:33.840 | - Yeah, yeah, yeah.
00:43:35.080 | Several options.
00:43:36.000 | - Yeah.
00:43:37.240 | - So back to poetry.
00:43:39.760 | - Ah, yes.
00:43:40.840 | - We're gonna interleave.
00:43:43.480 | So first few lines of a recent poem of yours,
00:43:46.680 | before I ask the next question.
00:43:49.000 | This is about the smartphone.
00:43:52.480 | Today I left my phone at home and went down to the sea.
00:43:57.220 | The sand was soft, the ocean glass,
00:44:00.600 | but I was still just me.
00:44:02.760 | So this is a poem about you leaving your phone behind
00:44:05.400 | and feeling quite liberated because of it.
00:44:09.080 | So this is kind of a difficult topic
00:44:11.880 | and let's see if we can talk about it, figure it out.
00:44:14.960 | But so with the help of AI, more and more,
00:44:17.900 | we can create sort of versions of ourselves,
00:44:20.680 | versions of reality that are in some ways
00:44:23.800 | more beautiful than actual reality.
00:44:26.340 | - Mm-hmm.
00:44:27.180 | - And some of the creative effort there
00:44:32.300 | is part of doing this, creating this illusion.
00:44:35.360 | So of course this is inevitable,
00:44:38.020 | but how do you think we should adjust as human beings
00:44:41.340 | to live in this digital world that's partly artificial,
00:44:45.140 | that's better than the world that we lived in
00:44:49.740 | a hundred years ago when you didn't have Instagram
00:44:53.260 | and Facebook versions of ourselves
00:44:55.980 | and the online-- - Oh, this is sort of
00:44:57.580 | showing off better versions of ourselves.
00:44:59.720 | - We're using the tooling of modifying the images
00:45:03.480 | or even with artificial intelligence,
00:45:05.700 | ideas of deep fakes and creating adjusted
00:45:10.700 | or fake versions of ourselves in reality.
00:45:13.200 | - I think it's an interesting question.
00:45:15.680 | You asked sort of historical bent on this.
00:45:17.920 | (Luke laughs)
00:45:19.560 | I actually wonder if 18th century aristocrats
00:45:22.440 | who commissioned famous painters to paint portraits of them
00:45:25.680 | had portraits that were slightly nicer
00:45:27.460 | than they actually looked in practice.
00:45:29.160 | - Touche, well played, sir.
00:45:30.280 | - So human desire to put your best foot forward
00:45:33.980 | has always been true.
00:45:35.120 | I think it's interesting, you sort of framed it in two ways.
00:45:40.200 | One is if we can imagine alternate realities
00:45:43.000 | and visualize them, is that a good or bad thing?
00:45:45.400 | In the old days, you do it with storytelling
00:45:48.000 | and words and poetry, which still resides sometimes
00:45:51.120 | on websites, but we've become a very visual culture
00:45:55.640 | in particular.
00:45:56.600 | In the 19th century, we're very much a text-based culture.
00:46:02.160 | People would read long tracks,
00:46:03.880 | political speeches were very long.
00:46:05.720 | Nowadays, everything's very kind of quick
00:46:08.560 | and visual and snappy.
00:46:10.240 | I think it depends on how harmless your intent.
00:46:16.000 | A lot of it's about intent.
00:46:17.880 | So if you have a somewhat flattering photo
00:46:22.400 | that you pick out of the photos that you have
00:46:24.360 | in your inbox to say, "This is what I look like,"
00:46:27.160 | it's probably fine.
00:46:31.360 | If someone's gonna judge you by how you look,
00:46:34.820 | then they'll decide soon enough when they meet you
00:46:37.360 | whether the reality, you know.
00:46:39.000 | I think where it can be harmful is if people
00:46:43.880 | hold themselves up to an impossible standard,
00:46:46.180 | which they then feel bad about themselves for not meeting.
00:46:49.120 | I think that's definitely can be an issue.
00:46:53.240 | But I think the ability to imagine and visualize
00:46:57.960 | an alternate reality, which sometimes,
00:47:00.360 | sometimes which you then go off and build later,
00:47:03.760 | can be a wonderful thing too.
00:47:04.880 | People can imagine architectural styles,
00:47:07.520 | which they then, you know, have a startup,
00:47:09.800 | make a fortune and then build a house
00:47:11.400 | that looks like their favorite video game.
00:47:13.640 | Is that a terrible thing?
00:47:14.880 | I think, I used to worry about exploration actually,
00:47:21.520 | that part of the joy of going to the moon
00:47:24.720 | when I was a tiny child, I remember it,
00:47:27.000 | and grainy black and white,
00:47:28.600 | was to know what it would look like when you got there.
00:47:31.400 | And I think now we have such good graphics
00:47:33.520 | for knowing, for visualizing the experience
00:47:35.800 | before it happens, that I slightly worry
00:47:38.200 | that it may take the edge off actually wanting to go,
00:47:41.600 | you know what I mean?
00:47:42.560 | 'Cause we've seen it on TV, we kind of,
00:47:44.720 | you know, by the time we finally get to Mars,
00:47:46.320 | we'll go, "Yeah, yeah, so it's Mars,
00:47:47.440 | "it's what it looks like."
00:47:50.240 | But then, you know, the outer exploration,
00:47:53.320 | I mean, I think Pluto was a fantastic recent discovery
00:47:57.040 | where nobody had any idea what it looked like
00:47:59.080 | and it was just breathtakingly varied and beautiful.
00:48:01.960 | So I think expanding the ability of the human toolkit
00:48:06.640 | to imagine and communicate on balance is a good thing.
00:48:10.880 | I think there are abuses, we definitely take them seriously
00:48:13.480 | and try to discourage them.
00:48:17.600 | I think there's a parallel side where the public needs
00:48:21.080 | to know what's possible through events like this, right?
00:48:24.320 | So that you don't believe everything you read
00:48:27.600 | in print anymore and it may over time become true
00:48:31.320 | of images as well.
00:48:33.020 | Or you need multiple sets of evidence
00:48:34.880 | to really believe something rather
00:48:36.280 | than a single media asset.
00:48:38.040 | So I think it's a constantly evolving thing,
00:48:40.340 | it's been true forever.
00:48:42.000 | There's a famous story about Anne of Cleves
00:48:44.640 | and Henry VIII where, luckily for Anne,
00:48:48.920 | they didn't get married, right?
00:48:51.220 | So, or they got married and broke up in it.
00:48:53.840 | - What's the story?
00:48:54.680 | - Oh, so Holbein went and painted a picture
00:48:57.120 | and then Henry VIII wasn't pleased and, you know.
00:48:59.760 | - Oh, yeah.
00:49:00.800 | - History doesn't record whether Anne was pleased,
00:49:02.600 | but I think she was pleased not to be married
00:49:04.560 | more than a day or something.
00:49:06.080 | So, I mean, this has gone on for a long time,
00:49:08.040 | but I think it's just part of the magnification
00:49:11.880 | of human capability.
00:49:14.440 | - You've kind of built up an amazing research environment
00:49:19.360 | here, research culture, research lab,
00:49:21.560 | and you've written at the secret
00:49:22.800 | to a thriving research lab as interns.
00:49:24.840 | Can you unpack that a little bit?
00:49:26.320 | - Oh, absolutely.
00:49:27.160 | So, a couple of reasons.
00:49:29.940 | As you see, looking at my personal history,
00:49:34.080 | there are certain ideas you bond with
00:49:35.660 | at a certain stage of your career
00:49:37.160 | and you tend to keep revisiting them through time.
00:49:40.440 | If you're lucky, you pick one that doesn't just get solved
00:49:43.160 | in the next five years and then you're sort of out of luck.
00:49:46.560 | So, I think a constant influx of new people
00:49:48.800 | brings new ideas with it.
00:49:50.120 | From the point of view of industrial research,
00:49:53.680 | because a big part of what we do
00:49:56.040 | is really taking those ideas to the point
00:49:58.000 | where they can ship as very robust features,
00:50:00.480 | you end up investing a lot in a particular idea.
00:50:05.140 | And if you're not careful, people can get too conservative
00:50:08.400 | in what they choose to do next,
00:50:09.560 | knowing that the product teams will want it.
00:50:12.000 | And interns let you explore the more fanciful
00:50:16.040 | or unproven ideas in a relatively lightweight way,
00:50:20.280 | ideally leading to new publications for the intern
00:50:22.840 | and for the researcher.
00:50:24.520 | And it gives you then a portfolio from which to draw
00:50:27.600 | which idea am I gonna then try to take all the way through
00:50:30.100 | to being robust in the next year or two to ship.
00:50:33.000 | So, it sort of becomes part of the funnel.
00:50:36.100 | It's also a great way for us
00:50:37.440 | to identify future full-time researchers.
00:50:40.480 | Many of our greatest researchers were former interns.
00:50:43.080 | It builds a bridge to university departments
00:50:46.400 | so we can get to know and build an enduring relationship
00:50:50.280 | with the professors
00:50:51.160 | whom we often do academic gift funds to as well
00:50:54.120 | as an acknowledgement of the value the interns add
00:50:56.520 | and their own collaborations.
00:50:58.920 | So, it's sort of a virtuous cycle.
00:51:01.480 | And then the long-term legacy of a great research lab
00:51:04.920 | hopefully will be not only the people who stay
00:51:07.920 | but the ones who move through
00:51:09.280 | and then go off and carry that same model to other companies.
00:51:12.840 | And so, we believe strongly in industrial research
00:51:16.440 | and how it can complement academia.
00:51:18.560 | And we hope that this model will continue to propagate
00:51:21.560 | and be invested in by other companies,
00:51:23.640 | which makes it harder for us to recruit, of course,
00:51:25.880 | but that's a sign of success.
00:51:28.760 | And a rising tide lifts all ships in that sense.
00:51:32.440 | - And where's the idea born with the interns?
00:51:35.560 | Is there brainstorming?
00:51:37.160 | Is there discussions about, you know, like what-
00:51:42.160 | - Where did the ideas come from?
00:51:43.920 | - Yeah, as I'm asking the question,
00:51:46.360 | I realize how dumb it is,
00:51:47.240 | but I'm hoping you have a better answer-
00:51:49.280 | - No, it's not a dumb question at all.
00:51:50.120 | - Better answer than a better-
00:51:50.960 | - A question I ask at the beginning of every summer.
00:51:54.160 | So, what will happen is we'll send out a call for interns.
00:51:59.080 | They'll, we'll have a number of resumes come in.
00:52:02.080 | People will contact the candidates,
00:52:03.680 | talk to them about their interests.
00:52:05.760 | They'll usually try to find somebody
00:52:07.960 | who has a reasonably good match
00:52:09.800 | to what they're already doing,
00:52:11.360 | or just has a really interesting domain
00:52:13.360 | that they've been pursuing in their PhD.
00:52:15.800 | And we think we'd love to do one of those projects too.
00:52:18.960 | And then the intern stays in touch with the mentor,
00:52:22.880 | as we call them.
00:52:23.960 | And then they come and in the first,
00:52:26.480 | at the end of two weeks, they have to decide.
00:52:28.400 | So, they'll often have a general sense
00:52:31.000 | by the time they arrive,
00:52:32.760 | and we'll have internal discussions
00:52:34.960 | about what are all the general ideas
00:52:37.320 | that we're wanting to pursue
00:52:38.600 | to see whether two people have the same idea
00:52:40.800 | and maybe they should talk and all that.
00:52:42.880 | But then once the intern actually arrives,
00:52:44.760 | sometimes the idea goes linearly,
00:52:47.160 | and sometimes it takes a giant left turn and we go,
00:52:49.600 | that sounded good, but when we thought about it,
00:52:51.640 | there's this other project, or it's already been done,
00:52:53.680 | and we found this paper, we were scooped.
00:52:55.920 | But we have this other great idea.
00:52:57.200 | So, it's pretty, pretty flexible at the beginning.
00:52:59.760 | One of the questions for research labs is,
00:53:04.560 | who's deciding what to do?
00:53:06.360 | And then who's to blame if it goes wrong,
00:53:08.400 | who gets the credit if it goes right?
00:53:10.520 | And so, in Adobe, we push the needle very much
00:53:14.920 | towards freedom of choice of projects
00:53:18.320 | by the researchers and the interns.
00:53:21.080 | But then we reward people based on impact.
00:53:23.640 | So, if the projects ultimately end up impacting the products
00:53:27.000 | and having papers and so on.
00:53:28.840 | And so, the alternative model, just to be clear,
00:53:31.960 | is that you have one lab director
00:53:33.920 | who thinks he's a genius and tells everybody what to do,
00:53:36.920 | takes all the credit if it goes well,
00:53:38.280 | blames everybody else if it goes badly.
00:53:39.680 | So, we don't want that model.
00:53:41.840 | And this helps new ideas percolate up.
00:53:45.560 | The art of running such a lab is that
00:53:47.640 | there are strategic priorities for the company,
00:53:49.960 | and there are areas where we do want to invest
00:53:52.000 | in pressing problems.
00:53:54.000 | And so, it's a little bit of a trickle down
00:53:56.280 | and filter up meets in the middle.
00:53:58.680 | And so, you don't tell people you have to do X,
00:54:01.480 | but you say X would be particularly appreciated this year.
00:54:05.720 | And then people reinterpret X through the filter
00:54:07.960 | of things they want to do and they're interested in.
00:54:10.080 | And miraculously, it usually comes together very well.
00:54:14.840 | One thing that really helps is Adobe
00:54:16.440 | has a really broad portfolio of products.
00:54:18.680 | So, if we have a good idea,
00:54:20.760 | there's usually a product team
00:54:23.240 | that is intrigued or interested.
00:54:26.080 | So, it means we don't have to
00:54:27.720 | qualify things too much ahead of time.
00:54:30.360 | Once in a while, the product teams sponsor an extra intern
00:54:34.360 | 'cause they have a particular problem
00:54:35.640 | that they really care about,
00:54:36.960 | in which case it's a little bit more,
00:54:38.800 | we really need one of these.
00:54:40.560 | And then we sort of say, great, I get an extra intern.
00:54:43.120 | We find an intern, he thinks that's a great problem.
00:54:45.440 | But that's not the typical model.
00:54:46.800 | That's sort of the icing on the cake
00:54:48.240 | as far as the budget's concerned.
00:54:49.880 | And all of the above end up being important.
00:54:54.160 | It's really hard to predict at the beginning of the summer
00:54:56.960 | which of the, we all have high hopes
00:54:58.480 | of all of the intern projects,
00:54:59.680 | but ultimately some of them pay off
00:55:01.680 | and some of them sort of are a nice paper
00:55:03.920 | but don't turn into a feature.
00:55:05.120 | Others turn out not to be as novel as we thought
00:55:07.960 | but they'd be a great feature but not a paper.
00:55:11.440 | And then others, we make a little bit of progress
00:55:14.920 | and we realize how much we don't know.
00:55:16.400 | And maybe we revisit that problem several years in a row
00:55:19.680 | until it finally we have a breakthrough
00:55:22.120 | and then it becomes more on track to impact a product.
00:55:26.200 | - Jumping back to a big overall view of Adobe Research,
00:55:31.200 | what are you looking forward to in 2019 and beyond?
00:55:35.400 | What is, you mentioned there's a giant suite of products.
00:55:38.960 | - Yes. - Giant suite of ideas.
00:55:41.080 | New interns, a large team of researchers.
00:55:46.080 | Where do you think, what do you think the future holds?
00:55:52.040 | - In terms of the technological breakthroughs?
00:55:54.520 | - Technological breakthroughs,
00:55:56.440 | especially ones that will make it into product,
00:56:00.000 | will get to impact the world.
00:56:01.760 | - So I think the creative or the analytics assistants
00:56:04.920 | that we talked about where they're constantly trying
00:56:08.000 | to figure out what you're trying to do
00:56:09.480 | and how can they be helpful and make useful suggestions
00:56:12.120 | is a really hot topic.
00:56:14.000 | And it's very unpredictable as to when it'll be ready
00:56:16.920 | but I'm really looking forward to seeing
00:56:18.480 | how much progress we make against that.
00:56:21.000 | I think some of the core technologies
00:56:25.320 | like generative adversarial networks are immensely promising
00:56:29.400 | and seeing how quickly those become practical
00:56:33.040 | for mainstream use cases at high resolution
00:56:35.320 | with really good quality is also exciting.
00:56:38.120 | And they also have this sort of strange way
00:56:40.160 | of even the things they do oddly are odd
00:56:42.000 | in an interesting way.
00:56:42.960 | So it can look like dreaming or something.
00:56:45.960 | So that's fascinating.
00:56:49.240 | I think internally we have a sensei platform
00:56:53.640 | which is a way in which we're pulling our neural nets
00:56:57.920 | and other intelligence models into a central platform
00:57:01.600 | which can then be leveraged
00:57:03.480 | by multiple product teams at once.
00:57:05.120 | So we're in the middle of transitioning from a,
00:57:08.400 | once you have a good idea,
00:57:09.320 | you pick a product team to work with
00:57:10.720 | and they, you sort of hand design it for that use case
00:57:14.400 | to a more sort of Henry Ford,
00:57:17.040 | stand it up in a standard way
00:57:18.400 | which can be accessed in a standard way
00:57:20.600 | which should mean that the time between a good idea
00:57:22.840 | and impacting our products will be greatly shortened.
00:57:26.200 | And when one product has a good idea,
00:57:28.680 | many of the other products can just leverage it too.
00:57:31.840 | So it's sort of an economy of scale.
00:57:33.760 | So that's more about the how than the what
00:57:35.800 | but that combination of this sort of renaissance in AI,
00:57:39.640 | there's a comparable one in graphics
00:57:41.200 | with real-time ray tracing
00:57:42.480 | and other really exciting emerging technologies.
00:57:46.000 | And when these all come together,
00:57:47.360 | you'll sort of basically be dancing with light, right?
00:57:50.400 | Where you'll have real-time shadows, reflections
00:57:53.160 | and as if it's a real world in front of you
00:57:56.320 | but then with all these magical properties brought by AI
00:57:58.720 | where it sort of anticipates or modifies itself
00:58:01.760 | in ways that make sense based on how it understands
00:58:04.480 | the creative task you're trying to do.
00:58:06.640 | - That's a really exciting future for creative
00:58:10.320 | for myself too as a creator.
00:58:12.160 | So first of all, I work in autonomous vehicles.
00:58:14.120 | I'm a roboticist, I love robots.
00:58:16.320 | And I think you have a fascination with snakes,
00:58:19.040 | both natural and artificial, robots.
00:58:21.760 | I share your fascination.
00:58:23.600 | I mean, their movement is beautiful, adaptable.
00:58:26.040 | The adaptability is fascinating.
00:58:28.600 | There are, I looked it up,
00:58:30.720 | 2,900 species of snakes in the world.
00:58:34.120 | 375 venomous, some are tiny, some are huge.
00:58:37.640 | Saw that there's one that's 25 feet in some cases.
00:58:41.480 | So what's the most interesting thing
00:58:44.720 | that you connect with in terms of snakes,
00:58:47.480 | both natural and artificial?
00:58:49.920 | What was the connection with robotics AI
00:58:53.920 | and this particular form of a robot?
00:58:56.440 | - Well, it actually came out of my work in the '80s
00:58:58.640 | on computer animation where I started doing things
00:59:01.400 | like cloth simulation and other kind of soft body simulation
00:59:04.880 | and you'd sort of drop it and it would bounce
00:59:07.560 | and then it would just sort of stop moving.
00:59:08.760 | And I thought, well, what if you animate the spring lengths
00:59:11.440 | and simulate muscles and the simplest object
00:59:14.480 | I could do that for was an earthworm.
00:59:16.200 | So I actually did a paper in 1988
00:59:18.680 | on called the motion dynamics of snakes and worms.
00:59:21.120 | And I read the physiology literature
00:59:23.400 | on both how snakes and worms move
00:59:26.120 | and then did some of the early computer animation
00:59:29.400 | examples of that.
00:59:30.520 | - So your interest in robotics started with graphics.
00:59:34.760 | - Came out of simulation and graphics.
00:59:37.000 | When I moved from Alias to Apple,
00:59:40.440 | we actually did a movie called "Her Majesty's Secret Serpent"
00:59:42.960 | which is about a secret agent snake that parachutes in
00:59:46.240 | and captures a film canister from a satellite
00:59:48.600 | which tells you how old fashioned we were thinking back then
00:59:51.080 | sort of classic 19 sort of 50s or 60s Bond movie
00:59:54.160 | kind of thing.
00:59:55.000 | And at the same time, I'd always made radio controlled ships
00:59:59.520 | when I was a child and from scratch.
01:00:02.120 | And I thought, well, how hard can it be to build a real one?
01:00:05.040 | And so then started what turned out to be
01:00:08.200 | like a 15 year obsession with trying to build
01:00:10.400 | better snake robots.
01:00:11.800 | And the first one that I built
01:00:14.040 | just sort of slithered sideways
01:00:15.240 | but didn't actually go forward.
01:00:16.600 | Then I added wheels and building things in real life
01:00:19.880 | makes you honest about the friction.
01:00:21.920 | The thing that appeals to me is I love creating
01:00:25.840 | the illusion of life, which is what drove me to animation.
01:00:29.640 | And if you have a robot with enough degrees
01:00:31.520 | of coordinated freedom that move in a kind of
01:00:34.400 | biological way, then it starts to cross the uncanny valley
01:00:37.520 | and to see me like a creature rather than a thing.
01:00:40.920 | And I certainly got that with the early snakes by S3.
01:00:45.200 | I had it able to sidewind as well as go directly forward.
01:00:50.000 | My wife to be suggested that it would be the ring bearer
01:00:52.560 | at our wedding.
01:00:53.400 | So it actually went down the aisle carrying the rings
01:00:56.120 | and got in the local paper for that, which was really fun.
01:01:00.200 | And this was all done as a hobby.
01:01:03.640 | And then I, at the time that onboard compute
01:01:06.600 | was incredibly limited.
01:01:07.760 | It was sort of.
01:01:08.600 | - Yeah, so you should explain that.
01:01:09.600 | These things, the whole idea is that you would,
01:01:11.960 | you're trying to run it autonomously.
01:01:15.240 | - Autonomously on board power.
01:01:17.360 | On board, right.
01:01:18.240 | And so the very first one, I actually built the controller
01:01:21.840 | from discrete logic, 'cause I used to do LSI,
01:01:25.360 | you know, circuits and things when I was a teenager.
01:01:28.200 | And then the second and third one,
01:01:30.720 | the eight bit microprocessors were available
01:01:32.800 | with like a whole 256 bytes of RAM,
01:01:36.120 | which you could just about squeeze in.
01:01:37.680 | So they were radio controlled rather than autonomous
01:01:40.480 | and really were more about the physicality
01:01:43.360 | and coordinated motion.
01:01:44.800 | I've occasionally taken a sidestep into,
01:01:49.960 | if only I could make it cheaply enough,
01:01:51.560 | make a great toy, which has been a lesson
01:01:54.000 | in how clockwork is its own magical realm
01:01:58.640 | that you venture into and learn things about backlash
01:02:02.120 | and other things you don't take into account
01:02:03.680 | as a computer scientist,
01:02:04.720 | which is why what seemed like a good idea doesn't work.
01:02:07.160 | So it was quite humbling.
01:02:08.920 | And then more recently I've been building S9,
01:02:11.920 | which is a much better engineered version of S3
01:02:15.160 | where the motors wore out and it doesn't work anymore.
01:02:17.280 | And you can't buy replacements,
01:02:18.520 | which is sad given that it was such a meaningful one.
01:02:22.320 | S5 was about twice as long
01:02:24.640 | and looked much more biologically inspired.
01:02:29.640 | I, unlike the typical roboticist, I taper my snakes.
01:02:33.560 | There are good mechanical reasons to do that,
01:02:35.440 | but it also makes them look more biological,
01:02:37.200 | although it means every segment's unique
01:02:39.200 | rather than a repetition,
01:02:42.360 | which is why most engineers don't do it.
01:02:44.400 | It actually saves weight and leverage and everything.
01:02:47.040 | And that one is currently on display
01:02:50.360 | at the International Spy Museum in Washington, DC.
01:02:53.080 | Not that it's done any spying.
01:02:54.800 | It was on YouTube and it got its own conspiracy theory
01:02:58.880 | where people thought that it wasn't real
01:03:00.200 | 'cause I work at Adobe, it must be fake graphics.
01:03:03.080 | And people would write to me, "Tell me it's real."
01:03:05.480 | You know, they say, "The background doesn't move."
01:03:06.800 | And it's like, it's on a tripod, you know?
01:03:08.960 | So that one, but you can see the real thing,
01:03:12.480 | so it really is true.
01:03:13.600 | And then the latest one is the first one
01:03:17.160 | where I could put a Raspberry Pi,
01:03:19.120 | which leads to all sorts of terrible jokes
01:03:20.600 | about pythons and things, but.
01:03:22.200 | - Yeah. - Yeah.
01:03:24.240 | But this one can have onboard compute.
01:03:26.480 | And then where my hobby work
01:03:28.960 | and my work work are converging is
01:03:31.440 | you can now add vision accelerator chips,
01:03:34.320 | which can evaluate neural nets
01:03:36.200 | and do object recognition and everything.
01:03:37.840 | So both for the snakes and more recently
01:03:40.680 | for the spider that I've been working on,
01:03:42.720 | having, you know, desktop level compute
01:03:46.200 | is now opening up a whole world of true autonomy
01:03:50.200 | with onboard compute, onboard batteries,
01:03:52.320 | and still having that sort of biomimetic quality
01:03:56.640 | that people, that appeals to children in particular,
01:04:00.040 | they are really drawn to them.
01:04:01.320 | And adults think they look creepy,
01:04:02.920 | but children actually think they look charming.
01:04:05.680 | And I gave a series of lectures at Girls Who Code
01:04:10.600 | to encourage people to take an interest in technology.
01:04:14.280 | And at the moment, I'd say they're still more expensive
01:04:18.080 | than the value that they add,
01:04:19.320 | which is why they're a great hobby for me,
01:04:20.760 | but they're not really a great product.
01:04:22.920 | It makes me think about doing that very early thing I did
01:04:28.120 | at Alias with changing the muscle rest lengths.
01:04:31.120 | If I could do that with a real artificial muscle material,
01:04:34.400 | then the next snake ideally would use that
01:04:37.600 | rather than motors and gearboxes and everything.
01:04:39.600 | It would be lighter, much stronger,
01:04:42.280 | and more continuous and smooth.
01:04:44.960 | So it's, I like to say being in research
01:04:48.680 | is a license to be curious.
01:04:50.000 | And I have the same feeling with my hobby.
01:04:52.000 | It forced me to read biology and be curious about things
01:04:55.920 | that otherwise would have just been, you know,
01:04:58.560 | a National Geographic special.
01:04:59.760 | Suddenly I'm thinking, how does that snake move?
01:05:02.000 | Can I copy it?
01:05:03.400 | I look at the trails that sidewinding snakes leave in sand
01:05:06.200 | and see if my snake robots would do the same thing.
01:05:09.160 | - So out of something in that,
01:05:11.040 | I mean, I like where you put it,
01:05:12.120 | try to bring life into it and beauty.
01:05:13.920 | - Absolutely.
01:05:14.760 | And then ultimately give it a personality,
01:05:17.120 | which is where the intelligent agent research
01:05:19.160 | will converge with the vision and voice synthesis
01:05:22.560 | to give it a sense of having,
01:05:25.160 | not necessarily human level intelligence.
01:05:27.080 | I think the Turing test is such a high bar.
01:05:29.680 | It's a little bit self-defeating,
01:05:32.480 | but having one that you can have
01:05:34.880 | a meaningful conversation with,
01:05:36.760 | especially if you have a reasonably good sense
01:05:39.360 | of what you can say.
01:05:40.360 | So not trying to have it so a stranger could walk up
01:05:45.040 | and have one, but so as a pet owner or a robot pet owner,
01:05:49.520 | you could know what it thinks about
01:05:51.640 | and what it can reason about.
01:05:53.840 | - Or sometimes just a meaningful interaction.
01:05:56.240 | If you have the kind of interaction you have with a dog,
01:05:58.960 | sometimes you might have a conversation,
01:06:00.880 | but it's usually one way.
01:06:02.160 | - Absolutely.
01:06:03.000 | - And nevertheless, it feels like a meaningful connection.
01:06:06.960 | - And one of the things that I'm trying to do
01:06:09.200 | in the sample audio that we'll play you
01:06:11.600 | is beginning to get towards the point
01:06:14.320 | where the reasoning system can explain
01:06:16.680 | why it knows something or why it thinks something.
01:06:19.440 | And that again, creates the sense
01:06:21.360 | that it really does know what it's talking about,
01:06:23.200 | but also for debugging,
01:06:26.760 | as you get more and more elaborate behavior,
01:06:29.680 | it's like, why did you decide to do that?
01:06:32.280 | How do you know that?
01:06:33.400 | I think the robot's really my muse
01:06:37.960 | for helping me think about the future of AI
01:06:40.280 | and what to invent next.
01:06:42.680 | - So even at Adobe,
01:06:44.760 | that's mostly operating in digital world.
01:06:47.200 | - Correct.
01:06:48.040 | - Do you ever, do you see a future where Adobe
01:06:51.160 | even expands into the more physical world perhaps?
01:06:54.680 | So bringing life not into animations,
01:06:57.920 | but bringing life into physical objects
01:07:01.240 | with whether it's, well, do you?
01:07:04.080 | - I'd have to say at the moment, it's a twinkle in my eye.
01:07:06.520 | I think the more likely thing is that we will bring
01:07:10.200 | virtual objects into the physical world
01:07:13.160 | through augmented reality.
01:07:15.000 | And many of the ideas that might take five years
01:07:17.960 | to build a robot to do,
01:07:19.520 | you can do in a few weeks with digital assets.
01:07:22.920 | So I think when really intelligent robots
01:07:27.560 | finally become commonplace,
01:07:29.160 | they won't be that surprising
01:07:30.440 | because we'll have been living with those personalities
01:07:32.600 | for in the virtual sphere for a long time.
01:07:35.400 | And then they'll just say, oh, it's Siri with legs
01:07:38.000 | or Alexa on hooves or something.
01:07:41.400 | So I can see that world coming.
01:07:44.720 | And for now, it's still an adventure
01:07:47.800 | and we don't know quite what the experience will be like.
01:07:51.160 | And it's really exciting to sort of see
01:07:53.840 | all of these different strands of my career converge.
01:07:56.520 | - Yeah, in interesting ways.
01:07:59.040 | And it is definitely a fun adventure.
01:08:01.360 | So let me end with my favorite poem,
01:08:06.360 | the last few lines of my favorite poem of yours
01:08:08.880 | that ponders mortality.
01:08:10.520 | And in some sense, immortality,
01:08:13.280 | as our ideas live through the ideas of others,
01:08:16.040 | through the work of others,
01:08:17.360 | it ends with, "Do not weep or mourn."
01:08:20.760 | It was enough the little atomies
01:08:22.320 | permitted just a single dance.
01:08:24.680 | Scatter them as deep as your eyes can see.
01:08:27.880 | I'm content to have another chance.
01:08:30.440 | Sweeping more centered parts along
01:08:33.000 | to join a jostling lifting throng as others danced in me.
01:08:37.280 | Beautiful poem, beautiful way to end it.
01:08:40.840 | Gavin, thank you so much for talking today.
01:08:42.720 | And thank you for inspiring and empowering
01:08:45.240 | millions of people like myself for creating amazing stuff.
01:08:49.560 | - Oh, thank you. It was a great conversation.
01:08:51.760 | (upbeat music)
01:08:54.360 | (upbeat music)
01:08:56.960 | (upbeat music)
01:08:59.560 | (upbeat music)
01:09:02.160 | (upbeat music)
01:09:04.760 | (upbeat music)
01:09:07.360 | [BLANK_AUDIO]