How to Become an AI Engineer from a Fullstack Background

00:00:00.400 | Hi, I'm Reid Mayo, founder of Reema AI. Welcome to Shift Left, how to become an AI engineer

00:00:19.520 | from a full-stack background. In this talk, we'll provide and review a syllabus that walks

00:00:23.960 | you step-by-step through a defined process with practical tutorials that teach you the

00:00:28.880 | comprehensive best practice skills and knowledge required to launch a professional AI engineering

00:00:34.300 | career. Think of it in a way as an AI engineering boot camp. So this talk assumes that you have a

00:00:39.800 | strong full-stack engineering background. You should be comfortable building modern tech from

00:00:44.040 | the ground up across all of the different layers, infrastructure, database and persistence, and

00:00:48.940 | applications both on the back end and on the client side. However, this talk assumes zero background with

00:00:54.840 | any AI or machine learning. We're going to start from scratch there. So why would you be interested

00:01:00.940 | in becoming an AI engineer? For an in-detail summary, I'd encourage you to read The Rise of the AI

00:01:06.080 | Engineer by Sean Swix Wang that inspired the name of this talk. One critical takeaway from this essay is

00:01:12.320 | when Swix identifies that full-stack engineers can now deploy a wide variety of legitimately useful AI

00:01:18.480 | solutions by leveraging new foundational models. Previously, such solutions would have

00:01:23.800 | required substantial experience in traditional ML techniques and costly investment in upfront data

00:01:29.720 | collection. So let's go ahead and move forward. So before we dive into the syllabus itself, I want

00:01:36.640 | you to follow a few techniques from the book, The Art of Learning, that this course was designed

00:01:40.600 | around. This is going to make your learning more effective and efficient. First, stay focused and limit

00:01:45.900 | distractions. There's a lot of low information out there with diminished returns. Stay focused on

00:01:50.880 | the important topics. Speaking of important topics, we're going to invest heavily in the fundamentals in

00:01:55.840 | this course. By understanding fundamental building blocks well, we'll be able to build sophisticated AI

00:02:00.960 | products through composition of those blocks. Lastly, as you go through the syllabus, use ChatGPT as a

00:02:06.920 | private tutor. Anytime you come across a new concept, use the Socratic method with ChatGPT to unfurl the

00:02:13.200 | topic until you understand it thoroughly. You'd be surprised how many concepts pre-date ChatGPT's

00:02:18.180 | January 22nd knowledge cutoff date. So regarding the syllabus itself, as we go through each section,

00:02:24.180 | I'll be spending most of our limited time talking about the why. We'll summarize what you will learn and

00:02:29.980 | why it is important. Let's go ahead and dive in. Section one, overview to large language models. Before we start

00:02:36.860 | working with large language models, it's useful to start with a short but respectably thorough overview of what

00:02:42.320 | they are and how they work at a high level. Cohere is a company founded by one of the creators of the

00:02:47.000 | Transformer architecture and they've got a great overview of these core concepts in their educational

00:02:51.700 | docs. So we'll start there. Remember, stay focused. Only review module one in its entirety and keep pairing

00:02:58.680 | the Socratic method with ChatGPT to flesh out your knowledge as you go along. Okay, moving forward.

00:03:05.300 | Section two, prompt engineering. So on its face, prompt engineering feels like a bunch

00:03:09.980 | of voodoo mumbo jumbo. It feels absurd, really, because we're used to working with symbolic

00:03:15.180 | architectures based on code logic. So it's strange to imagine getting higher quality output by prompting

00:03:21.520 | an AI model politely. But the language models are neural architectures. They're inspired by our brains.

00:03:27.660 | So different techniques are required. The bottom line is that prompt engineering objectively

00:03:32.340 | increases the quality of neural architectures output, such as language models. So now you might be tempted

00:03:38.840 | to say, all right, I'm going to skip all this prompt engineering stuff and get straight to fine tuning

00:03:42.540 | models. But fine tuning quality is often increased by starting with the best performing prompts and using

00:03:48.720 | those prompts in your fine tuning training data. Lastly, it's important to really sink your hands into the

00:03:54.180 | prompt engineering clay to see what language models are capable of, and also to probe their limitations.

00:03:59.220 | So regarding course materials, start out by watching the overview video from prompt engineering guide

00:04:05.220 | founder Elvis Cerevia, then dive directly into the guide itself. Read it cover to cover and pay special

00:04:11.940 | attention to the graduate job classification case study that shows how layering on prompt engineering techniques

00:04:17.940 | iteratively increases quality of output in aggregate. Next, read the learn prompting org docs

00:04:23.940 | favored by open AI cover to cover. The redundant concepts in this second guide are useful to review to

00:04:30.020 | really lock in these critical concepts. And also this guide does cover additional concepts as well.

00:04:35.140 | All right, moving on. Section three, open AI. Open AI does two things incredibly well. One,

00:04:42.980 | they provide state of the art AI models and two, they make them incredibly accessible. By learning open AI,

00:04:49.060 | you can understand the art of what's possible today. You can also start building and experimenting with

00:04:54.100 | AI engineering quickly. However, there are some practical limitations to consider that we will

00:04:59.460 | address further on. So regarding course material, we're going to read the open AI docs and API reference cover

00:05:05.540 | to cover. Then I would encourage you to quickly review the practical hands-on examples in their cookbook.

00:05:10.260 | Don't spend too much time there. You can come back later and we want to keep marching.

00:05:14.740 | Okay, moving on. Section four, Langchain. Langchain is the applications framework that allows you to

00:05:21.940 | put AI tech together in an organized and well-architected way. So it is highly maintainable,

00:05:27.220 | modular, and scalable. So Langchain integrates all the different parts and pieces required for a modern AI

00:05:33.620 | system. Models, prompts, long and short-term memory for retrieval augmented generation and conversations,

00:05:40.500 | practically everything. Furthermore, for any components that aren't supported yet,

00:05:44.980 | Langchain is flexible enough to allow straightforward integration of these new components, including your

00:05:49.620 | proprietary needs. Lastly, and this is very important in the context of this syllabus, because

00:05:55.460 | Langchain is the glue layer for most everything else in the AI ecosystem, you will learn a lot about the

00:06:01.140 | comprehensive practice of AI engineering by building a comprehensive understanding of Langchain.

00:06:06.020 | Now, onto the course materials. So building AI apps is a new paradigm. There's a lot to absorb.

00:06:13.140 | So we're going to prime you with a non-technical comprehensive executive summary by command bar.

00:06:18.900 | First, then we'll follow up with a simple plain English technical guide that covers only some basic

00:06:24.020 | Langchain building blocks. So you can begin to quickly grok how a more complex AI system can be built up

00:06:29.940 | modularly with this framework. So as you might imagine, the meat and potatoes of this section will be the

00:06:35.140 | Langchain docs and code base. Langchain's documentation is highly thorough. So take full advantage of it.

00:06:41.540 | I encourage reading both the Python and the JavaScript Tyscript docs cover to cover as the review helps lock in

00:06:48.260 | your knowledge and there are important concepts in each version that aren't yet in the other. As you read

00:06:54.260 | through the docs, pop over to GitHub and stick your head under the code base hood to see how Langchain

00:06:59.940 | implements the features and functionality that the documentation covers. This will give you in-depth

00:07:04.980 | practical knowledge on how to build AI tech the right way. Lastly, for real world Langchain app tutorials,

00:07:11.620 | Mayo Ocean has great video walkthroughs. Specifically, I would encourage reviewing his Langchain beginners tutorial

00:07:18.100 | as it covers the fundamentals. His other videos take these fundamentals and apply them towards more complex tasks.

00:07:24.500 | Alright, moving on. Section five, evaluating AI models. Coming from a full-stack background, evals are

00:07:32.100 | basically your software tests. Before we start fine-tuning black box AI models, we need a scientific

00:07:38.100 | process that can evaluate our changes iteratively. Otherwise, how do we know we're making improvements and not

00:07:43.540 | regressions, right? So regarding the course materials, OpenAI has a great cookbook that walks you through

00:07:49.380 | writing some example evals. Note that the nature of AI output often means you're going to have to be a

00:07:54.820 | little bit creative when writing effective evals. Furthermore, OpenAI also provides a framework that

00:08:00.580 | includes a robust eval suite and allows for writing your custom evals as well. Review these materials quickly.

00:08:09.220 | Alright, moving on. Section six, fine-tuning. By this point, you've already gained some exposure

00:08:14.180 | into fine-tuning OpenAI's models. We're going to take that further by going step-by-step through their

00:08:18.980 | fine-tuning cookbook. So knowledge of how to fine-tune OpenAI models will take you a long way. However,

00:08:24.820 | there are practical limitations to relying on OpenAI alone. For example, it can be cost prohibitive and you

00:08:30.420 | can run into latency or rate limiting issues in production. This is in addition to standard privacy and control

00:08:36.100 | concerns. Because of this, an efficient pattern is to prototype and ship a solution quickly using OpenAI's

00:08:42.660 | models, start gathering usage and training data. Then, if the solution needs to start scaling, see if

00:08:48.500 | you can fine-tune a smaller and cheaper open-source model to match or out-compete OpenAI's model on your

00:08:54.820 | target use case. So regarding course materials, first completely go through the OpenAI fine-tuning hands-on

00:09:01.540 | cookbook. After that, we'll walk through AniScale's tutorial that demonstrates how to fine-tune an

00:09:06.900 | open-source model, Meta's Llama 2, such that it can match or even beat OpenAI's models and target tasks.

00:09:14.100 | Finally, we're going to skim OpenPipe's cost savings case study that shows how on our example task,

00:09:20.340 | and it's not cherry-picked. A smaller fine-tuned Llama 2 model, at a cost of $19, can match results

00:09:28.100 | from OpenAI's state-of-the-art model, which would cost around $24,000 for the same task.

00:09:34.100 | Final section, advanced study. So by this point, you've completed the bootcamp section of the syllabus.

00:09:41.060 | I'd encourage you to start deploying your AI engineering skills in the real world before moving on to these

00:09:45.380 | advanced studies. However, once you're ready to take your skills well beyond the basics,

00:09:50.260 | FastAI's Practical Deep Learning course and Hugging Face's NLP course, and their docs, will give you a

00:09:56.260 | rich understanding of deep learning theory. In addition to learning fine-tuning further, you will also be

00:10:01.380 | able to train models from scratch. All right, so we've reached the end. So the syllabus is linked to my left.

00:10:07.380 | Thanks for joining me today, and for any questions, please reach out to me on LinkedIn. Bye!

How to Become an AI Engineer from a Fullstack Background - Reid Mayo

Chapters