How to Become an AI Engineer from a Fullstack Background

Hi, I'm Reid Mayo, founder of Reema AI. Welcome to Shift Left, how to become an AI engineer from a full-stack background. In this talk, we'll provide and review a syllabus that walks you step-by-step through a defined process with practical tutorials that teach you the comprehensive best practice skills and knowledge required to launch a professional AI engineering career.

Think of it in a way as an AI engineering boot camp. So this talk assumes that you have a strong full-stack engineering background. You should be comfortable building modern tech from the ground up across all of the different layers, infrastructure, database and persistence, and applications both on the back end and on the client side.

However, this talk assumes zero background with any AI or machine learning. We're going to start from scratch there. So why would you be interested in becoming an AI engineer? For an in-detail summary, I'd encourage you to read The Rise of the AI Engineer by Sean Swix Wang that inspired the name of this talk.

One critical takeaway from this essay is when Swix identifies that full-stack engineers can now deploy a wide variety of legitimately useful AI solutions by leveraging new foundational models. Previously, such solutions would have required substantial experience in traditional ML techniques and costly investment in upfront data collection. So let's go ahead and move forward.

So before we dive into the syllabus itself, I want you to follow a few techniques from the book, The Art of Learning, that this course was designed around. This is going to make your learning more effective and efficient. First, stay focused and limit distractions. There's a lot of low information out there with diminished returns.

Stay focused on the important topics. Speaking of important topics, we're going to invest heavily in the fundamentals in this course. By understanding fundamental building blocks well, we'll be able to build sophisticated AI products through composition of those blocks. Lastly, as you go through the syllabus, use ChatGPT as a private tutor.

Anytime you come across a new concept, use the Socratic method with ChatGPT to unfurl the topic until you understand it thoroughly. You'd be surprised how many concepts pre-date ChatGPT's January 22nd knowledge cutoff date. So regarding the syllabus itself, as we go through each section, I'll be spending most of our limited time talking about the why.

We'll summarize what you will learn and why it is important. Let's go ahead and dive in. Section one, overview to large language models. Before we start working with large language models, it's useful to start with a short but respectably thorough overview of what they are and how they work at a high level.

Cohere is a company founded by one of the creators of the Transformer architecture and they've got a great overview of these core concepts in their educational docs. So we'll start there. Remember, stay focused. Only review module one in its entirety and keep pairing the Socratic method with ChatGPT to flesh out your knowledge as you go along.

Okay, moving forward. Section two, prompt engineering. So on its face, prompt engineering feels like a bunch of voodoo mumbo jumbo. It feels absurd, really, because we're used to working with symbolic architectures based on code logic. So it's strange to imagine getting higher quality output by prompting an AI model politely.

But the language models are neural architectures. They're inspired by our brains. So different techniques are required. The bottom line is that prompt engineering objectively increases the quality of neural architectures output, such as language models. So now you might be tempted to say, all right, I'm going to skip all this prompt engineering stuff and get straight to fine tuning models.

But fine tuning quality is often increased by starting with the best performing prompts and using those prompts in your fine tuning training data. Lastly, it's important to really sink your hands into the prompt engineering clay to see what language models are capable of, and also to probe their limitations.

So regarding course materials, start out by watching the overview video from prompt engineering guide founder Elvis Cerevia, then dive directly into the guide itself. Read it cover to cover and pay special attention to the graduate job classification case study that shows how layering on prompt engineering techniques iteratively increases quality of output in aggregate.

Next, read the learn prompting org docs favored by open AI cover to cover. The redundant concepts in this second guide are useful to review to really lock in these critical concepts. And also this guide does cover additional concepts as well. All right, moving on. Section three, open AI. Open AI does two things incredibly well.

One, they provide state of the art AI models and two, they make them incredibly accessible. By learning open AI, you can understand the art of what's possible today. You can also start building and experimenting with AI engineering quickly. However, there are some practical limitations to consider that we will address further on.

So regarding course material, we're going to read the open AI docs and API reference cover to cover. Then I would encourage you to quickly review the practical hands-on examples in their cookbook. Don't spend too much time there. You can come back later and we want to keep marching. Okay, moving on.

Section four, Langchain. Langchain is the applications framework that allows you to put AI tech together in an organized and well-architected way. So it is highly maintainable, modular, and scalable. So Langchain integrates all the different parts and pieces required for a modern AI system. Models, prompts, long and short-term memory for retrieval augmented generation and conversations, practically everything.

Furthermore, for any components that aren't supported yet, Langchain is flexible enough to allow straightforward integration of these new components, including your proprietary needs. Lastly, and this is very important in the context of this syllabus, because Langchain is the glue layer for most everything else in the AI ecosystem, you will learn a lot about the comprehensive practice of AI engineering by building a comprehensive understanding of Langchain.

Now, onto the course materials. So building AI apps is a new paradigm. There's a lot to absorb. So we're going to prime you with a non-technical comprehensive executive summary by command bar. First, then we'll follow up with a simple plain English technical guide that covers only some basic Langchain building blocks.

So you can begin to quickly grok how a more complex AI system can be built up modularly with this framework. So as you might imagine, the meat and potatoes of this section will be the Langchain docs and code base. Langchain's documentation is highly thorough. So take full advantage of it.

I encourage reading both the Python and the JavaScript Tyscript docs cover to cover as the review helps lock in your knowledge and there are important concepts in each version that aren't yet in the other. As you read through the docs, pop over to GitHub and stick your head under the code base hood to see how Langchain implements the features and functionality that the documentation covers.

This will give you in-depth practical knowledge on how to build AI tech the right way. Lastly, for real world Langchain app tutorials, Mayo Ocean has great video walkthroughs. Specifically, I would encourage reviewing his Langchain beginners tutorial as it covers the fundamentals. His other videos take these fundamentals and apply them towards more complex tasks.

Alright, moving on. Section five, evaluating AI models. Coming from a full-stack background, evals are basically your software tests. Before we start fine-tuning black box AI models, we need a scientific process that can evaluate our changes iteratively. Otherwise, how do we know we're making improvements and not regressions, right? So regarding the course materials, OpenAI has a great cookbook that walks you through writing some example evals.

Note that the nature of AI output often means you're going to have to be a little bit creative when writing effective evals. Furthermore, OpenAI also provides a framework that includes a robust eval suite and allows for writing your custom evals as well. Review these materials quickly. Alright, moving on.

Section six, fine-tuning. By this point, you've already gained some exposure into fine-tuning OpenAI's models. We're going to take that further by going step-by-step through their fine-tuning cookbook. So knowledge of how to fine-tune OpenAI models will take you a long way. However, there are practical limitations to relying on OpenAI alone.

For example, it can be cost prohibitive and you can run into latency or rate limiting issues in production. This is in addition to standard privacy and control concerns. Because of this, an efficient pattern is to prototype and ship a solution quickly using OpenAI's models, start gathering usage and training data.

Then, if the solution needs to start scaling, see if you can fine-tune a smaller and cheaper open-source model to match or out-compete OpenAI's model on your target use case. So regarding course materials, first completely go through the OpenAI fine-tuning hands-on cookbook. After that, we'll walk through AniScale's tutorial that demonstrates how to fine-tune an open-source model, Meta's Llama 2, such that it can match or even beat OpenAI's models and target tasks.

Finally, we're going to skim OpenPipe's cost savings case study that shows how on our example task, and it's not cherry-picked. A smaller fine-tuned Llama 2 model, at a cost of $19, can match results from OpenAI's state-of-the-art model, which would cost around $24,000 for the same task. Final section, advanced study.

So by this point, you've completed the bootcamp section of the syllabus. I'd encourage you to start deploying your AI engineering skills in the real world before moving on to these advanced studies. However, once you're ready to take your skills well beyond the basics, FastAI's Practical Deep Learning course and Hugging Face's NLP course, and their docs, will give you a rich understanding of deep learning theory.

In addition to learning fine-tuning further, you will also be able to train models from scratch. All right, so we've reached the end. So the syllabus is linked to my left. Thanks for joining me today, and for any questions, please reach out to me on LinkedIn. Bye!

How to Become an AI Engineer from a Fullstack Background - Reid Mayo

Chapters

Transcript