How LLMs will affect Jobs?

LLMs
Jobs
Work
Author

Lawrence Wu

Published

August 29, 2023

I have been interested in how language models will affect the future of work. Having worked with LLMs the last ~6 months, I’ve seen their impact on my work as a data scientist. Jason Wei gave a talk at KDD 2023 called Scaling, Emergence and Reasoning in Large Language Models that was great. He works at the intersection of LLMs and AI Research and has a good view into how LLMs will change the field of AI Research. We need this same sort of thinking applied in other fields to get a sense for how to answer the question “How will LLMs change the future of work?” It’s important for more people to be aware the potential for LLMs to disrupt jobs and to get ahead of any changes.

How will LLMs affect data science work?

In terms of how I’ve seen LLMs affect data science work are:

5 years ago:

  1. Start with data: First step after a use case is defined for any data science problem –> acquire a data set and spend time preparing/cleaning the data
  2. Need to hire data labelers to label data or evaluate model output at scale
  3. Coding with Stack Overflow

Now

  1. Start with a model: Does a pre-trained model (embeddings, LLMs) solve your use case?
  2. A pre-trained model can generate required labels or evaluate model results directly
  3. Coding with GitHub Copilot, ChatGPT and Code Interpreter

How will LLMs affect the future of work?

I had the idea of using an LLM to assess the impact of LLMs on jobs. Having got llama2 up and running on my Mac M1 Pro using llama.cpp, it was pretty easy to use langchain, a set of occupations from ONET, and the llama2-13b-chat model to generate 10 ways that LLMs will affect a given occupation for the 1,016 ONET occupations. This took about 50 seconds per title to run and in total around 14 hours (and an unknown amount of electricity).

The idea and the prompt I used was partially inspired by the paper GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models where the framework they used to evaluate the impact of LLMs on jobs used LLMs to do the evaluation and their measure of a given job’s exposure used:

  • Direct exposure (E1) if using the described LLM via ChatGPT or the OpenAI playground can decrease the time required to complete the DWA or task by at least half (50%).
  • LLM+ Exposed (E2) if access to the described LLM alone would not reduce the time required to complete the activity/task by at least half, but additional software could be developed on top of the LLM that could reduce the time it takes to complete the specific activity/task with quality by at least half. Among these systems, we count access to image generation systems.

So the prompt I used was:

"As an {TITLE}, what are 10 areas of your work that you think will get disrupted by the widespread usage of LLMs or tools built on LLMs?"

Code

The code to do this:

from langchain.llms import LlamaCpp

MODEL_LLAMA2_13B = "llama-2-13b-chat.ggmlv3.q4_0.bin"
n_gpu_layers = 1  # Metal set to 1 is enough.
n_batch = 512  # Should be between 1 and n_ctx, consider the amount of RAM of your Apple Silicon Chip.

llm = LlamaCpp(
            model_path=MODEL_LLAMA2_13B,
            temperature=1,
            top_p=1,
            max_tokens=2048,
            n_ctx=2048,
            n=-1,
            repeat_penalty=1.1,
            n_threads=8,
            # callback_manager=callback_manager,
            n_gpu_layers=n_gpu_layers,
            n_batch=2048,  # Should be between 1 and n_ctx, consider the amount of RAM of your Apple Silicon Chip.
            f16_kv=True,  # MUST set to True, otherwise you will run into problem after a couple of calls
            verbose=False,
)

A Couple Examples

Running this for an entry-level data scientist:

llm("As an entry-level Data Scientist, what are 10 areas of your work that you think will get disrupted by the widespread usage of LLMs or tools built on LLMs?")

# Here's a list of 10 areas of my work as an entry-level Data Scientist that I think will be disrupted by the widespread usage of Large Language Models (LLMs) or tools built on LLMs:
# 1. Data Preprocessing: Many tasks in data preprocessing, such as tokenization, stemming, and lemmatization, can be automated using LLMs. This could make my work more efficient and reduce the time spent on these tasks.
# 2. Feature Engineering: LLMs can help generate new features that can improve model performance. For example, LLMs can be used to generate sentiment scores or topic models for text data. This could lead to better insights and improved accuracy in machine learning models.
# 3. Data Visualization: LLMs can be used to generate interactive visualizations that provide more detailed insights into data. For example, LLMs can be used to create interactive heatmaps or word clouds that show the distribution of words in a dataset.
# 4. Text Summarization: LLMs can be used to summarize long documents or articles into shorter summaries. This could be useful for quickly grasping the main points of a document or article.
# 5. Sentiment Analysis: LLMs can be used to analyze text data and provide sentiment scores. This could be useful for understanding public opinion on a particular topic or product.
# 6. Named Entity Recognition (NER): LLMs can be used to automatically identify and extract named entities such as people, organizations, and locations from text data. This could save time and improve the accuracy of downstream tasks such as information extraction or summarization.
# 7. Part-of-speech Tagging: LLMs can be used to automatically assign part-of-speech tags to words in a sentence. This could improve the accuracy of downstream tasks such as named entity recognition and sentiment analysis.
# 8. Dependency Parsing: LLMs can be used to analyze the grammatical structure of sentences and identify the relationships between words. This could lead to better insights into language usage and improved accuracy in natural language processing tasks.
# 9. Information Extraction: LLMs can be used to automatically extract structured data from unstructured text. For example, LLMs can be used to extract names, dates, and locations from news articles or social media posts.
# 10. Question Answering: LLMs can be used to answer questions based on the content of a document or article. This could be useful for providing quick answers to common customer support queries or for automating content search engines.
# Overall, the widespread adoption of LLMs and tools built on LLMs has the potential to significantly improve the efficiency and accuracy of many tasks in data science, particularly those involving natural language processing.

Repeating this for a Pediatric Pharmacist:

prompt="""
As an entry-level Pediatric Pharmacist, what are 10 areas of your work that you think will get disrupted by the widespread usage of LLMs or tools built on LLMs?
"""
llm(prompt)
# Although I am an early career pharmacist, and I lack industry expertise as a seasoned professional in the field of pediatrics or AI technology I can imagine several ways that LLMs could disrupt various aspects of my work:
# 1. Medication dosing: One potential area of disruption is medication dosing. LLMs may be able to quickly and accurately determine appropriate drug dosages for children based on their weight, age, and other factors, potentially reducing the need for human intervention and improving medication safety. 
# 2.Drug interactions: Another possible area of disruption is the identification of potential drug interactions. LLMs may be able to analyze large amounts of data and identify potential drug interaction that a human pharmacist might miss, helping to prevent adverse events and improve patient outcomes. 
# 3. Allergic reactions: With their ability to analyze large amounts of data, LLMs may also be useful in identifying patterns related to allergic reactions, allowing healthcare providers to take proactive steps to prevent these reactions or quickly respond to them if they occur.
# 4. Developmental delays: The use of LLMs could help identify children at risk for developmental delays and provide early interventions that can improve outcomes for those children.
# 5. Pediatric clinical decision support: With the ability to process vast amounts of data, LLMs may be able provide real-time clinical decision support to healthcare providers, improving diagnostic accuracy and reducing errors in patient care.  
# 6. Medical coding and billing: The use of LLMs could improve the accuracy of medical coding and billing, potentially reducing errors and improving reimbursement for healthcare providers. 
# 7. Pharmacovigilance: With their ability to analyze large amounts of data, LLMs may also be useful in pharmacovigilance by identifying potential safety issues related to pediatric medication use and allowing healthcare providers to take proactive steps to mitigate these risks.
# 8 Patient education: The use of LLMs could improve patient education, allowing healthcare providers to provide tailored information that addresses the unique needs of each child based on their medical history, allergies, and other factors. 
# 9. Referrals and consultations: With the ability to diagnose and treat a wide range of pediatric conditions, LLMs could help identify children who require specialist care or consultation with other healthcare providers, potentially reducing unnecessary referrals and improving patient outcomes.  
# 10. Public health surveillance: Finally, the use of LLMs could improve public health surveillance by analyzing large amounts of data related to pediatric health issues, allowing healthcare providers to identify trends and patterns that can inform population-based interventions and policies. 
# As an early career pharmacist, I recognize that the widespread adoption of LLMs or tools built on these technologies will require significant investment in training and infrastructure, as well as careful consideration of ethical and legal issues related to health data privacy and security.  
# However, if properly implemented, these technologies have the potential to transform the field of pediatric pharmacology, improving patient outcomes and reducing healthcare costs over time. Despite the challenges ahead, I remain excited about the potential applications of LLMs in pediatric pharmacology practice and look forward to seeing how these developments shape the future of our profession.

Conclusion

Hopefully this inspires some food for thought on how LLMs will affect the future of work. I think it would benefit most people to think about how a general purpose technology like LLMs (Andrew Ng calls AI like electricity in their broad applicability) will affect the specific things they are working and how they can use this technology to make themselves more productive. A similar analogy is those who have gotten good at searching with Google are that much more productive than those who haven’t. LLMs are a much more powerful tool than a search engine (after all you can give an LLM a search tool) and it’s important more people understand how to use this new tool.

Streamlit App

Here’s a Streamlit app you can use to explore all 1,016 titles and how it will potentially be impacted by LLMs: https://llm-impact-on-jobs.streamlit.app/