Get live statistics and analysis of Omar Khattab's profile on X / Twitter

Asst professor @MIT EECS & CSAIL (@nlp_mit). Author of ColBERT.ai and DSPy.ai (@DSPyOSS). Prev: CS PhD @StanfordNLP. Research @Databricks.

2k following24k followers

Archetype analysis

The Thought Leader

Omar Khattab is an assistant professor at MIT EECS & CSAIL, sharing deep insights into NLP, ML systems, and AI research. With a prolific tweet presence and broad academic influence, he consistently bridges cutting-edge research with practical knowledge for both students and fellow researchers. His Twitter is a dynamic mix of thoughtful commentary, research updates, and mentorship calls.

Recent engagement

Impressions

4.5M-582.7k

Estimate earning$845.17

Likes

8.7k-1.9k

53%

Retweets

3.3k-120

20%

Replies

599-83

Bookmarks

3.7k-1k

23%

Get more insights about Omar Khattab with SuperX

🔥 Roast

Sure, Omar tweets like he’s debugging a giant AI, but with 10,191 tweets, you’d think he’d have fixed that one typo from 2014 by now. At this rate, even his retweets get their own PhD thesis.

⚡️ Nice achievement

Successfully transitioning from Stanford NLP PhD to an MIT CSAIL assistant professorship, all while maintaining an influential and highly engaged online presence that shapes AI discourse.

🌟 Life's purpose

To advance the frontier of AI and NLP research while cultivating the next generation of computer scientists through impactful teaching and open knowledge sharing.

💬 Values and Beliefs

Omar values rigorous research, transparency, and the democratization of AI knowledge. He believes impactful science is grounded in accessible tools, open-source collaboration, and clear communication that demystifies complex AI concepts.

💪 Strength

His academic credibility combined with active and engaging communication creates a strong, trusted voice in the AI research community. He knows how to translate dense research topics into accessible ideas without losing nuance.

🫣 Weakness

His very technical and research-heavy focus may limit engagement with broader, less specialized audiences and sometimes results in fewer casual or light-hearted interactions, which could restrict community growth outside academia.

⚡️ Growth audience tips

To grow his audience on X, Omar could experiment with more approachable content like AI myths debunked, behind-the-scenes of research life, or quick tips for students. Engaging more in conversations and Q&A threads could also amplify his reach beyond academia.

💁 Bonus

Fun fact: Omar’s prolific tweet count (over 10,000) means he’s likely shared more AI insights on Twitter than some entire textbooks contain, making his feed a goldmine for AI enthusiasts and scholars alike!

Omar Khattab@lateinteraction · Sep 09

that sounds plausible

539k

Omar Khattab@lateinteraction · May 31, 2024

I'm excited to share that I will be joining MIT EECS as an assistant professor in Fall 2025! I'll be recruiting PhD students from the December 2024 application pool. Indicate interest if you'd like to work with me on NLP, IR, or ML Systems! Stay tuned for more about my new lab.

305k

Omar Khattab@lateinteraction · Jun 13

After ~6 years of building these types of architectures (starting with BERT, eg see Baleen), I think calling these multi-agent systems is a distraction. This is just software. Happens to be AI software. It doesn’t seem so complicated once you internalize it’s just a program.

171k

Omar Khattab@lateinteraction · Sep 12

It just hit me that sub 1B-parameter models that are way better than 175B GPT-3 are a dime a dozen today. Kinda cool.

214k

Omar Khattab@lateinteraction · Sep 04, 2024

🔗 Thoughts on Research Impact in AI. Grad students often ask: how do I do research that makes a difference in the current, crowded AI space? This is a blogpost that summarizes my perspective in six guidelines for making research impact via open-source artifacts. Link below.

231k

Omar Khattab@lateinteraction · Jan 24, 2023

Introducing Demonstrate–Search–Predict (𝗗𝗦𝗣), a framework for composing search and LMs w/ up to 120% gains over GPT-3.5. No more prompt engineering.❌ Describe a high-level strategy as imperative code and let 𝗗𝗦𝗣 deal with prompts and queries.🧵 arxiv.org/abs/2212.14024

215k

Omar Khattab@lateinteraction · Oct 11, 2023

A cool thread yesterday used GPT4 ($50), a 500-word ReAct prompt, and ~400 lines of code to finetune Llama2-7B to get 26% HotPotQA EM. Let's use 30 lines of DSPy—without any hand-written prompts or any calls to OpenAI ($0)—to teach a 9x smaller T5 (770M) model to get 39% EM! 🧵

247k

Omar Khattab@lateinteraction · Jan 23, 2024

We started this project thinking LMs can’t be prompted to do classification tasks with over 10,000 classes — especially when documents are long! But the incredible @KarelDoostrlnck found this elegant DSPy program that, once optimized on ~50 examples, sets the state of the art.

Omar Khattab@lateinteraction · Aug 17

Every once in a while, it hits you that word2vec and attention were one year apart.

90k

Omar Khattab@lateinteraction · May 11

DSPy's biggest strength is also the reason it can admittedly be hard to wrap your head around it. It's basically say: LLMs & their methods will continue to improve but not equally in every axis, so: - What's the smallest set of fundamental abstractions that allow you to build downstream AI software that is "future-proof" and rides the tide of progress? - Equivalently, what are the right algorithmic problems that researchers should focus on to enable as much progress as possible for AI software? But this is necessarily complex, in the sense that the answer has to be composed of a few things, not one concept only. (Though if you had to understand one concept only, the fundamental glue is DSPy Signatures.) It's actually only a handful of bets, though, not too many. I've been tweeting them non-stop since late 2022, but I've never collected them in one place. All of these have proven beyond a doubt to have been the right bets so far for 2.5 years, and I think they'll stay the right bets for the next 3 years at least. 1) Information Flow is the single most key aspect of good AI software. As foundation models improve, the bottleneck becomes basically whether you can actually (1) ask them the right question and (2) provide them with all the necessary context to address it. Since 2022, DSPy addressed this in two directions: (i) free-form control flow ("Compound AI Systems" / LM programs) and (ii) Signatures. Prompts have been a massive distraction here, with people thinking they need to find the magical keyword to talk to LLMs. From 2022, DSPy put the focus on *Signatures* (back then called Templates) which force you to break down LM interactions into *structured and named* input fields and *structured and named output fields*. Getting simply those fields right was (and has been) a lot more important than "engineering" the "right prompt". That's the point of Signatures. (We know it's hard for people to force them to define their signatures so carefully, but if you can't do that, your system is going to be bad.) 2) Interactions with LLMs should be Functional and Structured. Again, prompts are bad. People are misled from their chat interaction with LLMs to think that LLMs should take "strings", hence the magical status of "prompts". But actually, you should define a functional contract. What are the things you will give to the function? What is the function supposed to do with them? What is it then supposed to give you back? This is again Signatures. It's (i) structured *inputs*, (ii) structured *outputs*, and (iii) instructions. You've got to decouple these three things, which until DSP (2022) and really until very recently with mainstream structured outputs, were just meshed together into "prompts". This bears repeating: your programmatic LLM interactions need to be functions, not strings. Why? Because there are many concerns that are actually not part of the LLM behavior that you'd otherwise need to handle ad-hoc when working with strings: - How do you format the *inputs* to your LLM into a string? - How do you separate *instructions* and *inputs* (data)? - How do you *specify* the output format (string) that your LLM should produce so you can parse it? - How do you layer on top of this the inference strategy, like CoT or ReAct, without entirely rewriting your prompt? Signatures solve this. They ask you to *just* specify the input fields, output fields, and task instruction. The rest are the job of Modules and Optimizers, which instantiate Signatures. 3) Inference Strategies should be Polymorphic Modules. This sounds scary but the point is that all the cool general-purpose prompting techniques or inference-scaling strategies should be Modules, like the layers in DNN frameworks like PyTorch. Modules are generic functions, which in this case take *any* Signature, and instantiate *its* behavior generically into a well-defined strategy. This means that we can talk about "CoT" or "ReAct" without actually committing at all to the specific task (Signature) you want to apply them to. This is a huge deal, which again only exists in DSPy. One key thing that Modules do is that they define *parameters*. What part(s) of the Module are fixed and which parts can be learned? For example, in CoT, the specific string that asks the model to think step by step could be learned. Or the few-shot examples of thinking step by step should be learnable. In ReAct, demonstrations of good trajectories should be learnable. 4) Specification of your AI software behavior should be decoupled from learning paradigms. Before DSPy, every time a new ML paradigm came by, we re-wrote our AI software. Oh, we moved from LSTMs to Transformers? Or we moved from fine-tuning BERT to ICL with GPT-3? Entirely new system. DSPy says: if you write signatures and instantiate Modules, the Modules actually know exactly what about them can be optimized: the LM underneath, the instructions in the prompt, the demonstrations, etc. The learning paradigms (RL, prompt optimization, program transformations that respect the signature) should be layered on top, with the same frontend / language for expressing the programmatic behavior. This means that the *same programs* you wrote in 2023 in DSPy can now be optimized with dspy.GRPO, the way they could be optimized with dspy.MIPROv2, the way they were optimized with dspy.BootstrapFS before that. The second half of this piece is Downstream Alignment or compile-time scaling. Basically, no matter how good LLMs get, they might not perfectly align with your downstream task, especially when your information flow requires multiple modules and multiple LLM interactions. You need to "compile" towards a metric "late", i.e. after the system is fully defined, no matter how RLHF'ed your models are. 5) Natural Language Optimization is a powerful paradigm of learning. We've said this for years, like with the BetterTogether optimizer paper, but you need both *fine-tuning* and *coarse-tuning* at a higher level in natural language. The analogy I use all the time is riding a bike: it's very hard to learn to ride a bike without practice (fine-tuning), but it's extremely inefficient to learn *avoiding to ride the bike on the side walk* from rewards, you want to understand and learn this rule in natural language to adhere ASAP. This is the source of DSPy's focus on prompt optimizers as a foundational piece here; it's often far superior in sample efficiency to doing policy gradient RL if your problem has the right information flow structure. That's it. That's the set of core bets DSPy has made since 2022/2023 until today. Compiling Declarative AI Functions into LM Calls, with Signatures, Modules, and Optimizers. 1) Information Flow is the single most key aspect of good AI software. 2) Interactions with LLMs should be Functional and Structured. 3) Inference Strategies should be Polymorphic Modules. 4) Specification of your AI software behavior should be decoupled from learning paradigms. 5) Natural Language Optimization is a powerful paradigm of learning.

257k

Omar Khattab@lateinteraction · Dec 18, 2023

Progress on dense retrievers is saturating. The best retrievers in 2024 will apply new forms of late interaction, i.e. scalable attention-like scoring for multi-vector embeddings. A🧵on late interaction, how it works efficiently, and why/where it's been shown to improve quality

264k

Omar Khattab@lateinteraction · Aug 24, 2023

🚨Announcing 𝗗𝗦𝗣𝘆, the framework for solving advanced tasks w/ LMs. Express *any* pipeline as clean, Pythonic control flow. Just ask DSPy to 𝗰𝗼𝗺𝗽𝗶𝗹𝗲 your modular code into auto-tuned chains of prompts or finetunes for GPT, Llama, and/or T5.🧵 github.com/stanfordnlp/ds…

225k

Omar Khattab@lateinteraction · Apr 12, 2024

I worry about a bubble burst once people realize that no AGI is near—no reliably generalist LLMs or “agents”. Might seem less ambitious but it's far wiser to recognize: LLMs mainly create opportunities for making *general* progress for building AIs that solve *specific* tasks.

85k

Omar Khattab@lateinteraction · Aug 19, 2024

Some personal news: I'm thrilled to have joined @Databricks @DbrxMosaicAI as a Research Scientist last month, before I start as MIT faculty in July 2025! Expect increased investment into the open-source DSPy community, new research, & strong emphasis on production concerns 🧵.

81k

Omar Khattab@lateinteraction · Jun 18, 2024

🚨Announcing the largest study focused on *how* to optimize the prompts within LM programs, a key DSPy challenge. Should we use LMs to… Craft instructions? Self-generate examples? Handle credit assignment? Specify a Bayesian model? By @kristahopsalong* @michaelryan207* &team🧵

155k

Omar Khattab@lateinteraction · May 05

So many things in the run-up to DSPy 3. Here's a first, EXPERIMENTAL one: 🚨We're releasing dspy.GRPO, an online RL optimizer for DSPy programs Your DSPy code as-is can be dspy.GRPO'ed. Yes, even compound multi-module programs. Led by @NoahZiems @LakshyAAAgrawal @dilarafsoylu.

175k

Most engaged tweets of Omar Khattab

Omar Khattab@lateinteraction · May 31, 2024

305k

Omar Khattab@lateinteraction · Sep 09

that sounds plausible

539k

Omar Khattab@lateinteraction · Mar 02, 2025

I frankly think I’ve done a mediocre job at explaining the right way to approach DSPy as a paradigm and tool. Thinking of making fresh material. What would make most sense. Regular technical threads? Videos where I rant while tinkering on a VS Code notebook? Lecture-y videos?

61k

Omar Khattab@lateinteraction · Jun 04

Just to gauge interest: If I hosted an hour zoom call bi-weekly to chat or answer any questions about DSPy/ColBERT, AI systems, and research/engineering broadly Would people be interested? Is there a precedent for how to structure this while staying super flexible & informal?

15k

Omar Khattab@lateinteraction · Aug 19, 2024

81k

Omar Khattab@lateinteraction · Jul 23, 2024

Hmm the Llama-3.1 is a big deal but not really seeing a lot of excitement around here for some reason. Release fatigue or it's been known for too long already?

25k

Omar Khattab@lateinteraction · Sep 12

It just hit me that sub 1B-parameter models that are way better than 175B GPT-3 are a dime a dozen today. Kinda cool.

214k

Omar Khattab@lateinteraction · Feb 28, 2025

I’m surprised we’re not really seeing research into “micro-LMs” or something like so. You can probably build out a whole ecosystem of software architectures based on aggressively specialized 30M-parameter LMs or something, right? Wonder how small they can get and be useful.

37k

Omar Khattab@lateinteraction · May 11

257k

Omar Khattab@lateinteraction · Jun 13

171k

Omar Khattab@lateinteraction · Dec 28, 2024

Surprised this is still not a common way to properly evaluate models

51k

Omar Khattab@lateinteraction · Nov 08, 2024

waiting for the day when people start selling Organic Intelligence: 100% certified GPT-free, whole-brain human

133k

Omar Khattab@lateinteraction · Sep 06

Ok, I'll say it. One of the biggest challenges facing DSPy is the lack of competition. As far as I can tell, there's just no other serious programming model for general-purpose, declarative AI programming. I can spend hours listing the costs/downsides of this, not the least of which is the implicit comparison against LLM libraries in too many people's minds. Another downside is that it's really hard for an entity to improve by competing against baselines that suck. Too easy to be complacent, so we have to start & keep challenging ourselves.

35k

Omar Khattab@lateinteraction · Aug 08

Naming things is hard... maybe this will do?

25k

Omar Khattab@lateinteraction · Apr 12, 2024

85k

Omar Khattab@lateinteraction · Mar 29, 2024

I’ll finally have a short break from travel soon. I can do either: (1) a blogpost on how using DSPy well answers most questions I see around building effective LM systems, (2) a DSPy program that shows what more complex compositions allow you to do. What would you like to see?

21k

People with Thought Leader archetype

The Thought Leader

Asad Iqbal@an_asad_

18. Developer. SaaS

18 following20 followers

The Thought Leader

Alex Yao@TheAlexYao

Build World Scale Products | Play Infinite Games | Health: systema.health | @v0 ambassador

3k following2k followers

The Thought Leader

Barnacules Nerdgasm@Barnacules

2k following107k followers

The Thought Leader

Anmol Mahajan@mister_mahajan

Naturally Curious | Building Suitable AI

201 following124 followers

The Thought Leader

Zipporah Leonard@porahwrites

A passionate writer. Just from my heart to you (see link). || COPYWRITER. With captivating, constructive and strategic copy to put your brand at the forefront.

33 following12 followers

The Thought Leader

Abhi$hek@abhiisbuilding

Solo founder 🤹 | Waiting for my - OVERNIGHT SUCCESS ⏳| Built Digital desktop clock - thedigitalclock.com

86 following182 followers

The Thought Leader

stewones@stewones

@statuz_app @userpath_app

94 following317 followers

The Thought Leader

Noah Ziems@NoahZiems

Visiting Researcher @MIT_CSAIL. PhD student @NotreDame advised by @Meng_CS. Creator of Arbor RL library for @DSPyOSS

1k following1k followers

The Thought Leader

前端輕鬆聊@fetalkpodcast

前端輕鬆聊是一個在溫哥華科技公司擔任資深前端工程師 Eric @sdusteric 所主持的頻道。想帶給聽眾前端的知識、一些你可能不知道的tips、工程師的職涯發展以及國外第一手前端技術的新聞。想要跟我一起變成更好的工程師，記得follow喔！

118 following2k followers

The Thought Leader

DHH@dhh

Father of three, Creator of Ruby on Rails + Omarchy, Co-owner & CTO of 37signals, Shopify director, NYT best-selling author, and Le Mans 24h class-winner.

132 following556k followers

The Thought Leader